Apache Beam

A unified programming model for batch and streaming

Beam provides an advanced unified programming model, allowing you to implement batch and streaming data processing jobs that you can run on any execution engine.

Unified

Use a single programming model for both batch and streaming use cases.

Portable

Execute pipelines on multiple execution environments, including Apache Flink, Apache Spark, and Google Cloud Dataflow.

Extensible

Write and share new SDKs, IO connectors, and transformation libraries.
{{ content }}
All News
Open Source
Beam is an Apache Software Foundation project, available under the Apache v2 license. Beam is an open source community - contributions are appreciated! If you'd like to contribute, please see the Contribute section.
Overview
Apache Beam is a unified programming model you can use to create data processing pipelines. You start by building a program that defines the pipeline using one of the open source Beam SDKs. The pipeline is then executed by one of Beam’s supported distributed processing back-ends, which include Apache Flink, Apache Spark, and Google Cloud Dataflow.
Documentation
If you'd like to use Beam for your data processing tasks, use the Get Started section for an overview, quickstart, and examples. Then dive into the Documentation section for in-depth concepts and reference materials for the Beam Model, SDKs, and Runners.