Google Cloud Dataflow is #19 in Top 23 Big Data platforms

Last updated: January 02, 2020
Build, deploy, and run data processing pipelines that scale to solve your key business challenges. Google Cloud Dataflow enables reliable execution for large scale data processing scenarios such as ETL, analytics, real-time computation, and process orchestration.

Positions in ratings

#19 in Top 23 Big Data platforms


The best alternatives to Google Cloud Dataflow are: Apache Spark, Apache Kafka, Google Cloud Dataproc, Amazon EMR

Latest news about Google Cloud Dataflow

2015. Google partners with Cloudera to bring Cloud Dataflow to Apache Spark

Google announced that it has teamed up with the Hadoop specialists at Cloudera to bring its Cloud Dataflow programming model to Apache’s Spark data processing engine. With Google Cloud Dataflow, developers can create and monitor data processing pipelines without having to worry about the underlying data processing cluster. As Google likes to stress, the service evolved out of the company’s internal tools for processing large datasets at Internet scale. Not all data processing tasks are the same, though, and sometimes you may want to run a task in the cloud or on premise or on different processing engines. With Cloud Dataflow — in its ideal state — data analysts will be able use the same system for creating their pipelines, no matter the underlying architecture they want to run them on.