We performed a comparison between Cloudera DataFlow and Databricks based on real PeerSpot user reviews.
Find out in this report how the two Streaming Analytics solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."This solution is very scalable and robust."
"The initial setup was not so difficult"
"DataFlow's performance is okay."
"What I like about Databricks is that it's one of the most popular platforms that give access to folks who are trying not just to do exploratory work on the data but also go ahead and build advanced modeling and machine learning on top of that."
"Databricks provides a consistent interface for data engineers to work with data in a consistent language on a single integrated platform for ingesting, processing, and serving data to the end user."
"Databricks gives us the ability to build a lakehouse framework and do everything implicit to this type of database structure. We also like the ability to stream events. Databricks covers a broad spectrum, from reporting and machine learning to streaming events. It's important for us to have all these features in one platform."
"The solution is an impressive tool for data migration and integration."
"The load distribution capabilities are good, and you can perform data processing tasks very quickly."
"Its lightweight and fast processing are valuable."
"We like that this solution can handle a wide variety and velocity of data engineering, either in batch mode or real-time."
"One of the features provides nice interactive clusters, or compute instances that you don't really need to manage often."
"Although their workflow is pretty neat, it still requires a lot of transformation coding; especially when it comes to Python and other demanding programming languages."
"It's an outdated legacy product that doesn't meet the needs of modern data analysts and scientists."
"It is not easy to use the R language. Though I don't know if it's possible, I believe it is possible, but it is not the best language for machine learning."
"The solution could improve by providing better automation capabilities. For example, working together with more of a DevOps approach, such as continuous integration."
"Anyone who doesn't know SQL may find the product difficult to work with."
"A lot of people are required to manage this solution."
"The solution has some scalability and integration limitations when consolidating legacy systems."
"The tool should improve its integration with other products."
"Databricks would have more collaborative features than it has. It should have some more customization for the jobs."
"Databricks has a lack of debuggers, and it would be good to see more components."
"I have seen better user interfaces, so that is something that can be improved."
Cloudera DataFlow is ranked 13th in Streaming Analytics with 3 reviews while Databricks is ranked 2nd in Streaming Analytics with 78 reviews. Cloudera DataFlow is rated 6.6, while Databricks is rated 8.2. The top reviewer of Cloudera DataFlow writes "A scalable and robust platform for analyzing data". On the other hand, the top reviewer of Databricks writes "A nice interface with good features for turning off clusters to save on computing". Cloudera DataFlow is most compared with Confluent, Amazon MSK, Informatica Data Engineering Streaming, Hortonworks Data Platform and Spring Cloud Data Flow, whereas Databricks is most compared with Amazon SageMaker, Informatica PowerCenter, Dataiku Data Science Studio, Microsoft Azure Machine Learning Studio and Dremio. See our Cloudera DataFlow vs. Databricks report.
See our list of best Streaming Analytics vendors.
We monitor all Streaming Analytics reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.