We performed a comparison between Apache Spark and Cloudera Distribution for Hadoop based on real PeerSpot user reviews.
Find out in this report how the two Hadoop solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."The product’s most valuable feature is the SQL tool. It enables us to create a database and publish it."
"The most valuable feature of Apache Spark is its flexibility."
"Spark helps us reduce startup time for our customers and gives a very high ROI in the medium term."
"The main feature that we find valuable is that it is very fast."
"The solution is scalable."
"With Spark, we parallelize our operations, efficiently accessing both historical and real-time data."
"The most valuable feature of this solution is its capacity for processing large amounts of data."
"It provides a scalable machine learning library."
"We also really like the Cloudera community. You can have any question and will have your answer within a few hours."
"We had a data warehouse before all the data. We can process a lot more data structures."
"Very good end-to-end security features."
"Customer service and support were able to fix whatever the issue was."
"We experienced many issues when we started working with Hadoop 3.0 in the Cloudera 6.0 version, so there are a lot of things that need to improve. I believe they are working on that."
"I don't see any performance issues."
"The features I find most valuable is that the solution is that it is easy to install and to work with. It starts with the installation and from there on the management is very simple and centralized."
"Cloudera is a very manageable solution with good support."
"The solution must improve its performance."
"The migration of data between different versions could be improved."
"The setup I worked on was really complex."
"Apache Spark's GUI and scalability could be improved."
"Apart from the restrictions that come with its in-memory implementation. It has been improved significantly up to version 3.0, which is currently in use."
"I know there is always discussion about which language to write applications in and some people do love Scala. However, I don't like it."
"Apache Spark is very difficult to use. It would require a data engineer. It is not available for every engineer today because they need to understand the different concepts of Spark, which is very, very difficult and it is not easy to learn."
"When you first start using this solution, it is common to run into memory errors when you are dealing with large amounts of data."
"This is a very expensive solution."
"The security of this solution could be improved. There should also be a way to basically have a blockchain enabled storage with the HDFS."
"While the deployed product is generally functional, there are instances where it presents difficulties."
"The procedure for operations could be simplified."
"The user infrastructure and user interface needs to be improved, as well as the performance. The GUI needs to be better."
"The solution is not fit for on-premise distributions."
"They should focus on upgrading their technical capabilities in the market."
"Cloudera Distribution for Hadoop is not always completely stable in some cases, which can be a concern for big data solutions."
More Cloudera Distribution for Hadoop Pricing and Cost Advice →
Apache Spark is ranked 1st in Hadoop with 60 reviews while Cloudera Distribution for Hadoop is ranked 2nd in Hadoop with 47 reviews. Apache Spark is rated 8.4, while Cloudera Distribution for Hadoop is rated 8.0. The top reviewer of Apache Spark writes "Reliable, able to expand, and handle large amounts of data well". On the other hand, the top reviewer of Cloudera Distribution for Hadoop writes "Good end-to-end security features and we like that it's cloud independent". Apache Spark is most compared with Spring Boot, AWS Batch, Spark SQL, SAP HANA and Azure Stream Analytics, whereas Cloudera Distribution for Hadoop is most compared with Amazon EMR, HPE Ezmeral Data Fabric, Cassandra, ScyllaDB and MongoDB. See our Apache Spark vs. Cloudera Distribution for Hadoop report.
See our list of best Hadoop vendors.
We monitor all Hadoop reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.