We performed a comparison between Apache Hadoop and Snowflake based on real PeerSpot user reviews.
Find out in this report how the two Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."The most valuable feature is the database."
"The scalability of Apache Hadoop is very good."
"Hadoop File System is compatible with almost all the query engines."
"What I like about Apache Hadoop is that it's for big data, in particular big data analysis, and it's the easier solution. I like the data processing feature for AI/ML use cases the most because some solutions allow me to collect data from relational databases, while Hadoop provides me with more options for newer technologies."
"What comes with the standard setup is what we mostly use, but Ambari is the most important."
"The performance is pretty good."
"I liked that Apache Hadoop was powerful, had a lot of tools, and the fact that it was free and community-developed."
"Hadoop is designed to be scalable, so I don't think that it has limitations in regards to scalability."
"It's ultra-fast at handling queries, which is what we find very convenient."
"The technical support is pretty good, particularly if you are a more technical user."
"I like the idea that you can assign roles and responsibilities, limiting access to data."
"It is a very well-distributed system. It has different data engines for different applications. Many applications can use different computational engines at the same time. In terms of data processing, the feeling was similar to working with a relational database but in a scalable way."
"The snapshot feature is good, the rollback feature is good and the interface is user-friendly."
"The syntax is advanced which reduces the time to write code."
"Data Science capabilities are the most valuable feature."
"The most valuable feature of Snowflake is it's an all-in-one data warehousing solution."
"The load optimization capabilities of the product are an area of concern where improvements are required."
"Hadoop's security could be better."
"The solution is very expensive."
"The price could be better. I think we would use it more, but the company didn't want to pay for it. Hortonworks doesn't exist anymore, and Cloudera killed the free version of Hadoop."
"Since it is an open-source product, there won't be much support."
"It could be more user-friendly."
"In the next release, I would like to see Hive more responsive for smaller queries and to reduce the latency."
"It requires a great deal of learning curve to understand. The overall Hadoop ecosystem has a large number of sub-products. There is ZooKeeper, and there are a whole lot of other things that are connected. In many cases, their functionalities are overlapping, and for a newcomer or our clients, it is very difficult to decide which of them to buy and which of them they don't really need. They require a consulting organization for it, which is good for organizations such as ours because that's what we do, but it is not easy for the end customers to gain so much knowledge and optimally use it."
"They should improve the reporting tools."
"The solution should offer an on-premises version also. We have some requirements where we would prefer to use it as a template."
"It would be better if they had a data profile tool that tells me where the gaps are in my time series data."
"Snowflake has to build more capabilities because they have only built very few adapters, but they're growing and they're building. They should provide provisions to collect ETL pipeline capabilities, reduce developer work, and make more rapid application development, rather than some customizations. There are very few options, but they are building. I hope they will build ETL rapid application development provisions with more variety."
"These days, they are pushing users towards the GUI or graphical version. However, I am more familiar with the classic version. I'd like to continue to work with it using the older approach."
"The cost is a bit high."
"There could be better ELT tools that are appropriate for Snowflake. We decided on Matillion and it seemed to be the only one. There need to be better choices, it would be great if Snowflake provided an ELT solution that people could use. Additionally, if there was a pure cloud-based ELT tool it would be useful."
"To ensure the proper functioning of Snowflake as an MDS, it relies heavily on other partner tools."
Apache Hadoop is ranked 5th in Data Warehouse with 33 reviews while Snowflake is ranked 1st in Data Warehouse with 92 reviews. Apache Hadoop is rated 7.8, while Snowflake is rated 8.4. The top reviewer of Apache Hadoop writes "Handles huge data volumes and create your own workflows and tables but you need to have deeper knowledge". On the other hand, the top reviewer of Snowflake writes "Good usability, good data sharing and elastic compute features, and requires less DBA involvement". Apache Hadoop is most compared with Azure Data Factory, Microsoft Azure Synapse Analytics, Oracle Exadata, Teradata and BigQuery, whereas Snowflake is most compared with BigQuery, Azure Data Factory, Teradata, Vertica and Teradata Cloud Data Warehouse. See our Apache Hadoop vs. Snowflake report.
See our list of best Data Warehouse vendors and best Cloud Data Warehouse vendors.
We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.
Apache Hadoop is for data lake use cases. But getting data out of Hadoop for meaningful analytics is indeed need quite an amount of work. by either using spark/Hive/presto and so on. The way i look at Snowflake and Hadoop is they complement each other. For data lake you can use hadoop and then for datawarehouse companies can use snowflake. Depending on the size of the company you can turn snowflake into a data lake use case too. Snowflake is SQL friendly and you don't need to carry out any circus to get the data in and out of snowflake.