We performed a comparison between Amazon EMR and Azure Data Factory based on real PeerSpot user reviews.
Find out in this report how the two Cloud Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."When we grade big jobs from on-prem to the cloud, we do it in EMR with Spark."
"The initial setup is straightforward."
"This is the best tool for hosts and it's really flexible and scalable."
"It has a variety of options and support systems."
"One of the valuable features about this solution is that it's managed services, so it's pretty stable, and scalable as much as you wish. It has all the necessary distributions. With some additional work, it's also possible to change to a Spark version with the latest version of EMR. It also has Hudi, so we are leveraging Apache Hudi on EMR for change data capture, so then it comes out-of-the-box in EMR."
"In Amazon EMR it is easy to rebuild anything, easy to upgrade and has good fault tolerance."
"We are using applications, such as Splunk, Livy, Hadoop, and Spark. We are using all of these applications in Amazon EMR and they're helping us a lot."
"The initial setup is pretty straightforward."
"I think it makes it very easy to understand what data flow is and so on. You can leverage the user interface to do the different data flows, and it's great. I like it a lot."
"The solution has a good interface and the integration with GitHub is very useful."
"UI is easy to navigate and I can retrieve VTL code without knowing in-depth coding languages."
"The trigger scheduling options are decently robust."
"The flexibility that Azure Data Factory offers is great."
"One of the most valuable features of Azure Data Factory is the drag-and-drop interface. This helps with workflow management because we can just drag any tables or data sources we need. Because of how easy it is to drag and drop, we can deliver things very quickly. It's more customizable through visual effect."
"The user interface is very good. It makes me feel very comfortable when I am using the tool."
"It is very modular. It works well. We've used Data Factory and then made calls to libraries outside of Data Factory to do things that it wasn't optimized to do, and it worked really well. It is obviously proprietary in regards to Microsoft created it, but it is pretty easy and direct to bring in outside capabilities into Data Factory."
"There is room for improvement in pricing."
"There were times where they would release new versions and it seemed to end up breaking old versions, which is very strange."
"The dashboard management could be better. Right now, it's lacking a bit."
"As people are shifting from legacy solutions to other technologies, Amazon EMR needs to add more features that give more flexibility in managing user data."
"The product must add some of the latest technologies to provide more flexibility to the users."
"Modules and strategies should be better handled and notified early in advance."
"Amazon EMR can improve by adding some features, such as megastore services and HiveServer2. Additionally, the user interface could be better, similar to what Apache service provides, cross-platform services."
"The problem for us is it starts very slow."
"The solution needs to be more connectable to its own services."
"You cannot use a custom data delimiter, which means that you have problems receiving data in certain formats."
"Areas for improvement in Azure Data Factory include connectivity and integration. When you use integration runtime, whenever there's a failure, the backup process in Azure Data Factory takes time, so this is another area for improvement."
"There are limitations when processing more than one GD file."
"Currently, our company requires a monitoring tool, and that isn't available in Azure Data Factory."
"The performance could be better. It would be better if Azure Data Factory could handle a higher load. I have heard that it can get overloaded, and it can't handle it."
"Occasionally, there are problems within Microsoft itself that impacts the Data Factory and causes it to fail."
"There is room for improvement primarily in its streaming capabilities. For structured streaming and machine learning model implementation within an ETL process, it lags behind tools like Informatica."
Amazon EMR is ranked 8th in Cloud Data Warehouse with 20 reviews while Azure Data Factory is ranked 3rd in Cloud Data Warehouse with 81 reviews. Amazon EMR is rated 7.8, while Azure Data Factory is rated 8.0. The top reviewer of Amazon EMR writes "Provides efficient data processing features and has good scalability ". On the other hand, the top reviewer of Azure Data Factory writes "The data factory agent is quite good but pricing needs to be more transparent". Amazon EMR is most compared with Snowflake, Cloudera Distribution for Hadoop, Amazon Redshift, Apache Spark and Microsoft Azure Synapse Analytics, whereas Azure Data Factory is most compared with Informatica PowerCenter, Informatica Cloud Data Integration, Alteryx Designer, Snowflake and IBM InfoSphere DataStage. See our Amazon EMR vs. Azure Data Factory report.
See our list of best Cloud Data Warehouse vendors.
We monitor all Cloud Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.