Syncsort simplifies big data integration with Hadoop and Spark 2.0
 
            
        Syncsort, a global provider of Big Iron to Big Data solutions, announced new advancements in its industry-leading Big Data integration solution, DMX-h, that enable organisations to accelerate business objectives by speeding development, adapting to evolving data management requirements and leveraging rapid innovation in Big Data technology.
New, unmatched Integrated Workflow capabilities and Spark 2.0 integration dramatically simplify Hadoop and Spark application development, enabling organisations to extract maximum value from all their enterprise data assets.
“As Hadoop implementations continue to grow, with more diverse and complex use cases, and a constantly evolving Big Data technology stack, organisations require an increasingly efficient and flexible application development environment,” said Tendü Yoğurtçu, general manager of Syncsort’s Big Data business.
“By enhancing our single software environment with our new integrated workflow capability, we give customers an even simpler, more flexible way to create and manage their data pipelines. We also extend our design-once, deploy-anywhere architecture with support for Apache Spark 2.0, and make it easy for customers to take advantage of the benefits of Spark 2.0 and integrated workflow without spending time and resources redeveloping their jobs.”
Building an end-to-end data pipeline can be time-consuming and complicated, with various workloads executed on multiple compute frameworks, all of which need to be orchestrated and kept up to date. For example, an organisation might need to access a data warehouse or mainframe, run batch integration for large historical reference data in Hadoop MapReduce, and run streaming analytics and machine learning workflows with Apache Spark. Delays in development prevent business users from getting the insights they need for decision-making.
With Integrated Workflow, organisations can now manage various workloads such as batch ETL on very large repositories of historical data, referencing business rules during data ingest in a single workflow.
The new feature greatly simplifies and speeds development of the entire data pipeline, from accessing critical enterprise data, to transforming that data, and ultimately analysing it for business insights.
Built into Syncsort DMX-h’s design-once, deploy-anywhere architecture, Integrated Workflow empowers developers to:
- Dramatically reduce development time and resources by writing jobs in one environment, such as a laptop, and running them anywhere, including MapReduce, Spark 1.x or Spark 2.0, on-premise or in the cloud.
- Optimise with new technologies with an adoption pace that is best-fit for their business with the ability to run each workload on the compute framework that is best-fit for that workload.
- Enable organisations to leverage existing skills set and reduce by using a graphical interface to easily create and combine sophisticated workflows into one job, even if they are running on different compute frameworks.
As a result of all the benefits of Integrated Workflow, developers have unparalleled simplicity and flexibility to adapt to changing workloads, allowing them to deliver faster time to insight, while minimising development and opportunity costs.
Syncsort introduced Spark support in its last major release of DMX-h, allowing customers to take the same jobs initially designed for MapReduce and run them natively in Spark. With the new release, developers can now leverage the same capability to seamlessly take advantage of the enhancements made in Spark 2.0.
They can visually design data transformations once and run the jobs in MapReduce, Spark 1.x or Spark 2.0, by simply changing the compute framework. No rewriting, reconfiguring or recompiling are required.
Comment on this article below or via Twitter @IoTGN
