The enterprises are struggling to manage the multiple dimensions of ever increasing data. This was one of the most discussed challenges in many events like Hadoop Summit, in San Jose, and the Spark Summit in San Francisco, this month. The discussions were mostly around twin challenges of stocking the data and putting it into meaningful information as well as on how data can be leveraged in real-time for decision making. The present IT readiness and market conditions make storage of huge data possible, but the problem lies in sorting and analyzing the data gathered from the customers across multiple sources. The challenge also lies in the concurrent analysis that not only analyzes the historic data but also the real-time data.
In this digital world, most of the customer interactions happen in real-time and are short lived too. Advancements in streaming analytics and in memory processing is now letting enterprises take advantage of real-time data for decisioning and operational intelligence. Companies like IBM acted immediately by endorsing Apache Spark, which is a in memory system. The company is planning to include the Apache project by training about 3500 researchers, data scientists and engineers to use Spark. This would step up the adoption of real-time analytics.
Doug Henschen who is currently into Constellation Research stated in his blog regarding the statements made by the IBM executives of integrating everything in analytics portfolio and features of Spark that involved the task of data conversion, distribution, scheduling, and joining the actual data followed by its concurrent analysis. It has an advantage over Hadoop which relies on Map Reduce and its ability to host SQL queries.IBM further states that it would be running its own analytics including System ML, SPSS and IBM Streams. Henschen concluded that the features will be beneficial in the long run with Sparks supporting Scala, Java, Python .At Spark Summit, Amazon Web Services made an announcement of providing free Spark service on Amazon Elastic Map Reduce, and its plans of providing Spark services on BlueMix Z and Soft Layer and its expected that IBM‘s endorsement with Spark will set a milestone for the company in handling the Big data.