13
.
6
.
2014

comSysto and MapR at TDWI Europe 2014

Map-Reducing Everywhere

This year’s TDWI Europe conference takes place in Munich from June 23rd til 25th. The conference is one of the major hubs for the Data Warehousing and Business Intelligence scene, and comSysto and our partners MapR are happy to be giving one of the talks.

On tuesday, June 24th, our colleague and Big Data Expert Carsten Hufe will speak alongside Michael Hausenblas, Chief Data Engineer EMEA at MapR. Their talk, called “Map-Reducing Überall” (Map-Reducing everywhere),  will revolve around a Big Data web application based on anonymised and aggregated data from a mobile network which comSysto developed together with a customer from telecommunications and which uses Hadoop and the MapReduce programming paradigm.

During the development the team employed MapReduce in various ways: MapReduce in classical ETL processes, MapReduce for aggregation on AWS Elastic Map Reduce, MapReduce in MongoDB, MapReduce in the statistical software R for analyzing data quality, and MapReduce for data import and index creation for the JumboDB database. JumboDB ist a database which is opimimised for heavy-read scenarios and which comSysto developed for the project.

The two will kick off their session by giving an introduction to HDFS, Map Reduce, its role in Big Data processing, and the nature of “Map-Reducable” problems. They’ll highlight some of the important ecosystem components like Hive and Pig and discuss the implications that the arrival of YARN for Hadoop 2.0 will have.

After that, they’ll take a deep dive into the various challenges the development team of the Big Data project faced. These include handling huge amounts of data that could be queried on demand and providing near real-time performance. The talk will explain how the team coped with these challenges, developed the new, read-optimised database jumboDB – and how they accomplished all this under limited hardware resources and budgets.

Motivated by mobile network data, Carsten and Michael will also be discussing the pros and cons of a unified programming model in different areas and will report successes as well as problems with MapReduce. You can find details about the talk (in German) here.

If you’re interested in the subject and would like to have more information, please don’t hesitate to contact us! And if you’re going to TDWI Europe, please feel free to visit us at the joint comSysto/MapR booth. We’re looking forward to seeing you there!