Name: K MapReduce: A Scalable Tool for Data-Processing and Search/Ensemble Applications on Large-Scale Supercomputers
Start: 2013-09-24T16:30:00-0400
End: 2013-09-24T16:55:00-0400

The attendees list includes all authors (even thought they may not be attending), speakers, artists, etc.

View the full conference website here: IEEE Cluster 2013 Conference

Back To Schedule

K MapReduce: A Scalable Tool for Data-Processing and Search/Ensemble Applications on Large-Scale Supercomputers

KMR (K MapReduce) is a high-performance MapReduce system in the MPI environment, targeting large-scale supercomputers such as the K computer. Its objectives are to ease programming in data-processing and to achieve efficiency by utilizing plenty of memory available in large-scale supercomputers. KMR shuffles key-value pairs in a highly scalable way by log-step message-passing algorithms. Multi-threaded mapping and reducing allow KMR for further achieving efficiency in modern multi-core machines. Sorting is extensively used inside of shuffling and reducing, which is optimized in KMR by using the packed keys of fixed-length instead of the raw keys of variable-length for optimizing performance. Besides the MapReduce operations, KMR provides routines for collective file reading with affinity-aware optimizations. This paper presents results of experimental performance studies of KMR on the K computer. Our affinity-aware file loading improves the performance by about 42% than a non-optimized implementation. We also show how KMR can be used to implement real-world scientific applications, including meta-genome search and replica-exchange molecular dynamics.

Speakers

IEEE Cluster 13 Conference

Motohiko Matsuda

Naoya Maruyama

Motohiko Matsuda

Shinichiro Takizawa

Attendees (0)

IEEE Cluster 13 Conference

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Motohiko Matsuda

Naoya Maruyama

Motohiko Matsuda

Shinichiro Takizawa

Attendees (0)