KMR (K MapReduce) is a high-performance MapReduce system in the MPI environment, targeting large-scale supercomputers such as the K computer. Its objectives are to ease programming in data-processing and to achieve efficiency by utilizing plenty of memory available in large-scale supercomputers. KMR shuffles key-value pairs in a highly scalable way by log-step message-passing algorithms. Multi-threaded mapping and reducing allow KMR for further achieving efficiency in modern multi-core machines. Sorting is extensively used inside of shuffling and reducing, which is optimized in KMR by using the packed keys of fixed-length instead of the raw keys of variable-length for optimizing performance. Besides the MapReduce operations, KMR provides routines for collective file reading with affinity-aware optimizations. This paper presents results of experimental performance studies of KMR on the K computer. Our affinity-aware file loading improves the performance by about 42% than a non-optimized implementation. We also show how KMR can be used to implement real-world scientific applications, including meta-genome search and replica-exchange molecular dynamics.