Metis is a multi-core MapReduce library.
All source files, except for the ones in app
directory, are subject
to the LICENSE distributed with Metis release.
Metis is tested on 64-bit Linux. While the previous version of Metis also works on Corey, this version may not.
Some example MapReduce applications are included. For a list:
$ ls app/*.c
foo.c wc.c wr.c ...
To build and run the example foo.c:
$ ./configure
$ make
$ obj/foo [args]
To link with a specific memory allocator,
$ ./configure --with-malloc=<jemalloc|flow>
$ make clean
$ make
Flow is our re-implementation of Streamflow, and may be open-sourced in future.
The ./test/run_all.py
script runs all the tests mentioned in Metis technical
report. To run the test, you need to download the
data files into the top-level directory of
Metis source tree, unpack it, and execute the following command to generate the
inputs for all applications:
$ make data_gen
As our previous work of An analysis of Linux scalability to Many Cores shows, Metis can take advantage of Linux super pages to reduce the contentions on page faults. To enable this feature, Metis currently relies on the use of flow allocator, which will allocate memory from OS in super pages.
Note that there was a scalability bottleneck in Linux kernel's hugepage allocator. We haven't checked yet whether Linux has fixed it or not. If you are interested, take a look at the patch in the linux-patches directory about the problem and our 'fix'.
As described in Metis technical report,
Metis can be configured to use
different data structures to organize the intermediate key/value pairs,
although we beleive the default configuration is generally efficient
across all workloads. See ./configure --help
for details.