This is a collection of benchmarks using the OmpSs programming model that include a verification of their results, either algorithmical or comparing with a reference output. These codes are provided without guarantee of any kind.
Each directory provides build targets perf
, instr
, debug
, seq
, install
, uninstall
.
The first three are parallel builds, respectively built with and linked against the performance
, instrumentation
, and debug
libraries of the Nanox++ runtime. seq
builds a sequential binary.
All builds require a built libcatchroi
at $CATCHROI_HOME
(defaults to <repo_root>/libcatchroi
). install
and uninstall
targets respect the $DESTDIR
variable.
This repository contains two libraries:
libcatchroi
, that provides timing information and some error injection capabilities. The library format is useful to interpose the Region Of Interest (ROI) start and end calls, as well as any desired memory allocation calls.nx_catch_tdg
, which is a nanox instrumentation plugin to report scheduling information.
The rest of the subdirectories are benchmarks.
The table below lists all the benchmarks in this repository. The "Verif." column indicates how the benchmark's output verification is performed.
Name | Benchmark description | Category | Verif | Origin |
---|---|---|---|---|
Blackscholes | Option pricing | Partial Differential Equation | built-in | PARSEC benchmarks4,5 |
Cholesky | Cholesky factorization | Dense linear algebra | built-in | BSC Application Repository |
CG | Conjugate Gradient | Sparse linear algebra | built-in | matrices from SuiteSparse2 |
DGEMM | Matrix multiplication | Dense linear algebra | built-in | BSC Application Repository |
FFT | Fast Fourier Transform | Spectral method | ref. run | Wang Jian-Sheng |
Gauss-Seidel | Heat diffusion, Gauss-Seidel solver | Structured grid | ref. run | BSC Application Repository |
Jacobi | Heat diffusion, Jacobi solver | Structured grid | ref. run | BSC Application Repository |
KNN | K-nearest neighbours | Machine learning | ref. run | Heterogeneous Computer Architecture (HCA) group at BSC |
K Means | K-means clustering | Machine learning | ref. run | Heterogeneous Computer Architecture (HCA) group at BSC |
N-body | Astrophysical simulation | N-body method | built-in | BSC Application Repository |
PRK2 stencil | Parallel Research Kernels stencil | Stencil operation | built-in | Parallel Research Kernels3 |
Red-black | Heat diffusion, red-black solver | Structured grid | ref. run | BSC Application Repository |
SMI | Symmetric matrix inverse | Dense linear algebra | built-in | Guillermo Miranda |
Stream | Stream Triad | Memory bandwidth benchmark | built-in | John D. McCalpin |
The BSC benchmarks repository was previously hosted at https://pm.bsc.es/projects/bar. SuiteSparse was previously the UF Sparse Matrix Collection hosted at https://www.cise.ufl.edu/research/sparse/matrices/list_by_id.html.
- J. D. McCalpin, “Memory Bandwidth and Machine Balance in Current High Performance Computers,” IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter, pp. 19–25, 1995.
- T. A. Davis and Y. Hu, “The University of Florida Sparse Matrix Collection,” ACM Transactions on Mathematical Software, vol. 38, no. 1, pp. 1:1–1:25, Dec. 2011.
- R. F. V. der Wijngaart and T. G. Mattson, “The Parallel Research Kernels,” in IEEE High Performance Extreme Computing Conference, 2014, pp. 1–6.
- C. Bienia, “Benchmarking Modern Multiprocessors,” PhD thesis, Princeton University, 2011.
- D. Chasapis et al., “PARSECSs: Evaluating the Impact of Task Parallelism in the PARSEC Benchmark Suite,” ACM Transactions on Architecture and Code Optimization, vol. 12, no. 4, pp. 41:1–41:22, Dec. 2015.
All codes are under their original license. This repository does not own any of the code and does not provide any warranties.
Code written that is not part of any benchmark (under nx_catch_tdg/
, libcatchroi/
) is under LGPL v3.