-
Notifications
You must be signed in to change notification settings - Fork 2
/
README
119 lines (76 loc) · 4.16 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
Hopscotch v1.0
==============
Author: Alif Ahmed
email: alihahmed@virginia.edu
I. Overview
Hopscotch is a micro-benchmark suite for memory performace evaluation. Currently it supports empirically measuring roofline, bandwidth measurement with different access patterns and latency measurement. GPU is also supported for roofline. A memory access pattern visualizer (MAPProfiler) is also included with Hopscotch. MAPProfiler currently supports only x86_64 executables.
It is under active development. Latest source codes can be found on:
https://github.com/alifahmed/hopscotch
Details about most of the kernels and benchmark can be found in the paper:
A. Ahmed, K. Skadron, "Hopscotch: A Micro-benchmark Suite for Memory Performance Evaluation", MEMSYS, 2019.
II. Directory Structure
hopscotch/
|
|---cpu/ Directory containing benchmarks for memory connected to CPU.
| |
| |---1_roofline/ Roofline benchmark (CPU version).
| |---2_bandwidth/ Benchmarks memory with different access patterns.
| |---3_latency/ Latency benchmark.
| |---4_cache/ Benchmark for evaluating caching.
| |---common/ Source code common to all benchmarks.
| |---include/ Common header files.
| |---kernels/ Common kernels. Used by different benchmarks.
|
|---gpu/
| |
| |---1_roofline/ Roofline benchmark (CUDA version).
| |---common/ Source code common to all benchmarks.
| |---include/ Common header files.
|
|---MAPProfiler/ A tool for memory access pattern visualization.
|
|---Makefile Top level build file.
|---README This file.
More details can be found on the README files inside these directories.
II. Prerequisite
a) Python 3
b) MAPProfiler requires Intel Pin Tool. Check the README inside MAPProfiler for details.
III. Installation
Running make from top directory will build all the sub-directories. Binaries are created inside the respective benchmark's directory. Make can also be run inside a benchmark's directory to build just that specific benchmark. Some benchmarks will use scripts to rebuild the binaries with different configurations.
IV. CPU Benchmarks
1_roofline
==========
Measures the maximum attainable performance with varying arithmetic intensity and the machine balance.
To run: ./roofline.py
The python script will generate a pdf for the roofline plot. Available options can be found
using ./roofline.py --help
2_bandwidth
===========
Measures bandwidth with different types of access patterns.
To run: a) make
b) ./bandwidth
Working set size can be changed by defining WSS_EXP. Number of elements in the working set is (2 ^ WSS_EXP). WSS_EXP can be defined directy if manually compiling, or can be passes with USER_DEFS.
Example: a) make USER_DEFS="-DWSS_EXP=32"
b) ./bandwidth
3_latency
==========
Measures the latency with a single threaded pointer chasing kernel. Working set size is varied.
To run: ./latency.py
The python script will generate a pdf for the latency plot. Available options can be found
using ./latency.py --help
4_cache
===========
Measures cache efficiency by running workloads with different spatial and temporal locality ({low,low}, {low,high}, {high,low}, {high,high})
To run: a) make
b) ./cache
IV. GPU Benchmarks
1_roofline
==========
Measures the maximum attainable performance with varying arithmetic intensity and the machine balance.
Supports single and double precision floating point operations.
To run: ./roofline.py
The python script will generate a pdf for the roofline plot. Available options can be found
using ./roofline.py --help
V. Acknoledgement
This work was supported by CRISP, one of six centers in JUMP, a Semiconductor Research Corporation (SRC)
program sponsored by DARPA, and Brookhaven National Laboratory.