Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mee #304

Open
wants to merge 155 commits into
base: master
Choose a base branch
from
Open

Mee #304

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
155 commits
Select commit Hold shift + click to select a range
b7b9dc0
Fix for bugs in lazy write handling
gvoskuilen Oct 26, 2020
2d73b61
Merge pull request #3 from gvoskuilen/dev
mkhairy Oct 26, 2020
950464e
change address type into ull
allencho1222 Nov 9, 2020
07f77e1
do not truncate 32 MSB bits of the memory address
allencho1222 Nov 9, 2020
132c2ce
added MSHR_HIT
JRPan Nov 15, 2020
85e36b9
Merge pull request #6 from JRPan/add_mshr
mkhairy Nov 22, 2020
29cce50
Merge pull request #4 from allencho1222/patch-1
tgrogers Jan 27, 2021
e6b0608
Merge pull request #5 from allencho1222/patch-2
tgrogers Jan 27, 2021
5ac0b60
Merge branch 'dev' of https://github.com/accel-sim/gpgpu-sim_distribu…
mkhairy Jan 28, 2021
f3a0077
bug fix was_writeback_sent
JRPan Feb 12, 2021
67f89ab
Merge pull request #7 from JRPan/fix-was_writeback_sent
mkhairy Feb 16, 2021
51d9925
fix hash funciton
JRPan Feb 15, 2021
2f96645
Merge pull request #9 from JRPan/fix-cache-hash
mkhairy Feb 19, 2021
b430b36
adding new RTX 3070 config
mkhairy Feb 25, 2021
deb5eb5
Merge branch 'dev' of https://github.com/accel-sim/gpgpu-sim_distribu…
mkhairy Feb 25, 2021
09f10eb
change the L1 cache policy to be on-miss based on recent ubench
mkhairy Mar 25, 2021
1ee03f0
change the L1 cache policy based on recent ubench
mkhairy Mar 25, 2021
5533464
parition CU allocation, add prints
barnes88 May 9, 2021
645a0ea
minor fixes
barnes88 May 9, 2021
46423a2
useful print statement
barnes88 May 9, 2021
b672880
validated collector unit partitioning based on scheduler
barnes88 May 9, 2021
fa76ab4
sub core model dispatches only to assigned exec pipelines
barnes88 May 10, 2021
c905726
minor fix accessing du
barnes88 May 10, 2021
a72b84e
fix find_ready reg_id
barnes88 May 10, 2021
6ad5bad
dont need du id
barnes88 May 10, 2021
9219236
remove prints
barnes88 May 10, 2021
52a890c
need at least 1 cu per sched for sub_core model, fix find_ready() reg_id
barnes88 May 11, 2021
2db9120
move reg_id calc to cu object init
barnes88 May 11, 2021
4825a1d
fix assert
barnes88 May 11, 2021
e2b410d
clean up redundant method args
barnes88 May 11, 2021
9c0156b
more cleanup
barnes88 May 11, 2021
28c3c94
cleanup find_ready
barnes88 May 11, 2021
28d0565
partition issue() in the shader execute stage
barnes88 May 11, 2021
08ad045
Merge branch 'sub_core_devel' of github.com:barnes88/gpgpu-sim_distri…
barnes88 May 11, 2021
ec55c68
minor fixes, pure virtual calls
barnes88 May 11, 2021
71455d8
add prints for ex issue validation
barnes88 May 12, 2021
640674b
issue function needed to be constrained
barnes88 May 12, 2021
9b6af84
fix print, move simd::issue() impl to .cc file
barnes88 May 12, 2021
6ae2391
fix prints / segfault
barnes88 May 12, 2021
a450d74
remove prints
barnes88 May 12, 2021
6a09900
rm unnecessary instr get
barnes88 May 12, 2021
5945d70
specialized unit should be partitioned too
barnes88 May 13, 2021
92c814a
run changes through clang-format
barnes88 May 13, 2021
db10197
rm old dirs in format-code.sh
barnes88 May 13, 2021
c526262
fix adaptive cache cfg option parsing data type
JRPan May 13, 2021
c51350d
Merge pull request #13 from JRPan/fix-config-parser
tgrogers May 13, 2021
f2a7d9c
fixing streaming cache based on recent ubench
mkhairy May 15, 2021
1347395
adding the missing xoring hashing
mkhairy May 15, 2021
6319e31
moving reg file read to read_operands function as before
mkhairy May 15, 2021
d89f9f7
Merge branch 'dev' of https://github.com/accel-sim/gpgpu-sim_distribu…
mkhairy May 17, 2021
c94b883
code refactoring cycle()
mkhairy May 17, 2021
7d9a12f
specialized unit get_ready() was missing subcore
barnes88 May 17, 2021
585dcf5
Merge pull request #12 from barnes88/sub_core_devel
mkhairy May 18, 2021
6121a88
Merge branch 'dev' of https://github.com/accel-sim/gpgpu-sim_distribu…
mkhairy May 18, 2021
0f30305
dirty counter added. NO increamenting yet
JRPan Feb 15, 2021
615f173
store ack for new waps
JRPan Feb 20, 2021
ad72041
sending cache block byte mask
JRPan Mar 2, 2021
bb19c0c
update mf breakdown at L2
JRPan Mar 2, 2021
e05fa4a
little bug fix - flush()
JRPan Mar 2, 2021
804ee90
sending byte mask for all policies
JRPan Mar 8, 2021
b3dab5e
set byte mask on fill
JRPan Mar 8, 2021
40077df
solve deadlock for non-sectored cache configs
JRPan Mar 8, 2021
64bf6fd
dirty counter not resetting after kernel finish
JRPan Mar 18, 2021
a374b33
remove MSHR_HIT from cache total access
JRPan Mar 26, 2021
f6fb56b
check sector readable only on reads
JRPan Apr 6, 2021
994fb19
reset dirty counter
JRPan May 4, 2021
7306930
remove runtime check of dirty counter
JRPan May 12, 2021
0601354
Add WT to lazy_fetch_on_read
JRPan May 18, 2021
f783351
new configs - adaptive cache and cache write ratio
JRPan May 17, 2021
a2b1b1c
adaptive cache - update
JRPan May 17, 2021
f70f5d6
re-wording/formatting
JRPan May 19, 2021
4a762a9
formatting again
JRPan May 19, 2021
4c354eb
minor improvements
JRPan May 19, 2021
f27da22
Use cache config multipilier when possible
JRPan May 19, 2021
0e4f12a
Merge pull request #14 from JRPan/spring-2021-all
mkhairy May 19, 2021
1875132
Merge branch 'dev' into adaptive-cache
JRPan May 19, 2021
2b2b6a2
Merge pull request #15 from JRPan/adaptive-cache
mkhairy May 19, 2021
14f22bc
add checking on spec unit in subcore
mkhairy May 19, 2021
3363536
Merge branch 'dev' of https://github.com/accel-sim/gpgpu-sim_distribu…
mkhairy May 19, 2021
604baaf
fixing the failing of merging
mkhairy May 19, 2021
a2ba2f5
updating config files with right adaptive cache parameters
mkhairy May 19, 2021
b63d19a
updating config files
mkhairy May 19, 2021
e3d186b
chaning @sets to 4 based on recent ubenchs
mkhairy May 19, 2021
24ffab2
moving shmem option to the base class and change the code to accept t…
mkhairy May 20, 2021
fedcde3
moving the unified size from the base class config to l1 config
mkhairy May 20, 2021
8aee56d
rename set_dirty_byte_mask
mkhairy May 20, 2021
b466afe
eliminate redundant code in gpu-cache.h
mkhairy May 20, 2021
7fac247
change L1 cache config in Volta+ to be write-through and write-alloca…
mkhairy May 20, 2021
0d33266
oops delete this config, it should not be pushed
mkhairy May 20, 2021
2aef4e3
Merge pull request #16 from mkhairy/dev
mkhairy May 20, 2021
c8eca04
fix merge conflict
JRPan May 17, 2021
f665ad5
L2 breakdown - reuse mf allocator
JRPan May 21, 2021
b814c52
cast to float - dirty line percentage
JRPan May 21, 2021
ce4f20f
Merge pull request #17 from JRPan/rewrite-l2-breakdown
mkhairy May 21, 2021
3b75d8f
Update version
mkhairy May 22, 2021
7e48560
Update CHANGES
mkhairy May 22, 2021
b6409b4
Update README.md
mkhairy May 22, 2021
6c9e13d
format code
mkhairy May 23, 2021
778962e
updating the configs based on the tuner output
mkhairy May 26, 2021
3eea014
changing kernel latency
mkhairy May 26, 2021
6ad461a
fixing configs
mkhairy May 27, 2021
110aeb1
rewrite shmem_option parsing
JRPan May 31, 2021
04462cb
update readable
JRPan Jun 3, 2021
e9d781a
minor improvements
JRPan Jun 3, 2021
0f088dc
correct dirty counter
JRPan Jun 16, 2021
3cf24b8
WT in lazy fetch on read
JRPan Jun 23, 2021
b1befa8
Adding restricted round robin scheduler
JRPan Aug 16, 2021
b658147
better oc selecting when sub core enabled
JRPan Aug 16, 2021
a8256e5
Update volta to use lrr scheduler
JRPan Aug 23, 2021
84c4f46
Ampere and Turing also lrr scheduler
JRPan Aug 23, 2021
4a4fc87
Merge pull request #5 from accel-sim/dev
VijayKandiah Oct 17, 2021
84c6cf4
AccelWattch dev Integration
VijayKandiah Oct 17, 2021
da0aef2
Merge pull request #236 from accel-sim/dev
aamodt Oct 18, 2021
6b244a5
Merge pull request #237 from VijayKandiah/dev
aamodt Oct 18, 2021
f9b39ee
mee sub partition v0.1
FarmerJooe Aug 9, 2024
5f559b1
mee v0.2
FarmerJooe Aug 17, 2024
0e16e85
mee v0.3
FarmerJooe Aug 17, 2024
fe66d67
mee v0.3
FarmerJooe Aug 20, 2024
845a4c2
mee v0.4
FarmerJooe Aug 20, 2024
7799163
mee v0.4
FarmerJooe Aug 20, 2024
2161822
mee v1.0
FarmerJooe Aug 21, 2024
a475d07
mee v1.0
FarmerJooe Aug 21, 2024
9016f50
mee v1.0
FarmerJooe Aug 22, 2024
77fa0ec
mee v1.0.1
FarmerJooe Aug 24, 2024
3a3b2d2
mee v1.1
FarmerJooe Aug 24, 2024
6069f36
mee v1.1.1
FarmerJooe Aug 24, 2024
12fcca7
mee v1.1.2
FarmerJooe Aug 24, 2024
a89f06c
mee v1.1.2
FarmerJooe Aug 25, 2024
ac6c328
mee v1.2
FarmerJooe Aug 25, 2024
f918f1c
mee v1.2
FarmerJooe Aug 27, 2024
1c7c4bb
mee v1.2.1
FarmerJooe Aug 29, 2024
d0d0be4
mee v1.2.2
FarmerJooe Aug 30, 2024
d241f04
mee v1.2.2
FarmerJooe Aug 30, 2024
ba10d4b
mee v1.2.5
FarmerJooe Aug 31, 2024
662e4a7
mee v1.2.5
FarmerJooe Aug 31, 2024
7a2dc01
mee v1.2.6
FarmerJooe Sep 3, 2024
1c168bb
mee v1.2.6 fix:fill deadlock
FarmerJooe Sep 4, 2024
fb3eed9
mee v1.2.7
FarmerJooe Sep 4, 2024
d8b8007
mee v1.2.8
FarmerJooe Sep 5, 2024
a2051f9
mee v1.2.9
FarmerJooe Sep 10, 2024
4c281e2
mee v1.2.9
FarmerJooe Sep 10, 2024
5f78321
mee v1.2.9
FarmerJooe Sep 12, 2024
5a05cac
mee v1.2.9
FarmerJooe Sep 14, 2024
96eedb0
mee v1.2.9
FarmerJooe Sep 14, 2024
6ce08bf
mee v1.3.0
FarmerJooe Sep 29, 2024
6f5f2ef
mee v1.3.0
FarmerJooe Sep 29, 2024
a6f9953
mee v1.3.1
FarmerJooe Oct 6, 2024
0241f19
mee v1.3.2
FarmerJooe Oct 7, 2024
4353173
mee v1.3.3
FarmerJooe Oct 8, 2024
5a65d18
mee v1.3.3
FarmerJooe Oct 8, 2024
7604529
mee v1.3.3
FarmerJooe Oct 8, 2024
5e01b12
mee v1.3.4
FarmerJooe Oct 24, 2024
5425350
mee v1.3.9
FarmerJooe Nov 1, 2024
338da15
mee v1.3.9.1
FarmerJooe Nov 23, 2024
2fb389f
mee v1.4.0
FarmerJooe Dec 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
16 changes: 16 additions & 0 deletions CHANGES
Original file line number Diff line number Diff line change
@@ -1,4 +1,20 @@
LOG:
Version 4.2.0 vs 4.1.0
- Added AccelWattch power model v1.0 which replaces GPUWattch.
- Added AccelWattch XML configuration files for SM7_QV100, SM7_TITANV, SM75_RTX2060_S, SM6_TITANX. Note that all these AccelWattch XML configuration files are tuned only for SM7_QV100.

Version 4.1.0 versus 4.0.0
-Features:
1- Supporting L1 write-allocate with sub-sector writing policy as in Volta+ hardware, and changing the Volta+ cards config to make L1 write-allocate with write-through
2- Making the L1 adaptive cache policy to be configurable
3- Adding Ampere RTX 3060 config files
-Bugs:
1- Fixing L1 bank hash function bug
2- Fixing L1 read hit counters in gpgpu-sim to match nvprof, to achieve more accurate L1 correlation with the HW
3- Fixing bugs in lazy write handling, thanks to Gwendolyn Voskuilen from Sandia labs for this fix
4- Fixing the backend pipeline for sub_core model
5- Fixing Memory stomp bug at the shader_config
6- Some code refactoring:
Version 4.0.0 (development branch) versus 3.2.3
-Front-End:
1- Support .nc cache modifier and __ldg function to access the read-only L1D cache
Expand Down
30 changes: 30 additions & 0 deletions COPYRIGHT
Original file line number Diff line number Diff line change
Expand Up @@ -44,3 +44,33 @@ per UBC policy 88, item 2.3 on literary works) these students names appear in
the copyright notices of the respective files. UBC is also mentioned in the
copyright notice to highlight that was the author's affiliation when the work
was performed.

NOTE 3: AccelWattch and all its components are covered by the following license and copyright.
Copyright (c) 2018-2021, Vijay Kandiah, Junrui Pan, Mahmoud Khairy, Scott Peverelle, Timothy Rogers, Tor M. Aamodt, Nikos Hardavellas
Northwestern University, Purdue University, The University of British Columbia
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer;
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution;
3. Neither the names of Northwestern University, Purdue University,
The University of British Columbia nor the names of their contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.
16 changes: 8 additions & 8 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ ifneq ($(GPGPUSIM_POWER_MODEL),)
MCPAT_DBG_FLAG = dbg
endif

MCPAT_OBJ_DIR = $(SIM_OBJ_FILES_DIR)/gpuwattch
MCPAT_OBJ_DIR = $(SIM_OBJ_FILES_DIR)/accelwattch

MCPAT = $(MCPAT_OBJ_DIR)/*.o
endif
Expand Down Expand Up @@ -117,24 +117,24 @@ check_setup_environment:
fi

check_power:
@if [ -d "$(GPGPUSIM_ROOT)/src/gpuwattch/" -a ! -n "$(GPGPUSIM_POWER_MODEL)" ]; then \
@if [ -d "$(GPGPUSIM_ROOT)/src/accelwattch/" -a ! -n "$(GPGPUSIM_POWER_MODEL)" ]; then \
echo ""; \
echo " Power model detected in default directory ($(GPGPUSIM_ROOT)/src/gpuwattch) but GPGPUSIM_POWER_MODEL not set."; \
echo " Please re-run setup_environment or manually set GPGPUSIM_POWER_MODEL to the gpuwattch directory if you would like to include the GPGPU-Sim Power Model."; \
echo " Power model detected in default directory ($(GPGPUSIM_ROOT)/src/accelwattch) but GPGPUSIM_POWER_MODEL not set."; \
echo " Please re-run setup_environment or manually set GPGPUSIM_POWER_MODEL to the accelwattch directory if you would like to include the GPGPU-Sim Power Model."; \
echo ""; \
true; \
elif [ ! -d "$(GPGPUSIM_POWER_MODEL)" ]; then \
echo ""; \
echo "ERROR ** Power model directory invalid."; \
echo "($(GPGPUSIM_POWER_MODEL)) is not a valid directory."; \
echo "Please set GPGPUSIM_POWER_MODEL to the GPGPU-Sim gpuwattch directory."; \
echo "Please set GPGPUSIM_POWER_MODEL to the GPGPU-Sim accelwattch directory."; \
echo ""; \
exit 101; \
elif [ -n "$(GPGPUSIM_POWER_MODEL)" -a ! -f "$(GPGPUSIM_POWER_MODEL)/gpgpu_sim.verify" ]; then \
echo ""; \
echo "ERROR ** Power model directory invalid."; \
echo "gpgpu_sim.verify not found in $(GPGPUSIM_POWER_MODEL)."; \
echo "Please ensure that GPGPUSIM_POWER_MODEL points to a valid gpuwattch directory and that you have the correct GPGPU-Sim mcpat distribution."; \
echo "Please ensure that GPGPUSIM_POWER_MODEL points to a valid accelwattch directory and that you have the correct GPGPU-Sim mcpat distribution."; \
echo ""; \
exit 102; \
fi
Expand Down Expand Up @@ -243,8 +243,8 @@ makedirs:
if [ ! -d $(SIM_OBJ_FILES_DIR)/libopencl/bin ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/libopencl/bin; fi;
if [ ! -d $(SIM_OBJ_FILES_DIR)/$(INTERSIM) ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/$(INTERSIM); fi;
if [ ! -d $(SIM_OBJ_FILES_DIR)/cuobjdump_to_ptxplus ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/cuobjdump_to_ptxplus; fi;
if [ ! -d $(SIM_OBJ_FILES_DIR)/gpuwattch ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/gpuwattch; fi;
if [ ! -d $(SIM_OBJ_FILES_DIR)/gpuwattch/cacti ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/gpuwattch/cacti; fi;
if [ ! -d $(SIM_OBJ_FILES_DIR)/accelwattch ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/accelwattch; fi;
if [ ! -d $(SIM_OBJ_FILES_DIR)/accelwattch/cacti ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/accelwattch/cacti; fi;

all:
$(MAKE) gpgpusim
Expand Down
58 changes: 31 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Welcome to GPGPU-Sim, a cycle-level simulator modeling contemporary graphics
processing units (GPUs) running GPU computing workloads written in CUDA or
OpenCL. Also included in GPGPU-Sim is a performance visualization tool called
AerialVision and a configurable and extensible energy model called GPUWattch.
GPGPU-Sim and GPUWattch have been rigorously validated with performance and
AerialVision and a configurable and extensible power model called AccelWattch.
GPGPU-Sim and AccelWattch have been rigorously validated with performance and
power measurements of real hardware GPUs.

This version of GPGPU-Sim has been tested with a subset of CUDA version 4.2,
Expand All @@ -11,35 +11,38 @@ This version of GPGPU-Sim has been tested with a subset of CUDA version 4.2,
Please see the copyright notice in the file COPYRIGHT distributed with this
release in the same directory as this file.

GPGPU-Sim 4.0 is compatible with Accel-Sim simulation framework. With the support
of Accel-Sim, GPGPU-Sim 4.0 can run NVIDIA SASS traces (trace-based simulation)
generated by NVIDIA's dynamic binary instrumentation tool (NVBit). For more information
about Accel-Sim, see [https://accel-sim.github.io/](https://accel-sim.github.io/)

If you use GPGPU-Sim 4.0 in your research, please cite:

Mahmoud Khairy, Zhesheng Shen, Tor M. Aamodt, Timothy G Rogers.
Accel-Sim: An Extensible Simulation Framework for Validated GPU Modeling.
In proceedings of the 47th IEEE/ACM International Symposium on Computer Architecture (ISCA),
May 29 - June 3, 2020.

If you use CuDNN or PyTorch support, checkpointing or our new debugging tool for functional
If you use CuDNN or PyTorch support (execution-driven simulation), checkpointing or our new debugging tool for functional
simulation errors in GPGPU-Sim for your research, please cite:

Jonathan Lew, Deval Shah, Suchita Pati, Shaylin Cattell, Mengchi Zhang, Amruth Sandhupatla,
Christopher Ng, Negar Goli, Matthew D. Sinclair, Timothy G. Rogers, Tor M. Aamodt
Analyzing Machine Learning Workloads Using a Detailed GPU Simulator, arXiv:1811.08933,
https://arxiv.org/abs/1811.08933


If you use the Tensor Core model in GPGPU-Sim or GPGPU-Sim's CUTLASS Library
for your research please cite:

Md Aamir Raihan, Negar Goli, Tor Aamodt,
Modeling Deep Learning Accelerator Enabled GPUs, arXiv:1811.08309,
https://arxiv.org/abs/1811.08309

If you use the GPUWattch energy model in your research, please cite:
If you use the AccelWattch power model in your research, please cite:

Jingwen Leng, Tayler Hetherington, Ahmed ElTantawy, Syed Gilani, Nam Sung Kim,
Tor M. Aamodt, Vijay Janapa Reddi, GPUWattch: Enabling Energy Optimizations in
GPGPUs, In proceedings of the ACM/IEEE International Symposium on Computer
Architecture (ISCA 2013), Tel-Aviv, Israel, June 23-27, 2013.
Vijay Kandiah, Scott Peverelle, Mahmoud Khairy, Junrui Pan, Amogh Manjunath, Timothy G. Rogers, Tor M. Aamodt, and Nikos Hardavellas. 2021.
AccelWattch: A Power Modeling Framework for Modern GPUs. In MICRO54: 54th Annual IEEE/ACM International Symposium on Microarchitecture
(MICRO ’21), October 18–22, 2021, Virtual Event, Greece.

If you use the support for CUDA dynamic parallelism in your research, please cite:

Expand All @@ -58,8 +61,8 @@ This file contains instructions on installing, building and running GPGPU-Sim.
Detailed documentation on what GPGPU-Sim models, how to configure it, and a
guide to the source code can be found here: <http://gpgpu-sim.org/manual/>.
Instructions for building doxygen source code documentation are included below.
Detailed documentation on GPUWattch including how to configure it and a guide
to the source code can be found here: <http://gpgpu-sim.org/gpuwattch/>.

Previous versions of GPGPU-Sim (3.2.0 to 4.1.0) included the [GPUWattch Energy model](http://gpgpu-sim.org/gpuwattch/) which has been replaced by AccelWattch version 1.0 in GPGPU-Sim version 4.2.0. AccelWattch supports modern GPUs and is validated against a NVIDIA Volta QV100 GPU. Detailed documentation on AccelWattch can be found here: [AccelWattch Overview](https://github.com/VijayKandiah/accel-sim-framework#accelwattch-overview) and [AccelWattch MICRO'21 Artifact Manual](https://github.com/VijayKandiah/accel-sim-framework/blob/release/AccelWattch.md).

If you have questions, please sign up for the google groups page (see
gpgpu-sim.org), but note that use of this simulator does not imply any level of
Expand Down Expand Up @@ -104,21 +107,20 @@ library (part of the CUDA toolkit). Code to interface with the CUDA Math
library is contained in cuda-math.h, which also includes several structures
derived from vector_types.h (one of the CUDA header files).

## GPUWattch Energy Model
## AccelWattch Power Model

GPUWattch (introduced in GPGPU-Sim 3.2.0) was developed by researchers at the
University of British Columbia, the University of Texas at Austin, and the
University of Wisconsin-Madison. Contributors to GPUWattch include Tor
Aamodt's research group at the University of British Columbia: Tayler
Hetherington and Ahmed ElTantawy; Vijay Reddi's research group at the
University of Texas at Austin: Jingwen Leng; and Nam Sung Kim's research group
at the University of Wisconsin-Madison: Syed Gilani.
AccelWattch (introduced in GPGPU-Sim 4.2.0) was developed by researchers at
Northwestern University, Purdue University, and the University of British Columbia.
Contributors to AccelWattch include Nikos Hardavellas's research group at Northwestern University:
Vijay Kandiah; Tor Aamodt's research group at the University of British Columbia: Scott Peverelle;
and Timothy Rogers's research group at Purdue University: Mahmoud Khairy, Junrui Pan, and Amogh Manjunath.

GPUWattch leverages McPAT, which was developed by Sheng Li et al. at the
AccelWattch leverages McPAT, which was developed by Sheng Li et al. at the
University of Notre Dame, Hewlett-Packard Labs, Seoul National University, and
the University of California, San Diego. The paper can be found at
the University of California, San Diego. The McPAT paper can be found at
http://www.hpl.hp.com/research/mcpat/micro09.pdf.


# INSTALLING, BUILDING and RUNNING GPGPU-Sim

Assuming all dependencies required by GPGPU-Sim are installed on your system,
Expand Down Expand Up @@ -261,6 +263,7 @@ To clean the docs run
The documentation resides at doc/doxygen/html.

To run Pytorch applications with the simulator, install the modified Pytorch library as well by following instructions [here](https://github.com/gpgpu-sim/pytorch-gpgpu-sim).

## Step 3: Run

Before we run, we need to make sure the application's executable file is dynamically linked to CUDA runtime library. This can be done during compilation of your program by introducing the nvcc flag "--cudart shared" in makefile (quotes should be excluded).
Expand Down Expand Up @@ -311,15 +314,16 @@ need to re-compile your application simply to run it on GPGPU-Sim.
To revert back to running on the hardware, remove GPGPU-Sim from your
LD_LIBRARY_PATH environment variable.

The following GPGPU-Sim configuration options are used to enable GPUWattch
The following GPGPU-Sim configuration options are used to enable AccelWattch

-power_simulation_enabled 1 (1=Enabled, 0=Not enabled)
-gpuwattch_xml_file <filename>.xml

-power_simulation_mode 0 (0=AccelWattch_SASS_SIM or AccelWattch_PTX_SIM, 1=AccelWattch_SASS_HW, 2=AccelWattch_SASS_HYBRID)
-accelwattch_xml_file <filename>.xml

The GPUWattch XML configuration file name is set to gpuwattch.xml by default and
currently only supplied for GTX480 (default=gpuwattch_gtx480.xml). Please refer to
<http://gpgpu-sim.org/gpuwattch/> for more information.
The AccelWattch XML configuration file name is set to accelwattch_sass_sim.xml by default and is
currently provided for SM7_QV100, SM7_TITANV, SM75_RTX2060_S, and SM6_TITANX.
Note that all these AccelWattch XML configuration files are tuned only for SM7_QV100. Please refer to
<https://github.com/VijayKandiah/accel-sim-framework#accelwattch-overview> for more information.

Running OpenCL applications is identical to running CUDA applications. However,
OpenCL applications need to communicate with the NVIDIA driver in order to
Expand Down
Loading