generator v2021.05.25
carlushuang
released this
25 May 14:12
·
308 commits
to develop
since this release
This generator is mainly for a feature named "global memory access pattern", or gmap for short.
As the name suggested, gmap is used to dump the memory access pattern of input/weight/output tensor, for organized for each individual block, and for each individual read/write request.
This feature is controlled by an environment variable IGEMM_DUMP_GMAP
, example to use this feature:
python3 igemm_codegen.py config/igemm_bwd_gtc_gfx908_nhwc_fp16.config
cd out/
IGEMM_DUMP_GMAP=1 ./conv_driver.exe convfp16 -n 2 -c 1024 -H 40 -W 52 -k 512 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -g 1 -t 1 -F 2 --in_layout NHWC --fil_layout NHWC --out_layout NHWC
Currently only support NHWC
fwd/bwd
, fp32/fp16
. More layout precision support is to be added.