Skip to content

orpheezt/graphy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

graphy

CUDA

Binary analysis on CUDA

cuobjdump -ptx <file> | cu++filt

TODO: perform a small binary analysis section on the kernels :D

TODO: cudaGraphDebugDotPrint()

use __noinline__ to perform binary analisis on __device__ functions

Binary analysis on DS::join

See dump header for information about compilation

$L__BB3_4:
max.u32 %r19, %r22, %r21;
min.u32 %r22, %r22, %r21;
mul.wide.u32 %rd20, %r19, 4;
add.s64 %rd19, %rd6, %rd20;
//
fence.sc.gpu;
//
//
atom.cas.acquire.gpu.b32 %r21,[%rd19],%r19,%r22;
//
setp.ne.s32 %p4, %r19, %r21;
@%p4 bra $L__BB3_4;

References

  • Fallin, A., Gonzalez, A., Seo, J., & Burtscher, M. (2023, November). A High-Performance MST Implementation for GPUs. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (pp. 1-13).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published