Hi TileRT team!
I'm researching TileRT's performance optimizations. Would it be possible to share:
- Nsight profiling data (.nsys-rep/.ncu-rep files)
- Or performance analysis screenshots/reports
This would greatly help me understand the Tile-level scheduling and overlap mechanisms in practice.