split ThunderFunction to deallocate grad_outs while computing backward #5399
Job | Run time |
---|---|
6s | |
15s | |
36s | |
19s | |
2m 47s | |
2m 30s | |
1m 5s | |
49s | |
3m 22s | |
3m 15s | |
1s | |
15m 5s |
Job | Run time |
---|---|
6s | |
15s | |
36s | |
19s | |
2m 47s | |
2m 30s | |
1m 5s | |
49s | |
3m 22s | |
3m 15s | |
1s | |
15m 5s |