https://www.youtube.com/watch?v=V_xro1bcAuA&t=105s
https://github.com/mrdbourke/pytorch-deep-learning
LaLaRAND implemented only about allocating cpu/gpu layer by layer. (not about quantization/scheduling (later...))
https://www.youtube.com/watch?v=V_xro1bcAuA&t=105s
https://github.com/mrdbourke/pytorch-deep-learning