I am currently pursuing a Master’s degree in Artificial Intelligence Innovation at National Yang Ming Chiao Tung University. Having received multiple honors and scholarships, I am committed to contributing to advancements in the field of ML/AI systems.
📄 Learn more about my achievements and experiences on (LinkedIn).
Efficient Triton Kernels for LLM Training that significantly reduce GPU memory usage and improve performance.
- Addressed the dtype mismatch in AMP training scenarios and resolved uncovered scenarios in test cases for the core functionality of the Liger Kernel (Fused Linear Cross Entropy). (#501)
- Designed and performed a deep analysis of the advantages and disadvantages of different solutions to fix the dtype mismatch problem (#502), ensuring optimal performance in terms of memory usage and latency.
A Multi-Agent LLM workflow designed to leverage diverse source documents and LLMs to create comprehensive research reports.