The final project for EdgeAI course at NYCU, focusing on accelerating Llama-3.2-3B-Instruct inference on a single NVIDIA T4 GPU.
-
Updated
Jun 18, 2025 - Python
The final project for EdgeAI course at NYCU, focusing on accelerating Llama-3.2-3B-Instruct inference on a single NVIDIA T4 GPU.
An open and practical guide to Edge AI Engineering.
Add a description, image, and links to the ai-efficiency topic page so that developers can more easily learn about it.
To associate your repository with the ai-efficiency topic, visit your repo's landing page and select "manage topics."