Awesome AI Engineer
A curated roadmap and resource collection for leveling up from AI Engineer to AI Systems Architect β with 13 production-ready portfolio projects to build along the way.
Roadmap β’
Projects β’
Learning β’
Frameworks β’
MLOps β’
System Design β’
Books β’
Contributing
The AI engineering landscape is evolving fast. New frameworks, models, and patterns emerge weekly. This repo cuts through the noise with:
A clear career progression from junior AI engineer to systems architect
13 hands-on projects covering LLMs, agents, RAG, multi-agent systems, and protocols
Curated resources β only the best, most relevant material for 2025-2026
Enterprise focus β production patterns, not just tutorials
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AI ENGINEER β SYSTEMS ARCHITECT β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β LEVEL 1: AI Engineer β
β βββ LLM fundamentals (tokenization, generation, fine-tuning) β
β βββ Prompt engineering & evaluation β
β βββ RAG pipelines & vector databases β
β βββ API design for AI services β
β βββ Projects: #1 #2 #5 #7 β
β β
β LEVEL 2: Senior AI Engineer β
β βββ Agent architectures (ReAct, Plan-and-Execute) β
β βββ Multi-agent orchestration patterns β
β βββ Agentic frameworks (LangGraph, CrewAI, ADK) β
β βββ Streaming & real-time AI systems β
β βββ Projects: #3 #4 #6 #8 β
β β
β LEVEL 3: AI Platform Engineer β
β βββ MLOps & model serving (vLLM, TensorRT, Triton) β
β βββ Kubernetes for AI workloads β
β βββ Observability & evaluation frameworks β
β βββ Protocol design (MCP, A2A, UCP) β
β βββ Projects: #8 #9 #10 β
β β
β LEVEL 4: AI Systems Architect β
β βββ Enterprise agent framework patterns β
β βββ Multi-agent system design at scale β
β βββ Compliance, security & governance for AI β
β βββ Cross-framework interoperability β
β βββ Projects: #11 #12 #13 β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
13 production-ready projects, each in its own repo with Docker setup, tests, CI/CD, and Kubernetes manifests. Clone any project and run it in under 2 minutes.
#
Project
What You'll Build
Key Skills
Repo
1
LLM Playground
Interactive tokenization & generation explorer
BPE tokenization, sampling strategies, temperature
2
Customer Support Chatbot
Production chatbot with fine-tuning
LoRA/PEFT, prompt engineering, evaluation
3
Ask-the-Web Agent
Perplexity-like search & synthesis agent
LangGraph, tool use, citation generation
4
Deep Research
Multi-strategy reasoning engine
Chain-of-Thought, Tree-of-Thought, inference scaling
5
Image Generation
Multi-provider image service
Diffusion models, DALL-E, FLUX, Stable Diffusion
6
Capstone Multi-Agent
Supervisor-pattern agent platform
Multi-agent orchestration, task decomposition
7
Agent RAG
Hierarchical RAG system
Single/multi/hierarchical retrieval, query decomposition
8
MCP & A2A
Protocol implementation
MCP SDK, A2A protocol, AgentCards
Protocol & Commerce (Level 3)
#
Project
What You'll Build
Key Skills
Repo
9
UCP Merchant Server
Commerce protocol platform
Checkout state machine, AP2 verification, MCP bindings
10
UCP Shopping Agent
AI shopping assistant
Multi-merchant discovery, comparison, LangGraph
Enterprise Frameworks (Level 4)
#
Project
What You'll Build
Key Skills
Repo
11
Compliance Audit Agents
Regulatory compliance system
Microsoft Agent Framework, graph workflows, middleware
12
Incident Response ADK
IT incident orchestrator
Google ADK, Sequential/Parallel/Loop agents
13
Contract Lifecycle Crew
Contract management platform
CrewAI, role-based agents, Flow orchestration
Quick Start (Any Project)
# Clone any project
git clone https://github.com/samuelvinay91/< project-name> .git
cd < project-name>
# Option 1: Docker (recommended)
docker compose up --build
# Option 2: Local
pip install -e " .[dev]"
uvicorn src.< package> .main:app --reload --port < port>
# Run tests
pytest tests/ -v
π Foundational Learning
π€ Agent Frameworks & Tools
Framework
Best For
Maintainer
Link
LangGraph
Complex stateful agent workflows with cycles
LangChain
Docs
CrewAI
Role-based multi-agent collaboration
CrewAI
Docs
Google ADK
Agent Development Kit with sequential/parallel patterns
Google
Docs
Microsoft Agent Framework
Enterprise agents with middleware pipelines
Microsoft
GitHub
OpenAI Agents SDK
Simple agent loops with handoffs
OpenAI
Docs
LlamaIndex
Data-centric RAG and agent workflows
LlamaIndex
Docs
AutoGen
Multi-agent conversation patterns
Microsoft
Docs
Semantic Kernel
Enterprise AI orchestration for .NET/Python/Java
Microsoft
Docs
Agent Protocols & Standards
Protocol
Purpose
Link
MCP (Model Context Protocol)
Standardized tool/resource access for LLMs
Spec
A2A (Agent-to-Agent)
Inter-agent communication protocol by Google
Spec
UCP (Universal Commerce Protocol)
Agentic commerce transactions
Spec
OpenAI Function Calling
Structured tool use for LLMs
Docs
Orchestration & Deployment
Tool
Purpose
Link
LangSmith
LLM observability, tracing, and evaluation
Docs
Weights & Biases Weave
AI application tracing and evaluation
Docs
Arize Phoenix
Open-source LLM observability
GitHub
Braintrust
Eval, logging, and prompt playground
Docs
βοΈ MLOps & Infrastructure
Tool
Purpose
Link
vLLM
High-throughput LLM serving with PagedAttention
GitHub
TensorRT-LLM
NVIDIA optimized LLM inference
GitHub
Triton Inference Server
Multi-framework model serving at scale
GitHub
Ollama
Run LLMs locally with one command
GitHub
llama.cpp
CPU/GPU inference for LLMs in C++
GitHub
SGLang
Fast serving framework for LLMs and VLMs
GitHub
ML Platforms & Experiment Tracking
Tool
Purpose
Link
MLflow
Experiment tracking, model registry, deployment
Docs
Weights & Biases
Experiment tracking, dataset versioning
Docs
Ray
Distributed compute for ML training and serving
Docs
DVC
Data version control and ML pipelines
Docs
Kubeflow
ML workflows on Kubernetes
Docs
Database
Strengths
Link
Qdrant
Rust-based, fast, rich filtering
Docs
Pinecone
Fully managed, serverless option
Docs
Weaviate
Multi-modal, GraphQL API
Docs
ChromaDB
Simple, lightweight, great for prototyping
Docs
Milvus
Scalable, GPU-accelerated similarity search
Docs
pgvector
PostgreSQL extension β use your existing DB
GitHub
Tool
Purpose
Link
Hugging Face TRL
RLHF, DPO, SFT training
Docs
Unsloth
2-5x faster fine-tuning with 80% less memory
GitHub
Axolotl
Streamlined fine-tuning with YAML configs
GitHub
PEFT
Parameter-efficient fine-tuning (LoRA, QLoRA)
Docs
LitGPT
Pretrain, fine-tune, deploy LLMs
GitHub
ποΈ System Design for AI
Resource
Description
Link
Chip Huyen β AI Engineering
Definitive guide to building AI applications
Book
Designing Machine Learning Systems
ML system design end-to-end (Chip Huyen)
Book
Eugene Yan β Applied ML
Practical patterns for production ML
Blog
The AI Engineer's Handbook
Architecture patterns for LLM applications
Newsletter
System Design for ML
Interview prep meets real-world ML design
GitHub
Tool
Purpose
Link
RAGAS
RAG evaluation framework
Docs
DeepEval
LLM evaluation with 14+ metrics
GitHub
Promptfoo
LLM output testing and red-teaming
Docs
Inspect AI
AI safety evaluations by AISI
GitHub
Giskard
ML model testing and vulnerability scanning
GitHub
βοΈ Cloud & Kubernetes for AI
Provider
Key Services
Link
AWS
Bedrock (managed LLMs), SageMaker (training/serving), Inferentia
Docs
Google Cloud
Vertex AI, Gemini API, Cloud TPUs, GKE for ML
Docs
Azure
Azure OpenAI Service, Azure ML, AKS for AI
Docs
Kubernetes for AI Workloads
Resource
Description
Link
KubeFlow
End-to-end ML platform on Kubernetes
Docs
KServe
Standardized ML model serving on K8s
Docs
NVIDIA GPU Operator
Automated GPU management in K8s
Docs
Ray on Kubernetes
Distributed ML compute on K8s with KubeRay
Docs
Kustomize
Template-free K8s configuration management
Docs
Book
Author
Why It Matters
AI Engineering
Chip Huyen
The book for building LLM applications in production (2025)
Designing Machine Learning Systems
Chip Huyen
End-to-end ML system design β the architect's bible
Build a Large Language Model (From Scratch)
Sebastian Raschka
Deep understanding of transformer internals
Hands-On Large Language Models
Jay Alammar, Maarten Grootendorst
Practical LLM patterns with code
Natural Language Processing with Transformers
Lewis Tunstall et al.
HuggingFace-centric NLP from the team that built it
Book
Author
Why It Matters
Generative Deep Learning (2nd Ed)
David Foster
Comprehensive generative AI β VAEs, GANs, diffusion, transformers
Deep Learning
Ian Goodfellow et al.
The foundational theory reference
Designing Data-Intensive Applications
Martin Kleppmann
System design fundamentals β essential for AI architects
Building Microservices (2nd Ed)
Sam Newman
Service architecture patterns used in AI platforms
The Staff Engineer's Path
Tanya Reilly
Leadership and influence for senior technical roles
π° Blogs, Newsletters & Podcasts
Blog
Author
Focus
Lil'Log
Lilian Weng (OpenAI)
Deep technical surveys on AI topics
Chip Huyen's Blog
Chip Huyen
ML systems, AI engineering, industry trends
Eugene Yan
Eugene Yan (Amazon)
Applied ML, RecSys, production patterns
Jay Alammar
Jay Alammar
Visual explanations of transformers and LLMs
Sebastian Raschka
Sebastian Raschka
LLM research, fine-tuning, practical AI
Simon Willison
Simon Willison
LLM tools, prompt engineering, practical AI
Hamel Husain
Hamel Husain
MLOps, LLM evaluation, practical engineering
Newsletter
Description
Link
The Batch
Andrew Ng's weekly AI news digest
Subscribe
Ahead of AI
Sebastian Raschka's research roundup
Subscribe
The AI Engineer
Swyx's newsletter on AI engineering
Subscribe
Interconnects
Nathan Lambert on RLHF, alignment, and LLMs
Subscribe
AI Tidbits
Sahar Mor's weekly AI news
Subscribe
Podcast
Description
Link
Latent Space
The AI engineer podcast β deep technical interviews
Listen
Practical AI
Real-world AI/ML applications and tools
Listen
TWIML AI
This Week in Machine Learning & AI
Listen
Gradient Dissent
Weights & Biases podcast on ML engineering
Listen
Lex Fridman Podcast
Long-form interviews with AI researchers
Listen
π Open Source Models & Datasets
Model
Developer
Strengths
Link
Llama 3/4
Meta
Best open-weight general-purpose LLMs
HuggingFace
Mistral / Mixtral
Mistral AI
Excellent efficiency, MoE architecture
HuggingFace
Gemma 2/3
Google
Strong small models (2B-27B)
HuggingFace
Qwen 2.5/3
Alibaba
Competitive multilingual models
HuggingFace
DeepSeek V3/R1
DeepSeek
Strong reasoning, open-weight
HuggingFace
Phi-3/4
Microsoft
Best-in-class small language models
HuggingFace
FLUX
Black Forest Labs
State-of-the-art open image generation
HuggingFace
Resource
Purpose
Link
Hugging Face Hub
Largest open model & dataset repository
Hub
MMLU / MMLU-Pro
Massive multitask language understanding benchmark
Paper
HumanEval / SWE-bench
Code generation benchmarks
GitHub
LMSYS Chatbot Arena
Crowdsourced LLM comparison leaderboard
Leaderboard
Open LLM Leaderboard
HuggingFace's open model rankings
Leaderboard
π‘οΈ AI Safety & Governance
Resource
Description
Link
OWASP Top 10 for LLMs
Security risks in LLM applications
Docs
NIST AI Risk Management Framework
Government framework for AI risk management
Docs
Anthropic Research
AI safety research and responsible scaling
Blog
EU AI Act
European regulation for AI systems
Overview
AI Alignment Forum
Technical AI safety research discussion
Forum
π₯ Communities & Conferences
Community
Platform
Link
Hugging Face
Discord + Forums
Join
LangChain
Discord
Join
MLOps Community
Slack
Join
r/MachineLearning
Reddit
Visit
r/LocalLLaMA
Reddit
Visit
Latent Space
Discord
Join
Conference
Focus
Link
NeurIPS
Top ML research conference
Site
ICML
International Conference on ML
Site
AI Engineer Summit
Applied AI engineering
Site
KubeCon
Cloud-native + AI infrastructure
Site
Tool
Purpose
Link
Claude Code
AI coding assistant in the terminal
Docs
Cursor
AI-first code editor
Site
Continue
Open-source AI code assistant
GitHub
uv
Fast Python package manager (10-100x faster than pip)
GitHub
Ruff
Extremely fast Python linter and formatter
GitHub
Docker
Containerization for reproducible AI environments
Docs
ποΈ Learning Path (12-Week Plan)
For those who want a structured approach:
Week
Focus
Projects
Resources
1-2
LLM Fundamentals
#1 LLM Playground
Karpathy's Zero to Hero, HF NLP Course
3-4
Prompt Engineering & Fine-Tuning
#2 Customer Support Chatbot
Anthropic/OpenAI prompt guides, PEFT docs
5-6
Agents & Tool Use
#3 Ask-the-Web, #4 Deep Research
LangGraph docs, DeepLearning.AI courses
7-8
RAG & Multi-Agent Systems
#6 Capstone, #7 Agent RAG
LlamaIndex docs, RAG survey papers
9
Protocols & Interop
#8 MCP & A2A
MCP spec, A2A spec
10
Commerce & Real-World AI
#9 UCP Merchant, #10 Shopping Agent
UCP spec, state machine design
11
Enterprise Frameworks
#11 Compliance, #12 Incident Response
MS Agent Framework, Google ADK docs
12
Advanced Patterns & Portfolio
#13 Contract Lifecycle, polish portfolio
CrewAI docs, system design resources
Contributions welcome! This list is community-maintained.
Fork this repository
Add your resource in the appropriate category
Submit a pull request with a clear description
Resources must be high-quality and actively maintained
Prefer free/open-source resources, but paid resources are OK if they're exceptional
Each entry needs a working link and brief description
Follow the existing table format
No duplicates β check existing entries first
See CONTRIBUTING.md for detailed guidelines.
If you find this useful, give it a star! It helps others discover these resources.
This work is licensed under CC0 1.0 Universal . To the extent possible under law, the author has waived all copyright and related rights to this work.
Built with π‘ by samuelvinay91 β from the AI Engineer Portfolio project