Local AI Agent from Scratch

This notebook demonstrates how to build a simple function-calling AI agent from scratch using the OpenAI-compatible API format and a local vLLM inference server. It shows how to structure prompts, parse model outputs, and route function calls.

NOTE: Before running the notebook, launch a vLLM server in a separate terminal using your model of choice: vllm serve <model>

I am using Salesforce/xLAM-2-3b-fc-r due to its small size and high ranking on the Berkeley Function-Calling Leaderboard. The model is running locally on my Nvidia RTX 3060.

vLLM launch command:

vllm serve Salesforce/xLAM-2-3b-fc-r --enable-auto-tool-choice --tool-parser-plugin ./xlam_tool_call_parser.py --tool-call-parser xlam --tensor-parallel-size 1 --dtype float16 --gpu-memory-utilization 0.8

The full vLLM launch instructions for this particular model can be found in the Using vLLM for Inference section on the model's HugginngFace page.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md
agent_from_scratch.ipynb		agent_from_scratch.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Local AI Agent from Scratch

About

Uh oh!

Releases

Packages

Languages

License

c-jg/ai-agent-from-scratch

Folders and files

Latest commit

History

Repository files navigation

Local AI Agent from Scratch

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages