Agent Gym

Convert any model into a r1-like reasoning hyper-intelligent agent. Leverages TRL, Huggingface, and various other libraries. This is a work in progress. Our goal is to make it easy to train any model into a reasoning agent.

Installation

pip3 install -U agentgym

Usage

from agentgym.r1_pipeline import R1Pipeline, SFTConfig

r1_pipeline = R1Pipeline(
    sft_model="Qwen/Qwen2-0.5B-Instruct",
    tokenizer_name="Qwen/Qwen2-0.5B-Instruct",
    sft_dataset="trl-lib/tldr",
    sft_args=SFTConfig(output_dir="/tmp"),
    only_grpo=True,
    model_name="Qwen/Qwen2-0.5B-Instruct"
)

r1_pipeline.run()

Architecture

The architecture is as follows:

SFT: Supervised Fine-Tuning
GRPO: Generative Reinforcement Policy Optimization

-> model -> sft -> grpo -> model

graph TD;
    A[model] --> B[sft]
    B --> C[grpo]
    C --> D[reasoning model]

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github		.github
agentgym		agentgym
images		images
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
example.py		example.py
grpo_example_two.py		grpo_example_two.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Gym

Installation

Usage

Architecture

License

About

Releases

Sponsor this project

Packages

Languages

License

The-Swarm-Corporation/AgentGym

Folders and files

Latest commit

History

Repository files navigation

Agent Gym

Installation

Usage

Architecture

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Languages

Packages