AgileCoder is a novel multi-agent framework for software development that draws inspiration from the widely-used Agile Methodology in professional software engineering. The key innovation lies in its task-oriented approach, where instead of assigning fixed roles to agents, AgileCoder mimics real-world software development by creating a backlog of tasks and dividing the development process into sprints, with the backlog being dynamically updated at each sprint.
Website: https://fsoft-ai4code.github.io/agilecoder/
AgileCoder can be installed easily through pip:
pip install agilecoder
If you want to clone the repository, be sure to switch to the new-flow branch to access the latest source code.
AgileCoder currently supports various models, including OpenAI, Azure OpenAI, Anthropic, and your self-hosted models, which can be hosted by Ollama. For instance, to configure the necessary environment variables for the Azure OpenAI service, please set the following:
- API_KEY: Your Azure OpenAI API key.
- RESOURCE_ENDPOINT: The endpoint URL for your Azure OpenAI resource.
- API_TYPE: The type of Azure OpenAI API you are using (e.g., "azure").
- API_VERSION: The version of the Azure OpenAI API you are using (e.g., "2022-12-01").
- API_ENGINE: The name of the Azure OpenAI engine you want to use (e.g., "text-davinci-002").
You can set these environment variables either in your system settings or by creating a .env
file in the project root directory with the following format:
API_KEY=your_api_key
RESOURCE_ENDPOINT=your_resource_endpoint
API_TYPE=azure
API_VERSION=your_api_version
API_ENGINE=your_api_engine
For other models, please refer to setup_model.md
To generate software using AgileCoder, use the following command:
agilecoder --task "<your software requirements>"
Replace <your software requirements>
with a description of the software you want AgileCoder to create.
For example, to generate a Caro game in Python, run:
agilecoder --task "Create a Caro game in Python"
AgileCoder will process your requirements and generate the corresponding software based on the provided task description.
You can specify additional options and flags to customize the behavior of AgileCoder. For more information on the available options, run:
agilecoder --help
This will display the help message with a list of supported options and their descriptions. Feel free to explore different software requirements and experiment with AgileCoder to generate various types of software projects tailored to your needs.
We evaluate the performance of AgileCoder on two types of datasets to assess its effectiveness in generating code for different scenarios:
- Competitive Programming: HumanEval and MBPP
- Complex Software Requirements: We curate a new dataset called ProjectDev, which contains complex software requirements for generating complete software projects. This dataset will be designed to evaluate AgileCoder's ability to handle more intricate and real-world software development scenarios.
Category | Model | HumanEval | MBPP |
---|---|---|---|
LLMs (prompting) | |||
CodeGeeX-13B | 18.9 | 26.9 | |
PaLM Coder-540B | 43.9 | 32.3 | |
DeepSeeker-33B-Inst | 79.3 | 70.0 | |
GPT-3.5 Turbo | 60.3 | 52.2 | |
Claude 3 Haiku | 75.9 | 80.4 | |
GPT 4 | 80.1 | 80.1 | |
LLMs-based Agents | |||
with GPT-3.5 Turbo | ChatDev | 61.79 | 74.80 |
MetaGPT | 62.80 | 74.73 | |
AgileCoder | 70.53 | 80.92 | |
with Claude 3 Haiku | ChatDev | 76.83 | 70.96 |
AgileCoder | 79.27 | 84.31 | |
with GPT 4 | MetaGPT | 85.9 | 87.7 |
AgileCoder | 90.85 | - |
For ProjectDev, we evaluate the practical application of software projects generated by AgileCoder, ChatDev, and MetaGPT. The evaluation will involve human assessment to compare their performance with 3 criterias:
- Human evaluators will assess the executability of the generated software projects against the expected requirements specified in the ProjectDev dataset.
- For each generated software project, the evaluators will determine whether it is executable and meets the specified requirements.
- The success rate will be calculated as the percentage of requirements met by the executable software projects (e.g., if a generated program is executable and meets 4 out of 10 requirements, its executability rate is 40%).
Metric | ChatDev | MetaGPT | AgileCoder |
---|---|---|---|
Executability | 32.79 | 7.73 | 57.79 |
Entire Running Time (s) | 120 | 48 | 444 |
Avg. Time/Sprint (s) | - | - | 306 |
#Sprints | - | - | 1.64 |
Token Usage | 7440 | 3029 | 36818 |
Expenses (USD) | 0.12 | 0.02 | 0.44 |
#Errors | 6 | 32 | 0 |
AgileCoder can generate a wide range of software with high accuracy; a gallery of executable software will be available soon.
More details can be found in screenshots
More details can be found in our paper.
If you're using AgileCoder in your research or applications, please cite using this BibTeX:
@article{nguyen2024agilecoder,
title={AgileCoder: Dynamic Collaborative Agents for Software Development based on Agile Methodology},
author={Nguyen, Minh Huynh and Chau, Thang Phan and Nguyen, Phong X and Bui, Nghi DQ},
journal={arXiv preprint arXiv:2406.11912},
year={2024}
}
If you have any questions, comments or suggestions, please do not hesitate to contact us.
- Website: fpt-aicenter
- Email: bdqnghi@gmail.com