This project is a Django-based web application designed for evaluating and testing AI prompts across different models and APIs, including OpenAI, Fireworks AI, and Google APIs.
- Python 3.12+
- Django 5.1.5
- Various API integrations (see requirements.txt)
- Clone the repository:
git clone https://github.com/yourusername/prompt_eval.git
cd prompt_eval- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Set up environment variables:
Create a
.envfile in the project root with the following variables:
DEBUG=True
# API keys for external services
OPENAI_API_KEY=<Add your Key>
DEEPSEEK_API_KEY=<Add your Key>
FIREWORKS_API=<Add your Key>
# Model endpoint URLs for uniformity
DEEPSEEK_API_URL=https://api.deepseek.com
FIREWORKS_API_URL=https://api.fireworks.ai/inference/v1/chat/completions
# OpenAI API endpoint (default)
OPENAI_API_URL=https://api.openai.com/v1
Create Cache folder for Convert_to_json Endpoint
mkdir ./eval/static/converted_jsons
- Run migrations:
python manage.py migrate- Start the development server:
python manage.py runserver- OpenAI Integration: Leverage OpenAI's models for prompt testing and evaluation
- Fireworks AI: Alternative AI model provider for comparative testing
- Google APIs: Integration with Google services for additional functionality
- Prompt testing across multiple AI models
- Performance comparison between different AI providers
- Response evaluation metrics
- User management system
- Export and sharing of results
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Write a short poem about AI"}]
)
print(response.choices[0].message.content)from fireworks_ai import Fireworks
client = Fireworks()
response = client.chat.completions.create(
model="fireworks/models/mixtral-8x7b",
messages=[{"role": "user", "content": "Write a short poem about AI"}]
)
print(response.choices[0].message.content)- Create a superuser:
python manage.py createsuperuser - Access the admin interface at
http://localhost:8000/admin/ - Create and manage prompts, evaluations, and results
The application is configured to be deployed with Gunicorn and can be served behind Nginx or similar web servers.
For production deployment:
gunicorn coreproject.wsgi:application --bind 0.0.0.0:8000Contributions are welcome! Please feel free to submit a Pull Request.
This project is exclusively licensed with Turing.com. This is NOT an Apache 2.0 licensed project.