Prompt Evaluation Project

Overview

This project is a Django-based web application designed for evaluating and testing AI prompts across different models and APIs, including OpenAI, Fireworks AI, and Google APIs.

Requirements

Python 3.12+
Django 5.1.5
Various API integrations (see requirements.txt)

Installation

Clone the repository:

git clone https://github.com/yourusername/prompt_eval.git
cd prompt_eval

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Set up environment variables: Create a .env file in the project root with the following variables:


DEBUG=True
# API keys for external services
OPENAI_API_KEY=<Add your Key>
DEEPSEEK_API_KEY=<Add your Key>
FIREWORKS_API=<Add your Key>

# Model endpoint URLs for uniformity
DEEPSEEK_API_URL=https://api.deepseek.com
FIREWORKS_API_URL=https://api.fireworks.ai/inference/v1/chat/completions

# OpenAI API endpoint (default)
OPENAI_API_URL=https://api.openai.com/v1

Create Cache folder for Convert_to_json Endpoint
mkdir ./eval/static/converted_jsons

Run migrations:

python manage.py migrate

Start the development server:

python manage.py runserver

Important Functionality

API Integrations

OpenAI Integration: Leverage OpenAI's models for prompt testing and evaluation
Fireworks AI: Alternative AI model provider for comparative testing
Google APIs: Integration with Google services for additional functionality

Key Features

Prompt testing across multiple AI models
Performance comparison between different AI providers
Response evaluation metrics
User management system
Export and sharing of results

Usage Examples

Example 1: Testing a Prompt with OpenAI

from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Write a short poem about AI"}]
)
print(response.choices[0].message.content)

Example 2: Comparing with Fireworks AI

from fireworks_ai import Fireworks

client = Fireworks()
response = client.chat.completions.create(
    model="fireworks/models/mixtral-8x7b",
    messages=[{"role": "user", "content": "Write a short poem about AI"}]
)
print(response.choices[0].message.content)

Example 3: Using the Django Admin Interface

Create a superuser: python manage.py createsuperuser
Access the admin interface at http://localhost:8000/admin/
Create and manage prompts, evaluations, and results

Deployment

The application is configured to be deployed with Gunicorn and can be served behind Nginx or similar web servers.

For production deployment:

gunicorn coreproject.wsgi:application --bind 0.0.0.0:8000

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is exclusively licensed with Turing.com. This is NOT an Apache 2.0 licensed project.

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
.github/workflows		.github/workflows
bin		bin
coreproject		coreproject
eval		eval
processor		processor
static/js		static/js
staticfiles		staticfiles
staticfiles_v2		staticfiles_v2
templates/socialaccount		templates/socialaccount
.gitignore		.gitignore
LICENSE		LICENSE
LLM_JOB_MONITORING_GUIDE.md		LLM_JOB_MONITORING_GUIDE.md
README.md		README.md
RUN_MIGRATIONS_GUIDE.md		RUN_MIGRATIONS_GUIDE.md
SERVER_FIX_GUIDE.md		SERVER_FIX_GUIDE.md
backup_database.sh		backup_database.sh
check_services.sh		check_services.sh
deploy.sh		deploy.sh
deploy_dry_run.sh		deploy_dry_run.sh
deploy_enhanced.sh		deploy_enhanced.sh
deploy_local.sh		deploy_local.sh
fix_stuck_jobs.py		fix_stuck_jobs.py
llm-job-processor.service		llm-job-processor.service
manage.py		manage.py
pyvenv.cfg		pyvenv.cfg
requirements.txt		requirements.txt
restart_gunicorn.sh		restart_gunicorn.sh
run_services.py		run_services.py
run_sync_daemon.py		run_sync_daemon.py
setup_auto_processor.sh		setup_auto_processor.sh
start_admin.sh		start_admin.sh
switch_sqlite_to_delete_mode.sh		switch_sqlite_to_delete_mode.sh
sync_daemon_monitor.py		sync_daemon_monitor.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Prompt Evaluation Project

Overview

Requirements

Installation

Important Functionality

API Integrations

Key Features

Usage Examples

Example 1: Testing a Prompt with OpenAI

Example 2: Comparing with Fireworks AI

Example 3: Using the Django Admin Interface

Deployment

Contributing

License

About

Uh oh!

Releases

Packages

Languages

License

ashutosh-turing/cot-generation-tool

Folders and files

Latest commit

History

Repository files navigation

Prompt Evaluation Project

Overview

Requirements

Installation

Important Functionality

API Integrations

Key Features

Usage Examples

Example 1: Testing a Prompt with OpenAI

Example 2: Comparing with Fireworks AI

Example 3: Using the Django Admin Interface

Deployment

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages