Skip to content

Workflow-based SQL generation Tool integration for database operations Memory management with checkpointing

License

Notifications You must be signed in to change notification settings

Joe-ElM/Project-Text2SQL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Text2SQL Agent

A professional AI-powered application that converts natural language questions into SQL queries and executes them against your database. Built with LangGraph, Streamlit, and OpenAI's GPT models.

Text2SQL Agent Demo Streamlit App Python Streamlit License

πŸ”— Try the Live Demo - Transform your questions into SQL queries instantly!

Note: You'll need your own OpenAI API key to use the application. This ensures data privacy and cost control.

Features

Core Functionality

  • Natural Language to SQL: Convert questions into optimized SQL queries
  • Multi-Database Support: SQLite, PostgreSQL, MySQL with schema selection
  • AI-Powered Analysis: GPT-4o and GPT-4o-mini model selection
  • Memory & Context: Conversational interface with chat history
  • Real-time Execution: Execute queries and display results instantly

User Experience

  • Clean UI: Professional Streamlit interface with sidebar controls
  • API Key Required: Users provide their own OpenAI key for privacy and cost control
  • Database Upload: Upload SQLite files or connect via connection strings
  • SQL Display: View generated SQL queries alongside natural language responses
  • Chat History: Persistent conversation with visual icons

Technical Features

  • LangGraph Workflow: Structured agent with tool integration
  • Database Agnostic Tools: SQLAlchemy-based tools work across all databases
  • Memory Management: Built-in conversation memory with checkpointing
  • Error Handling: Comprehensive error management and user feedback

Installation

Prerequisites

  • Python 3.8 or higher
  • OpenAI API key (optional - app provides fallback)

Local Setup

  1. Clone the repository

    git clone https://github.com/Joe-ElM/Project-Text2SQL-.git
    cd Project-Text2SQL-
  2. Install dependencies

    pip install -r requirements.txt
  3. Configure environment

    cp .env.example .env
    # Edit .env with your API keys (optional)
  4. Run the application

    streamlit run app/app.py

Configuration

Environment Variables (.env)

OPENAI_API_KEY=your_openai_api_key_here
DB_TYPE=sqlite
DB_DSN=sqlite:///data/sakila_master.db

Supported Models

  • GPT-4o: Best performance, higher cost
  • GPT-4o-mini: Balanced performance and cost

Database Support

  • SQLite: File upload or local files
  • PostgreSQL: Connection string with optional schema selection
  • MySQL: Connection string support

Usage

Quick Start

  1. Open the application in your browser
  2. Select your preferred AI model in the sidebar
  3. Choose database type and upload/connect your database
  4. Ask natural language questions about your data

Example Queries

"Show me all customers from California"
"What are the top 5 best-selling products?"
"Find customers who haven't made a purchase in the last 30 days"
"Show revenue by month for the last year"

Database Connection Examples

SQLite (File Upload)

  • Use the file uploader in the sidebar
  • Supports .db and .sqlite files

PostgreSQL

postgresql://username:password@localhost:5432/database_name

PostgreSQL with Schema

Connection: postgresql://username:password@localhost:5432/database_name
Schema: public

MySQL

mysql://username:password@localhost:3306/database_name

Architecture

Project Structure

β”œβ”€β”€ app/
β”‚   └── app.py                 # Streamlit UI application
β”œβ”€β”€ src/
β”‚   └── text2sql/
β”‚       β”œβ”€β”€ agents/
β”‚       β”‚   └── text2sql_agent.py    # LangGraph agent implementation
β”‚       β”œβ”€β”€ database/
β”‚       β”‚   β”œβ”€β”€ factory.py           # Database engine factory
β”‚       β”‚   └── tools.py             # Database interaction tools
β”‚       β”œβ”€β”€ core/
β”‚       β”‚   └── query_engine.py      # Core query processing (placeholder)
β”‚       β”œβ”€β”€ utils/
β”‚       β”‚   └── helpers.py           # Utility functions
β”‚       └── config.py                # Configuration management
β”œβ”€β”€ data/
β”‚   └── sakila_master.db            # Sample SQLite database
β”œβ”€β”€ tests/
β”‚   └── test_smoke.py               # Basic tests
β”œβ”€β”€ requirements.txt                # Python dependencies
β”œβ”€β”€ .env.example                   # Environment template
β”œβ”€β”€ .gitignore                     # Git ignore rules
β”œβ”€β”€ README.md                      # This file
└── CLAUDE.md                      # Development notes

Key Components

LangGraph Agent (src/text2sql/agents/text2sql_agent.py)

  • Workflow-based SQL generation
  • Tool integration for database operations
  • Memory management with checkpointing

Database Tools (src/text2sql/database/tools.py)

  • list_tables_tool: Discover available tables
  • get_table_schema_tool: Inspect table structures
  • execute_sql_tool: Execute generated queries

Streamlit UI (app/app.py)

  • Clean, professional interface
  • Sidebar controls for configuration
  • Chat-style conversation history

Deployment

Streamlit Cloud

  1. Push to GitHub repository
  2. Connect repository to Streamlit Cloud
  3. Set environment variables in Streamlit Cloud dashboard
  4. Deploy automatically

Railway/Render

  1. Connect GitHub repository
  2. Configure build command: pip install -r requirements.txt
  3. Configure start command: streamlit run app/app.py --server.port=$PORT
  4. Set environment variables
  5. Deploy

Docker (Optional)

FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8501
CMD ["streamlit", "run", "app/app.py", "--server.port=8501"]

Security & Privacy

Enterprise Considerations

  • Data Privacy: No data stored permanently, only session-based memory
  • API Key Security: Users can provide their own OpenAI keys
  • Database Security: Supports read-only operations by default
  • Local Deployment: Can be deployed on-premises for sensitive data

For Production Use

  • Consider self-hosted LLM models for complete data privacy
  • Implement user authentication for multi-user environments
  • Add audit logging for compliance requirements
  • Use Azure OpenAI for enterprise-grade security

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • LangChain/LangGraph: For the agent framework
  • OpenAI: For GPT models
  • Streamlit: For the web interface
  • SQLAlchemy: For database abstraction

About

Workflow-based SQL generation Tool integration for database operations Memory management with checkpointing

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages