Skip to content

AloshkaD/leetcode_assistant

Repository files navigation

This repository is actively maintained and updated. Stay tuned for upcoming features!

LeetCode Coding multi-agent AI Assistant with LangGraph and Graph reasoning - Project README


Table of Contents

  1. Introduction
  2. Project Features
  3. System Architecture
  4. File Structure
  5. Setup Instructions
  6. Running the Application
  7. Key Components
    • Agents
    • Tasks
    • Tools
    • LangGraph Workflow
    • Database
    • RAG Support
  8. Logging with Weave and LangSmith
  9. Dependencies
  10. Future Enhancements
  11. Contributing
  12. License

1. Introduction

The Coding Assistant AI is an intelligent assistant application that captures screenshots, extracts coding questions, and answers them using a multi-agent architecture powered by CrewAI and LangGraph. It uses OpenAI's GPT-3.5 to generate answers to coding questions and supports advanced reasoning with RAG (Retrieval Augmented Generation). The assistant can handle cases where no question is found and can log activities using Weave and LangSmith. All interactions, including questions and answers, are stored in a SQLite database for future reference.


2. Project Features

  • Multi-Agent System: Built using CrewAI, with dedicated agents for screen capture, question extraction, and answering.
  • Conditional Workflow: Managed via LangGraph, with branches for different states (e.g., no question found).
  • Advanced Reasoning: Supports Retrieval Augmented Generation (RAG) for better context-based answers.
  • Screenshot Capture: Takes screenshots when Ctrl+S is pressed.
  • OCR Support: Extracts coding questions from screenshots using Tesseract OCR.
  • Database Storage: Stores questions and answers in a SQLite database for easy retrieval.
  • Logging: Uses Weave and LangSmith for structured logging and event tracing.
  • RAG Knowledge Base: Supports retrieval of relevant information from a knowledge base to augment GPT-3.5 answers.

3. System Architecture

The application is built around a multi-agent system where each agent performs a specific task. The process is orchestrated using LangGraph, which manages the flow from screen capture to question identification and answer generation.

High-Level Workflow:

  1. Ctrl+S Trigger: The user presses Ctrl+S, triggering the screen capture agent.
  2. Screenshot and OCR: The screenshot is processed to extract a coding question using OCR.
  3. Agent Workflow: If a question is found, the answer agent generates a response using GPT-3.5. If no question is found, a handling agent provides feedback.
  4. Database Storage: The question and its corresponding answer are saved in the database.
  5. RAG Search: The answer agent can augment the GPT-3.5 response with relevant information from a RAG knowledge base.

4. File Structure

/coding-assistant-ai/
│
├── /screenshots/          # Stores screenshots taken by the assistant
├── agents.py              # Defines CrewAI agents for capturing, extracting, and answering questions
├── crew.py                # Manages the multi-agent crew and task orchestration
├── database.py            # Handles SQLite database connections and Q&A storage
├── graph.py               # LangGraph workflow definition
├── main.py                # Main entry point for running the application
├── nodes.py               # Nodes for LangGraph, handling individual steps in the workflow
├── rag.py                 # Implements the RAG (Retrieval Augmented Generation) knowledge base
├── state.py               # Defines the state structure used by the LangGraph workflow
├── tasks.py               # Defines tasks for agents to perform (screen capture, answering questions, etc.)
├── tools.py               # Provides reusable tools for agents (screen capture, OCR, answering, etc.)
├── app.log                # Log file generated by Weave for event tracking
└── .env                   # Environment file for sensitive information (e.g., API keys)

5. Setup Instructions

Prerequisites

  1. Python 3.8+
  2. Tesseract OCR installed on your machine.
  3. OpenAI API Key (For GPT-3.5)

Installation Steps

  1. Clone the Repository:

    git clone https://github.com/yourusername/coding-assistant-ai.git
    cd coding-assistant-ai
  2. Create a Virtual Environment (optional but recommended):

    python -m venv venv
    source venv/bin/activate  # On Windows use `venv\Scripts\activate`
  3. Install Required Packages:

    pip install -r requirements.txt
  4. Create .env file:

    • Add your OpenAI API key and other sensitive information.
    touch .env

    Inside .env:

    OPENAI_API_KEY=your_openai_api_key_here
    
  5. Create the Screenshots Directory:

    mkdir screenshots

6. Running the Application

  1. Run the application with:

    python main.py
  2. Trigger the Workflow:

    • Press Ctrl+S to capture a screenshot and start the question identification and answering process.
  3. Logging:

    • Logs will be written to app.log by Weave, capturing all important events.
  4. View Questions and Answers:

    • The SQLite database qa.db will store the extracted coding questions and their corresponding answers.

7. Key Components

Agents

  • Screen Capture Agent: Captures the screen and extracts text using OCR.
  • Question Not Found Agent: Handles cases where no question is found and provides feedback.
  • Answer Agent: Uses GPT-3.5 to answer coding questions with advanced reasoning and RAG support.

Tasks

  • capture_and_identify_task: Captures the screen and identifies coding questions.
  • handle_no_question_task: Provides feedback if no question is found.
  • answer_question_task: Generates an answer using GPT-3.5 and RAG.

Tools

  • CaptureScreenTool: Captures the screen and saves it as an image file.
  • ExtractQuestionTool: Extracts coding questions from screenshots using Tesseract OCR.
  • AnswerQuestionTool: Sends coding questions to GPT-3.5 for answers.
  • RAGSearchTool: Supports RAG by searching the knowledge base for relevant information.

LangGraph Workflow

  • wait_for_next_trigger: Waits for the user to press Ctrl+S.
  • capture_and_identify: Captures the screen and extracts a coding question.
  • check_question_found: Determines whether a question was found in the screenshot.
  • answer_question: Generates an answer using GPT-3.5.
  • store_result: Stores the question and answer in the database.

Database

  • The SQLite database (qa.db) stores all extracted coding questions and their corresponding answers. It is initialized in database.py and is updated with every interaction.

RAG Support

  • RAG (Retrieval Augmented Generation) is implemented using the KnowledgeBase class in rag.py. This supports the retrieval of relevant documents from a knowledge base to provide context-aware answers.

8. Logging with Weave and LangSmith

The application uses Weave and LangSmith for logging and event tracing. Key actions such as capturing screenshots, finding questions, and generating answers are logged in app.log. This provides a clear audit trail of the assistant's activities.


9. Dependencies

Install dependencies via requirements.txt or manually as listed below:

  • CrewAI: Multi-agent architecture.
  • LangGraph: State graph management.
  • Weave: Structured logging.
  • LangSmith: Event tracing.
  • OpenAI: GPT-3.5 API for question answering.
  • Pytesseract: OCR library for extracting text from screenshots.
  • Pillow: Image handling and manipulation.
  • SQLite: Database for storing questions and answers.
  • Tesseract: OCR engine for recognizing text from screenshots.

10. Future Enhancements

  • Expand RAG Knowledge Base: Integrate with a larger dataset or knowledge base to improve the quality of retrieved information.
  • Error Handling: Add more robust error handling and retries for tasks such as API calls and image processing.
  • Improved Question Identification: Use advanced NLP techniques to better identify and classify coding questions.
  • User Interface: Add a simple GUI to interact with the assistant instead of relying on Ctrl+S key presses.

11. Contributing

If you'd like to contribute to this project, feel free to open an issue or submit a pull request on the Git

Hub repository. Please ensure all contributions adhere to the project’s code of conduct and follow the contribution guidelines.


12. License

This project is licensed under the MIT License - see the LICENSE file for more details.


Conclusion

The Coding Assistant AI is a powerful multi-agent system for answering coding questions by combining OCR, GPT-3.5, and retrieval-augmented generation (RAG). With its modular design and extensive logging, it provides a strong foundation for automating coding assistance and task management.

Happy Coding!

About

LeetCode Coding multi-agent AI Assistant with LangGraph and Graph reasoning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages