Skip to content

cepdnaclk/e19-4yp-Solve-Issues-In-Large-Code-Repositories

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Solve Issues in Large Code Repositories

A Novel Approach to SWE Bench Optimization

Introduction

This repository contains the implementation and research artifacts for our final year project titled: "Solve Issues in Large Code Repositories"

The project addresses limitations in current automated debugging and patch generation methods by introducing a hybrid approach that combines:

  • Iterative reasoning, to mimic real-world developer behavior.
  • Graph-based retrieval, to reduce the search space and improve precision.
  • Retrieval-Augmented Generation (RAG) leveraging Stack Overflow for enhanced context.
  • Multi-LLM-based patch generation and refinement, ensuring higher SWE-bench performance with cost-effective computation.

Objectives

General Objective

Enhance the efficiency and accuracy of automated software engineering solutions evaluated using the SWE Bench framework.

Specific Objectives

  • Develop an iterative reasoning system for issue resolution.
  • Create a graph-based representation of code repositories for accurate file retrieval.
  • Integrate Stack Overflow knowledge using RAG to improve contextual understanding.
  • Combine multiple LLMs (e.g., Claude, GPT-4, DeepSeek R1) for diverse patch generation.
  • Learn from incorrect patches using iterative refinement and reasoning models.
  • Achieve retrieval accuracy >82% on SWE-bench tasks.

Methodology

  • Graph-Based Repository Modeling Built using NetworkX and visualized with Gephi, representing inter-file relationships like imports and function calls.

  • Retrieval-Augmented Generation (RAG) Enhanced contextual understanding using Stack Overflow data stored in ChromaDB, queried semantically via Sentence-BERT, and processed with LlamaIndex.

  • Iterative Reasoning & Multi-LLM Patch Generation Employing reasoning models like DeepSeek R1 and multiple LLMs to generate, compare, and refine patches.

  • Artificial Stack Trace Generation For difficult cases where standard retrieval fails, simulate execution paths using graph traversal to identify probable buggy files.


Technologies Used

Technology Purpose
Python Primary development language
NetworkX Graph construction
Gephi Graph visualization
ChromaDB Vector database for semantic retrieval
OpenAI-embeddings Embedding generation
Langchain RAG integration and vector search
BeautifulSoup Web scraping (Stack Overflow)
StackAPI API access to Stack Overflow data
GPT-4, Claude LLMs for retrieval and patch generation
DeepSeek R1 Reasoning and decision-making

Experiment Setup

  • Graph-based repository model construction.

  • SWE-bench dataset preprocessing.

  • Retrieval techniques benchmarked:

    • LLM-based
    • Embedding-based
    • LLM + RAG
  • Artificial stack traces generated when direct retrieval fails.

  • Stack Overflow context integration using vector search.

  • Evaluation metrics:

    • Retrieval Accuracy (target: >82%)
    • Patch Validity (unit test pass/fail)
    • Cost-efficiency (LLM token usage and execution time)

Results and Analysis (To be updated after implementation)

Expected Outcomes:

  • Improved retrieval accuracy compared to baseline agentless models.
  • More accurate and contextually relevant patch generation.
  • Reduced computation cost due to graph-pruned search space.
  • Iterative learning model able to refine patches across runs.

References


Contributors


License

This repository is for academic and non-commercial research use only. Licensing options to be determined based on publication and university policy.

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •