Skip to content

Exploring and improving the quality of ChatGPT-generated code for LeetCode programming tasks.

Notifications You must be signed in to change notification settings

yueyueL/ChatGPT-CodeGenAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

ChatGPT-CodeGen-Analysis

This repository contains the code and data for the paper Refining ChatGPT-Generated Code: Characterizing and Mitigating Code Quality Issues, published in ACM Transactions on Software Engineering and Methodology (TOSEM).

Our experiments focus on assessing the quality and correctness of the code generated by ChatGPT. We investigate the factors that influence the effectiveness of ChatGPT and explore its self-repairing capabilities with different types of feedback. This repository includes 2033 programming tasks, scripts to test the generated code, and automated scripts to submit the code to LeetCode and collect results.

Repository Structure

.
├── data
│   ├── chatgpt_generated_code
│   ├── leetcode_tasks 
│   ├── rq3
│   ├── results (the folder to store the results of the experiments)
├── src
│   ├── evaluation
│   ├── debugging
│   ├── leetcode_auto_submit
├── README.md

Data

The data directory contains three main folders: leetcode_tasks, chatgpt_generated_code, rq3.

Leetcode Tasks

The leetcode_tasks folder contains the 2,033 programming tasks from the Leetcode platform (). The tasks are stored in JSON format and include the following information:

Here's a detailed description of the fields:

  • id: A unique identifier for the task (e.g., "001").
  • name: The name of the task (e.g., "two-sum").
  • difficulty: The difficulty level of the task, which can be "easy", "medium", or "hard".
  • link: The URL to the original task on the LeetCode platform (e.g., "https://leetcode.com/problems/two-sum/").
  • task_description: The complete description of the task, including problem statements, input format, constraints, and examples.
  • test_cases: A set of test cases with input and expected output that can be used to verify the correctness of the generated code.
  • python_template: A Python code template for the task, containing the class and method signature.
  • java_template: A Java code template for the task, containing the class and method signature.

chatgpt_generated_code

The chatgpt_generated_code folder consists of two json files: python.json and java.json. Each file contains the code snippets generated by ChatGPT in the respective programming languages for the programming tasks in the leetcode_tasks folder. Also the results of static analysis and runtime error detection are included in the json files.

Here's a detailed description of the fields:

  • name: The name of the task.
  • is_pass: A binary indicator (1 or 0) representing whether the solution passed the test cases.
  • test_cases: A description of the input and expected output for each test case.
  • error: A string describing the type of error.
  • error_info: Additional information regarding runtime errors.
  • is_quality_issue: A binary indicator (1 or 0) representing whether the solution has quality issues.
  • quality_info: A description of the quality issues generated by static analysis tools, if any.
  • generated_code: The code snippet generated by ChatGPT for the task.

rq3

The rq3 folder contains the results of the RQ3 experiments. The rq3 folder consists of two subfolders: simple_feedback and feedback_with_static_analysis_runtime. Each subfolder contains the results of the experiments using simple feedback and static analysis feedback, respectively. The results are stored in JSON format.

Source Code

The src directory contains two three folders: evaluation, debugging, leetcode_auto_submit.

chatgpt_code_generation.py is the main script for generating code using ChatGPT.

Evaluation

The evaluation subdirectory contains scripts for evaluating the quality of generated code. For the process of assessing the quality of generated code, we use static analysis tools to detect quality issues in the generated code. Please read the README.md file in the src/evaluation directory for more information.

Debugging

The debugging subdirectory includes scripts for exploring ChatGPT's self-debugging capabilities. The simple_feedback.py script implements experiments using simple feedback, while the static_analysis_feedback.py script incorporates static analysis tools and runtime errors to provide more detailed feedback to ChatGPT.

Auto Submit to LeetCode

The leetcode_auto_submit folder contains a script for automatically submitting the generated code to the LeetCode platform. The script uses Selenium to automate the process of logging in to the LeetCode platform and submitting the code.

Parameters

  • username: (Type: String) - The email address used for logging into the ChatGPT service.

  • password: (Type: String) - The password associated with the email address for the ChatGPT service. Keep this secure.

  • skip_login: (Type: Boolean) - If set to True, the client will attempt to use a saved session for login, avoiding the need for credentials. Useful for repeated runs.

  • headless: (Type: Boolean) - Determines whether the browser runs in headless mode (no GUI). Set to False to see the browser UI.

  • incognito: (Type: Boolean) - If True, the browser launches in incognito mode, ensuring no cookies or history from previous sessions are used.

  • user_data_dir: (Type: String) - Path to the directory where user data (like cookies and login sessions) is stored, allowing for session persistence between runs.

  • login_type (Optional): (Type: String) - Specifies the type of login to be used (normal, manually). Determines how the automated login process is handled.

You can use the following code to submit the code to LeetCode and collect the results.

from leetcode_auto_submit import AutoLeetCode
import time

autoleet = AutoLeetCode(
    headless = False,
    username = "your_username",
    password = "your_password",
    verbose = False,
    incognito = False,
    skip_login = False,
    user_data_dir = "data/profile/",
    login_type = 'manully'
)
time.sleep(2)

chatgpt_generated_code = "xxxxxxxxxxxx"
task_name = "Two Sum"
language = "Python3"

# if only run code and collect
result_status, result_details = autoleet.run_and_collect(task_name, language, chatgpt_generated_code)
# if submit codd and collect
result_status, result_details = autoleet.submit_and_collect(task_name, language, chatgpt_generated_code)

Dependencies

The evaluation scripts require the following dependencies:

python==3.8.5
openai==0.10.2
selenium==4.9.1
undetected_chromedriver=3.5.4

Citation

If you find this repo or our survey helpful, please consider citing us:

@article{liu2023refining,
  title={Refining ChatGPT-generated code: Characterizing and mitigating code quality issues},
  author={Liu, Yue and Le-Cong, Thanh and Widyasari, Ratnadira and Tantithamthavorn, Chakkrit and Li, Li and Le, Xuan-Bach D and Lo, David},
  journal={arXiv preprint arXiv:2307.12596},
  year={2023}
}

License

This repository is licensed under the MIT License.

About

Exploring and improving the quality of ChatGPT-generated code for LeetCode programming tasks.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published