This repository contains the code and data for the paper Refining ChatGPT-Generated Code: Characterizing and Mitigating Code Quality Issues, published in ACM Transactions on Software Engineering and Methodology (TOSEM).
Our experiments focus on assessing the quality and correctness of the code generated by ChatGPT. We investigate the factors that influence the effectiveness of ChatGPT and explore its self-repairing capabilities with different types of feedback. This repository includes 2033 programming tasks, scripts to test the generated code, and automated scripts to submit the code to LeetCode and collect results.
.
├── data
│ ├── chatgpt_generated_code
│ ├── leetcode_tasks
│ ├── rq3
│ ├── results (the folder to store the results of the experiments)
├── src
│ ├── evaluation
│ ├── debugging
│ ├── leetcode_auto_submit
├── README.md
The data
directory contains three main folders: leetcode_tasks
, chatgpt_generated_code
, rq3
.
The leetcode_tasks
folder contains the 2,033 programming tasks from the Leetcode platform (). The tasks are stored in JSON format and include the following information:
Here's a detailed description of the fields:
id
: A unique identifier for the task (e.g., "001").name
: The name of the task (e.g., "two-sum").difficulty
: The difficulty level of the task, which can be "easy", "medium", or "hard".link
: The URL to the original task on the LeetCode platform (e.g., "https://leetcode.com/problems/two-sum/").task_description
: The complete description of the task, including problem statements, input format, constraints, and examples.test_cases
: A set of test cases with input and expected output that can be used to verify the correctness of the generated code.python_template
: A Python code template for the task, containing the class and method signature.java_template
: A Java code template for the task, containing the class and method signature.
The chatgpt_generated_code
folder consists of two json files: python.json
and java.json
. Each file contains the code snippets generated by ChatGPT in the respective programming languages for the programming tasks in the leetcode_tasks
folder. Also the results of static analysis and runtime error detection are included in the json files.
Here's a detailed description of the fields:
name
: The name of the task.is_pass
: A binary indicator (1 or 0) representing whether the solution passed the test cases.test_cases
: A description of the input and expected output for each test case.error
: A string describing the type of error.error_info
: Additional information regarding runtime errors.is_quality_issue
: A binary indicator (1 or 0) representing whether the solution has quality issues.quality_info
: A description of the quality issues generated by static analysis tools, if any.generated_code
: The code snippet generated by ChatGPT for the task.
The rq3
folder contains the results of the RQ3 experiments. The rq3
folder consists of two subfolders: simple_feedback
and feedback_with_static_analysis_runtime
. Each subfolder contains the results of the experiments using simple feedback and static analysis feedback, respectively. The results are stored in JSON format.
The src
directory contains two three folders: evaluation
, debugging
, leetcode_auto_submit
.
chatgpt_code_generation.py
is the main script for generating code using ChatGPT.
The evaluation subdirectory contains scripts for evaluating the quality of generated code. For the process of assessing the quality of generated code, we use static analysis tools to detect quality issues in the generated code. Please read the README.md
file in the src/evaluation
directory for more information.
The debugging subdirectory includes scripts for exploring ChatGPT's self-debugging capabilities. The simple_feedback.py script implements experiments using simple feedback, while the static_analysis_feedback.py script incorporates static analysis tools and runtime errors to provide more detailed feedback to ChatGPT.
The leetcode_auto_submit
folder contains a script for automatically submitting the generated code to the LeetCode platform. The script uses Selenium to automate the process of logging in to the LeetCode platform and submitting the code.
Parameters
-
username: (Type: String) - The email address used for logging into the ChatGPT service.
-
password: (Type: String) - The password associated with the email address for the ChatGPT service. Keep this secure.
-
skip_login: (Type: Boolean) - If set to True, the client will attempt to use a saved session for login, avoiding the need for credentials. Useful for repeated runs.
-
headless: (Type: Boolean) - Determines whether the browser runs in headless mode (no GUI). Set to False to see the browser UI.
-
incognito: (Type: Boolean) - If True, the browser launches in incognito mode, ensuring no cookies or history from previous sessions are used.
-
user_data_dir: (Type: String) - Path to the directory where user data (like cookies and login sessions) is stored, allowing for session persistence between runs.
-
login_type (Optional): (Type: String) - Specifies the type of login to be used (normal, manually). Determines how the automated login process is handled.
You can use the following code to submit the code to LeetCode and collect the results.
from leetcode_auto_submit import AutoLeetCode
import time
autoleet = AutoLeetCode(
headless = False,
username = "your_username",
password = "your_password",
verbose = False,
incognito = False,
skip_login = False,
user_data_dir = "data/profile/",
login_type = 'manully'
)
time.sleep(2)
chatgpt_generated_code = "xxxxxxxxxxxx"
task_name = "Two Sum"
language = "Python3"
# if only run code and collect
result_status, result_details = autoleet.run_and_collect(task_name, language, chatgpt_generated_code)
# if submit codd and collect
result_status, result_details = autoleet.submit_and_collect(task_name, language, chatgpt_generated_code)
The evaluation scripts require the following dependencies:
python==3.8.5
openai==0.10.2
selenium==4.9.1
undetected_chromedriver=3.5.4
If you find this repo or our survey helpful, please consider citing us:
@article{liu2023refining,
title={Refining ChatGPT-generated code: Characterizing and mitigating code quality issues},
author={Liu, Yue and Le-Cong, Thanh and Widyasari, Ratnadira and Tantithamthavorn, Chakkrit and Li, Li and Le, Xuan-Bach D and Lo, David},
journal={arXiv preprint arXiv:2307.12596},
year={2023}
}
This repository is licensed under the MIT License.