GitHub - tomasmajercik/ai-code-completion-evaluation: Created own dataset, ran model that generated hidden part from code, evaluated generated part

Evaluation of AI code-competion for my own projects

📝 Project description

This project was created as an application task for internship at jetbrains. The task was to find model, that will generate the hidden part of a code from my own project(s). I was asked to create my own dataset which I've done with the help of automated python script, as it was not feasible doing manually. After that, I implemented a model form huggingface, which was prompted to generate this hidden part based on content before and after the hidden middle (hidden) part. To finish this task, I manually went trought the generated code and checked, whether the code is an exact match (generated code was identical to the hidden part), different but still working, or an garbage code unable to work properly.

🛠️ Tech stack

Python
Huggingface

🌱 Skills gained & problems overcomed

I got in touch with huggingface, finding and using suitable models, simple evaluations and dataset preparation

📊 Shorten example results

[
    // Correct output
    { 
        "prefix": "...cation/json; charset=UTF-8\");\n    session_start",
        "suffix": " \"root\", \"\", \"webTask\" ...",
        "real_middle": "();\n\n    $connection = mysqli_connect(\"localhost\",",
        "generated_middle": "();\n\n    $connection = mysqli_connect(\"localhost\",",
        "exact_match": true,
        "chrf": 100.0,
        "levenshtein_distance": 0,
        "label": "is_correct"
    },
    // Incorrect output
    {
        "prefix": "...$query = \"SELECT `ta",
        "suffix": "result = mysqli_query($connection, $query);\n ...",
        "real_middle": "g` FROM `tags` WHERE `username` = '$username'\";\n        $",
        "generated_middle": "g` FROM `tags` WHERE `fileName` = '$fileName'\";\n        $",
        "exact_match": false,
        "chrf": 62.37266202285458,
        "levenshtein_distance": 10,
        "label": "will_not_work"
    }
]

⚙️ How to install

clone the repo
create your own dataset(s) using provided python scripts or feel free to use mine
run the main script, save results and look at the generated results

Project dependencies:

pip install transformers
pip install evaluate
pip install python-Levenshtein
pip install sacrebleu

📂 File structure:

/data_files_vI/II/III this folder contains files from three of my recent projects. These are the data used for dataset creation
/datasets contains randomly selected parts of the code to be completed by the model. Each dataset contains from 20 to 50 records with prefix, middle and sufix (code before hidden text, actual hidden code and the remainig code).
/evaluation contains main.py, which is a script, that runs the AI model (bigcode/starcoder2-3b) from huggingface hub, that generates the hidden part, and as a result create .json file containg:
- prefix
- suffix
- original middle (the hidden part)
- generated middle (generated part that is hidden from model)
and computed metrics:
- exact_match
- chrf
- levenshtein_distance

There remaining files are:

generate_dataset.py is script that generates the datasets
load_dataset.py is script that loads dataset (for testing purposes only)
report.pdf is a report describing my thought process, findings and learnings

Model results are accessible in resultingDataset-wAnotations.json.
The "label" says, if the generated code is exact match and therefore is, for sure, correct; whether the code is changed, but still capable of working, or if the generated code is wrong and will lead to an error.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Evaluation of AI code-competion for my own projects

📝 Project description

🛠️ Tech stack

🌱 Skills gained & problems overcomed

📊 Shorten example results

⚙️ How to install

📂 File structure:

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data_files_vI		data_files_vI
data_files_vII		data_files_vII
data_files_vIII		data_files_vIII
datasets		datasets
evaluation		evaluation
predictions		predictions
README.md		README.md
generate_dataset.py		generate_dataset.py
load_dataset.py		load_dataset.py
report.pdf		report.pdf
resultingDataset-wAnotations.json		resultingDataset-wAnotations.json

tomasmajercik/ai-code-completion-evaluation

Folders and files

Latest commit

History

Repository files navigation

Evaluation of AI code-competion for my own projects

📝 Project description

🛠️ Tech stack

🌱 Skills gained & problems overcomed

📊 Shorten example results

⚙️ How to install

📂 File structure:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages