Treefix: Enabling Execution with a Tree of Prefixes

This repository contains the implementation of Treefix and supplementary material for the paper "Treefix: Enabling Execution with a Tree of Prefixes" (ICSE'25).

Paper: https://arxiv.org/pdf/2501.12339

Getting Started Guide

Follow the installation instructions in INSTALL.md.
Follow the usage instructions in USE.md.

Replication Guide

To reproduce the results from the paper, follow these instructions. The results of the following instructions are provided in the Available Data section bellow, i.e., you can also inspect them there to skip some of the below steps.

First, install Treefix using the instructions above.

Effectiveness at Covering Code Overall and per Step (RQ1 and RQ2)

We evaluate Treefix on two datasets containing Python code snippets. These datasets are available at ./so_snippets and ./popular_projects_snippets_dataset. The used files are listed at ./so_snippets_dataset.txt and ./popular_projects_snippets_dataset.txt.

Create a folder to store the metrics for each code snippet:

mkdir metrics

Run Treefix on the chosen dataset:

For the Stack Overflow snippets, run:

python -m l3.Run --files ./so_snippets_dataset.txt --openai_api_key $OPENAI_API_KEY

For the Open-source functions, run:

python -m l3.Run --files ./popular_projects_snippets_dataset.txt --openai_api_key $OPENAI_API_KEY

The coverage achieved by Treefix on each snippet will be available on a raw folder.

Get the average coverage achieved by Treefix on the snippets on each step:

python -m l3.evaluation.CombineData

The average results, such as the ones presented at Tables II and III of the paper, will be available on a *_grouped.csv file, where the coverage achieved by the set of prefixes P is in column coverage_percentage and the coverage achieved by p_best is in column max_coverage_prediction_value.

Diversity of Values (RQ4)

Get the list of .csv files generated on the same folder of the original code snippets, i.e. at ./so_snippets and ./popular_projects_snippets_dataset. Write the path of the .csv files to a .txt file, e.g. model_predictions.txt
Calculate the predictions diversity:

python -m l3.evaluation.CalculatePredictionsDiversity --files model_predictions.txt

The output of this is a raw JSON file: treefix_types_and_values.json.

Summarize the values, similarly to Table IV in the paper:

python -m l3.evaluation.SummarizePredictionsDiversity

Efficiency and Costs (RQ5)

Get the list of .csv files generated on the same folder of the original code snippets, i.e. at ./so_snippets and ./popular_projects_snippets_dataset. Write the path of the .csv files to a .txt file, e.g. model_predictions.txt
Calculate the price:

python -m l3.evaluation.CalculatePrice --files model_predictions.txt

Available Data

The data from our experiments can be found in the following folders:

The prompts and model responses for each snippet are available in .csv files at:

./so_snippets_full_with_gpt4o

./so_snippets_full_with_gpt4omini

popular_projects_snippets_dataset_full_with_gpt4o

popular_projects_snippets_dataset_full_with_gpt4omini

The overall metrics and metrics for each snippet and model are available at:

./metrics_full_datasets_with_GPT4o

./metrics_full_datasets_with_GPT4o_mini

The case studies are available at:

./case_studies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Treefix: Enabling Execution with a Tree of Prefixes

Getting Started Guide

Replication Guide

Effectiveness at Covering Code Overall and per Step (RQ1 and RQ2)

Diversity of Values (RQ4)

Efficiency and Costs (RQ5)

Available Data

About

Releases 1

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
case_studies		case_studies
metrics_full_datasets_with_GPT4o		metrics_full_datasets_with_GPT4o
metrics_full_datasets_with_GPT4o_mini		metrics_full_datasets_with_GPT4o_mini
popular_projects_snippets_dataset		popular_projects_snippets_dataset
popular_projects_snippets_dataset_full_with_gpt4o		popular_projects_snippets_dataset_full_with_gpt4o
popular_projects_snippets_dataset_full_with_gpt4omini		popular_projects_snippets_dataset_full_with_gpt4omini
so_snippets		so_snippets
so_snippets_full_with_gpt4o		so_snippets_full_with_gpt4o
so_snippets_full_with_gpt4omini		so_snippets_full_with_gpt4omini
src/l3		src/l3
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README.md		README.md
USE.md		USE.md
requirements.txt		requirements.txt
setup.py		setup.py

License

sola-st/Treefix

Folders and files

Latest commit

History

Repository files navigation

Treefix: Enabling Execution with a Tree of Prefixes

Getting Started Guide

Replication Guide

Effectiveness at Covering Code Overall and per Step (RQ1 and RQ2)

Diversity of Values (RQ4)

Efficiency and Costs (RQ5)

Available Data

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages