qLMM-Lua-Eval-Pipeline

Script and data for the manuscript titled "Evaluating Quantized Large Language Models for Code Generation on Low-Resource Language Benchmarks".

Files

".env" contains variables used by the python scripts. Make sure to set the "MODEL_REP_PATH" variable with a path to a directory that contains the locally stored code LLMs.
"config.json" provides the list of used code LLMs and benchmarks.
"hfDatasetDownloader.py" downloads and formats the MultiPL-HumanEval, MultiPL-MBPP, and MCEVAL benchmarks from HuggingFace. The download benchmarks are stored inside the "benchmarks" directory.
"genPipe.py" script that loads code LLMs one by one and applies code generation tasks to them. All generated code is stored inside the "genOutput" directory.
"evalPipe.py" evaluates the Lua code generated by the code LLMs using several metrics mention in the manuscript. The evaluation results are stored inside the "evalOutput" directory.
"analysis.R" to analyze the content of the "evalOutput" directory

config.json structure

{
"name": "Unique name for the model.",
"family": "Model family name. Same at all quantization precisions.",
"id": "HuggingFace URI of the model.",
"max_tokens": "Maximum number of tokens to generate.",
"temp": "Temperature at which the model is run.",
"top_k": "Next token sampling rate.",
"eos": "End-Of-Sequence tokens",
"qBits": "Precision. 2, 4, 8 for integer quantization precision and 16 for half-precision floating point.",
"skip": "0 or 1. If 1 the model will be ignored by genPipe.py."
}

{
"name": "benchmark name",
"id": "jsonl file with the benchmark",
"sample": "None or integer number. If integer number N then a random sample with N tasks will be used to evaluate the models",
"skip": "0 or 1. If 1 the benchmark will be ignored by genPipe.py."
}

Setting the environment

The Python script requires following packages:

dotenv
pathlib
airium
pandas
llama-cpp-python

Download the code LLMs and store them locally. The list of code LLMs is available inside the config.json file. Inside the .env file, make sure to set the "MODEL_REP_PATH" variable to the directory path that contains the locally stored code LLMs.

Before running this script ensure that

Lua 3.5 or higher is installed
Path to Lua is added to the environmental variable
the LuaUnit package is installed
install the CLOC tool to count lines of code (winget install AlDanial.Cloc): https://github.com/AlDanial/cloc?tab=readme-ov-file

Order of execution

hfDatasetDownloader.py
genPipe.py
evalPipe.py
analysis.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

qLMM-Lua-Eval-Pipeline

Files

config.json structure

Setting the environment

Order of execution

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
benchmarks		benchmarks
evalOutput		evalOutput
genOutput		genOutput
luaGenTestsOnly		luaGenTestsOnly
.env		.env
LICENSE		LICENSE
README.md		README.md
analysis.R		analysis.R
config.json		config.json
evalPipe.py		evalPipe.py
genPipe.py		genPipe.py
hfDatasetDownloader.py		hfDatasetDownloader.py

License

E-Nyamsuren/qLMM-Lua-Eval-Pipeline

Folders and files

Latest commit

History

Repository files navigation

qLMM-Lua-Eval-Pipeline

Files

config.json structure

Setting the environment

Order of execution

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages