Elevate your language models with insightful diversity metrics.
Paper: https://arxiv.org/abs/2308.11189
Video: https://www.youtube.com/watch?v=BekDOLm6qBI&t=10s&ab_channel=NeuroSymbolic
Check out LangDiversity Hello World if you're new.
LangDiversity is a package that provides tools to calculate diversity measures for a given set of data. Specifically, it can compute measures like Shannon's entropy and Gini impurity. It also offers utilities to select prompts based on their diversity scores when interacting with models like OpenAI's GPT-3.5 Turbo.
The primary goal of this project is to assist researchers and developers in analyzing the diversity of responses generated by language models, thereby aiding in the evaluation and fine-tuning of such models.
pip install langdiversity
Detailed documentation is available here.
If you used this software in your work please cite our paper
@misc{ngu2023diversity,
title={Diversity Measures: Domain-Independent Proxies for Failure in Language Model Queries},
author={Noel Ngu and Nathaniel Lee and Paulo Shakarian},
year={2023},
eprint={2308.11189},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
This repository is licensed under BSD-3-Clause
For any inquiries or feedback, please contact:
- Noel Ngu: nngu2@asu.edu
- Nathaniel Lee: nlee51@asu.edu
- Paulo Shakarian: pshak02@asu.edu