Skip to content

lab-v2/langdiversity

Repository files navigation

LangDiversity

PyPI version Python version License

Elevate your language models with insightful diversity metrics.

Links

Paper: https://arxiv.org/abs/2308.11189

Video: https://www.youtube.com/watch?v=BekDOLm6qBI&t=10s&ab_channel=NeuroSymbolic

Check out LangDiversity Hello World if you're new.

Table of Contents

Introduction

LangDiversity is a package that provides tools to calculate diversity measures for a given set of data. Specifically, it can compute measures like Shannon's entropy and Gini impurity. It also offers utilities to select prompts based on their diversity scores when interacting with models like OpenAI's GPT-3.5 Turbo.

The primary goal of this project is to assist researchers and developers in analyzing the diversity of responses generated by language models, thereby aiding in the evaluation and fine-tuning of such models.

Installation

pip install langdiversity

Usage

Detailed documentation is available here.

Bibtex

If you used this software in your work please cite our paper

@misc{ngu2023diversity,
      title={Diversity Measures: Domain-Independent Proxies for Failure in Language Model Queries},
      author={Noel Ngu and Nathaniel Lee and Paulo Shakarian},
      year={2023},
      eprint={2308.11189},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

License

This repository is licensed under BSD-3-Clause

Contacts

For any inquiries or feedback, please contact: