Skip to content

Commit

Permalink
fixed readme
Browse files Browse the repository at this point in the history
  • Loading branch information
Mcilie committed Jun 9, 2024
1 parent abbfb17 commit 984e0d8
Showing 1 changed file with 34 additions and 2 deletions.
36 changes: 34 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,23 @@
# Prompt Engineering Survey
Generative Artificial Intelligence (GenAI) systems are being increasingly deployed across all parts of
industry and research settings. Developers and end users interact with these systems through the use of
prompting or prompt engineering. While prompting is a widespread and highly researched concept, there
exists conflicting terminology and a poor ontological understanding of what constitutes a prompt due to the
area’s nascency. This repository is the code for The Prompt Report, our research that establishes a structured
understanding of prompts, by assembling a taxonomy of prompting techniques and analyzing their use. This code
allows for the automated review of papers, the collection of data, and the running of experiments. Our dataset
is available on [Hugging Face](https://huggingface.co/datasets/PromptSystematicReview/ThePromptReport)

## Table of Contents
- [Prompt Engineering Survey](#prompt-engineering-survey)
- [Table of Contents](#table-of-contents)
- [Install requirements](#install-requirements)
- [Setting up API keys](#setting-up-api-keys)
- [Setting up keys for running tests](#setting-up-keys-for-running-tests)
- [Structure of the Repository](#structure-of-the-repository)
- [Running the code](#running-the-code)
- [TLDR;](#tldr)
- [Notes](#notes)

## Install requirements

Expand Down Expand Up @@ -40,13 +59,26 @@ The core of the repository is in `src/prompt_systematic_review`. The `config_dat

The source folder is divided into 4 main sections: 3 scripts (`automated_review.py`, `collect_papers.py`,`config_data.py`) that deal with collecting the data and running the automated review, the `utils` folder that contains utility functions that are used throughout the repository, the `get_papers` folder that contains the scripts to download the papers, and the `experiments` folder that contains the scripts to run the experiments.

At the root, there is a `data` folder. It comes pre-loaded with some data that is used for the experiments, however the bulk of the dataset can either be generated by running `main.py` or by downloading the data from huggingface. It is in `data/experiments_output` that the results of the experiments are saved.
At the root, there is a `data` folder. It comes pre-loaded with some data that is used for the experiments, however the bulk of the dataset can either be generated by running `main.py` or by downloading the data from Hugging Face. It is in `data/experiments_output` that the results of the experiments are saved.

Notably, the keywords used in the automated review/scraping process are in `src/prompt_systematic_review/utils/keywords.py`. Anyone who wishes to run the automated review can adjust these keywords to their liking in that file.

## Running the code

### TLDR;
```bash
git clone https://github.com/trigaten/Prompt_Systematic_Review.git && cd Prompt_Systematic_Review
pip install -r requirements.txt
# create a .env file with your API keys
nano .env
git lfs install
git clone https://huggingface.co/datasets/PromptSystematicReview/ThePromptReport
mv ThePromptReport/* data/
python main.py
```

Running `main.py` will download the papers, run the automated review, and run the experiments.
However, if you wish to save time and only run the experiments, you can download the data from huggingface and move the papers folder and all the csv files in the dataset into the data folder (should look like `data/papers/*.pdf` and `data/master_papers.csv` etc). Adjust main.py accordingly.
However, if you wish to save time and only run the experiments, you can download the data from [Hugging Face](https://huggingface.co/datasets/PromptSystematicReview/ThePromptReport) and move the papers folder and all the csv files in the dataset into the data folder (should look like `data/papers/*.pdf` and `data/master_papers.csv` etc). Adjust main.py accordingly.

Every experiment script has a `run_experiment` function that is called in `main.py`. The `run_experiment` function is responsible for running the experiment and saving the results. However each script can be run individually by just running `python src/prompt_systematic_review/experiments/<experiment_name>.py` from root.

Expand Down

0 comments on commit 984e0d8

Please sign in to comment.