Chain of Thoughtlessness? An Analysis of CoT in Planning

This the repo with the planning related experiments presented in the paper. For experiments related to other benchmarks like Last Letter Concatenation, check out this repo

Requirements

Linux
Python 3.6+
- Install required packages with pip install -r requirements.txt
Fast Downward
1. Use the version in planner_tools or download from here
2. Assign path of the folder to the environment variable FAST_DOWNWARD FAST_DOWNWARD=/path/to/fast_downward
VAL
1. Use the version in planner_tools or download from here
2. Assign path of the folder to the environment variable VAL VAL=/path/to/val
PR2Plan
1. Use the version in planner_tools or download and compile obs-compiler from here
2. Assign path of the folder to the environment variable PR2 PR2=/path/to/pr2plan
LLM access/setup - (currently OpenAI/BLOOM)

Usage

Library requirements are provided in requirements.txt

Run the following command to run prompt generation:

python3 prompt_generation.py -t TASK -c CONFIG [-ct COT_TYPE] [-si SPECIFIC-INSTANCES] [-re RANDOM-EXAMPLE] [-v VERBOSE] [-s SEED] [-ie] [-br BLOCKS_RANGE_START BLOCKS_RANGE_END] [-oe OVERRIDE_EXAMPLE]

Required arguments:

--task: The task to run: "standard" or "cot"
--config: The name of the config file to use. The config file must be a YAML file present in the configs folder. These configs decide the test problem distribution.

Optional arguments:

-ie: If added as part of the command, the pipeline will ignore the already completed instances and rerun the entire pipeline. If not added, the pipeline will not redo already completed instances. Default is False.
-si: If a list of instance ids is provided, the pipeline will only run the task on those instances. If not provided, the pipeline will run the task on all instances between the start and end provided in the config file. Default is None. For example, -si 1 2 3 4 5
-re: If set to True, the example instance for each task will be randomly chosen from the set of instances. If set to False, the previous instance id will be used for the example prompt. Default is False.
-v: If set to True, the pipeline will print the prompts, responses and evaluation. Default is False.
-s: The seed to use for randomization. Default is 42.
-ct: The type of chain of thought. Provide this if task is "cot". For now it's only "upb" which means Universal Plan Breakdown. Default is "none"
-br: Range of blocks for Blocksworld. Default is 3 to 20.
-oe: Overriding current examples with examples from a different problem distribution. "st" - Progression Proof, "ds" - Domain Specific (Stacking), "lex" - Lexicographic Stacking

This will generate the prompts for the given task and store them in the prompts folder as json files.

Run the following command to only run response generation (PROMPT JSONS MUST BE GENERATED FIRST):

python3 response_generation.py -t TASK -c CONFIG --engine ENGINE [-ct COT_TYPE] [-temp TEMPERATURE] [-si SPECIFIC-INSTANCES] [-re RANDOM-EXAMPLE] [-v VERBOSE] [-s SEED] [-ie] [-oe OVERRIDE_EXAMPLE]

This will generate the responses for the given task using the generated prompts. The generated responses are appended to the prompt jsons and are stored in the responses folder.

Run the following command to only run evaluation (RESPONSE JSONS MUST BE GENERATED FIRST):

python3 response_evaluation.py -t TASK -c CONFIG --engine ENGINE [-ct COT_TYPE] [-temp TEMPERATURE] [-si SPECIFIC-INSTANCES] [-re RANDOM-EXAMPLE] [-v VERBOSE] [-s SEED] [-ie] [-oe OVERRIDE_EXAMPLE]

This will evaluate the raw responses generated by the model. The evaluation is appended to the response jsons and the final results are stored in the results folder.

Citation

@inproceedings{
stechly2024chain,
title={Chain of Thoughtlessness? An Analysis of CoT in Planning},
author={Kaya Stechly and Karthik Valmeekam and Subbarao Kambhampati},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024},
url={https://openreview.net/forum?id=kPBEAZU5Nm}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Executor		Executor
configs		configs
example_query_prompts		example_query_prompts
instances		instances
model_parser		model_parser
pddlgenerators		pddlgenerators
results		results
utils		utils
.gitignore		.gitignore
README.md		README.md
compute_plans_offline.py		compute_plans_offline.py
generate_problems.py		generate_problems.py
graph_gen.py		graph_gen.py
prompt_generation.py		prompt_generation.py
requirements.txt		requirements.txt
response_evaluation.py		response_evaluation.py
response_generation.py		response_generation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chain of Thoughtlessness? An Analysis of CoT in Planning

Requirements

Usage

Run the following command to run prompt generation:

Required arguments:

Optional arguments:

Run the following command to only run response generation (PROMPT JSONS MUST BE GENERATED FIRST):

Run the following command to only run evaluation (RESPONSE JSONS MUST BE GENERATED FIRST):

Citation

About

Releases

Packages

Languages

karthikv792/cot-planning

Folders and files

Latest commit

History

Repository files navigation

Chain of Thoughtlessness? An Analysis of CoT in Planning

Requirements

Usage

Run the following command to run prompt generation:

Required arguments:

Optional arguments:

Run the following command to only run response generation (PROMPT JSONS MUST BE GENERATED FIRST):

Run the following command to only run evaluation (RESPONSE JSONS MUST BE GENERATED FIRST):

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages