reasoning_world_model

Step 1: Prefilter on The Pile

cd sampling/prefilter_code

sh start_multiple_server.sh

cd sampling/sampling_code

python sampling_rationales_all_datasets.py/sampling_rationales_c4.py

python calculate_perplexity_sampled_rationale_new_method.py

python filter_rationale.py

python parse_sampled_rationale.py

sbatch sbatch_llama_finetune.sh

The Rationalyst fine-tuned with rationales sampled from GSM8K and ECQA can be found here.

With world model: sbatch sbatch_llama_inference_world_model_single.sh Without world model: sbatch sbatch_llama_inference_single.sh