From e2a8b4b1c12631cb737f9ffea4a694347d5da71b Mon Sep 17 00:00:00 2001 From: ZubinGou Date: Sun, 8 Oct 2023 16:47:30 +0800 Subject: [PATCH] :rocket: release ToRA model outputs --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index f8bb223..2f6eded 100644 --- a/README.md +++ b/README.md @@ -55,7 +55,7 @@ Please visit our [website](https://microsoft.github.io/ToRA/) for more details. ### Tool-Integrated Reasoning

- +
Figure 2: A basic example of single-round tool interaction, which interleaves rationales with program-based tool use.

@@ -84,7 +84,7 @@ pip install -r requirements.txt ### 🪁 Inference -We provide a script for inference, simply config the `MODEL_NAME_OR_PATH` and `DATA` in `[src/scripts/infer.sh](/src/scripts/infer.sh)` and run the following command: +We provide a script for inference, simply config the `MODEL_NAME_OR_PATH` and `DATA` in [src/scripts/infer.sh](/src/scripts/infer.sh) and run the following command: ```sh bash scritps/infer.sh @@ -94,7 +94,7 @@ We also open-source the [model outputs](/src/outputs/llm-agents/) from our best ### ⚖️ Evaluation -The `[src/eval/grader.py](src/eval/grader.py)` file contains the grading logic that assesses the accuracy of the predicted answer by comparing it to the ground truth. This logic is developed based on the Hendrycks' MATH grading system, which we have manually verified on the MATH dataset to minimize false positives and false negatives. +The [src/eval/grader.py](/src/eval/) file contains the grading logic that assesses the accuracy of the predicted answer by comparing it to the ground truth. This logic is developed based on the Hendrycks' MATH grading system, which we have manually verified on the MATH dataset to minimize false positives and false negatives. To evaluate the predicted answer, run the following command: