Skip to content

🧠 Project Description This project builds an Agentic Reasoning System that can autonomously break down complex logic problems into smaller subtasks, select the right tools, and verify each step for accuracy. It goes beyond traditional LLM inference by performing multi-step, transparent reasoning with human-readable traces. The system combines sym

Adars2005/AgenticReasoner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 Agentic Reasoning System

This project implements an Agentic AI Reasoning System designed for structured, multi-step reasoning tasks such as logic-based question answering.
It autonomously decomposes problems, selects appropriate tools, executes subtasks, and generates transparent reasoning traces.


πŸš€ Project Overview

Large Language Models (LLMs) often hallucinate intermediate steps or skip verification during logical reasoning.
To address this, this system implements an agentic reasoning framework that:

  • Decomposes logic problems into smaller subtasks.
  • Selects tools (symbolic solver, calculator, or code execution).
  • Executes and verifies sub-results to ensure reliability.
  • Generates step-by-step reasoning traces along with the final answer.

The system is designed to run on smaller LLMs or base models and avoid heavy proprietary reasoning models such as GPT-4, GPT-5, Claude 3, or Gemini Ultra.


πŸ“ Repository Structure

Agentic-Reasoner/
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ train.csv
β”‚   β”œβ”€β”€ test.csv
β”‚
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ main.py
β”‚   β”œβ”€β”€ data_loader.py
β”‚   β”œβ”€β”€ reasoning_agent.py
β”‚   β”œβ”€β”€ tool_selector.py
β”‚   β”œβ”€β”€ solver.py
β”‚   β”œβ”€β”€ verifier.py
β”‚   β”œβ”€β”€ utils.py
β”‚
β”œβ”€β”€ outputs/
β”‚   β”œβ”€β”€ output.csv
β”‚
β”œβ”€β”€ eval_runner.py
β”œβ”€β”€ README.md
└── requirements.txt

🧩 Core Components

1. reasoning_agent.py

Implements the agentic controller that:

  • Decomposes the main problem into subtasks.
  • Chooses appropriate tools for each subtask.
  • Integrates all results into a coherent reasoning chain.

2. tool_selector.py

Chooses tools like:

  • Symbolic Solver (for algebra, logic)
  • Arithmetic Calculator
  • Code Execution Module (for programmable subtasks)

3. solver.py

Handles execution of mathematical or logical subtasks.

4. verifier.py

Checks subtask outputs for consistency and correctness.

5. utils.py

Helper functions for logging, formatting reasoning traces, and CSV export.


βš™οΈ Installation

# Clone the repository
git clone https://github.com/<your-username>/Agentic-Reasoner.git
cd Agentic-Reasoner

# Install dependencies
pip install -r requirements.txt

🧠 Running the System

Step 1: Train / Fine-tune (Optional)

You can use train.csv to fine-tune a small model or validate the reasoning pipeline.

Step 2: Inference

Run inference on the test dataset:

python src/main.py

The system will:

  1. Read test.csv
  2. Decompose each problem
  3. Solve step-by-step
  4. Output reasoning traces and predictions to outputs/output.csv

Output format:

topic problem_statement solution correct_option

πŸ§ͺ Evaluation

To generate predictions and evaluate them, run:

python eval_runner.py

This script compares predicted answers with ground truth (if available) and computes metrics like Macro F1 Score.


🧰 Requirements

Create a requirements.txt file with:

pandas
numpy
scikit-learn
sympy

πŸ“Š Output Example

Example row in outputs/output.csv:

topic problem_statement solution correct_option
Arithmetic What is 2 + 2 Γ— 3? Step 1: Multiply 2Γ—3=6. Step 2: Add 2+6=8. Final Answer: 8. 2

🧾 Evaluation Metrics

  • Macro F1 Score (50%)
  • Approach Creativity & Originality (35%)
  • Report Quality (10%)
  • Code Quality (5%)

🧠 Key Design Principles

βœ… Transparent reasoning with trace logs
βœ… Modular, reusable pipeline
βœ… Verification for correctness
βœ… Interpretable output for human validation


πŸ‘¨β€πŸ’» Authors

Developed by J. Adarsh and contributors for the Agentic Reasoning Challenge.

About

🧠 Project Description This project builds an Agentic Reasoning System that can autonomously break down complex logic problems into smaller subtasks, select the right tools, and verify each step for accuracy. It goes beyond traditional LLM inference by performing multi-step, transparent reasoning with human-readable traces. The system combines sym

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published