Skip to content
#

multi-step-reasoning

Here are 16 public repositories matching this topic...

The course teaches how to fine-tune LLMs using Group Relative Policy Optimization (GRPO)—a reinforcement learning method that improves model reasoning with minimal data. Learn RFT concepts, reward design, LLM-as-a-judge evaluation, and deploy jobs on the Predibase platform.

  • Updated Jun 13, 2025
  • Jupyter Notebook

🧠 Project Description This project builds an Agentic Reasoning System that can autonomously break down complex logic problems into smaller subtasks, select the right tools, and verify each step for accuracy. It goes beyond traditional LLM inference by performing multi-step, transparent reasoning with human-readable traces. The system combines sym

  • Updated Oct 8, 2025
  • Python

Improve this page

Add a description, image, and links to the multi-step-reasoning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multi-step-reasoning topic, visit your repo's landing page and select "manage topics."

Learn more