#

test-time-scaling

Here are 20 public repositories matching this topic...

Leezekun / MassGen

🚀 MassGen: An Open-Source Multi-Agent Scaling System for Collaborative AI with the Goal of Continuous Self-Improvement. Featuring parallel agent orchestration across frontier open and closed weight models, MCP integration, code execution, and intelligent consensus building for collective intelligence. Docs: docs.massgen.ai

agent multi-agent llm test-time-scaling

Updated Oct 23, 2025
Python

haizelabs / verdict

Inference-time scaling for LLMs-as-a-judge.

reward-shaping llm llm-as-a-judge test-time-compute inference-time-compute llm-judge test-time-scaling

Updated Sep 30, 2025
Jupyter Notebook

liuff19 / Video-T1

[ICCV 2025] Video-T1: Test-Time Scaling for Video Generation

video video-generation aigc chain-of-thought test-time-scaling

Updated Jun 29, 2025
Python

RyanLiu112 / compute-optimal-tts

Official codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".

r1 o1 large-language-model process-reward-model test-time-scaling

Updated Feb 19, 2025
Python

JARVIS-Xs / SE-Agent

SE-Agent is a self-evolution framework for LLM Code agents. It enables trajectory-level evolution to exchange information across reasoning paths via Revision, Recombination, and Refinement, expanding the search space and escaping local optima. On SWE-bench Verified, it achieves SOTA performance

mcts code-fix swe-agent test-time-scaling claude-code code-agent swe-bench self-evolve

Updated Sep 23, 2025
Python

PRIME-RL / ImplicitPRM

Repo of paper "Free Process Rewards without Process Labels"

rl prm test-time-scaling

Updated Mar 14, 2025
Python

HKUSTDial / Alpha-SQL

🔥[ICML'25] Official repository for the paper "Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search"

icml text-to-sql test-time-scaling icml-2025

Updated Oct 23, 2025
Python

RyanLiu112 / GenPRM

Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".

r1 o1 large-language-model process-reward-model test-time-scaling

Updated Jun 4, 2025
Python

jincan333 / MAS-TTS

Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning (NeurIPS2026-SEA)

multi-agent-systems reasoning test-time-scaling collaborative-reasoning

Updated Apr 19, 2025
Python

testtimescaling / testtimescaling.github.io

"what, how, where, and how well? a survey on test-time scaling in large language models" repository

survey awesome-lists large-language-model test-time-scaling

Updated Oct 23, 2025
HTML

bobxwu / learning-from-rewards-llm-papers

A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward models and learning strategies across training, inference, and post-inference stages.

reinforcement-learning post-training self-correction reward-learning large-language-models llm llms reward-models reward-model reward-modeling guided-decoding test-time-scaling

Updated Jun 13, 2025

xuyige / SoftCoT

ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-of-Thought Reasoning

natural-language-processing large-language-models chain-of-thought test-time-scaling continuous-chain-of-thought

Updated May 30, 2025
Python

UCSC-VLAA / m1

m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models

time test medical scaling reasoning r1 llm test-time-scaling

Updated Apr 14, 2025
Jupyter Notebook

PPPP-kaqiu / Awesome-Parallel-Reasoning

Awesome-Parallel-Reasoning: Unlocking the reasoning potential of LLMs. Papers, Code, Resources & Survey.

reinforcement-learning-algorithms r1 large-language-models reasoning-models test-time-scaling awesome-parallel-reasoning

Updated Oct 15, 2025
HTML

monurcan / efficient_test_time_scaling

Efficient Test-Time Scaling for Small Vision-Language Models, official implementation of the paper, test-time scaling via test-time augmentation

vlm test-time-augmentation test-time-adaptation llm vision-language-model multimodal-llm test-time-scaling inference-time-scaling

Updated Oct 7, 2025
Python

Amirhosein-gh98 / Guided-by-Gut

The official PyTorch implementation for the Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence

efficient tree-search gg prm self-consistency confidence dvts rl-training llm inference-time-compute grpo test-time-scaling guided-by-gut

Updated Jun 9, 2025
Python

kang-ml / LogicTree

[EMNLP 2025 Main] LogicTree: Structured Proof Exploration for Coherent and Rigorous Logical Reasoning with Large Language Models

tree-search logical-reasoning large-language-models llm-memory llm-reasoning test-time-scaling

Updated Sep 10, 2025
Python

appier-research / language-matters

In which language do these models reason when solving problems presented in different languages? Our findings reveal that, despite multilingual training, LRMs tend to default to reasoning in high-resource languages (e.g., English) at test time

multilingual alignment appier reasoning deepseek test-time-scaling

Updated Jun 3, 2025
Python

Sairamg18814 / GUBBALA-V3-TRUE

Revolutionary Self-Evolving Language Model - 100% self-contained AI trained on 40M Wikipedia articles

machine-learning deep-learning wikipedia transformers pytorch mixture-of-experts apple-silicon llm test-time-scaling self-improving-ai

Updated Aug 2, 2025
Python

Aradhye2002 / reasoning_exps

Official implementation of the paper "First Finish Search: Efficient Test-Time Scaling in Large Language Models"

reasoning-language-models reasoning-models test-time-scaling

Updated May 27, 2025

Improve this page

Add a description, image, and links to the test-time-scaling topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the test-time-scaling topic, visit your repo's landing page and select "manage topics."