TextualVerifier is a self-verification framework that leverages Large Language Models (LLMs) to provide step-by-step verification capabilities for textual optimization processes. Built as a modular verification system, TextualVerifier addresses the critical gap of self-verification mechanisms in textual gradient optimization frameworks.
TextualVerifier is designed for robust verification across multiple optimization scenarios:
Verify mathematical reasoning, problem-solving steps, and computational solutions to ensure accuracy and correctness in multi-step problem solving.
Validate code refinements, algorithm improvements, and implementation correctness during iterative code optimization processes.
Verify prompt engineering improvements and ensure optimization steps maintain semantic coherence and effectiveness.
Systematically verify loss value calculations and optimization results to prevent error propagation in gradient-based optimization.
Enhance accuracy in academic domains including mathematics, machine learning, and physics through process-supervised verification.
TextualVerifier implements a systematic four-stage verification methodology that transforms raw reasoning chains into verified, high-quality outputs through process supervision principles.
The verification process follows the general formula:
instance + instruction β calculation
instance + calculation + verification_prompt β verified_calculation
Where:
- instance: Input data or problem context
- instruction: Task-specific optimization directive
- calculation: Generated solution or reasoning chain
- verified_calculation: Validated and potentially corrected output
TextualVerifier supports four distinct verification configurations:
- Basic Verification (
cot=False, breakdown=False): Single-unit verification for prompt optimization - Step-by-Step Verification (
cot=False, breakdown=True): Granular verification of pre-structured reasoning - CoT Generation (
cot=True, breakdown=False): Structured reasoning generation without granular verification - Full Verification (
cot=True, breakdown=True): Maximum verification capability for complex solutions
Decomposes complex reasoning chains into individual logical steps using chain-of-thought prompting techniques. Extracts steps using regex patterns for <STEP>...</STEP> tags with fallback mechanisms for unstructured text.
Creates multiple alternative formulations of each reasoning step using different verification perspectives:
- Rule-Based Verification: Mathematical correctness and procedural adherence
- Pedagogical Verification: Clarity and educational value assessment
- Domain-Specific Verification: Specialized knowledge application
Evaluates and selects the best variant through consensus-based decision making, ensuring robust verification through multiple perspectives.
Consolidates verified individual steps into a coherent, validated reasoning chain with explicit <VERIFIED>...</VERIFIED> tags for downstream processing.
pip install textualverifierfrom textualverifier import TextualVerifier
from textgrad.engine import get_engine
# Initialize the verifier
engine = get_engine("gpt-4o")
verifier = TextualVerifier(
verifier_engine=engine,
use_cot_generation=True,
use_step_breakdown=True
)
# Verify a calculation
verified_result = verifier.verify(
instance=Variable("Solve xΒ² - 7x + 2 = 0"),
instruction=Variable("Solve the quadratic equation step by step"),
calculation=Variable("Using quadratic formula: x = (7 Β± β(49-8))/2")
)
print(verified_result.value)# Full verification with custom prompts
full_verifier = TextualVerifier(
verifier_engine=engine,
use_cot_generation=True,
use_step_breakdown=True,
verification_task_prompts=[
"Evaluate mathematical correctness and procedural rules",
"Review from teaching assistant perspective",
"Check for domain-specific accuracy"
]
)TextualVerifier is designed to integrate seamlessly with TextGrad optimization workflows:
import textgrad as tg
from textualverifier import TextualVerifier
# Standard TextGrad optimization
optimizer = tg.TGD(parameters=[solution])
loss = tg.Variable("Evaluation of solution accuracy")
# With TextualVerifier integration
verifier = TextualVerifier(engine, use_cot_generation=True, use_step_breakdown=True)
# Verify during optimization process
for step in range(optimization_steps):
optimizer.zero_grad()
loss.backward()
# Verify before applying optimization step
verified_solution = verifier.verify(
instance=problem,
instruction=optimization_instruction,
calculation=solution
)
optimizer.step()- Loss Value Verification: Verify loss calculations before backpropagation
- Optimizer Result Verification: Validate optimization results before acceptance
- Combined Verification: Comprehensive verification of both loss values and optimization outcomes
TextualVerifier features a modular architecture with four core components:
Input β TextualVerifier()
βββ Step Extractor (CoT decomposition)
βββ Variant Generator (Multi-perspective analysis)
βββ Voting Mechanism (Consensus selection)
βββ Step Merger (Verification consolidation)
β Verified Output
Experimental results on academic benchmarks:
| Dataset | Base Accuracy | With TextualVerifier | Improvement |
|---|---|---|---|
| GPQA-Diamond | - | - | +5.56pp |
| MMLU-ML | - | - | +7.14pp |
| MMLU-CP | - | - | +2.94pp |
Results show percentage point improvements over baseline TextGrad optimization
# Computational efficiency vs accuracy trade-off
verifier_configs = {
"fast": TextualVerifier(engine, use_cot_generation=False, use_step_breakdown=False),
"balanced": TextualVerifier(engine, use_cot_generation=True, use_step_breakdown=False),
"comprehensive": TextualVerifier(engine, use_cot_generation=True, use_step_breakdown=True)
}Configure multiple verification perspectives for robust consensus:
verification_prompts = [
"Evaluate mathematical correctness and procedural rules",
"Review from teaching assistant perspective",
"Check for logical consistency and completeness",
"Assess clarity and explanatory quality"
]We welcome contributions! Please see our Contributing Guidelines for details.
This project is licensed under the MIT License - see the LICENSE file for details.
If you use TextualVerifier in your research, please cite:
@misc{textualverifier2024,
author = {Eugenius Mario Situmorang},
title = {TextualVerifier: Self-Verification Framework for Textual Optimization},
year = {2024},
howpublished = {IR-NLP Lab, Department of Computer Science, Universitas Indonesia},
note = {Manuscript in preparation}
}- TextGrad - Automatic "Differentiation" via Text
- DSPy - Programming Foundation Models
- Process Supervision Papers
