Semantic Evidence Construction with Optimal Transport
Reference implementation: eviot
MakeSense is a research framework for constructing evidence sets, not just by ranking sentences.
Its Python implementation, eviot, uses Optimal Transport (OT) to select a set of sentences that are together sufficient to answer complex queries.
Traditional retrieval answers:
“Which sentences are relevant?”
MakeSense answers:
“Which set of sentences together is sufficient?”
This distinction matters for:
- multi-hop questions
- presence of redundancy
- reasoning across multiple facts
- deciding joint sufficiency
- Embed using (bge-base-en-v1.5), the
- query (optionally decomposed into semantic supports)
- candidate sentences
- Use Optimal Transport to measure coverage between:
- query representation
- candidate evidence set
- Construct evidence as a set, not a list as a result of just relevance ranking
eviot supports two modes.
- Query is split into semantic supports
- OT enforces coverage across supports
- Produces smaller, inference-based contexts
Best for:
- semantic sufficiency
- redundancy suppression
- theory-driven evaluation
- Query is embedded as a single vector
- OT behaves closer to dense retrieval
- Favors explicit answer sentences
Works for:
- real-world implementations
- evidence recall
- optimizing context retrieval to minimzing set size trade-off
- Greedy selection
- Stops when marginal gain saturates
- Produces minimal sufficient context
- Selects exactly
ksentences - Inflates context beyond sufficiency for large
k - Not recommended for practical purposes
- Models progressive evidence discovery
- Evidence appears over time
- Penalizes abrupt semantic shifts
- States are not answer-complete individually
Used for:
- analysis
- evidence evolution
- semantic drift
python -m venv .venv
source .venv/bin/activate
pip install torch transformers pot spacy
python -m spacy download en_core_web_smInstall uv package manager based on your OS (Windows/MacOS/Linux) - https://docs.astral.sh/uv/getting-started/installation/
uv init
uv venv
uv syncOr manually add dependencies
uv add torch pot spacy transformerspython -m eviot.runners.single_queryCONFIG = {
"mode": "adaptive",
"use_query_decomposition": True,
"epsilon": 0.01,
"patience": 2,
"k_max": 10,
"temporal_slices": 3,
"alpha_temporal": 0.3,
}Configuration details:
mode: adaptive, fixed, temporaluse_query_decomposed: query decomposed or single vector embeddingepsilon: threshold gain acheievedpatiencetimes to trigger stoppingpatience: how many times of gains equivalent toepsilonmust trigger stoppingk_max: upper bound for any set (triggered only if saturation is not observed atk_maxset size)temporal_slices: number of stages before uncovering all candidatesalpha_temporal: controls semantic drift