🎯 Goal
Add confidence scores to LLM-generated comments and filter low-confidence ones.
📊 Complexity
Medium Term (1 day)
🔍 Problem
Not all LLM-reported issues are equally certain. Some are speculative while others are clear bugs. We need a way to filter speculative reports.
✅ Solution
1. Update JSON Schema
Modify inline review JSON to include confidence score:
{
"summary": "Found 2 issues",
"comments": [
{
"file": "auth.py",
"line": 64,
"severity": "performance",
"confidence": 0.3, // ← NEW: 0.0 to 1.0
"message": "Consider optimizing os.chmod call"
},
{
"file": "reviewer.py",
"line": 25,
"severity": "bug",
"confidence": 0.9, // ← High confidence
"message": "Potential null pointer exception"
}
]
}
2. Update Parser
Modify iara/parsers/inline_parser.py:
- Add
confidence to required fields
- Validate
0.0 <= confidence <= 1.0
3. Update Prompt
Instruct LLM to assess confidence:
Rate your confidence in each issue (0.0 to 1.0):
- 0.9-1.0: Definite bug/issue
- 0.7-0.9: Highly likely issue
- 0.5-0.7: Possible issue worth reviewing
- 0.3-0.5: Speculative suggestion
- 0.0-0.3: Low confidence observation
4. Filter by Threshold
Add configuration in .iara.json:
{
"review": {
"min_confidence": 0.7 // Only post >= 0.7 confidence
}
}
📝 Implementation Steps
- Update
iara/prompt.py to request confidence scores
- Update
iara/parsers/inline_parser.py to validate confidence field
- Add filtering logic in
iara/post_comment.py
- Add
min_confidence config option
- Update documentation and examples
- Test with various confidence thresholds
🎁 Expected Impact
- Users can tune sensitivity vs. precision
- 40-60% reduction in low-quality comments
- Better signal-to-noise ratio
🔗 Related
Medium complexity due to schema changes and prompt tuning.
🎯 Goal
Add confidence scores to LLM-generated comments and filter low-confidence ones.
📊 Complexity
Medium Term (1 day)
🔍 Problem
Not all LLM-reported issues are equally certain. Some are speculative while others are clear bugs. We need a way to filter speculative reports.
✅ Solution
1. Update JSON Schema
Modify inline review JSON to include confidence score:
{ "summary": "Found 2 issues", "comments": [ { "file": "auth.py", "line": 64, "severity": "performance", "confidence": 0.3, // ← NEW: 0.0 to 1.0 "message": "Consider optimizing os.chmod call" }, { "file": "reviewer.py", "line": 25, "severity": "bug", "confidence": 0.9, // ← High confidence "message": "Potential null pointer exception" } ] }2. Update Parser
Modify
iara/parsers/inline_parser.py:confidenceto required fields0.0 <= confidence <= 1.03. Update Prompt
Instruct LLM to assess confidence:
4. Filter by Threshold
Add configuration in
.iara.json:{ "review": { "min_confidence": 0.7 // Only post >= 0.7 confidence } }📝 Implementation Steps
iara/prompt.pyto request confidence scoresiara/parsers/inline_parser.pyto validate confidence fieldiara/post_comment.pymin_confidenceconfig option🎁 Expected Impact
🔗 Related
Medium complexity due to schema changes and prompt tuning.