Skip to content

Add confidence score to inline comments #72

@felipefernandes

Description

@felipefernandes

🎯 Goal

Add confidence scores to LLM-generated comments and filter low-confidence ones.

📊 Complexity

Medium Term (1 day)

🔍 Problem

Not all LLM-reported issues are equally certain. Some are speculative while others are clear bugs. We need a way to filter speculative reports.

✅ Solution

1. Update JSON Schema

Modify inline review JSON to include confidence score:

{
  "summary": "Found 2 issues",
  "comments": [
    {
      "file": "auth.py",
      "line": 64,
      "severity": "performance",
      "confidence": 0.3,  // ← NEW: 0.0 to 1.0
      "message": "Consider optimizing os.chmod call"
    },
    {
      "file": "reviewer.py",
      "line": 25,
      "severity": "bug",
      "confidence": 0.9,  // ← High confidence
      "message": "Potential null pointer exception"
    }
  ]
}

2. Update Parser

Modify iara/parsers/inline_parser.py:

  • Add confidence to required fields
  • Validate 0.0 <= confidence <= 1.0

3. Update Prompt

Instruct LLM to assess confidence:

Rate your confidence in each issue (0.0 to 1.0):
- 0.9-1.0: Definite bug/issue
- 0.7-0.9: Highly likely issue
- 0.5-0.7: Possible issue worth reviewing
- 0.3-0.5: Speculative suggestion
- 0.0-0.3: Low confidence observation

4. Filter by Threshold

Add configuration in .iara.json:

{
  "review": {
    "min_confidence": 0.7  // Only post >= 0.7 confidence
  }
}

📝 Implementation Steps

  1. Update iara/prompt.py to request confidence scores
  2. Update iara/parsers/inline_parser.py to validate confidence field
  3. Add filtering logic in iara/post_comment.py
  4. Add min_confidence config option
  5. Update documentation and examples
  6. Test with various confidence thresholds

🎁 Expected Impact

  • Users can tune sensitivity vs. precision
  • 40-60% reduction in low-quality comments
  • Better signal-to-noise ratio

🔗 Related


Medium complexity due to schema changes and prompt tuning.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    Status

    Ready

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions