fix: prevent over-scoped text edits from silently deleting content by buger · Pull Request #466 · probelabs/probe

buger · 2026-03-03T11:09:48Z

Summary

Adds a mechanical guard in the edit tool (edit.js) that rejects text-mode edits where old_string spans 20+ lines and new_string is less than 50% of that line count — catches the "accidental deletion" pattern where an AI replaces a large block with a tiny replacement
Adds prompt reinforcement in the engineer prompt (prompts.js) and system message guidance (ProbeAgent.js) to steer toward line-targeted editing (start_line/end_line) for large blocks
Adds 2 unit tests verifying the guard triggers correctly and doesn't false-positive on proportional edits

Context

In trace 3afc73dad05339c0a4c6e49cc55f5d63, the AI used text-mode editing with a ~47-line old_string covering both the target section AND an unrelated section. The 5-line new_string silently deleted content the user didn't ask to remove. The edit tool worked correctly — the AI just over-scoped.

Test plan

New test: rejects 25→2 line replacement with correct error message
New test: allows 25→20 line proportional replacement (no false positive)
All existing edit-create-tools tests pass (1 pre-existing failure in symbol disambiguation unrelated to this change)

🤖 Generated with Claude Code

Add a size-ratio guard in the edit tool that rejects text-mode edits where old_string spans 20+ lines and new_string is less than 50% of that line count, preventing accidental content deletion. Also add prompt reinforcement in engineer and system message guidance to steer toward line-targeted editing for large blocks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

probelabs · 2026-03-03T11:13:55Z

Summary

This PR adds safeguards to prevent AI agents from accidentally deleting content when using text-mode editing with over-scoped old_string values. The fix addresses a real incident where an AI replaced a ~47-line block with a 5-line replacement, silently deleting unrelated content.

Files Changed Analysis

File	Changes	Purpose
`npm/src/tools/edit.js`	+7 lines	Mechanical guard that rejects edits where `old_string` ≥20 lines and `new_string` <50% of that count
`npm/src/agent/shared/prompts.js`	+1 line	Prompt reinforcement for minimal, focused edits
`npm/src/agent/ProbeAgent.js`	+1 line	System message guidance preferring line-targeted editing for large blocks
`npm/tests/unit/edit-create-tools.test.js`	+53 lines	Two unit tests: guard trigger + no false positive on proportional edits

Total: 4 files, 62 additions, 0 deletions

Architecture & Impact Assessment

What This PR Accomplishes

Adds a mechanical validation guard in the edit tool that detects suspicious edit patterns before execution
Provides clear error messages directing users to line-targeted editing (start_line/end_line) or smaller scoped edits
Reinforces prompt guidance to steer AI agents toward safer editing patterns proactively

Key Technical Changes

flowchart TD
    A[Edit Request] --> B{Has old_string?}
    B -->|No| C[Use line-targeted mode]
    B -->|Yes| D[Count lines in old_string]
    D --> E{Lines ≥ 20?}
    E -->|No| F[Proceed with edit]
    E -->|Yes| G{new_string ≥ 50% of old?}
    G -->|Yes| F
    G -->|No| H[BLOCK: Return error]
    H --> I[Suggest start_line/end_line]

Affected System Components

Edit Tool (npm/src/tools/edit.js) - Core validation logic
Agent Prompts (npm/src/agent/shared/prompts.js) - Behavioral guidance
System Messages (npm/src/agent/ProbeAgent.js) - Tool usage instructions

Threshold Rationale

20 lines minimum: Catches genuinely large edits, avoids false positives on small corrections
50% ratio: Allows legitimate refactoring that condenses code, blocks likely accidental deletions

Scope Discovery & Context Expansion

Related Files (not modified but relevant)

npm/src/tools/lineEditHeuristics.js - Handles line-targeted edit corrections (echo stripping, indent restoration)
npm/src/tools/symbolEdit.js - Symbol-based editing alternative
npm/src/tools/hashline.js - Line hash verification for integrity

Edit Tool Alternatives

The PR promotes these safer alternatives for large edits:

Line-targeted editing (start_line/end_line) - Constrains scope to exact line range
Symbol editing - Targets named functions/classes without exact text matching
Smaller focused edits - Split large changes into sequential smaller operations

Test Coverage

Test Case	Expected Result
25→2 line replacement	❌ Blocked with error message
25→20 line replacement	✅ Allowed (proportional)

Both tests verify the guard logic and ensure no false positives on legitimate refactoring operations.

Tags: tags.review-effort: 2, tags.label: bug

Metadata

Review Effort: 2 / 5
Primary Label: bug

Powered by Visor from Probelabs

Last updated: 2026-03-03T11:32:08.412Z | Triggered by: pr_opened | Commit: a732886

💡 TIP: You can chat with Visor using /visor ask <your question>

probelabs · 2026-03-03T11:19:28Z

Architecture Issues (4)

Severity	Location	Issue
🟢 Info	`npm/src/tools/edit.js:451-458`	The over-scope validation logic is embedded inline in the main edit flow. The codebase already has validation functions extracted (isPathAllowed, handleSymbolEdit, handleLineEdit). This guard could follow the same pattern for consistency and testability. 💡 Suggestion Consider extracting the validation into a dedicated function like `validateEditScope(oldLines, newLines)` that returns `{ valid: boolean, reason?: string }`. This would make the validation logic reusable, testable in isolation, and consistent with other validation patterns in the file.
🟢 Info	`npm/src/tools/edit.js:451`	The guard is placed after the uniqueness check but before the replacement. Consider whether this should be earlier in the validation chain (e.g., right after parsing old_string) to fail fast and provide clearer error ordering. 💡 Suggestion The current placement is acceptable, but consider documenting the validation order: (1) path validation, (2) file existence, (3) read-before-write check, (4) mode routing, (5) content validation (uniqueness, over-scope). This makes the validation flow explicit.
🟢 Info	`npm/src/agent/shared/prompts.js:90`	The guidance about preferring line-targeted editing for large blocks is duplicated across three locations: ProbeAgent.js (system message), prompts.js (engineer prompt), and edit.js (error message). While this reinforcement is intentional, ensure all three stay in sync. 💡 Suggestion Consider defining the guidance text in one place and referencing it, or add a comment in each location noting the other locations that need to stay in sync. This reduces risk of drift where one location gets updated but others don't.
🟡 Warning	`npm/src/tools/edit.js:455-457`	The guard thresholds (20 lines, 50% ratio) are hard-coded magic numbers without justification. These values determine when an edit is considered 'over-scoped' but lack documentation explaining why these specific thresholds were chosen or whether they're configurable. 💡 Suggestion Extract these thresholds to named constants at the top of the file with comments explaining the rationale. Consider making them configurable via options if different projects might need different thresholds. Example: // Thresholds for detecting potentially accidental large-scale deletions // 20 lines: large enough to catch significant over-scoping but not small edits // 50% ratio: catches replacement that dramatically shrinks content const OVER_SCOPE_LINE_THRESHOLD = 20; const OVER_SCOPE_RATIO_THRESHOLD = 0.5;

✅ Performance Check Passed

No performance issues found – changes LGTM.

\n\n

Architecture Issues (4)

Severity	Location	Issue
🟢 Info	`npm/src/tools/edit.js:451-458`	The over-scope validation logic is embedded inline in the main edit flow. The codebase already has validation functions extracted (isPathAllowed, handleSymbolEdit, handleLineEdit). This guard could follow the same pattern for consistency and testability. 💡 Suggestion Consider extracting the validation into a dedicated function like `validateEditScope(oldLines, newLines)` that returns `{ valid: boolean, reason?: string }`. This would make the validation logic reusable, testable in isolation, and consistent with other validation patterns in the file.
🟢 Info	`npm/src/tools/edit.js:451`	The guard is placed after the uniqueness check but before the replacement. Consider whether this should be earlier in the validation chain (e.g., right after parsing old_string) to fail fast and provide clearer error ordering. 💡 Suggestion The current placement is acceptable, but consider documenting the validation order: (1) path validation, (2) file existence, (3) read-before-write check, (4) mode routing, (5) content validation (uniqueness, over-scope). This makes the validation flow explicit.
🟢 Info	`npm/src/agent/shared/prompts.js:90`	The guidance about preferring line-targeted editing for large blocks is duplicated across three locations: ProbeAgent.js (system message), prompts.js (engineer prompt), and edit.js (error message). While this reinforcement is intentional, ensure all three stay in sync. 💡 Suggestion Consider defining the guidance text in one place and referencing it, or add a comment in each location noting the other locations that need to stay in sync. This reduces risk of drift where one location gets updated but others don't.
🟡 Warning	`npm/src/tools/edit.js:455-457`	The guard thresholds (20 lines, 50% ratio) are hard-coded magic numbers without justification. These values determine when an edit is considered 'over-scoped' but lack documentation explaining why these specific thresholds were chosen or whether they're configurable. 💡 Suggestion Extract these thresholds to named constants at the top of the file with comments explaining the rationale. Consider making them configurable via options if different projects might need different thresholds. Example: // Thresholds for detecting potentially accidental large-scale deletions // 20 lines: large enough to catch significant over-scoping but not small edits // 50% ratio: catches replacement that dramatically shrinks content const OVER_SCOPE_LINE_THRESHOLD = 20; const OVER_SCOPE_RATIO_THRESHOLD = 0.5;

\n\n ### ✅ Performance Check Passed

No performance issues found – changes LGTM.

\n\n

Quality Issues (3)

Severity	Location	Issue
🟢 Info	`npm/tests/unit/edit-create-tools.test.js:187`	Missing edge case test at exactly 20 lines with exactly 10 lines replacement (50% ratio). The condition uses `newLines < oldLines * 0.5`, so 20→10 should pass but 20→9 should fail. Testing the boundary ensures the inequality is correct. 💡 Suggestion Add a test case verifying the boundary: 20 old lines with 10 new lines (exactly 50%) should be allowed, and optionally 20 old lines with 9 new lines should be rejected.
🟢 Info	`npm/src/tools/edit.js:451`	The scope check uses `matchTarget` (which may be fuzzy-matched text) rather than the original `old_string`. If fuzzy matching expands the match significantly, the line count could differ from what the user intended. Consider whether the check should use `old_string` or `matchTarget`. 💡 Suggestion Consider whether the line count should be based on the user's original `old_string` or the fuzzy-matched `matchTarget`. Using `matchTarget` is reasonable since that's what will actually be replaced, but document this decision.
🟡 Warning	`npm/src/tools/edit.js:451`	Magic numbers 20 and 0.5 are hardcoded without explanation. These threshold values determine when an edit is considered 'over-scoped' but lack self-documenting names or comments explaining the rationale. 💡 Suggestion Extract to named constants with documentation: const MIN_LINES_FOR_SCOPE_CHECK = 20; const MIN_REPLACEMENT_RATIO = 0.5; // Reject edits replacing 20+ lines with less than 50% of original line count

Powered by Visor from Probelabs

Last updated: 2026-03-03T11:21:23.242Z | Triggered by: pr_opened | Commit: a732886

💡 TIP: You can chat with Visor using /visor ask <your question>

buger merged commit 89ca673 into main Mar 3, 2026
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent over-scoped text edits from silently deleting content#466

fix: prevent over-scoped text edits from silently deleting content#466
buger merged 1 commit intomainfrom
fix/prevent-over-scoped-text-edits

buger commented Mar 3, 2026

Uh oh!

probelabs bot commented Mar 3, 2026 •

edited

Loading

Uh oh!

probelabs bot commented Mar 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

buger commented Mar 3, 2026

Summary

Context

Test plan

Uh oh!

probelabs bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Files Changed Analysis

Architecture & Impact Assessment

What This PR Accomplishes

Key Technical Changes

Affected System Components

Threshold Rationale

Scope Discovery & Context Expansion

Related Files (not modified but relevant)

Edit Tool Alternatives

Test Coverage

Uh oh!

probelabs bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Architecture Issues (4)

✅ Performance Check Passed

Architecture Issues (4)

Quality Issues (3)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

probelabs bot commented Mar 3, 2026 •

edited

Loading

probelabs bot commented Mar 3, 2026 •

edited

Loading