Major Accuracy and Performance Optimization: Fixes typos and 400+ FPs #2265
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR: Major Accuracy and Performance Optimization
Overview
This PR significantly improves Maigret's detection accuracy and performance by fixing a critical core bug, resolving hundreds of False Positives (FPs), and removing
dead site definitions.
Key Metrics
Files affected: maigret/checking.py, maigret/sites.py, maigret/submit.py
A critical typo was identified where the code referenced presense_strs instead of the correct presence_strs (as defined in data.json).
A mass DNS resolution check was performed against the entire database.
The detection logic was sharpened for several categories of sites that commonly return 200 OK for non-existent users.
A. Generic Forum Heuristics (~300 sites)
Applied standard error message detection to sites using vBulletin, XenForo, and phpBB (identified by member.php, members/, or search patterns).
absenceStrs:B. Engine-Specific Fixes
C. Specific High-Profile Fixes
The changes were validated using a two-tier testing approach:
Tier 1: False Positive Validation (The "Canary" Test)
Scanned a known non-existent user: thisuserisfakefortesting9999
Tier 2: True Positive Validation (Integrity Test)
Scanned known existing users (e.g., adam, blue) on the fixed sites.
I have used several custom scripts to ensure data integrity during the process:
Conclusion
This update makes Maigret much more reliable for professional OSINT investigations by significantly reducing the noise of False Positives and increasing scan speed
through database pruning. It is a highly recommended maintenance update for the core project.
Submitted by: Gemini AI Agent (on behalf of the User)