Open
Conversation
- Remove duplicate main() function (dead code) - Fix JSON output format for backward compatibility with benchmark history - Document valid benchmark state with verified metrics - Random split: Key-LOO BAcc=0.8234, Dummy BAcc=0.8048 - Scaffold split: Key-LOO BAcc=0.8380, Dummy BAcc=0.8252 This commit establishes a known-good state for benchmark metrics. DO NOT modify benchmark code without explicit approval and verification.
CRITICAL FIX: Running both random and scaffold splits in the same execution caused state leakage, resulting in incorrect metrics (e.g., 84% BAcc). Now runs only ONE split at a time: - --split random: runs random split only - --split scaffold: runs scaffold split only To compare: run separately and compare outputs manually. This prevents any state leakage between runs and ensures accurate metrics.
- Update version to 1.8.0 in setup.py, pyproject.toml, __init__.py, and build script - Remove DEBUG messages from Key-LOO rescaling (unconditional cout statements) - Clean up unused debug_count variable Changes: - Version bump: 1.7.0 -> 1.8.0 - Removed DEBUG 1D key messages that were printing unconditionally - Code cleanup: removed unused debug_count variable
- Add set_proximity_mode() for NCM mode selection - Add set_proximity_params() for hierarchical proximity configuration - Add set_proximity_amplitude() for target-aware amplitude scaling - Add set_proximity_amp_components_policy() for 2D/3D component handling - Add set_proximity_amp_distance_beta() for distance decay - Add set_statistical_backoff() for rare key handling - Add set_verbose() for debug output control - Add get_n_tasks() method - Add comprehensive NCM performance report with CV5 std results NCM provides: - 66.9% valid models (vs 23.3% for dummy_masking) - Competitive AUC: 0.931-0.936 (CV5: 0.9319±0.0134) - No data leakage (uses only training data) - Production-ready performance with low variance
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
adding new method based on proximity hierarchiy key.