@@ -7,21 +7,21 @@ The log-parser uses a sophisticated multi-factor scoring algorithm to calculate
77## Overall Scoring Formula
88
99```
10- Final Score = Base Confidence
11- × Severity Multiplier
12- × Chronological Factor
13- × Proximity Factor
14- × Temporal Factor
15- × Context Factor
10+ Final Score = Base Confidence
11+ × Severity Multiplier
12+ × Chronological Factor
13+ × Proximity Factor
14+ × Temporal Factor
15+ × Context Factor
1616 × (1.0 - Frequency Penalty)
1717```
1818
1919## Scoring Factors
2020
2121### 1. Base Confidence
2222
23- ** Source** : Pattern definition in YAML files
24- ** Range** : 0.0 to 1.0
23+ ** Source** : Pattern definition in YAML files
24+ ** Range** : 0.0 to 1.0
2525** Purpose** : Pattern-specific confidence level defined by pattern authors
2626
2727This is the starting confidence score defined in the pattern YAML file:
@@ -34,8 +34,8 @@ primary_pattern:
3434
3535### 2. Severity Multiplier
3636
37- **Source**: Pattern definition in YAML files (severity field)
38- **Range**: 1.0 to 5.0
37+ **Source**: Pattern definition in YAML files (severity field)
38+ **Range**: 1.0 to 5.0
3939**Purpose**: Amplify scores for more severe failure types
4040
4141| Severity | Multiplier |
@@ -48,8 +48,8 @@ primary_pattern:
4848
4949### 3. Chronological Factor
5050
51- **Source**: Calculated from log line position
52- **Range**: 0.5 to configurable max (default: 2.5)
51+ **Source**: Calculated from log line position
52+ **Range**: 0.5 to configurable max (default: 2.5)
5353**Purpose**: Prioritize earlier errors as they're more likely to be root causes
5454
5555The algorithm divides the log into three zones based on configurable thresholds:
@@ -83,8 +83,8 @@ factor = 0.5 + (1.0 - position)
8383
8484### 4. Proximity Factor
8585
86- ** Source** : Secondary patterns defined in YAML files + line distance calculation
87- ** Range** : 1.0 to unlimited (practical max ~ 3.0)
86+ ** Source** : Secondary patterns defined in YAML files + line distance calculation
87+ ** Range** : 1.0 to unlimited (practical max ~ 3.0)
8888** Purpose** : Boost scores when secondary patterns are found nearby
8989
9090Uses exponential decay to calculate proximity bonus:
@@ -115,8 +115,8 @@ proximity_factor = 1.0 + 0.485 = 1.485
115115
116116### 5. Temporal Factor
117117
118- ** Source** : Sequence patterns defined in YAML files + chronological event matching
119- ** Range** : 1.0 to unlimited
118+ ** Source** : Sequence patterns defined in YAML files + chronological event matching
119+ ** Range** : 1.0 to unlimited
120120** Purpose** : Bonus for matching event sequences that indicate cascading failures
121121
122122```
@@ -130,17 +130,17 @@ Where sequence matching:
130130
131131### 6. Context Factor
132132
133- ** Source** : Surrounding log lines analysis using regex patterns
134- ** Range** : 1.0 to configurable max (default: 2.5)
133+ ** Source** : Surrounding log lines analysis using regex patterns
134+ ** Range** : 1.0 to configurable max (default: 2.5)
135135** Purpose** : Boost scores based on error-rich surrounding context
136136
137137The context analysis examines lines before, at, and after the match:
138138
139139```
140- context_score = 0.4 × error_lines
141- + 0.2 × warning_lines
142- + 0.1 × stack_trace_lines
143- + 0.3 × exception_lines
140+ context_score = 0.4 × error_lines
141+ + 0.2 × warning_lines
142+ + 0.1 × stack_trace_lines
143+ + 0.3 × exception_lines
144144 + min(stack_trace_lines × 0.1, 0.5)
145145```
146146
@@ -159,8 +159,8 @@ context_factor = min(1.0 + context_score, max_context_factor)
159159
160160### 7. Frequency Penalty
161161
162- ** Source** : Pattern match frequency tracking over time window
163- ** Range** : 0.0 to configurable max (default: 0.8)
162+ ** Source** : Pattern match frequency tracking over time window
163+ ** Range** : 0.0 to configurable max (default: 0.8)
164164** Purpose** : Reduce scores for frequently occurring patterns (noise reduction)
165165
166166```
@@ -240,4 +240,4 @@ Final Score = 0.8 × 3.0 × 2.1 × 1.4 × 1.0 × 1.5 × (1.0 - 0.0)
240240### Sequence Pattern Design
241241- Define cascading failure sequences (connection → timeout → retry → failure)
242242- Use moderate bonus multipliers (0.2-0.5) to avoid overwhelming primary scores
243- - Consider temporal relationships in container startup sequences
243+ - Consider temporal relationships in container startup sequences
0 commit comments