javrr-ui · Copilot · Jan 13, 2026 · Jan 13, 2026 · Jan 13, 2026
diff --git a/CODE_SMELLS.md b/CODE_SMELLS.md
diff --git a/docs/issues/01-duplicate-month-mappings.md b/docs/issues/01-duplicate-month-mappings.md
@@ -0,0 +1,73 @@
+# Code Smell: Duplicate Month Mapping Dictionaries
+
+**Labels:** `code-quality`, `refactoring`, `medium-priority`  
+**Type:** Code Duplication
+
+## Description
+
+The Spanish-to-English month mapping dictionary is duplicated across multiple parser files, violating the DRY (Don't Repeat Yourself) principle. This creates maintenance burden and risk of inconsistency.
+
+## Locations
+
+The same or similar month mapping appears in:
+- `core/parsers/hey_banco.py` (lines 143-156)
+- `core/parsers/rappi.py` (lines 22-35)
+- `core/parsers/banorte.py` (lines 36-49)
+- `core/parsers/nubank.py` (lines 23-36)
+
+## Current Code
+
+Each parser has variations of:
+```python
+SPANISH_TO_ENGLISH_MONTH = {
+    "ene": "Jan",
+    "feb": "Feb",
+    "mar": "Mar",
+    # ... etc
+}
+```
+
+## Impact
+
+- **Maintenance Burden:** Changes must be replicated in 4 locations
+- **Risk of Inconsistency:** If one mapping is updated and others aren't
+- **Code Bloat:** Unnecessary duplication across files
+- **Testing Overhead:** Same logic needs to be tested multiple times
+
+## Proposed Solution
+
+1. Create a shared date utilities module:
+   ```python
+   # core/parsers/date_utils.py
+   SPANISH_TO_ENGLISH_MONTH = {
+       "ene": "Jan",
+       "enero": "January",
+       "feb": "Feb",
+       "febrero": "February",
+       # ... complete mapping
+   }
+
+   def normalize_spanish_date(date_str: str) -> str:
+       """Normalize Spanish date string to English format."""
+       # Common date normalization logic
+       pass
+   ```
+
+2. Update all parsers to import from the shared module:
+   ```python
+   from core.parsers.date_utils import SPANISH_TO_ENGLISH_MONTH
+   ```
+
+3. Consider extracting common date parsing logic into reusable functions
+
+## Acceptance Criteria
+
+- [ ] Create `core/parsers/date_utils.py` with comprehensive month mappings
+- [ ] Update all 4 parser files to import the shared constant
+- [ ] Remove duplicate definitions from parser files
+- [ ] Verify all parsers still work correctly with existing test data
+- [ ] Add unit tests for the shared date utilities
+
+## Related Issues
+
+- #2 (Duplicate Date Normalization Logic)
diff --git a/docs/issues/02-duplicate-date-normalization.md b/docs/issues/02-duplicate-date-normalization.md
@@ -0,0 +1,119 @@
+# Code Smell: Duplicate Date Normalization Logic
+
+**Labels:** `code-quality`, `refactoring`, `medium-priority`  
+**Type:** Code Duplication
+
+## Description
+
+The date normalization logic (Spanish months to English, AM/PM conversion) is duplicated within the `HeyBancoParser` class. The same ~27 lines of code appear in two different methods.
+
+## Locations
+
+- `core/parsers/hey_banco.py` - `_parse_outgoing_transfer()` (lines 143-169)
+- `core/parsers/hey_banco.py` - `_parse_credit_card_payment()` (lines 226-253)
+
+## Current Code
+
+Both methods contain identical normalization logic:
+```python
+# Normalize Spanish months and AM/PM to English equivalents
+month_map = {
+    "enero": "January",
+    "febrero": "February",
+    # ... etc
+}
+
+normalized_date_str = date_str
+for spanish, english in month_map.items():
+    normalized_date_str = normalized_date_str.replace(spanish, english)
+    normalized_date_str = normalized_date_str.replace(
+        spanish.capitalize(), english
+    )
+
+# Normalize "a. m." and "p. m." to "AM" / "PM"
+normalized_date_str = normalized_date_str.replace("a. m.", "AM").replace(
+    "p. m.", "PM"
+)
+
+try:
+    datetime_obj = date_parser(normalized_date_str, dayfirst=True)
+except (ValueError, TypeError, OverflowError) as e:
+    logger.error(...)
+```
+
+## Impact
+
+- **Code Duplication:** ~27 lines duplicated exactly
+- **Bug Risk:** Bug fixes need to be applied in multiple places
+- **Readability:** Makes methods longer and harder to understand
+- **Maintenance:** Changes to date parsing logic require updates in 2 places
+
+## Proposed Solution
+
+Extract the normalization logic into a private helper method:
+
+```python
+def _normalize_spanish_date(self, date_str: str) -> str:
+    """
+    Normalize Spanish date string to English format.
+
+    Converts:
+    - Spanish month names to English
+    - "a. m." / "p. m." to "AM" / "PM"
+
+    Args:
+        date_str: Raw date string in Spanish format
+
+    Returns:
+        Normalized date string ready for parsing
+    """
+    month_map = {
+        "enero": "January",
+        "febrero": "February",
+        "marzo": "March",
+        "abril": "April",
+        "mayo": "May",
+        "junio": "June",
+        "julio": "July",
+        "agosto": "August",
+        "septiembre": "September",
+        "octubre": "October",
+        "noviembre": "November",
+        "diciembre": "December",
+    }
+
+    normalized = date_str
+    for spanish, english in month_map.items():
+        normalized = normalized.replace(spanish, english)
+        normalized = normalized.replace(spanish.capitalize(), english)
+
+    normalized = normalized.replace("a. m.", "AM").replace("p. m.", "PM")
+    return normalized
+
+def _parse_date(self, date_str: str) -> datetime | None:
+    """Parse normalized date string with error handling."""
+    try:
+        normalized = self._normalize_spanish_date(date_str)
+        return date_parser(normalized, dayfirst=True)
+    except (ValueError, TypeError, OverflowError) as e:
+        logger.error("Failed to parse date: %s -> %s", date_str, e)
+        return None
+```
+
+Then update both methods to use:
+```python
+datetime_obj = self._parse_date(date_str.strip())
+```
+
+## Acceptance Criteria
+
+- [ ] Create `_normalize_spanish_date()` helper method
+- [ ] Create `_parse_date()` helper method with error handling
+- [ ] Update `_parse_outgoing_transfer()` to use helper methods
+- [ ] Update `_parse_credit_card_payment()` to use helper methods
+- [ ] Verify both methods produce identical results with test data
+- [ ] Reduce method lengths by ~20 lines each
+
+## Related Issues
+
+- #1 (Duplicate Month Mappings) - Should be resolved together
diff --git a/docs/issues/03-magic-number-max-results.md b/docs/issues/03-magic-number-max-results.md
@@ -0,0 +1,90 @@
+# Code Smell: Magic Number (maxResults=500)
+
+**Labels:** `code-quality`, `refactoring`, `low-priority`  
+**Type:** Magic Number
+
+## Description
+
+The Gmail API call in `fetch_emails.py` uses a hardcoded value `maxResults=500` without explanation or named constant.
+
+## Location
+
+- `core/fetch_emails.py` (line 23)
+
+## Current Code
+
+```python
+response = (
+    service.users()
+    .messages()
+    .list(userId="me", q=query, maxResults=500, pageToken=page_token)
+    .execute()
+)
+```
+
+## Impact
+
+- **Unclear Intent:** Why 500? Is it the API limit? Performance tuning?
+- **Maintainability:** Harder to adjust if Gmail API limits change
+- **Not Configurable:** Cannot be changed without code modification
+- **Documentation:** No comment explaining the choice
+
+## Proposed Solution
+
+Define as a named constant at the module level with documentation:
+
+```python
+# Gmail API allows up to 500 messages per page
+# Using maximum to minimize API calls
+MAX_MESSAGES_PER_PAGE = 500
+
+def list_messages(service, query: str = " ", page_token: str | None = None):
+    """Yield all message metadata matching the query, handling pagination automatically."""
+    while True:
+        response = (
+            service.users()
+            .messages()
+            .list(userId="me", q=query, maxResults=MAX_MESSAGES_PER_PAGE, pageToken=page_token)
+            .execute()
+        )
+        # ... rest of code
+```
+
+Or make it configurable:
+
+```python
+DEFAULT_PAGE_SIZE = 500
+
+def list_messages(
+    service, 
+    query: str = " ", 
+    page_token: str | None = None,
+    page_size: int = DEFAULT_PAGE_SIZE
+):
+    """
+    Yield all message metadata matching the query, handling pagination automatically.
+
+    Args:
+        service: Authenticated Gmail API service
+        query: Gmail search query
+        page_token: Pagination token for continuing previous search
+        page_size: Number of messages per page (default: 500, max: 500)
+    """
+    if page_size > 500:
+        logger.warning("Gmail API maximum is 500 messages per page, using 500")
+        page_size = 500
+
+    # ... rest of code with maxResults=page_size
+```
+
+## Acceptance Criteria
+
+- [ ] Define `MAX_MESSAGES_PER_PAGE` constant at module level
+- [ ] Add comment explaining the value (Gmail API limit)
+- [ ] Replace magic number with named constant
+- [ ] Consider making it a parameter for testing flexibility
+- [ ] Update any related documentation
+
+## Additional Notes
+
+According to [Gmail API documentation](https://developers.google.com/gmail/api/reference/rest/v1/users.messages/list), the maximum value for `maxResults` is 500, so this is already using the optimal value. The issue is purely about code clarity and maintainability.