Skip to content

Conversation

@badri-singhal
Copy link
Contributor

@badri-singhal badri-singhal commented Dec 27, 2025

Summary by CodeRabbit

  • New Features

    • Added automatic cleanup mechanism for stuck calls
    • Enhanced retry scheduling for failed call attempts
    • Call recording upload to cloud storage
    • Webhook notifications for call outcomes and failures
  • Refactor

    • Improved internal call management architecture

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 27, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This PR refactors the Breeze Buddy call management system by extracting lead processing, configuration management, outbound number handling, retry scheduling, call recordings, webhook delivery, and stuck-lead cleanup logic into seven dedicated manager modules, updating the main calls.py to depend on these new public interfaces.

Changes

Cohort / File(s) Summary
New Manager Modules
app/ai/voice/agents/breeze_buddy/managers/cleanup.py, app/ai/voice/agents/breeze_buddy/managers/config.py, app/ai/voice/agents/breeze_buddy/managers/outbound_numbers.py, app/ai/voice/agents/breeze_buddy/managers/processor.py, app/ai/voice/agents/breeze_buddy/managers/recordings.py, app/ai/voice/agents/breeze_buddy/managers/retry.py, app/ai/voice/agents/breeze_buddy/managers/webhooks.py
Seven new modules introduced: cleanup adds cleanup_stuck_leads() for processing stale PROCESSING leads; config adds get_lead_config() and is_within_calling_hours() for configuration and time validation; outbound_numbers adds get_available_number(), get_available_number_by_provider(), acquire_number(), and release_number() for provider-aware number management; processor adds process_single_lead() orchestrating end-to-end lead processing with template support, multi-provider retry logic, and call initiation; recordings adds update_call_recording() for provider-aware recording download/upload; retry adds schedule_retry() for retry scheduling with max-attempt checks and no-answer webhook dispatch; webhooks adds send_call_failure_webhook() and send_no_answer_webhook() for event notifications.
Refactored Main Handler
app/ai/voice/agents/breeze_buddy/managers/calls.py
Updated to wire new manager APIs; replaced internal helpers with public functions (cleanup_stuck_leads, get_lead_config, process_single_lead, update_recording, schedule_retry, release_number); refactored process_backlog_leads to use cleanup and per-lead processing; updated handle_call_completion signature to accept outcome, call_end_time, meta_data and return Optional[LeadCallTracker]; refactored handle_unanswered_calls to use config and retry helpers; replaced internal locking/processing inline logic with process_single_lead calls.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • swaroopvarma1

Poem

🐰 A hop, skip, and refactor through the code,
Seven new paths down the manager road,
Lead tracking cleaner, configs aligned,
Webhooks and retries, all redesigned,
The Breeze Buddy hops with joy today!

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Refactor call manager in buddy' is vague and does not clearly convey the specific changes made. While it references refactoring the call manager, it lacks specificity about what was actually refactored or why, making it difficult for reviewers to understand the primary change from the title alone. Consider using a more descriptive title that highlights the main refactoring objective, such as 'Refactor calls manager to use public API helpers' or 'Extract call manager logic into reusable manager modules' to better communicate the scope and intent of the changes.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@badri-singhal badri-singhal changed the title Refactor call manager file Refactor call manager in buddy Dec 27, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (5)
app/ai/voice/agents/breeze_buddy/managers/config.py (1)

32-47: Replace EN DASH with HYPHEN-MINUS in comments.

The static analysis tool flagged ambiguous Unicode characters. Use standard ASCII hyphens for consistency.

🔎 Proposed fix
     if config.call_start_time <= config.call_end_time:
-        # Normal case (e.g., 09:00–17:00)
+        # Normal case (e.g., 09:00-17:00)
         return config.call_start_time <= current_time <= config.call_end_time
     else:
-        # Overnight case (e.g., 22:00–06:00)
+        # Overnight case (e.g., 22:00-06:00)
         return (
             current_time >= config.call_start_time
             or current_time <= config.call_end_time
         )
app/ai/voice/agents/breeze_buddy/managers/recordings.py (1)

39-43: Move provider.lower() before the lead lookup.

The provider normalization at line 40 happens after the async call but before the if not lead check. While functionally correct, it would be cleaner to normalize the provider at the start of the function for clarity and to avoid unnecessary processing if an early return happens.

🔎 Proposed fix
     logger.info(
         f"Processing call recording for call_id: {call_id} from provider: {provider}"
     )
+    provider = provider.lower()
     lead = await get_lead_by_call_id(call_id)
-    provider = provider.lower()
     if not lead:
         logger.error(f"Could not find lead for call_id: {call_id}")
         return
app/ai/voice/agents/breeze_buddy/managers/outbound_numbers.py (1)

144-148: Potential negative channel count.

If release_number is called when channels is already 0 (or due to a race condition), the channel count could become negative. Consider adding a floor check.

🔎 Proposed fix
     elif provider == CallProvider.EXOTEL:
         outbound_number = await get_outbound_number_by_id(number_id)
         if outbound_number:
+            new_channels = max(0, outbound_number.channels - 1) if outbound_number.channels else 0
             await update_outbound_number_channels(
-                number_id, outbound_number.channels - 1
+                number_id, new_channels
             )
app/ai/voice/agents/breeze_buddy/managers/calls.py (1)

63-65: Explicit return None for clarity.

The function signature declares -> Optional[LeadCallTracker], but line 65 uses bare return. While functionally equivalent, explicit return None improves readability.

🔎 Proposed fix
     lead = await get_lead_by_call_id(call_id)
     if not lead:
         logger.error(f"Could not find lead for call_id: {call_id}")
-        return
+        return None
app/ai/voice/agents/breeze_buddy/managers/processor.py (1)

291-296: Remove unused retry_success variable.

The variable is assigned but never used, as flagged by static analysis. If the retry outcome doesn't affect subsequent logic, remove the assignment.

🔎 Proposed fix
             # Attempt retry with alternate provider
-            retry_success = await _handle_call_initiation_failure_with_retry(
+            await _handle_call_initiation_failure_with_retry(
                 session, locked_lead, number_to_use, config, use_template_flow
             )
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ef71043 and 81dfd99.

📒 Files selected for processing (8)
  • app/ai/voice/agents/breeze_buddy/managers/calls.py
  • app/ai/voice/agents/breeze_buddy/managers/cleanup.py
  • app/ai/voice/agents/breeze_buddy/managers/config.py
  • app/ai/voice/agents/breeze_buddy/managers/outbound_numbers.py
  • app/ai/voice/agents/breeze_buddy/managers/processor.py
  • app/ai/voice/agents/breeze_buddy/managers/recordings.py
  • app/ai/voice/agents/breeze_buddy/managers/retry.py
  • app/ai/voice/agents/breeze_buddy/managers/webhooks.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-24T07:52:29.756Z
Learnt from: badri-singhal
Repo: juspay/clairvoyance PR: 445
File: app/database/migrations/009_add_merchant_template_to_outbound_number.sql:38-38
Timestamp: 2025-12-24T07:52:29.756Z
Learning: In the outbound_number table in app/database/migrations/009_add_merchant_template_to_outbound_number.sql, duplicate phone numbers across merchant_id and shop_identifier combinations are intentionally allowed. No composite unique constraint on (number, merchant_id, shop_identifier) is required.

Applied to files:

  • app/ai/voice/agents/breeze_buddy/managers/outbound_numbers.py
🧬 Code graph analysis (7)
app/ai/voice/agents/breeze_buddy/managers/recordings.py (3)
app/database/accessor/breeze_buddy/lead_call_tracker.py (2)
  • get_lead_by_call_id (193-212)
  • update_lead_call_recording_url (263-286)
app/services/gcp/storage/storage.py (1)
  • upload_file_to_gcs (75-110)
app/ai/voice/agents/breeze_buddy/managers/calls.py (1)
  • update_call_recording (126-145)
app/ai/voice/agents/breeze_buddy/managers/webhooks.py (3)
app/ai/voice/agents/breeze_buddy/utils/common.py (1)
  • send_webhook_with_retry (57-99)
app/schemas/breeze_buddy/core.py (1)
  • LeadCallTracker (34-57)
app/ai/voice/agents/breeze_buddy/template/context.py (1)
  • reporting_webhook_url (97-99)
app/ai/voice/agents/breeze_buddy/managers/config.py (2)
app/database/accessor/breeze_buddy/call_execution_config.py (1)
  • get_call_execution_config_by_merchant_id (91-134)
app/schemas/breeze_buddy/core.py (2)
  • CallExecutionConfig (118-134)
  • LeadCallTracker (34-57)
app/ai/voice/agents/breeze_buddy/managers/cleanup.py (6)
app/ai/voice/agents/breeze_buddy/managers/config.py (1)
  • get_lead_config (13-29)
app/ai/voice/agents/breeze_buddy/managers/outbound_numbers.py (1)
  • release_number (138-149)
app/ai/voice/agents/breeze_buddy/managers/retry.py (1)
  • schedule_retry (17-56)
app/database/accessor/breeze_buddy/lead_call_tracker.py (4)
  • acquire_lock_on_lead_by_id (119-138)
  • get_leads_by_status_and_time_before (361-377)
  • release_lock_on_lead_by_id (141-160)
  • update_lead_call_completion_details (289-319)
app/database/accessor/breeze_buddy/outbound_number.py (1)
  • get_outbound_number_by_id (77-99)
app/schemas/breeze_buddy/core.py (1)
  • LeadCallStatus (25-31)
app/ai/voice/agents/breeze_buddy/managers/retry.py (4)
app/ai/voice/agents/breeze_buddy/managers/webhooks.py (1)
  • send_no_answer_webhook (46-85)
app/core/transport/http_client.py (1)
  • create_aiohttp_session (58-77)
app/database/accessor/breeze_buddy/lead_call_tracker.py (1)
  • create_lead_call_tracker (43-95)
app/schemas/breeze_buddy/core.py (2)
  • CallExecutionConfig (118-134)
  • LeadCallTracker (34-57)
app/ai/voice/agents/breeze_buddy/managers/processor.py (8)
app/ai/voice/agents/breeze_buddy/managers/config.py (2)
  • get_lead_config (13-29)
  • is_within_calling_hours (32-47)
app/ai/voice/agents/breeze_buddy/managers/outbound_numbers.py (4)
  • acquire_number (128-135)
  • get_available_number (23-96)
  • get_available_number_by_provider (99-125)
  • release_number (138-149)
app/ai/voice/agents/breeze_buddy/managers/webhooks.py (1)
  • send_call_failure_webhook (14-43)
app/ai/voice/agents/breeze_buddy/services/telephony/utils.py (1)
  • get_voice_provider (13-20)
app/database/accessor/breeze_buddy/lead_call_tracker.py (4)
  • acquire_lock_on_lead_by_id (119-138)
  • release_lock_on_lead_by_id (141-160)
  • update_lead_call_completion_details (289-319)
  • update_lead_call_details (163-190)
app/database/accessor/breeze_buddy/template.py (1)
  • get_template_by_merchant (35-77)
app/schemas/breeze_buddy/core.py (3)
  • CallProvider (18-22)
  • LeadCallStatus (25-31)
  • LeadCallTracker (34-57)
app/ai/voice/agents/breeze_buddy/template/context.py (1)
  • provider (117-119)
app/ai/voice/agents/breeze_buddy/managers/calls.py (11)
app/helpers/automatic/process_pool.py (1)
  • cleanup (677-707)
app/ai/voice/agents/breeze_buddy/managers/cleanup.py (1)
  • cleanup_stuck_leads (23-72)
app/ai/voice/agents/breeze_buddy/managers/config.py (1)
  • get_lead_config (13-29)
app/ai/voice/agents/breeze_buddy/managers/outbound_numbers.py (1)
  • release_number (138-149)
app/ai/voice/agents/breeze_buddy/managers/processor.py (1)
  • process_single_lead (208-300)
app/ai/voice/agents/breeze_buddy/managers/recordings.py (1)
  • update_call_recording (17-100)
app/ai/voice/agents/breeze_buddy/managers/retry.py (1)
  • schedule_retry (17-56)
app/core/transport/http_client.py (1)
  • create_aiohttp_session (58-77)
app/database/accessor/breeze_buddy/lead_call_tracker.py (3)
  • get_lead_by_call_id (193-212)
  • get_leads_based_on_status_and_next_attempt (98-116)
  • update_lead_call_completion_details (289-319)
app/database/accessor/breeze_buddy/outbound_number.py (1)
  • get_outbound_number_by_id (77-99)
app/schemas/breeze_buddy/core.py (2)
  • LeadCallStatus (25-31)
  • LeadCallTracker (34-57)
🪛 Ruff (0.14.10)
app/ai/voice/agents/breeze_buddy/managers/recordings.py

96-96: Do not catch blind exception: Exception

(BLE001)

app/ai/voice/agents/breeze_buddy/managers/webhooks.py

42-42: Do not catch blind exception: Exception

(BLE001)


84-84: Do not catch blind exception: Exception

(BLE001)

app/ai/voice/agents/breeze_buddy/managers/config.py

40-40: Comment contains ambiguous (EN DASH). Did you mean - (HYPHEN-MINUS)?

(RUF003)


43-43: Comment contains ambiguous (EN DASH). Did you mean - (HYPHEN-MINUS)?

(RUF003)

app/ai/voice/agents/breeze_buddy/managers/cleanup.py

67-67: Do not catch blind exception: Exception

(BLE001)

app/ai/voice/agents/breeze_buddy/managers/processor.py

291-291: Local variable retry_success is assigned to but never used

Remove assignment to unused variable retry_success

(F841)


296-296: Consider moving this statement to an else block

(TRY300)


298-298: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (9)
app/ai/voice/agents/breeze_buddy/managers/config.py (1)

13-29: LGTM!

The function correctly fetches configs by merchant/shop and filters by template. The early return pattern with appropriate warning logs is clean.

app/ai/voice/agents/breeze_buddy/managers/recordings.py (1)

45-100: LGTM with note on error handling.

The fallback behavior (storing provider URL on download/upload failure) is a sensible degradation pattern. The broad Exception catch is acceptable here as a top-level handler that logs with stack trace via exc_info=True.

One minor observation: there's no return or database update in the exception handler at line 96-100, so if an exception occurs mid-processing, the lead's recording URL may remain unset. Consider whether a fallback to the provider URL should be attempted in the exception path as well.

app/ai/voice/agents/breeze_buddy/managers/cleanup.py (1)

23-72: LGTM overall.

The cleanup logic is well-structured with proper lock management via try/finally. The per-lead error handling ensures one failed cleanup doesn't block others.

app/ai/voice/agents/breeze_buddy/managers/retry.py (1)

17-56: LGTM!

The retry logic correctly handles:

  1. Scheduling retries when attempts remain (using attempt_count < max_retry - 1)
  2. Sending NO_ANSWER webhook when max retries are exhausted
  3. Propagating use_template_flow metadata to the new lead

The session management with async with create_aiohttp_session() ensures proper cleanup.

app/ai/voice/agents/breeze_buddy/managers/outbound_numbers.py (1)

23-96: LGTM on the number selection logic.

The dual-path approach (template-based vs backward-compatible) is well-documented and correctly handles provider-specific constraints (EXOTEL channel limits vs TWILIO availability).

app/ai/voice/agents/breeze_buddy/managers/calls.py (2)

35-49: LGTM!

The backlog processing flow is clean: cleanup stuck leads first, then process each lead through the dedicated processor. The session management via async with ensures proper resource cleanup.


126-145: LGTM!

The recording handler is now a thin wrapper that delegates to the extracted update_recording function, maintaining the same public API.

app/ai/voice/agents/breeze_buddy/managers/processor.py (2)

34-54: LGTM on helper structure.

The helper functions are well-factored with clear single responsibilities. The call initiation logic cleanly separates success and failure handling paths.


107-205: LGTM on retry with alternate provider logic.

The provider flip logic (Twilio ↔ Exotel) with proper international calling constraints and failure webhooks is well-implemented.

@badri-singhal badri-singhal force-pushed the refactor-call-manager-in-buddy branch from 81dfd99 to 0115ebd Compare December 28, 2025 06:32
@badri-singhal
Copy link
Contributor Author

@coderabbitai please review thoroughly and check for each edge cases.

@coderabbitai
Copy link

coderabbitai bot commented Dec 28, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Fetch all the manager files to review them thoroughly

echo "=== calls.py ==="
cat app/ai/voice/agents/breeze_buddy/managers/calls.py

echo -e "\n\n=== cleanup.py ==="
cat app/ai/voice/agents/breeze_buddy/managers/cleanup.py

echo -e "\n\n=== config.py ==="
cat app/ai/voice/agents/breeze_buddy/managers/config.py

echo -e "\n\n=== outbound_numbers.py ==="
cat app/ai/voice/agents/breeze_buddy/managers/outbound_numbers.py

echo -e "\n\n=== processor.py ==="
cat app/ai/voice/agents/breeze_buddy/managers/processor.py

echo -e "\n\n=== recordings.py ==="
cat app/ai/voice/agents/breeze_buddy/managers/recordings.py

echo -e "\n\n=== retry.py ==="
cat app/ai/voice/agents/breeze_buddy/managers/retry.py

echo -e "\n\n=== webhooks.py ==="
cat app/ai/voice/agents/breeze_buddy/managers/webhooks.py

Length of output: 20737


🏁 Script executed:

#!/bin/bash
# Fetch the correctly named files (singular forms)

echo "=== outbound_number.py ==="
cat app/ai/voice/agents/breeze_buddy/managers/outbound_number.py

echo -e "\n\n=== recording.py ==="
cat app/ai/voice/agents/breeze_buddy/managers/recording.py

echo -e "\n\n=== webhook.py ==="
cat app/ai/voice/agents/breeze_buddy/managers/webhook.py

Length of output: 11711


@badri-singhal, I've completed a thorough review of all 8 files in this refactoring. Here's my comprehensive edge case analysis organized by severity:

🔴 Critical Issues

1. Non-Atomic Number Acquisition Pattern (All modules using numbers)

Risk: Race condition where multiple processes could acquire the same number simultaneously.

Location: processor.py and throughout

number_to_use = await get_available_number(config, template)  # Check
if not number_to_use:
    return True
await acquire_number(number_to_use)  # Acquire (separate operation)

Issue: Between checking availability and acquiring, another process could grab the same number. This is especially problematic for Twilio numbers where status changes from AVAILABLE → IN_USE, but two processes could both see it as AVAILABLE.

Recommendation: Consider database-level atomic operations (SELECT FOR UPDATE) or optimistic locking with version numbers.


2. Exotel Channel Management with None Values (outbound_number.py)

Lines: acquire_number() and release_number()

async def acquire_number(number: OutboundNumber):
    if number.provider == CallProvider.EXOTEL:
        if number.channels is not None:
            await update_outbound_number_channels(number.id, number.channels + 1)
        else:
            logger.warning(
                f"Cannot acquire Exotel number {number.id}: channels is None"
            )
            # ⚠️ Number is NOT actually acquired but no error is raised!

Issue: When channels is None, the warning is logged but the function doesn't fail or prevent the number from being used. The caller assumes the number was successfully acquired, leading to:

  • Inconsistent state
  • Number might be used beyond capacity
  • No failure signal propagated

Similar issue in release_number() - if channels is None, it can't be decremented.

Recommendation:

async def acquire_number(number: OutboundNumber):
    if number.provider == CallProvider.TWILIO:
        await update_outbound_number_status(number.id, OutboundNumberStatus.IN_USE)
    elif number.provider == CallProvider.EXOTEL:
        if number.channels is None:
            raise ValueError(f"Cannot acquire Exotel number {number.id}: channels is None")
        await update_outbound_number_channels(number.id, number.channels + 1)

🟡 High Priority Issues

3. Exotel Maximum Channels = 0 Edge Case (outbound_number.py)

Location: Multiple places checking channels < maximum_channels

if (
    num.channels is not None
    and num.maximum_channels is not None
    and num.channels < num.maximum_channels  # ⚠️ If maximum_channels=0, always False
):

Issue: If maximum_channels is configured as 0 (meaning "no channels available"), the number would never be selected, but it also wouldn't be explicitly filtered out. This is correct behavior, but edge case worth documenting.

Recommendation: Add validation when creating/updating outbound numbers to ensure maximum_channels > 0 for Exotel numbers, or explicitly document that 0 means "disabled".


4. Unused Return Value from Retry Handler (processor.py)

Line ~160:

# Attempt retry with alternate provider
retry_success = await _handle_call_initiation_failure_with_retry(
    session, locked_lead, number_to_use, config, use_template_flow
)
# ⚠️ retry_success is never used!

Issue: The return value indicates whether retry succeeded, but it's captured and ignored. The function returns True regardless.

Recommendation: Consider whether this return value should affect logging or metrics.


5. Template Not Found When Template Flow Enabled (processor.py)

Lines ~130-140:

use_template_flow = (
    lead.metaData.get("use_template_flow", False) if lead.metaData else False
)

template = (
    await get_template_by_merchant(...)
    if use_template_flow
    else None
)
# ⚠️ If use_template_flow=True but template is None (not found), continues silently

Issue: When use_template_flow is True but template isn't found in DB, the code continues with template=None. This might not be intended behavior.

Recommendation:

if use_template_flow:
    template = await get_template_by_merchant(...)
    if not template:
        logger.error(f"Template flow enabled for lead {lead.id} but template not found")
        return True  # Skip this lead
else:
    template = None

🟢 Medium Priority Issues

6. Retry Count Logic Could Be Clearer (retry.py)

Line 19:

if lead.attempt_count < config.max_retry - 1:  # Confusing off-by-one
    # ...
    logger.info(
        f"Scheduled retry for lead {lead.id} (attempt {lead.attempt_count + 2}/{config.max_retry})"
    )

Issue: The logic is technically correct but confusing:

  • For max_retry=3 and 0-indexed attempts:
    • attempt 0 → retry (next will be attempt 1)
    • attempt 1 → retry (next will be attempt 2)
    • attempt 2 → no retry (this was the last attempt)

The log message shows attempt_count + 2 which could confuse readers.

Recommendation: Add a comment explaining the 0-indexed semantics or refactor to be more explicit:

# attempt_count is 0-indexed, so max_retry=3 means attempts 0, 1, 2
if lead.attempt_count + 1 < config.max_retry:

7. Negative Retry Offset Edge Case (retry.py)

Line 20:

next_attempt_at = datetime.now(timezone.utc) + timedelta(
    seconds=config.retry_offset
)

Issue: If retry_offset is negative (misconfiguration), next_attempt_at would be in the past, causing immediate retry loop.

Recommendation: Validate retry_offset > 0 at config load time or add defensive check:

if config.retry_offset <= 0:
    logger.warning(f"Invalid retry_offset {config.retry_offset}, using default 300s")
    retry_offset = 300
else:
    retry_offset = config.retry_offset

8. Timezone Assumption in Webhook (webhook.py)

Line 46:

if lead.call_initiated_time:
    call_initiated_time_utc = lead.call_initiated_time.astimezone(timezone.utc)

Issue: If call_initiated_time is a naive datetime (no timezone info), astimezone() might raise an exception or use system timezone unpredictably.

Recommendation: Ensure all datetime fields in LeadCallTracker are timezone-aware, or add defensive handling:

if lead.call_initiated_time:
    if lead.call_initiated_time.tzinfo is None:
        # Assume UTC if naive
        call_initiated_time_utc = lead.call_initiated_time.replace(tzinfo=timezone.utc)
    else:
        call_initiated_time_utc = lead.call_initiated_time.astimezone(timezone.utc)

9. Calling Hours Boundary Edge Case (config.py)

Line 27-33:

if config.call_start_time <= config.call_end_time:
    return config.call_start_time <= current_time <= config.call_end_time

Issue: When call_start_time == call_end_time == current_time (e.g., all are noon), this returns True. This is an unusual config but mathematically valid.

Recommendation: Document this behavior or add validation that start_time != end_time.


10. Config Returned After Lead Updated (calls.py)

Line 53-63: In handle_call_completion:

config = await get_lead_config(lead)
if not config:
    return

updated_lead = await update_lead_call_completion_details(...)

Issue: If config is not found, the function returns early without updating the lead. The lead remains in its current state, potentially stuck.

Recommendation: Consider whether the lead should be updated to FINISHED with an error outcome even when config is missing:

config = await get_lead_config(lead)

updated_lead = await update_lead_call_completion_details(
    id=lead.id,
    status=LeadCallStatus.FINISHED,
    outcome=outcome if config else "CONFIG_NOT_FOUND",
    meta_data=meta_data or {},
    call_end_time=call_end_time,
)

if not config:
    logger.error(f"Config not found for lead {lead.id}, marked as finished")
    return updated_lead

✅ Well-Handled Edge Cases

  1. Lock release in finally blocks - Correctly implemented in cleanup.py and processor.py
  2. None checks for metaData, payload, webhook URLs - Properly handled throughout
  3. Lead not found scenarios - Appropriate logging and early returns
  4. Exception handling per lead - Ensures one failure doesn't break batch processing
  5. Provider-specific logic separation - Clean separation between Twilio and Exotel
  6. Fallback to provider URL on upload failure - Good resilience in recording.py
  7. Lock acquisition failure handling - Properly skips leads already being processed

Additional Recommendations

  1. Add integration tests for the retry logic with different max_retry values
  2. Add tests for race conditions using concurrent processing
  3. Consider adding metrics/monitoring for:
    • Number acquisition failures
    • Lock acquisition conflicts
    • Webhook delivery failures
  4. Document the 0-indexed attempt_count semantics in the LeadCallTracker schema

Great refactoring overall! The modular structure significantly improves maintainability. The critical issues around number acquisition atomicity and Exotel channel management should be addressed before merging.

Would you like me to elaborate on any of these findings or help with implementation of the fixes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant