Skip to content

Conversation

@hvignesh18197
Copy link

…ndling

Copilot AI review requested due to automatic review settings August 8, 2025 17:22
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements Runpod API integration as an alternative to local Whisper processing for media transcription, adding support for async processing via webhooks. The integration provides a configurable fallback mechanism where translation tasks use local Whisper while transcription can use either Runpod or local processing based on configuration.

  • Adds Runpod API client with support for both sync and async processing
  • Implements webhook handling for async Runpod job results
  • Updates transcription services to conditionally use Runpod or local Whisper based on configuration

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
services/runpod_whisper.py New Runpod API client with transcription, polling, and webhook support
routes/webhook.py New webhook endpoint for handling async Runpod transcription results
services/v1/media/media_transcribe.py Updated to support Runpod integration with fallback to local Whisper
services/transcription.py Modified to use Runpod when configured, otherwise local Whisper
services/ass_toolkit.py Added Runpod support for transcription generation
routes/v1/media/media_transcribe.py Added initial_prompt parameter support
config.py Added Runpod configuration variables
.env.example Added Runpod API key example

Dictionary containing transcription results in Whisper format
or job information if webhook is provided
"""
return runpod_client.transcribe_audio(audio_url, model, language, webhook_url, initial_prompt="")
Copy link

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial_prompt parameter is being hardcoded to an empty string instead of using the passed initial_prompt parameter. This should be initial_prompt=initial_prompt.

Suggested change
return runpod_client.transcribe_audio(audio_url, model, language, webhook_url, initial_prompt="")
return runpod_client.transcribe_audio(audio_url, model, language, webhook_url, initial_prompt=initial_prompt)

Copilot uses AI. Check for mistakes.
if not self.api_key:
raise ValueError("RUNPOD_API_KEY environment variable is not set")

self.base_url = "https://api.runpod.ai/v2/n8j2ln49qh2n4x/run"
Copy link

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The endpoint ID 'n8j2ln49qh2n4x' is hardcoded. This should be configurable via environment variable to support different Runpod endpoints or deployments.

Suggested change
self.base_url = "https://api.runpod.ai/v2/n8j2ln49qh2n4x/run"
self.endpoint_id = os.environ.get('RUNPOD_ENDPOINT_ID')
if not self.endpoint_id:
raise ValueError("RUNPOD_ENDPOINT_ID environment variable is not set")
self.base_url = f"https://api.runpod.ai/v2/{self.endpoint_id}/run"

Copilot uses AI. Check for mistakes.

logger.info(f"Polling for job completion. Job ID: {job_id}")

status_url = f"https://api.runpod.ai/v2/n8j2ln49qh2n4x/status/{job_id}"
Copy link

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The endpoint ID is hardcoded again in the status URL. This should use the same configurable base URL pattern to maintain consistency.

Suggested change
status_url = f"https://api.runpod.ai/v2/n8j2ln49qh2n4x/status/{job_id}"
status_url = self.base_url.rsplit('/run', 1)[0] + f"/status/{job_id}"

Copilot uses AI. Check for mistakes.
Dictionary containing job status information
"""
try:
status_url = f"https://api.runpod.ai/v2/n8j2ln49qh2n4x/status/{job_id}"
Copy link

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Third occurrence of the hardcoded endpoint ID. Consider extracting this to a method or configuration variable.

Suggested change
status_url = f"https://api.runpod.ai/v2/n8j2ln49qh2n4x/status/{job_id}"
status_url = f"https://api.runpod.ai/v2/{self.ENDPOINT_ID}/status/{job_id}"

Copilot uses AI. Check for mistakes.
input_filename = download_file(media_url, os.path.join(LOCAL_STORAGE_PATH, f"{job_id}_input"))
logger.info(f"Downloaded media to local file: {input_filename}")

logger.info(f"Checking Runpod configuration: USE_RUNPOD={USE_RUNPOD}, RUNPOD_API_KEY={'✅ Set' if RUNPOD_API_KEY else '❌ Not set'}")
Copy link

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging API key presence with emoji characters could cause encoding issues in some logging systems. Consider using plain text like 'SET' and 'NOT_SET'.

Suggested change
logger.info(f"Checking Runpod configuration: USE_RUNPOD={USE_RUNPOD}, RUNPOD_API_KEY={'✅ Set' if RUNPOD_API_KEY else '❌ Not set'}")
logger.info(f"Checking Runpod configuration: USE_RUNPOD={USE_RUNPOD}, RUNPOD_API_KEY={'SET' if RUNPOD_API_KEY else 'NOT_SET'}")

Copilot uses AI. Check for mistakes.
input_filename = download_file(media_url, os.path.join(STORAGE_PATH, 'input_media'))
logger.info(f"Downloaded media to local file: {input_filename}")

logger.info(f"Checking Runpod configuration: USE_RUNPOD={USE_RUNPOD}, RUNPOD_API_KEY={'✅ Set' if RUNPOD_API_KEY else '❌ Not set'}")
Copy link

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same emoji encoding issue as in media_transcribe.py. Use plain text for better logging compatibility.

Suggested change
logger.info(f"Checking Runpod configuration: USE_RUNPOD={USE_RUNPOD}, RUNPOD_API_KEY={'Set' if RUNPOD_API_KEY else 'Not set'}")
logger.info(f"Checking Runpod configuration: USE_RUNPOD={USE_RUNPOD}, RUNPOD_API_KEY={'Set' if RUNPOD_API_KEY else 'Not set'}")

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +29
# In-memory storage for webhook results (in production, use Redis or database)
webhook_results = {}
Copy link

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using in-memory storage for webhook results will lose data on application restart. The comment mentions using Redis or database in production, but this should be implemented or at least have proper warning documentation.

Suggested change
# In-memory storage for webhook results (in production, use Redis or database)
webhook_results = {}
# WARNING: In-memory storage for webhook results.
# All webhook results will be lost on application restart or crash.
# DO NOT USE THIS IN PRODUCTION. Use Redis or a persistent database instead.
webhook_results = {}
logging.warning(
"USING IN-MEMORY STORAGE FOR WEBHOOK RESULTS. "
"ALL DATA WILL BE LOST ON APPLICATION RESTART. "
"DO NOT USE THIS IN PRODUCTION. Use Redis or a persistent database."
)

Copilot uses AI. Check for mistakes.
Comment on lines +286 to +288
# Global instance for easy access
runpod_client = RunpodWhisperClient()

Copy link

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a global instance without proper error handling. If RUNPOD_API_KEY is not set, this will raise an exception at module import time, potentially breaking the entire application even when Runpod is not being used.

Suggested change
# Global instance for easy access
runpod_client = RunpodWhisperClient()
# Lazy singleton instance for easy access
_runpod_client_instance = None
def get_runpod_client():
global _runpod_client_instance
if _runpod_client_instance is None:
_runpod_client_instance = RunpodWhisperClient()
return _runpod_client_instance

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant