Implement Runpod API integration for transcription and add webhook ha… #192

hvignesh18197 · 2025-08-08T17:22:28Z

…ndling

Copilot

Pull Request Overview

This PR implements Runpod API integration as an alternative to local Whisper processing for media transcription, adding support for async processing via webhooks. The integration provides a configurable fallback mechanism where translation tasks use local Whisper while transcription can use either Runpod or local processing based on configuration.

Adds Runpod API client with support for both sync and async processing
Implements webhook handling for async Runpod job results
Updates transcription services to conditionally use Runpod or local Whisper based on configuration

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
services/runpod_whisper.py	New Runpod API client with transcription, polling, and webhook support
routes/webhook.py	New webhook endpoint for handling async Runpod transcription results
services/v1/media/media_transcribe.py	Updated to support Runpod integration with fallback to local Whisper
services/transcription.py	Modified to use Runpod when configured, otherwise local Whisper
services/ass_toolkit.py	Added Runpod support for transcription generation
routes/v1/media/media_transcribe.py	Added initial_prompt parameter support
config.py	Added Runpod configuration variables
.env.example	Added Runpod API key example

Copilot · 2025-08-08T17:24:51Z

services/runpod_whisper.py

+        Dictionary containing transcription results in Whisper format
+        or job information if webhook is provided
+    """
+    return runpod_client.transcribe_audio(audio_url, model, language, webhook_url, initial_prompt="")


The initial_prompt parameter is being hardcoded to an empty string instead of using the passed initial_prompt parameter. This should be initial_prompt=initial_prompt.

Suggested change

return runpod_client.transcribe_audio(audio_url, model, language, webhook_url, initial_prompt="")

return runpod_client.transcribe_audio(audio_url, model, language, webhook_url, initial_prompt=initial_prompt)

Copilot · 2025-08-08T17:24:52Z

services/runpod_whisper.py

+        if not self.api_key:
+            raise ValueError("RUNPOD_API_KEY environment variable is not set")
+
+        self.base_url = "https://api.runpod.ai/v2/n8j2ln49qh2n4x/run"


The endpoint ID 'n8j2ln49qh2n4x' is hardcoded. This should be configurable via environment variable to support different Runpod endpoints or deployments.

Suggested change

self.base_url = "https://api.runpod.ai/v2/n8j2ln49qh2n4x/run"

self.endpoint_id = os.environ.get('RUNPOD_ENDPOINT_ID')

if not self.endpoint_id:

raise ValueError("RUNPOD_ENDPOINT_ID environment variable is not set")

self.base_url = f"https://api.runpod.ai/v2/{self.endpoint_id}/run"

Copilot · 2025-08-08T17:24:52Z

services/runpod_whisper.py

+
+        logger.info(f"Polling for job completion. Job ID: {job_id}")
+
+        status_url = f"https://api.runpod.ai/v2/n8j2ln49qh2n4x/status/{job_id}"


The endpoint ID is hardcoded again in the status URL. This should use the same configurable base URL pattern to maintain consistency.

Suggested change

status_url = f"https://api.runpod.ai/v2/n8j2ln49qh2n4x/status/{job_id}"

status_url = self.base_url.rsplit('/run', 1)[0] + f"/status/{job_id}"

Copilot · 2025-08-08T17:24:52Z

services/runpod_whisper.py

+            Dictionary containing job status information
+        """
+        try:
+            status_url = f"https://api.runpod.ai/v2/n8j2ln49qh2n4x/status/{job_id}"


Third occurrence of the hardcoded endpoint ID. Consider extracting this to a method or configuration variable.

Suggested change

status_url = f"https://api.runpod.ai/v2/n8j2ln49qh2n4x/status/{job_id}"

status_url = f"https://api.runpod.ai/v2/{self.ENDPOINT_ID}/status/{job_id}"

Copilot · 2025-08-08T17:24:53Z

services/v1/media/media_transcribe.py

-    input_filename = download_file(media_url, os.path.join(LOCAL_STORAGE_PATH, f"{job_id}_input"))
-    logger.info(f"Downloaded media to local file: {input_filename}")
-
+    logger.info(f"Checking Runpod configuration: USE_RUNPOD={USE_RUNPOD}, RUNPOD_API_KEY={'✅ Set' if RUNPOD_API_KEY else '❌ Not set'}")


Logging API key presence with emoji characters could cause encoding issues in some logging systems. Consider using plain text like 'SET' and 'NOT_SET'.

Suggested change

logger.info(f"Checking Runpod configuration: USE_RUNPOD={USE_RUNPOD}, RUNPOD_API_KEY={'✅ Set' if RUNPOD_API_KEY else '❌ Not set'}")

logger.info(f"Checking Runpod configuration: USE_RUNPOD={USE_RUNPOD}, RUNPOD_API_KEY={'SET' if RUNPOD_API_KEY else 'NOT_SET'}")

Copilot · 2025-08-08T17:24:53Z

services/transcription.py

-    input_filename = download_file(media_url, os.path.join(STORAGE_PATH, 'input_media'))
-    logger.info(f"Downloaded media to local file: {input_filename}")
-
+    logger.info(f"Checking Runpod configuration: USE_RUNPOD={USE_RUNPOD}, RUNPOD_API_KEY={'✅ Set' if RUNPOD_API_KEY else '❌ Not set'}")


Same emoji encoding issue as in media_transcribe.py. Use plain text for better logging compatibility.

Suggested change

logger.info(f"Checking Runpod configuration: USE_RUNPOD={USE_RUNPOD}, RUNPOD_API_KEY={'✅ Set' if RUNPOD_API_KEY else '❌ Not set'}")

logger.info(f"Checking Runpod configuration: USE_RUNPOD={USE_RUNPOD}, RUNPOD_API_KEY={'Set' if RUNPOD_API_KEY else 'Not set'}")

Copilot · 2025-08-08T17:24:53Z

routes/webhook.py

+# In-memory storage for webhook results (in production, use Redis or database)
+webhook_results = {}


Using in-memory storage for webhook results will lose data on application restart. The comment mentions using Redis or database in production, but this should be implemented or at least have proper warning documentation.

Suggested change

# In-memory storage for webhook results (in production, use Redis or database)

webhook_results = {}

# WARNING: In-memory storage for webhook results.

# All webhook results will be lost on application restart or crash.

# DO NOT USE THIS IN PRODUCTION. Use Redis or a persistent database instead.

webhook_results = {}

logging.warning(

"USING IN-MEMORY STORAGE FOR WEBHOOK RESULTS. "

"ALL DATA WILL BE LOST ON APPLICATION RESTART. "

"DO NOT USE THIS IN PRODUCTION. Use Redis or a persistent database."

)

Copilot · 2025-08-08T17:24:54Z

services/runpod_whisper.py

+# Global instance for easy access
+runpod_client = RunpodWhisperClient()
+


Creating a global instance without proper error handling. If RUNPOD_API_KEY is not set, this will raise an exception at module import time, potentially breaking the entire application even when Runpod is not being used.

Suggested change

# Global instance for easy access

runpod_client = RunpodWhisperClient()

# Lazy singleton instance for easy access

_runpod_client_instance = None

def get_runpod_client():

global _runpod_client_instance

if _runpod_client_instance is None:

_runpod_client_instance = RunpodWhisperClient()

return _runpod_client_instance

…v.example and add docker-compose-custom-build.yml

Implement Runpod API integration for transcription and add webhook ha…

42f463e

…ndling

Copilot AI review requested due to automatic review settings August 8, 2025 17:22

Copilot AI reviewed Aug 8, 2025

View reviewed changes

hvignesh18197 added 3 commits August 20, 2025 21:21

Remove libsvtav1-dev from Dockerfile dependencies

c04d346

Update environment configuration for S3 and RunPod integration in .en…

8b71d8d

…v.example and add docker-compose-custom-build.yml

Remove unused Whisper model configuration from environment files

abe5230

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Runpod API integration for transcription and add webhook ha… #192

Implement Runpod API integration for transcription and add webhook ha… #192

Uh oh!

hvignesh18197 commented Aug 8, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Aug 8, 2025

Uh oh!

Copilot AI Aug 8, 2025

Uh oh!

Copilot AI Aug 8, 2025

Uh oh!

Copilot AI Aug 8, 2025

Uh oh!

Copilot AI Aug 8, 2025

Uh oh!

Copilot AI Aug 8, 2025

Uh oh!

Copilot AI Aug 8, 2025

Uh oh!

Copilot AI Aug 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	return runpod_client.transcribe_audio(audio_url, model, language, webhook_url, initial_prompt="")
	return runpod_client.transcribe_audio(audio_url, model, language, webhook_url, initial_prompt=initial_prompt)

-        self.base_url = "https://api.runpod.ai/v2/n8j2ln49qh2n4x/run"
+        self.endpoint_id = os.environ.get('RUNPOD_ENDPOINT_ID')
+        if not self.endpoint_id:
+            raise ValueError("RUNPOD_ENDPOINT_ID environment variable is not set")
+        self.base_url = f"https://api.runpod.ai/v2/{self.endpoint_id}/run"


		logger.info(f"Polling for job completion. Job ID: {job_id}")

		status_url = f"https://api.runpod.ai/v2/n8j2ln49qh2n4x/status/{job_id}"

	status_url = f"https://api.runpod.ai/v2/n8j2ln49qh2n4x/status/{job_id}"
	status_url = self.base_url.rsplit('/run', 1)[0] + f"/status/{job_id}"

	logger.info(f"Checking Runpod configuration: USE_RUNPOD={USE_RUNPOD}, RUNPOD_API_KEY={'✅ Set' if RUNPOD_API_KEY else '❌ Not set'}")
	logger.info(f"Checking Runpod configuration: USE_RUNPOD={USE_RUNPOD}, RUNPOD_API_KEY={'SET' if RUNPOD_API_KEY else 'NOT_SET'}")

		# In-memory storage for webhook results (in production, use Redis or database)
		webhook_results = {}

-# In-memory storage for webhook results (in production, use Redis or database)
-webhook_results = {}
+# WARNING: In-memory storage for webhook results.
+# All webhook results will be lost on application restart or crash.
+# DO NOT USE THIS IN PRODUCTION. Use Redis or a persistent database instead.
+webhook_results = {}
+logging.warning(
+    "USING IN-MEMORY STORAGE FOR WEBHOOK RESULTS. "
+    "ALL DATA WILL BE LOST ON APPLICATION RESTART. "
+    "DO NOT USE THIS IN PRODUCTION. Use Redis or a persistent database."
+)

		# Global instance for easy access
		runpod_client = RunpodWhisperClient()

-# Global instance for easy access
-runpod_client = RunpodWhisperClient()
+# Lazy singleton instance for easy access
+_runpod_client_instance = None
+def get_runpod_client():
+    global _runpod_client_instance
+    if _runpod_client_instance is None:
+        _runpod_client_instance = RunpodWhisperClient()
+    return _runpod_client_instance

Implement Runpod API integration for transcription and add webhook ha… #192

Are you sure you want to change the base?

Implement Runpod API integration for transcription and add webhook ha… #192

Uh oh!

Conversation

hvignesh18197 commented Aug 8, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant