feat(media): full expired CDN recovery pipeline — retryMediaFromMetadata, fetchGroupHistory, batchRecoverMedia#2465
Conversation
…st bug Baileys RC9 checks error.status but Boom sets output.statusCode, causing reuploadRequest to never trigger automatically. Fix: in getBase64FromMediaMessage catch block, explicitly call this.client.updateMediaMessage() with 30s timeout to get fresh CDN URL, then retry download. Falls back to downloadContentFromMessage if updateMediaMessage also fails. Technique validated in OwnPilot retryMediaFromMetadata() — 174/205 SOR files recovered. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PostgreSQL JSONB stores object keys lexicographically, so a Uint8Array
serialized as {0:b0,1:b1,...,9:b9,10:b10,...} is retrieved with keys in
order "0","1","10","11",...,"19","2","20",...,"9". Using Object.values()
on this gives bytes in the WRONG order, causing Baileys HKDF/AES-GCM to
fail with "Unsupported state or unable to authenticate data".
Fix: sort keys numerically before constructing Uint8Array, and also handle
base64-encoded string mediaKey (from HTTP request bodies).
This matches OwnPilot retryMediaFromMetadata() which uses:
new Uint8Array(Buffer.from(base64mediaKey, 'base64'))
Before: updateMediaMessage → "Unsupported state or unable to authenticate data"
After: updateMediaMessage → re-upload → fresh CDN URL → download ✅
Tested: 2312ZL_2_A_V1.SOR (expired CDN, fromMe=true) → 21058 bytes ✅
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Reviewer's GuideAdds robust media re-download handling for expired WhatsApp CDN URLs and fixes Uint8Array reconstruction from PostgreSQL JSONB to prevent cryptographic failures when downloading media via Baileys RC9. Sequence diagram for WhatsApp media download with explicit updateMediaMessagesequenceDiagram
actor Caller
participant BaileysStartupService
participant BaileysClient
participant WhatsAppCDN
participant SenderDevice
Caller->>BaileysStartupService: getBase64FromMediaMessage(msg)
BaileysStartupService->>BaileysClient: downloadMediaMessage(msg, buffer, options, callbacks)
BaileysClient->>WhatsAppCDN: GET media via directPath/url
WhatsAppCDN-->>BaileysClient: 200 OK media bytes
BaileysClient-->>BaileysStartupService: Buffer
BaileysStartupService-->>Caller: base64
alt CDN expired or 404/410
BaileysClient-->>BaileysStartupService: error (Boom with output.statusCode)
BaileysStartupService->>BaileysStartupService: catch downloadMediaMessage error
BaileysStartupService->>BaileysStartupService: log Download Media failed
BaileysStartupService->>BaileysClient: updateMediaMessage(key, message)
BaileysStartupService-->>BaileysStartupService: 30s timeout guard
alt SenderDevice online and has file
BaileysClient->>SenderDevice: Request media reupload
SenderDevice->>WhatsAppCDN: Upload fresh media
BaileysClient-->>BaileysStartupService: updatedMsg with fresh CDN URL
BaileysStartupService->>BaileysClient: downloadMediaMessage(updatedMsg, buffer, options, callbacks)
BaileysClient->>WhatsAppCDN: GET media via new URL
WhatsAppCDN-->>BaileysClient: 200 OK media bytes
BaileysClient-->>BaileysStartupService: Buffer
BaileysStartupService->>BaileysStartupService: log success after updateMediaMessage
BaileysStartupService-->>Caller: base64
else updateMediaMessage fails or times out
BaileysClient-->>BaileysStartupService: error from updateMediaMessage
BaileysStartupService->>BaileysStartupService: log updateMediaMessage failure
BaileysStartupService->>BaileysStartupService: wait 5s
BaileysStartupService->>BaileysClient: downloadContentFromMessage(mediaKey, directPath, url, mediaType)
BaileysClient->>WhatsAppCDN: GET media via constructed URL
alt Fallback succeeds
WhatsAppCDN-->>BaileysClient: media stream
BaileysClient-->>BaileysStartupService: async chunks
BaileysStartupService->>BaileysStartupService: concatenate chunks to Buffer
BaileysStartupService->>BaileysStartupService: log fallback success
BaileysStartupService-->>Caller: base64
else Fallback fails
WhatsAppCDN-->>BaileysClient: NOT_FOUND or error
BaileysClient-->>BaileysStartupService: error
BaileysStartupService->>BaileysStartupService: log fallback failure
BaileysStartupService-->>Caller: error propagated
end
end
end
Class diagram for BaileysStartupService media download and reupload handlingclassDiagram
class BaileysStartupService {
- client BaileysClient
- logger Logger
+ getBase64FromMediaMessage(msg)
+ mapMediaType(mediaType)
}
class BaileysClient {
+ updateMediaMessage(key, message)
+ downloadMediaMessage(message, type, options, callbacks)
+ downloadContentFromMessage(mediaDescriptor, mediaType, options)
}
class Logger {
+ info(message)
+ error(message)
}
class WhatsAppMessage {
+ key Key
+ message MediaContainer
}
class MediaContainer {
+ imageMessage MediaMessage
+ videoMessage MediaMessage
+ audioMessage MediaMessage
+ documentMessage MediaMessage
}
class MediaMessage {
+ mediaKey Uint8Array
+ directPath string
+ url string
}
BaileysStartupService --> BaileysClient : uses
BaileysStartupService --> Logger : logs
BaileysStartupService --> WhatsAppMessage : processes
WhatsAppMessage --> MediaContainer : has
MediaContainer --> MediaMessage : contains
Flow diagram for mediaKey Uint8Array reconstruction from JSONB or base64flowchart TD
A["Start with mediaMessage.mediaKey"] --> B{Type of mediaKey}
B -->|String| C["Treat as base64 string"]
C --> D["Decode with Buffer.from(mediaKey, 'base64')"]
D --> E["Wrap in new Uint8Array(buffer)"]
E --> Z["Assign to msg.message[mediaType].mediaKey"]
B -->|Object and not Buffer and not Uint8Array| F["JSONB plain object {0:b0,1:b1,...}"]
F --> G["Extract keys: Object.keys(keyObj)"]
G --> H["Sort keys numerically: keys.sort((a,b) => parseInt(a) - parseInt(b))"]
H --> I["Map sorted keys to byte values: keys.map(k => keyObj[k])"]
I --> J["Create Uint8Array from ordered byte array"]
J --> Z
B -->|Buffer or Uint8Array or other| K["Assume already valid binary representation"]
K --> Z
Z --> L["Use mediaKey to decrypt and download media via Baileys"]
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The
mediaKeyobject branch should explicitly guard againstnull/undefinedbefore callingObject.keys, sincetypeof mediaMessage['mediaKey'] === 'object'is true fornulland will currently throw at runtime. - In the
Promise.racearoundupdateMediaMessage, consider capturing and clearing the timeout handle once either branch resolves to avoid accumulating stray timers on repeated calls. - When reconstructing the
Uint8Arrayfrom the JSONBmediaKeyobject, it might be safer to validate or filter keys that do not parse to a finite integer before sorting, so unexpected keys cannot silently affect byte ordering.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `mediaKey` object branch should explicitly guard against `null`/`undefined` before calling `Object.keys`, since `typeof mediaMessage['mediaKey'] === 'object'` is true for `null` and will currently throw at runtime.
- In the `Promise.race` around `updateMediaMessage`, consider capturing and clearing the timeout handle once either branch resolves to avoid accumulating stray timers on repeated calls.
- When reconstructing the `Uint8Array` from the JSONB `mediaKey` object, it might be safer to validate or filter keys that do not parse to a finite integer before sorting, so unexpected keys cannot silently affect byte ordering.
## Individual Comments
### Comment 1
<location path="src/api/integrations/channel/whatsapp/whatsapp.baileys.service.ts" line_range="3896-3897" />
<code_context>
+ // base64-encoded string (e.g. from HTTP request body) → Uint8Array
+ // This matches OwnPilot retryMediaFromMetadata: new Uint8Array(Buffer.from(base64, 'base64'))
+ msg.message[mediaType].mediaKey = new Uint8Array(Buffer.from(mediaMessage['mediaKey'], 'base64'));
+ } else if (
+ typeof mediaMessage['mediaKey'] === 'object' &&
+ !Buffer.isBuffer(mediaMessage['mediaKey']) &&
+ !(mediaMessage['mediaKey'] instanceof Uint8Array)
</code_context>
<issue_to_address>
**issue:** Guard against `mediaKey` being `null` before treating it as a plain object.
Since `typeof null === 'object'`, a `mediaKey` of `null` will hit this branch, be cast to `Record<string, number>`, and cause `Object.keys(keyObj)` to throw. If `null` is a possible value (e.g. malformed input or partial DB data), add an explicit non-null check or a more specific guard (e.g. `mediaMessage['mediaKey'] && typeof mediaMessage['mediaKey'] === 'object'`, or an `Array.isArray`/`instanceof` check) before entering this path.
</issue_to_address>
### Comment 2
<location path="src/api/integrations/channel/whatsapp/whatsapp.baileys.service.ts" line_range="3909" />
<code_context>
{ logger: P({ level: 'error' }) as any, reuploadRequest: this.client.updateMediaMessage },
);
} catch {
- this.logger.error('Download Media failed, trying to retry in 5 seconds...');
- await new Promise((resolve) => setTimeout(resolve, 5000));
- const mediaType = Object.keys(msg.message).find((key) => key.endsWith('Message'));
- if (!mediaType) throw new Error('Could not determine mediaType for fallback');
+ this.logger.error('Download Media failed, attempting explicit updateMediaMessage (Baileys RC9 reuploadRequest bug workaround)...');
+ // Baileys RC9 bug: reuploadRequest callback never triggers because downloadMediaMessage
</code_context>
<issue_to_address>
**suggestion:** Capture and log the original error in the outer `catch` for better debuggability.
The empty outer `catch {}` hides the original `downloadMediaMessage` error, so logs will only show failures in the reupload/fallback paths and obscure systemic issues in the initial download (auth/CDN/parsing, etc.). Please change this to `catch (err)` and log `err` (including the stack) with the explanatory message.
```suggestion
let buffer: Buffer;
{ logger: P({ level: 'error' }) as any, reuploadRequest: this.client.updateMediaMessage },
);
} catch (err) {
this.logger.error(
'Download Media failed, attempting explicit updateMediaMessage (Baileys RC9 reuploadRequest bug workaround)...',
err,
);
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| } else if ( | ||
| typeof mediaMessage['mediaKey'] === 'object' && |
There was a problem hiding this comment.
issue: Guard against mediaKey being null before treating it as a plain object.
Since typeof null === 'object', a mediaKey of null will hit this branch, be cast to Record<string, number>, and cause Object.keys(keyObj) to throw. If null is a possible value (e.g. malformed input or partial DB data), add an explicit non-null check or a more specific guard (e.g. mediaMessage['mediaKey'] && typeof mediaMessage['mediaKey'] === 'object', or an Array.isArray/instanceof check) before entering this path.
| msg.message[mediaType].mediaKey = new Uint8Array(sortedKeys.map((k) => keyObj[k])); | ||
| } | ||
|
|
||
| let buffer: Buffer; |
There was a problem hiding this comment.
suggestion: Capture and log the original error in the outer catch for better debuggability.
The empty outer catch {} hides the original downloadMediaMessage error, so logs will only show failures in the reupload/fallback paths and obscure systemic issues in the initial download (auth/CDN/parsing, etc.). Please change this to catch (err) and log err (including the stack) with the explanatory message.
| let buffer: Buffer; | |
| let buffer: Buffer; | |
| { logger: P({ level: 'error' }) as any, reuploadRequest: this.client.updateMediaMessage }, | |
| ); | |
| } catch (err) { | |
| this.logger.error( | |
| 'Download Media failed, attempting explicit updateMediaMessage (Baileys RC9 reuploadRequest bug workaround)...', | |
| err, | |
| ); |
Two new endpoints mirroring OwnPilot's media recovery algorithms:
POST /chat/retryMediaFromMetadata/{instance}
- Accepts mediaKey (base64), directPath, url, participant directly
- No DB lookup needed — works even if message not in EA's Message table
- Same algorithm as OwnPilot retryMediaFromMetadata():
Step 1: direct downloadMediaMessage
Step 2: explicit updateMediaMessage [30s timeout] → retry download
POST /chat/fetchGroupHistory/{instance}
- Triggers sock.fetchMessageHistory() for a group JID
- WhatsApp delivers old message protos (with fresh mediaKey/directPath)
via messaging-history.set event → stored in EA's Message table
- Rate-limited: 1 call/30s (ban risk)
- This is how OwnPilot recovers mediaKeys for messages missing from EA DB
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… CDN recovery
POST /chat/batchRecoverMedia/{instance}
- Takes array of messageIds, fetches metadata from EA's own Message table
- Downloads via retryMediaFromMetadata (direct → updateMediaMessage fallback)
- Uploads recovered buffer to MinIO + upserts media record + updates message.mediaUrl
- continueOnError=true for resilient batch processing
- storeToMinIO=true (default) for full end-to-end pipeline
Tested: 310/320 expired SOR files recovered in <5min without OwnPilot.
3 skipped (already stored), 7 irrecoverable (sender permanently offline).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Agent-friendly reference for the 3 new endpoints + 2 bug fixes: - retryMediaFromMetadata: metadata-based download without DB lookup - fetchGroupHistory: WhatsApp on-demand history sync trigger - batchRecoverMedia: end-to-end batch recovery pipeline (download → MinIO → DB) Includes: algorithm details, edge cases, JSONB mediaKey sort fix, Baileys RC9 reuploadRequest workaround, env prerequisites, iterative fetching pattern, and production results (1132/1137 SOR recovered). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Problem
Three compounding bugs that silently killed media recovery in EA:
JSONB mediaKey byte order — PostgreSQL JSONB stores object keys lexicographically (
"0","1","10",...,"2",...,"9").Object.values()returns wrong byte order → Baileys HKDF/AES-GCM →Unsupported state or unable to authenticate dataBaileys RC9 dead code —
reuploadRequestcallback never fires because the catch block checkserror.statusbut Boom errors setoutput.statusCode→ sender re-upload never requested for expired CDN URLsMissing self-sufficient recovery pipeline — EA had no way to recover media without the message being in its own DB, and no batch pipeline
Solution (4 commits)
Commit
262c9300— PostgreSQL JSONB key sortingAlso handles base64 string format (not just byte objects).
Commit
f268571b— Baileys RC9reuploadRequestdead code workaroundAdds explicit
updateMediaMessage()call with 30s timeout in the catch block ofgetBase64FromMediaMessage, bypassing the broken callback mechanism.Commit
5e2f0774— Two new endpoints for external recoveryPOST /chat/retryMediaFromMetadata/{instance}— Download media using caller-supplied metadata (mediaKey, directPath, url) without needing the message in EA's DB. Mirrors OwnPilot'sretryMediaFromMetadata()algorithm.POST /chat/fetchGroupHistory/{instance}— Trigger on-demand WhatsApp history sync for a group. WhatsApp responds with old message protos containing freshmediaKey + directPath, enabling recovery of messages EA never stored. 30s rate-limit guard included.Commit
28fc185f— End-to-end batch recovery pipelinePOST /chat/batchRecoverMedia/{instance}— Full self-sufficient pipeline:mediaKey + directPath + urlfrom EA's ownMessagetableretryMediaFromMetadata(direct →updateMediaMessagefallback)s3Service.uploadFile)prismaRepository.mediarecordmessage.mediaUrlin DB so future reads skip re-downloadTest Results
21058 bytes(was failing)retryMediaFromMetadatasingle21288 bytesfetchGroupHistorysessionIdreturned, history sync triggeredbatchRecoverMedia320 files310 ok / 3 skip / 7 irrecoverableAPI Reference