Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds PreRecorded V2: generator emits prerecorded schemas; new PreRecorded V2 clients, core logic, and generated types for JS/TS and Python; E2E Vitest tests and configs; SDK public APIs and versions updated; HTTP client improvements and enriched HttpError diagnostics; README and repo config updates. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant SDK as PreRecordedV2Client
participant HTTP as HttpClient
participant API as Gladia API
Client->>SDK: transcribe(audio, options)
alt Local file
SDK->>SDK: prepare upload
SDK->>HTTP: POST /v2/upload (multipart)
HTTP->>API: upload audio
API-->>HTTP: upload response (audio_url)
HTTP-->>SDK: audio_url
else URL input
SDK->>SDK: use provided audio_url
end
SDK->>HTTP: POST /v2/pre-recorded (create job)
HTTP->>API: create job
API-->>HTTP: init response (job_id)
HTTP-->>SDK: job_id
loop poll
SDK->>HTTP: GET /v2/pre-recorded/{job_id}
HTTP->>API: job status
API-->>HTTP: job response
HTTP-->>SDK: status
alt done
SDK->>Client: return final result
else pending
SDK->>SDK: sleep(interval)
else error
SDK->>Client: raise error
end
end
sequenceDiagram
participant OpenAPI
participant Generator
participant PreProc as PreProcessor
participant BaseGen
participant TSGen
participant PyGen
OpenAPI->>Generator: provide spec
Generator->>BaseGen: generate LiveV2 schemas
BaseGen->>TSGen: emit LiveV2 types
BaseGen->>PyGen: emit LiveV2 types
Generator->>PreProc: preProcessSchemaForPreRecordedV2(spec)
PreProc-->>Generator: PreRecordedV2Schemas
Generator->>BaseGen: generatePreRecordedV2(PreRecordedV2Schemas)
BaseGen->>TSGen: generatePreRecordedV2Types()
TSGen-->>TSGen: apply fieldOverrides
BaseGen->>PyGen: generatePreRecordedV2Types()
PyGen-->>PyGen: apply fieldOverrides
TSGen-->>Repository: write v2/prerecorded types
PyGen-->>Repository: write v2/prerecorded types
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…ecking for file uploads
b3f52dd to
4dc7f8c
Compare
3290ef6 to
2c5f17b
Compare
7bd8e8b to
9bd37af
Compare
…and return Node.js fetch
added object in PreRecordedV2DeleteResponse core.py for the deletion method
added return result for the javascript deletion method
| @@ -0,0 +1,188 @@ | |||
| import { readFileSync } from 'fs' | |||
There was a problem hiding this comment.
Because GladiaClient statically imports this module from the root package entrypoint,
these top-level fs/path imports break browser/IIFE consumers even when they never use
pre-recorded APIs. I.e. you cannot use the streaming interface for the SDK from the browser anymore
There was a problem hiding this comment.
tbh, per my LLM:
"The actual problem is that the root entrypoint also exports GladiaClient, and packages/sdk-js/src/ client.ts has a static value import of PreRecordedV2Client from packages/sdk-js/src/v2/ prerecorded/index.ts, which in turn re-exports packages/sdk-js/src/v2/prerecorded/client.ts. That file now has top-level imports of fs and path.
But this is a classic problem to beware of on typescript packages
| audio: A local file path (str or Path), an open binary file object, or a URL (str). | ||
| options: Optional transcription options (no audio_url). Can be a | ||
| :class:`PreRecordedV2TranscriptionOptions` instance or a dict. | ||
| interval: Seconds between polling attempts. |
There was a problem hiding this comment.
This docstring uses seconds, but the Python top-level client options still expose bare numeric defaults without unit docs; can we document timeout units at the options definition too, especially since the JS SDK documents milliseconds explicitly?
(also why the change seconds / milliseconds between JS and Python? But it doesn't matter that much)
a3490b2 to
10cc9ab
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
README.md (4)
15-27:⚠️ Potential issue | 🟡 MinorAdd language hint to the directory tree code block.
The fenced code block is missing a language specifier, causing
markdownlintMD040 violations. Addingtextwill resolve the linting issue.📝 Proposed fix
-``` +```text ├── packages/ │ ├── generator/ # Fetches Gladia OpenAPI schema and generates types │ ├── sdk-js/ # JavaScript/TypeScript SDK (ESM, CJS, IIFE)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 15 - 27, Add a language hint to the fenced code block that contains the ASCII directory tree so markdownlint MD040 is not triggered; update the opening fence from ``` to ```text (the directory tree code block shown in the README, i.e., the block that starts with the line containing "├── packages/") so the block is explicitly marked as plain text.
212-219:⚠️ Potential issue | 🟡 MinorAdd language hint to the package structure code block.
The fenced code block showing the package directory structure is missing a language specifier, causing
markdownlintMD040 violations. Addingtextwill resolve the linting issue.📝 Proposed fix
-``` +```text packages/sdk-<language>/ ├── src/ │ └── v2/live/ # The generator will write here🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 212 - 219, The fenced code block that shows the package directory structure (starting with the line "packages/sdk-<language>/") lacks a language specifier and triggers markdownlint MD040; update that fenced code block to include a language hint (e.g., add "text" after the opening triple backticks) so the block begins with ```text and the rest of the snippet (the lines under "packages/sdk-<language>/", "├── src/", "│ └── v2/live/", etc.) remains unchanged.
35-46:⚠️ Potential issue | 🟡 MinorAdd language hint to the flow diagram code block.
The fenced code block is missing a language specifier, causing
markdownlintMD040 violations. Addingtextwill resolve the linting issue.📝 Proposed fix
-``` +```text OpenAPI schema (api.gladia.io/openapi.json) │ ▼🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 35 - 46, The fenced diagram block in README.md is missing a language specifier, causing markdownlint MD040; update the code fence that contains the ASCII flow diagram (the triple-backtick block shown starting with "OpenAPI schema (api.gladia.io/openapi.json)") to include a language hint like text (i.e., change ``` to ```text) so the linter recognizes it as a plain text block and the MD040 warning is resolved.
65-74:⚠️ Potential issue | 🟡 MinorAdd language hint to the commit format code block.
The fenced code block showing the commit message template is missing a language specifier, causing
markdownlintMD040 violations. Addingtextwill resolve the linting issue.📝 Proposed fix
This project uses [Conventional Commits](https://www.conventionalcommits.org/) enforced by commitlint. Commits must follow the format: -``` +```text <type>(optional scope): <description></details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@README.mdaround lines 65 - 74, Update the fenced code block under the
"Commit convention" section to include a language hint to satisfy markdownlint
MD040: change the opening fence fromtotext for the block that contains
"(optional scope): " so the commit message template is fenced
astext ...; no other changes required.</details> </blockquote></details> </blockquote></details> <details> <summary>🧹 Nitpick comments (1)</summary><blockquote> <details> <summary>packages/sdk-python/src/gladiaio_sdk/client_options.py (1)</summary><blockquote> `9-12`: **Consider using `None` as the sentinel instead of a magic value.** The comment indicates these constants detect "no timeout configured." If a user explicitly sets `http_timeout=10.0`, the detection logic may incorrectly treat it as "not configured." A cleaner pattern is to default to `Optional[float] = None` and apply the fallback internally: ```python http_timeout: float | None = None # None means "use default for context" ``` Then in the consuming code, fall back to the appropriate default (e.g., 10s for WebSocket, longer for pre-recorded). <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@packages/sdk-python/src/gladiaio_sdk/client_options.py` around lines 9 - 12, Replace the sentinel magic values DEFAULT_HTTP_TIMEOUT and DEFAULT_WS_TIMEOUT with None-based sentinels (use Optional[float] defaults) so "no timeout configured" is represented by None; update the client options declaration (e.g., http_timeout, ws_timeout in client_options.py) to default to None and change any consuming logic (where the code checks DEFAULT_HTTP_TIMEOUT/DEFAULT_WS_TIMEOUT) to explicitly fall back to the actual numeric defaults (e.g., 10s for WS or longer for pre-recorded) when the timeout value is None rather than comparing against the magic constants. ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Inline comments:
In@packages/sdk-js/src/v2/prerecorded/client.ts:
- Around line 151-168: The poll() loop can overshoot the caller's timeout when
timeout < interval because it always awaits sleep(interval); modify poll (in
PreRecordedV2 client) to compute remaining = timeout - (Date.now() - start) when
timeout is provided, throw immediately if remaining <= 0, and sleep for
Math.min(interval, remaining) instead of the full interval; keep behavior
unchanged when timeout is undefined. Update references in callers like
createAndPoll()/transcribe() rely on poll() honoring the deadline.
Duplicate comments:
In@README.md:
- Around line 15-27: Add a language hint to the fenced code block that contains
the ASCII directory tree so markdownlint MD040 is not triggered; update the
opening fence fromtotext (the directory tree code block shown in the
README, i.e., the block that starts with the line containing "├── packages/") so
the block is explicitly marked as plain text.- Around line 212-219: The fenced code block that shows the package directory
structure (starting with the line "packages/sdk-/") lacks a language
specifier and triggers markdownlint MD040; update that fenced code block to
include a language hint (e.g., add "text" after the opening triple backticks) so
the block begins with ```text and the rest of the snippet (the lines under
"packages/sdk-/", "├── src/", "│ └── v2/live/", etc.) remains
unchanged.- Around line 35-46: The fenced diagram block in README.md is missing a language
specifier, causing markdownlint MD040; update the code fence that contains the
ASCII flow diagram (the triple-backtick block shown starting with "OpenAPI
schema (api.gladia.io/openapi.json)") to include a language hint like text
(i.e., changetotext) so the linter recognizes it as a plain text block
and the MD040 warning is resolved.- Around line 65-74: Update the fenced code block under the "Commit convention"
section to include a language hint to satisfy markdownlint MD040: change the
opening fence fromtotext for the block that contains "(optional
scope): " so the commit message template is fenced as ```text ...--- Nitpick comments: In `@packages/sdk-python/src/gladiaio_sdk/client_options.py`: - Around line 9-12: Replace the sentinel magic values DEFAULT_HTTP_TIMEOUT and DEFAULT_WS_TIMEOUT with None-based sentinels (use Optional[float] defaults) so "no timeout configured" is represented by None; update the client options declaration (e.g., http_timeout, ws_timeout in client_options.py) to default to None and change any consuming logic (where the code checks DEFAULT_HTTP_TIMEOUT/DEFAULT_WS_TIMEOUT) to explicitly fall back to the actual numeric defaults (e.g., 10s for WS or longer for pre-recorded) when the timeout value is None rather than comparing against the magic constants.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID:
31c2a3bf-1334-4a43-b4a2-fd9a20f93694📒 Files selected for processing (3)
README.mdpackages/sdk-js/src/v2/prerecorded/client.tspackages/sdk-python/src/gladiaio_sdk/client_options.py
| async poll( | ||
| jobId: string, | ||
| { interval = 3_000, timeout }: { interval?: number; timeout?: number } = {} | ||
| ): Promise<PreRecordedV2Response> { | ||
| const start = Date.now() | ||
| while (true) { | ||
| const result = await this.get(jobId) | ||
| if (result.status === 'done') { | ||
| return result | ||
| } | ||
| if (result.status === 'error') { | ||
| throw new Error(`Pre-recorded job ${jobId} failed with error code: ${result.error_code}`) | ||
| } | ||
| if (timeout !== undefined && Date.now() - start >= timeout) { | ||
| throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`) | ||
| } | ||
| await sleep(interval) | ||
| } |
There was a problem hiding this comment.
poll() can exceed the caller’s timeout.
If timeout is smaller than interval, this loop still sleeps the full interval after the timeout check, so createAndPoll() and transcribe() can wait well past the documented deadline. Clamp the sleep to the remaining budget instead of always sleeping interval.
⏱️ Suggested change
async poll(
jobId: string,
{ interval = 3_000, timeout }: { interval?: number; timeout?: number } = {}
): Promise<PreRecordedV2Response> {
- const start = Date.now()
+ const deadline = timeout === undefined ? undefined : Date.now() + timeout
while (true) {
+ if (deadline !== undefined && Date.now() >= deadline) {
+ throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`)
+ }
const result = await this.get(jobId)
if (result.status === 'done') {
return result
}
if (result.status === 'error') {
throw new Error(`Pre-recorded job ${jobId} failed with error code: ${result.error_code}`)
}
- if (timeout !== undefined && Date.now() - start >= timeout) {
+ if (deadline !== undefined && Date.now() >= deadline) {
throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`)
}
- await sleep(interval)
+ await sleep(
+ deadline === undefined ? interval : Math.min(interval, Math.max(0, deadline - Date.now()))
+ )
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async poll( | |
| jobId: string, | |
| { interval = 3_000, timeout }: { interval?: number; timeout?: number } = {} | |
| ): Promise<PreRecordedV2Response> { | |
| const start = Date.now() | |
| while (true) { | |
| const result = await this.get(jobId) | |
| if (result.status === 'done') { | |
| return result | |
| } | |
| if (result.status === 'error') { | |
| throw new Error(`Pre-recorded job ${jobId} failed with error code: ${result.error_code}`) | |
| } | |
| if (timeout !== undefined && Date.now() - start >= timeout) { | |
| throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`) | |
| } | |
| await sleep(interval) | |
| } | |
| async poll( | |
| jobId: string, | |
| { interval = 3_000, timeout }: { interval?: number; timeout?: number } = {} | |
| ): Promise<PreRecordedV2Response> { | |
| const deadline = timeout === undefined ? undefined : Date.now() + timeout | |
| while (true) { | |
| if (deadline !== undefined && Date.now() >= deadline) { | |
| throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`) | |
| } | |
| const result = await this.get(jobId) | |
| if (result.status === 'done') { | |
| return result | |
| } | |
| if (result.status === 'error') { | |
| throw new Error(`Pre-recorded job ${jobId} failed with error code: ${result.error_code}`) | |
| } | |
| if (deadline !== undefined && Date.now() >= deadline) { | |
| throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`) | |
| } | |
| await sleep( | |
| deadline === undefined ? interval : Math.min(interval, Math.max(0, deadline - Date.now())) | |
| ) | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/sdk-js/src/v2/prerecorded/client.ts` around lines 151 - 168, The
poll() loop can overshoot the caller's timeout when timeout < interval because
it always awaits sleep(interval); modify poll (in PreRecordedV2 client) to
compute remaining = timeout - (Date.now() - start) when timeout is provided,
throw immediately if remaining <= 0, and sleep for Math.min(interval, remaining)
instead of the full interval; keep behavior unchanged when timeout is undefined.
Update references in callers like createAndPoll()/transcribe() rely on poll()
honoring the deadline.
There was a problem hiding this comment.
♻️ Duplicate comments (5)
README.md (4)
212-219:⚠️ Potential issue | 🟡 MinorAdd language hint to the fenced code block.
This code block is missing an info string, causing
markdownlintto flag MD040. Addtextas the language identifier.📝 Proposed fix
-``` +```text packages/sdk-<language>/ ├── src/ │ └── v2/live/ # The generator will write here ├── project.json # Nx project configuration ├── README.md └── <language build config> (e.g. go.mod, Gemfile, pom.xml)</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@README.mdaround lines 212 - 219, The fenced code block in README.md lacks
an info string (triggering MD040); update the triple-backtick fence that
precedes the directory tree (the snippet showing "packages/sdk-/") to
include the language hint "text" (i.e., changetotext) so markdownlint
recognizes it as a code block with an explicit language identifier.</details> --- `15-27`: _⚠️ Potential issue_ | _🟡 Minor_ **Add language hint to the fenced code block.** This code block is missing an info string, causing `markdownlint` to flag MD040. Add `text` as the language identifier. <details> <summary>📝 Proposed fix</summary> ```diff -``` +```text ├── packages/ │ ├── generator/ # Fetches Gladia OpenAPI schema and generates types │ ├── sdk-js/ # JavaScript/TypeScript SDK (ESM, CJS, IIFE) │ └── sdk-python/ # Python SDK (sync + async) ├── e2e/ │ ├── e2e-node-esm/ # E2E tests — Node ESM │ ├── e2e-node-cjs/ # E2E tests — Node CJS │ ├── e2e-python/ # E2E tests — Python │ └── javascript-fixtures/ ├── tools/scripts/ # Release & publish automation └── .github/workflows/ # CI, release, publish pipelines ``` ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@README.mdaround lines 15 - 27, The fenced code block in README.md lacks an
info string and triggers markdownlint MD040; update that block by adding the
language hint "text" to the opening backticks so the block starts with ```text,
ensuring the directory tree sample is treated as plain text and the linter
warning is resolved.</details> --- `35-46`: _⚠️ Potential issue_ | _🟡 Minor_ **Add language hint to the fenced code block.** This code block is missing an info string, causing `markdownlint` to flag MD040. Add `text` as the language identifier. <details> <summary>📝 Proposed fix</summary> ```diff -``` +```text OpenAPI schema (api.gladia.io/openapi.json) │ ▼ generator ├── Live V2 │ ├──▶ packages/sdk-js/src/v2/live/generated-types.ts │ └──▶ packages/sdk-python/src/gladiaio_sdk/v2/live/generated_types.py └── PreRecorded V2 ├──▶ packages/sdk-js/src/v2/prerecorded/generated-types.ts └──▶ packages/sdk-python/src/gladiaio_sdk/v2/prerecorded/generated_types.py ``` ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@README.mdaround lines 35 - 46, The fenced code block containing the ASCII
tree starting with "OpenAPI schema (api.gladia.io/openapi.json)" is missing a
language info string and triggers markdownlint MD040; fix it by adding the info
string "text" to the opening backticks of that code block (so the block begins
with ```text) in the README so the snippet is treated as plain text.</details> --- `69-71`: _⚠️ Potential issue_ | _🟡 Minor_ **Add language hint to the fenced code block.** This code block is missing an info string, causing `markdownlint` to flag MD040. Add `text` as the language identifier. <details> <summary>📝 Proposed fix</summary> ```diff -``` +```text <type>(optional scope): <description> ``` ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@README.mdaround lines 69 - 71, The fenced code block showing
"(optional scope): " is missing an info string and triggers
markdownlint MD040; update that code fence by adding the language identifier
"text" to the opening backticks so the block starts with ```text ensuring the
info string is present and MD040 is resolved.</details> </blockquote></details> <details> <summary>packages/sdk-js/src/v2/prerecorded/client.ts (1)</summary><blockquote> `151-168`: _⚠️ Potential issue_ | _🟠 Major_ **`poll()` can exceed the declared timeout budget.** When `timeout < interval`, Line 167 always sleeps the full `interval`, so timeout is exceeded before the next iteration. <details> <summary>Proposed patch</summary> ```diff - const start = Date.now() + const deadline = timeout === undefined ? undefined : Date.now() + timeout while (true) { + if (deadline !== undefined && Date.now() >= deadline) { + throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`) + } const result = await this.get(jobId) if (result.status === 'done') { return result } if (result.status === 'error') { throw new Error(`Pre-recorded job ${jobId} failed with error code: ${result.error_code}`) } - if (timeout !== undefined && Date.now() - start >= timeout) { + if (deadline !== undefined && Date.now() >= deadline) { throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`) } - await sleep(interval) + await sleep( + deadline === undefined ? interval : Math.min(interval, Math.max(0, deadline - Date.now())) + ) } ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@packages/sdk-js/src/v2/prerecorded/client.ts` around lines 151 - 168, The poll method can sleep past the timeout when timeout < interval; update PreRecordedV2Client.poll to compute remaining time after checking timeout (use start and Date.now()), and before calling sleep(interval) if timeout is set, throw immediately if remaining <= 0 and otherwise sleep Math.min(interval, remaining); ensure jobId, start, interval, timeout and sleep are the referenced symbols to locate and modify. ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In@packages/sdk-js/src/v2/prerecorded/client.ts:
- Around line 151-168: The poll method can sleep past the timeout when timeout <
interval; update PreRecordedV2Client.poll to compute remaining time after
checking timeout (use start and Date.now()), and before calling sleep(interval)
if timeout is set, throw immediately if remaining <= 0 and otherwise sleep
Math.min(interval, remaining); ensure jobId, start, interval, timeout and sleep
are the referenced symbols to locate and modify.In
@README.md:
- Around line 212-219: The fenced code block in README.md lacks an info string
(triggering MD040); update the triple-backtick fence that precedes the directory
tree (the snippet showing "packages/sdk-/") to include the language
hint "text" (i.e., changetotext) so markdownlint recognizes it as a
code block with an explicit language identifier.- Around line 15-27: The fenced code block in README.md lacks an info string and
triggers markdownlint MD040; update that block by adding the language hint
"text" to the opening backticks so the block starts with ```text, ensuring the
directory tree sample is treated as plain text and the linter warning is
resolved.- Around line 35-46: The fenced code block containing the ASCII tree starting
with "OpenAPI schema (api.gladia.io/openapi.json)" is missing a language info
string and triggers markdownlint MD040; fix it by adding the info string "text"
to the opening backticks of that code block (so the block begins with ```text)
in the README so the snippet is treated as plain text.- Around line 69-71: The fenced code block showing "(optional scope):
" is missing an info string and triggers markdownlint MD040; update
that code fence by adding the language identifier "text" to the opening
backticks so the block starts with ```text ensuring the info string is present
and MD040 is resolved.</details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: defaults **Review profile**: CHILL **Plan**: Pro **Run ID**: `d42ee134-7221-41b0-9ea0-154f9cd1400a` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between a3490b2b3b4d9afca853512b5a4351c73c22472e and 10cc9ab67a93f3caeea742ff5b028f5421efff99. </details> <details> <summary>📒 Files selected for processing (3)</summary> * `README.md` * `packages/sdk-js/src/v2/prerecorded/client.ts` * `packages/sdk-python/src/gladiaio_sdk/client_options.py` </details> <details> <summary>🚧 Files skipped from review as they are similar to previous changes (1)</summary> * packages/sdk-python/src/gladiaio_sdk/client_options.py </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
packages/sdk-js/src/v2/prerecorded/client.ts (1)
151-168:⚠️ Potential issue | 🟡 Minor
poll()can exceed the caller's timeout.The timeout check on line 164 occurs after
get()but beforesleep(). Iftimeoutis smaller thaninterval, the fullintervalsleep still executes, causing the actual wait time to exceed the documented deadline. Clamp the sleep duration to the remaining time budget.⏱️ Suggested fix
async poll( jobId: string, { interval = 3_000, timeout }: { interval?: number; timeout?: number } = {} ): Promise<PreRecordedV2Response> { - const start = Date.now() + const deadline = timeout === undefined ? undefined : Date.now() + timeout while (true) { const result = await this.get(jobId) if (result.status === 'done') { return result } if (result.status === 'error') { throw new Error(`Pre-recorded job ${jobId} failed with error code: ${result.error_code}`) } - if (timeout !== undefined && Date.now() - start >= timeout) { + if (deadline !== undefined && Date.now() >= deadline) { throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`) } - await sleep(interval) + const remaining = deadline === undefined ? interval : Math.max(0, deadline - Date.now()) + if (remaining <= 0) { + throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`) + } + await sleep(Math.min(interval, remaining)) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/sdk-js/src/v2/prerecorded/client.ts` around lines 151 - 168, In poll(), the current timeout check happens before sleep so a long interval can push the actual wait past the caller's timeout; change the loop to compute remaining = timeout - (Date.now() - start) (when timeout is defined) and, before calling sleep(), if remaining <= 0 throw the timeout Error, otherwise sleep for Math.min(interval, remaining); keep full-interval sleep when timeout is undefined; reference the poll function, variables start, interval, timeout, get(), and sleep() when making the change.e2e/e2e-python/tests/test_prerecorded_v2_async.py (1)
21-24:⚠️ Potential issue | 🟠 MajorSkip this e2e module when
GLADIA_API_KEYis absent.Without a module-level guard, local runs and CI without secrets will fail when
GladiaClient()raisesValueErrordue to missing API key.🧪 Suggested guard
YOUTUBE_VIDEO_URL = "https://www.youtube.com/watch?v=DYyY8Nh3TQE" POLL_TIMEOUT_S = 180.0 YOUTUBE_POLL_TIMEOUT_S = 600.0 + +pytestmark = pytest.mark.skipif( + not os.getenv("GLADIA_API_KEY"), + reason="requires GLADIA_API_KEY", +)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@e2e/e2e-python/tests/test_prerecorded_v2_async.py` around lines 21 - 24, Add a module-level guard that skips the entire e2e module when the GLADIA_API_KEY env var is missing to avoid GladiaClient() raising ValueError during import; at the top of the file (near YOUTUBE_VIDEO_URL / POLL_TIMEOUT_S constants) check os.environ.get("GLADIA_API_KEY") and call pytest.skip("Gladia API key missing, skipping e2e tests", allow_module_level=True) if absent so tests that construct GladiaClient() are not executed.
🧹 Nitpick comments (5)
packages/sdk-js/src/v2/prerecorded/client.ts (1)
60-65: Consider using async file reading to avoid blocking the event loop.
readFileSyncblocks the thread while reading the file. For large audio files, this can degrade server performance. The dynamic import is already async, so switching toreadFilefromfs/promisesis straightforward.♻️ Proposed fix
- const { readFileSync } = await import('fs') + const { readFile } = await import('fs/promises') const { basename } = await import('path') - const fileBuffer = readFileSync(audio_url) + const fileBuffer = await readFile(audio_url) const filename = basename(audio_url)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/sdk-js/src/v2/prerecorded/client.ts` around lines 60 - 65, Replace the synchronous readFileSync with an async file read: dynamically import or require readFile from 'fs/promises' and await readFile(audio_url) to populate fileBuffer instead of blocking the event loop; keep using basename(audio_url) for filename, create the Blob from the awaited fileBuffer and append to formData as before (variables: audio_url, fileBuffer, filename, blob, formData).packages/sdk-python/src/gladiaio_sdk/network/http_client.py (2)
16-31: Unbounded recursion risk with deeply nested payloads.
_flatten_json_keysuses recursion without a depth limit. Maliciously crafted or excessively nested JSON payloads could cause aRecursionError. Consider adding a depth guard.♻️ Proposed fix with depth limit
-def _flatten_json_keys(obj: Any, out: set[str] | None = None) -> set[str]: +def _flatten_json_keys(obj: Any, out: set[str] | None = None, *, _depth: int = 0, _max_depth: int = 50) -> set[str]: """Collect all string keys from nested dict/list JSON payloads.""" if out is None: out = set() + if _depth >= _max_depth: + return out if isinstance(obj, dict): for k, v in obj.items(): if isinstance(k, str): out.add(k) - _flatten_json_keys(v, out) + _flatten_json_keys(v, out, _depth=_depth + 1, _max_depth=_max_depth) elif isinstance(obj, list): for item in obj: - _flatten_json_keys(item, out) + _flatten_json_keys(item, out, _depth=_depth + 1, _max_depth=_max_depth) return out🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/sdk-python/src/gladiaio_sdk/network/http_client.py` around lines 16 - 31, The _flatten_json_keys function can recurse indefinitely on deeply nested payloads; add a max depth guard to avoid RecursionError by introducing an optional depth/current_depth parameter (or a constant MAX_DEPTH) and return early when current_depth >= MAX_DEPTH, ensuring callers still get collected keys up to that depth; update the function signature (e.g., _flatten_json_keys(obj: Any, out: set[str] | None = None, depth: int = 0)) and increment depth on each recursive call, and document/choose a reasonable MAX_DEPTH constant and behavior (skip deeper children) in the function that calls or caches results.
75-105: Lazy imports inside function are acceptable but unconventional.The deferred imports of
PreRecordedV2InitTranscriptionRequestandLiveV2InitRequestavoid circular dependencies. This is a valid pattern, though documenting the reason would help future maintainers.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/sdk-python/src/gladiaio_sdk/network/http_client.py` around lines 75 - 105, Add a short inline comment at the top of the _schema_field_names_for_url function explaining that the imports of PreRecordedV2InitTranscriptionRequest and LiveV2InitRequest are intentionally deferred (lazy) to avoid circular import issues and to keep startup cost low; keep the current lazy-import placement and behavior, but document why those imports are inside the function and reference the specific types (PreRecordedV2InitTranscriptionRequest and LiveV2InitRequest) so future maintainers understand the rationale.packages/sdk-python/src/gladiaio_sdk/v2/prerecorded/client.py (2)
202-213: Thereturn Falsepath on non-202 may be unreachable.When
HttpClient.delete()receives a 4xx/5xx response, it raisesHttpErrorbefore returning. Theresp.status_code == 202check would only catch non-error status codes like 200/201/204. Consider if this is the intended behavior or if the docstring should clarify.📝 Clarify docstring or adjust logic
If the API returns 200/204 for already-deleted jobs or other success variants:
def delete(self, job_id: str) -> bool: """Delete a pre-recorded transcription job. Args: job_id: The UUID of the transcription job to delete. Returns: - True if the job was deleted successfully (HTTP 202), False otherwise. + True if the job was deleted successfully (HTTP 202). + Returns False for other 2xx responses (e.g., already deleted). + Raises HttpError for 4xx/5xx responses. """🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/sdk-python/src/gladiaio_sdk/v2/prerecorded/client.py` around lines 202 - 213, The delete method currently checks resp.status_code == 202 but HttpClient.delete raises HttpError for 4xx/5xx so the False path may never run; either update the delete docstring (in delete) to state that HttpError is raised on non-2xx responses, or change logic to catch HttpError from self._http_client.delete and return False for error responses while treating 202 (and other success codes like 200/204) as True; locate the method delete, self._http_client.delete, and self._core.build_job_endpoint to implement the preferred behavior.
252-263: Potential off-by-one in timeout check.The condition
(time.time() - start) >= timeoutis checked afterget()returns but before sleeping. Ifget()takes longer than the timeout, the check correctly triggers. However, ifget()completes just under the timeout and then the loop sleeps, total elapsed time could exceedtimeout. This is likely acceptable for polling semantics but worth noting.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/sdk-python/src/gladiaio_sdk/v2/prerecorded/client.py` around lines 252 - 263, The polling loop can sleep past the timeout: after calling get() you should compute elapsed = time.time() - start and if timeout is set raise TimeoutError immediately when elapsed >= timeout (use _core.create_timeout_error_message(job_id, timeout)), and when deciding to sleep use sleep_time = min(interval, max(0, timeout - elapsed)) (or skip sleeping if timeout is exhausted) so the loop never sleeps beyond the configured timeout; adjust the loop around start, get(job_id), timeout, interval and the _core.create_timeout_error_message call accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/sdk-js/src/v2/prerecorded/client.ts`:
- Around line 124-127: The delete(jobId) method incorrectly checks
response.status because this.httpClient.delete<Response> returns a parsed body
(so status is undefined); update delete to either call the client in a way that
returns the raw HTTP response (e.g., change the generic from Response to void or
use the client option that returns the raw Response) so you can test
response.status === 202, or simplify by wrapping the call in try/catch and
return true on successful completion and false on error; specifically modify the
delete method in the class (the delete(jobId: string): Promise<boolean> function
that calls this.httpClient.delete) to use one of those two approaches.
- Around line 135-138: The getFile method currently calls
this.httpClient.get<Response>(...) and then response.arrayBuffer(), but
HttpClient auto-parses JSON responses so the returned value may be a parsed
object (breaking arrayBuffer()); update getFile to request the raw binary
response by using the HttpClient option that prevents JSON auto-parsing (e.g.,
set responseType: 'arraybuffer' or an equivalent flag) or call a low-level/raw
fetch method on the client that returns the native Response; ensure the change
is made in the getFile function and that the returned value is the actual
Response (so response.arrayBuffer() is valid) rather than a parsed JSON object.
In `@packages/sdk-python/src/gladiaio_sdk/network/http_client.py`:
- Around line 274-298: The enriched HttpError returned by
enrich_http_error_with_field_suggestions currently discards the original
exception chain; update the HttpError construction in
enrich_http_error_with_field_suggestions to pass cause=err.__cause__ (or the
appropriate constructor parameter for the underlying HttpError type) so the
original __cause__ is preserved when creating the new HttpError instance,
keeping all other fields (message, method, url, status, id, request_id,
response_body, response_headers, validation_errors, invalid_parameters)
unchanged.
---
Duplicate comments:
In `@e2e/e2e-python/tests/test_prerecorded_v2_async.py`:
- Around line 21-24: Add a module-level guard that skips the entire e2e module
when the GLADIA_API_KEY env var is missing to avoid GladiaClient() raising
ValueError during import; at the top of the file (near YOUTUBE_VIDEO_URL /
POLL_TIMEOUT_S constants) check os.environ.get("GLADIA_API_KEY") and call
pytest.skip("Gladia API key missing, skipping e2e tests",
allow_module_level=True) if absent so tests that construct GladiaClient() are
not executed.
In `@packages/sdk-js/src/v2/prerecorded/client.ts`:
- Around line 151-168: In poll(), the current timeout check happens before sleep
so a long interval can push the actual wait past the caller's timeout; change
the loop to compute remaining = timeout - (Date.now() - start) (when timeout is
defined) and, before calling sleep(), if remaining <= 0 throw the timeout Error,
otherwise sleep for Math.min(interval, remaining); keep full-interval sleep when
timeout is undefined; reference the poll function, variables start, interval,
timeout, get(), and sleep() when making the change.
---
Nitpick comments:
In `@packages/sdk-js/src/v2/prerecorded/client.ts`:
- Around line 60-65: Replace the synchronous readFileSync with an async file
read: dynamically import or require readFile from 'fs/promises' and await
readFile(audio_url) to populate fileBuffer instead of blocking the event loop;
keep using basename(audio_url) for filename, create the Blob from the awaited
fileBuffer and append to formData as before (variables: audio_url, fileBuffer,
filename, blob, formData).
In `@packages/sdk-python/src/gladiaio_sdk/network/http_client.py`:
- Around line 16-31: The _flatten_json_keys function can recurse indefinitely on
deeply nested payloads; add a max depth guard to avoid RecursionError by
introducing an optional depth/current_depth parameter (or a constant MAX_DEPTH)
and return early when current_depth >= MAX_DEPTH, ensuring callers still get
collected keys up to that depth; update the function signature (e.g.,
_flatten_json_keys(obj: Any, out: set[str] | None = None, depth: int = 0)) and
increment depth on each recursive call, and document/choose a reasonable
MAX_DEPTH constant and behavior (skip deeper children) in the function that
calls or caches results.
- Around line 75-105: Add a short inline comment at the top of the
_schema_field_names_for_url function explaining that the imports of
PreRecordedV2InitTranscriptionRequest and LiveV2InitRequest are intentionally
deferred (lazy) to avoid circular import issues and to keep startup cost low;
keep the current lazy-import placement and behavior, but document why those
imports are inside the function and reference the specific types
(PreRecordedV2InitTranscriptionRequest and LiveV2InitRequest) so future
maintainers understand the rationale.
In `@packages/sdk-python/src/gladiaio_sdk/v2/prerecorded/client.py`:
- Around line 202-213: The delete method currently checks resp.status_code ==
202 but HttpClient.delete raises HttpError for 4xx/5xx so the False path may
never run; either update the delete docstring (in delete) to state that
HttpError is raised on non-2xx responses, or change logic to catch HttpError
from self._http_client.delete and return False for error responses while
treating 202 (and other success codes like 200/204) as True; locate the method
delete, self._http_client.delete, and self._core.build_job_endpoint to implement
the preferred behavior.
- Around line 252-263: The polling loop can sleep past the timeout: after
calling get() you should compute elapsed = time.time() - start and if timeout is
set raise TimeoutError immediately when elapsed >= timeout (use
_core.create_timeout_error_message(job_id, timeout)), and when deciding to sleep
use sleep_time = min(interval, max(0, timeout - elapsed)) (or skip sleeping if
timeout is exhausted) so the loop never sleeps beyond the configured timeout;
adjust the loop around start, get(job_id), timeout, interval and the
_core.create_timeout_error_message call accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: bdb4dd83-bfb3-4a80-910e-765c86dd61b1
⛔ Files ignored due to path filters (3)
bun.lockis excluded by!**/*.locke2e/e2e-python/uv.lockis excluded by!**/*.lockpackages/sdk-python/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (12)
e2e/e2e-python/tests/test_prerecorded_v2_async.pypackages/sdk-js/package.jsonpackages/sdk-js/src/v2/prerecorded/client.tspackages/sdk-js/src/version.tspackages/sdk-python/CHANGELOG.mdpackages/sdk-python/pyproject.tomlpackages/sdk-python/src/gladiaio_sdk/network/__init__.pypackages/sdk-python/src/gladiaio_sdk/network/http_client.pypackages/sdk-python/src/gladiaio_sdk/v2/prerecorded/async_client.pypackages/sdk-python/src/gladiaio_sdk/v2/prerecorded/client.pypackages/sdk-python/src/gladiaio_sdk/version.pypackages/sdk-python/tests/network/test_async_http_client.py
✅ Files skipped from review due to trivial changes (5)
- packages/sdk-python/CHANGELOG.md
- packages/sdk-python/src/gladiaio_sdk/version.py
- packages/sdk-js/package.json
- packages/sdk-js/src/version.ts
- packages/sdk-python/src/gladiaio_sdk/network/init.py
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/sdk-python/pyproject.toml
- packages/sdk-python/src/gladiaio_sdk/v2/prerecorded/async_client.py
| def enrich_http_error_with_field_suggestions( | ||
| err: HttpError, | ||
| known_fields: Sequence[str], | ||
| *, | ||
| status_codes: tuple[int, ...] = (400, 422), | ||
| ) -> HttpError: | ||
| """Return a copy of ``err`` with difflib-based field hints, or ``err`` if none apply.""" | ||
| if err.status not in status_codes or not err.invalid_parameters: | ||
| return err | ||
| hint = format_invalid_field_suggestions(err.invalid_parameters, known_fields) | ||
| if not hint: | ||
| return err | ||
| return HttpError( | ||
| message=f"{str(err)}\n\n{hint}", | ||
| method=err.method, | ||
| url=err.url, | ||
| status=err.status, | ||
| id=err.id, | ||
| request_id=err.request_id, | ||
| response_body=err.response_body, | ||
| response_headers=err.response_headers, | ||
| validation_errors=err.validation_errors, | ||
| invalid_parameters=err.invalid_parameters, | ||
| ) | ||
|
|
There was a problem hiding this comment.
__cause__ is not preserved when enriching the error.
If the original HttpError has a __cause__ set (from chained exceptions), the enriched copy loses it. Consider passing cause=err.__cause__ to preserve the exception chain.
🛡️ Proposed fix
return HttpError(
message=f"{str(err)}\n\n{hint}",
method=err.method,
url=err.url,
status=err.status,
id=err.id,
request_id=err.request_id,
response_body=err.response_body,
response_headers=err.response_headers,
validation_errors=err.validation_errors,
invalid_parameters=err.invalid_parameters,
+ cause=err.__cause__,
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/sdk-python/src/gladiaio_sdk/network/http_client.py` around lines 274
- 298, The enriched HttpError returned by
enrich_http_error_with_field_suggestions currently discards the original
exception chain; update the HttpError construction in
enrich_http_error_with_field_suggestions to pass cause=err.__cause__ (or the
appropriate constructor parameter for the underlying HttpError type) so the
original __cause__ is preserved when creating the new HttpError instance,
keeping all other fields (message, method, url, status, id, request_id,
response_body, response_headers, validation_errors, invalid_parameters)
unchanged.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
packages/sdk-js/src/v2/prerecorded/client.ts (3)
154-171:⚠️ Potential issue | 🟠 Major
poll()can exceed the declared timeout budget.Line 170 always sleeps the full interval; when
timeout < interval, total wait overshoots caller deadline.💡 Proposed fix
- const start = Date.now() + const deadline = timeout === undefined ? undefined : Date.now() + timeout while (true) { + if (deadline !== undefined && Date.now() >= deadline) { + throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`) + } const result = await this.get(jobId) if (result.status === 'done') { return result } if (result.status === 'error') { throw new Error(`Pre-recorded job ${jobId} failed with error code: ${result.error_code}`) } - if (timeout !== undefined && Date.now() - start >= timeout) { + if (deadline !== undefined && Date.now() >= deadline) { throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`) } - await sleep(interval) + await sleep( + deadline === undefined ? interval : Math.min(interval, Math.max(0, deadline - Date.now())) + ) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/sdk-js/src/v2/prerecorded/client.ts` around lines 154 - 171, The poll() loop in PreRecordedV2Client can overshoot the provided timeout because it always awaits the full interval; modify poll (function poll, local variables start, interval, timeout, and the sleep call) to compute remaining = timeout - (Date.now() - start) when timeout is provided, and before sleeping either throw if remaining <= 0 or await sleep(Math.min(interval, remaining)); if timeout is undefined keep sleeping the full interval; ensure the timeout check occurs before sleeping so the method never waits past the caller's deadline.
60-63:⚠️ Potential issue | 🟠 MajorAvoid sync disk I/O in this async request path.
Line 62 uses
readFileSync, which blocks the event loop under load. Prefer asyncnode:fs/promises.readFile.💡 Proposed fix
- const { readFileSync } = await import('fs') - const { basename } = await import('path') - const fileBuffer = readFileSync(audio_url) + const [{ readFile }, { basename }] = await Promise.all([ + import('node:fs/promises'), + import('node:path'), + ]) + const fileBuffer = await readFile(audio_url) const filename = basename(audio_url)#!/bin/bash # Verify sync fs usage in JS SDK sources. rg -n --type=ts -C2 '\breadFileSync\s*\(' packages/sdk-js/src🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/sdk-js/src/v2/prerecorded/client.ts` around lines 60 - 63, The code is using readFileSync which blocks the event loop; replace the synchronous call with async file I/O by importing and using readFile from 'fs/promises' (or node:fs/promises) and await it instead of readFileSync when reading audio_url; update the import lines (remove the sync import of 'fs') and keep basename from 'path' as before, ensuring the surrounding async function (in client.ts where audio_url and filename are used) awaits the readFile promise so the rest of the logic uses the resolved fileBuffer.
138-140:⚠️ Potential issue | 🟠 MajorForce raw response for file download to keep
arrayBuffer()valid.Line 139 should request the raw
Response; otherwise JSON content-types are auto-parsed byHttpClient, andarrayBuffer()may fail at runtime.💡 Proposed fix
- const response = await this.httpClient.get<Response>(`/v2/pre-recorded/${jobId}/file`) + const response = await this.httpClient.get<Response>(`/v2/pre-recorded/${jobId}/file`, { + rawResponse: true, + }) return response.arrayBuffer()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/sdk-js/src/v2/prerecorded/client.ts` around lines 138 - 140, The getFile method calls this.httpClient.get which auto-parses JSON; update the get call in getFile(jobId: string) to request the raw/fetch Response (e.g., pass the httpClient option that disables parsing or sets responseType to raw/arraybuffer) so you receive a Response object and can safely call response.arrayBuffer(); modify the call to this.httpClient.get<Response>(`/v2/pre-recorded/${jobId}/file`, { /* raw response option */ }) and then return response.arrayBuffer().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/sdk-js/src/network/httpClient.ts`:
- Around line 107-111: The request methods in httpClient.ts (get, post, put,
delete, request) currently accept a rawResponse?: boolean option but their
TypeScript signatures don't reflect that: add overloads for each method so that
when options includes { rawResponse: true } the return type is Promise<Response>
(the raw fetch Response), and when options omits rawResponse or rawResponse is
false the return type is Promise<T> (the parsed JSON generic). Update the method
signatures for get/post/put/delete/request to declare these two overloads and
remove any unsafe casts (e.g., the cast at the previous unsafe cast location) so
callers like prerecorded/client.ts calling get<Response>(..., { rawResponse:
true }) or get<T>(...) without rawResponse are type-safe.
---
Duplicate comments:
In `@packages/sdk-js/src/v2/prerecorded/client.ts`:
- Around line 154-171: The poll() loop in PreRecordedV2Client can overshoot the
provided timeout because it always awaits the full interval; modify poll
(function poll, local variables start, interval, timeout, and the sleep call) to
compute remaining = timeout - (Date.now() - start) when timeout is provided, and
before sleeping either throw if remaining <= 0 or await sleep(Math.min(interval,
remaining)); if timeout is undefined keep sleeping the full interval; ensure the
timeout check occurs before sleeping so the method never waits past the caller's
deadline.
- Around line 60-63: The code is using readFileSync which blocks the event loop;
replace the synchronous call with async file I/O by importing and using readFile
from 'fs/promises' (or node:fs/promises) and await it instead of readFileSync
when reading audio_url; update the import lines (remove the sync import of 'fs')
and keep basename from 'path' as before, ensuring the surrounding async function
(in client.ts where audio_url and filename are used) awaits the readFile promise
so the rest of the logic uses the resolved fileBuffer.
- Around line 138-140: The getFile method calls this.httpClient.get which
auto-parses JSON; update the get call in getFile(jobId: string) to request the
raw/fetch Response (e.g., pass the httpClient option that disables parsing or
sets responseType to raw/arraybuffer) so you receive a Response object and can
safely call response.arrayBuffer(); modify the call to
this.httpClient.get<Response>(`/v2/pre-recorded/${jobId}/file`, { /* raw
response option */ }) and then return response.arrayBuffer().
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: cc8615b4-bc83-4c07-9057-eb48e734c68a
📒 Files selected for processing (3)
packages/sdk-js/src/network/httpClient.test.tspackages/sdk-js/src/network/httpClient.tspackages/sdk-js/src/v2/prerecorded/client.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/sdk-js/src/network/httpClient.test.ts
| type RequestOptions = Omit<RequestInit, 'method' | 'headers'> & { | ||
| headers?: Headers | ||
| /** When true, the successful response is the raw `fetch` `Response` (no JSON parsing). */ | ||
| rawResponse?: boolean | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify current type-contract gap and potentially unsafe call sites.
rg -n "async (get|post|put|delete|request)<" packages/sdk-js/src/network/httpClient.ts
rg -nP --type=ts -C2 'httpClient\.(get|post|put|delete)<Response>\(' packages/sdk-js/srcRepository: gladiaio/sdk
Length of output: 1124
🏁 Script executed:
# Get the full httpClient.ts file to understand the method signatures and implementation
cat -n packages/sdk-js/src/network/httpClient.ts | head -220Repository: gladiaio/sdk
Length of output: 7453
🏁 Script executed:
# Check what's at lines 285-287 mentioned in the review
sed -n '280,290p' packages/sdk-js/src/network/httpClient.tsRepository: gladiaio/sdk
Length of output: 388
🏁 Script executed:
# Search for all usages of httpClient methods with rawResponse to identify type mismatches
rg -nP --type=ts "httpClient\.(get|post|put|delete|request)" packages/sdk-js/src -A1 | grep -A1 "rawResponse"Repository: gladiaio/sdk
Length of output: 132
🏁 Script executed:
# Check lines 285-287 specifically
sed -n '283,290p' packages/sdk-js/src/network/httpClient.tsRepository: gladiaio/sdk
Length of output: 285
🏁 Script executed:
# Comprehensive search for all httpClient method calls to identify unsafe type patterns
rg -nP --type=ts "httpClient\.(get|post|put|delete|request)\<" packages/sdk-js/src -B1 -A2Repository: gladiaio/sdk
Length of output: 2661
Add overloads to enforce rawResponse behavior in method signatures.
The rawResponse flag (line 109) controls runtime behavior but isn't reflected in method types. This allows unsafe type assumptions:
- Line 139 (prerecorded/client.ts): Calls
get<Response>()withoutrawResponse: true, then invokes.arrayBuffer(). Without the flag, the implementation JSON-parses the response instead of returning rawResponse, causing a runtime failure. - Line 126 (prerecorded/client.ts): Calls
delete<Response>(..., { rawResponse: true }). Works at runtime but the type system doesn't enforce thatResponsetype requiresrawResponse: true.
Add overloads so:
rawResponse: truealways returnsResponse- Calls without
rawResponsereturn the genericResponseType(JSON-parsed)
This applies to all methods: get, post, put, delete, and request (lines 174, 181, 188, 195, 202, and the unsafe cast at 285-287).
Proposed typing fix
type RequestOptions = Omit<RequestInit, 'method' | 'headers'> & {
headers?: Headers
/** When true, the successful response is the raw `fetch` `Response` (no JSON parsing). */
rawResponse?: boolean
}
+type RawResponseRequestOptions = RequestOptions & { rawResponse: true }
+type ParsedRequestOptions = RequestOptions & { rawResponse?: false | undefined }
+
export class HttpClient {
+ async get(url: string | URL, init: RawResponseRequestOptions): Promise<Response>
+ async get<ResponseType = Response>(url: string | URL, init?: ParsedRequestOptions): Promise<ResponseType>
async get<ResponseType = Response>(
url: string | URL,
init: RequestOptions = {}
): Promise<ResponseType> {
return this.request('GET', url, init)
}
+ async delete(url: string | URL, init: RawResponseRequestOptions): Promise<Response>
+ async delete<ResponseType = Response>(url: string | URL, init?: ParsedRequestOptions): Promise<ResponseType>
async delete<ResponseType = Response>(
url: string | URL,
init: RequestOptions = {}
): Promise<ResponseType> {
return this.request('DELETE', url, init)
}
+ async request(
+ method: HttpMethod,
+ url: string | URL,
+ init: RawResponseRequestOptions
+ ): Promise<Response>
+ async request<ResponseType>(
+ method: HttpMethod,
+ url: string | URL,
+ init?: ParsedRequestOptions
+ ): Promise<ResponseType>
async request<ResponseType>(
method: HttpMethod,
url: string | URL,
init: RequestOptions = {}
): Promise<ResponseType> {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/sdk-js/src/network/httpClient.ts` around lines 107 - 111, The
request methods in httpClient.ts (get, post, put, delete, request) currently
accept a rawResponse?: boolean option but their TypeScript signatures don't
reflect that: add overloads for each method so that when options includes {
rawResponse: true } the return type is Promise<Response> (the raw fetch
Response), and when options omits rawResponse or rawResponse is false the return
type is Promise<T> (the parsed JSON generic). Update the method signatures for
get/post/put/delete/request to declare these two overloads and remove any unsafe
casts (e.g., the cast at the previous unsafe cast location) so callers like
prerecorded/client.ts calling get<Response>(..., { rawResponse: true }) or
get<T>(...) without rawResponse are type-safe.
| if "http_timeout" not in kwargs and not args: | ||
| merged_options = dataclasses.replace( | ||
| merged_options, | ||
| http_timeout=300, # 5 minutes to handle file uploads |
There was a problem hiding this comment.
Should it be hardcoded ?
| if "http_timeout" not in kwargs and not args: | ||
| merged_options = dataclasses.replace( | ||
| merged_options, | ||
| http_timeout=300, # 5 minutes to handle file uploads |
There was a problem hiding this comment.
Hardcoded timeout
| @@ -1,2 +1,2 @@ | |||
| // This file is auto-generated during build. Do not edit manually. | |||
| export const SDK_VERSION = '0.6.1' as const | |||
There was a problem hiding this comment.
Remove this manual change
| { | ||
| "name": "@gladiaio/sdk", | ||
| "description": "Gladia JavaScript/TypeScript SDK", | ||
| "version": "0.6.1", |
There was a problem hiding this comment.
Remove this manual change
| @@ -1,2 +1,2 @@ | |||
| # This file is auto-generated. Do not edit manually. | |||
There was a problem hiding this comment.
Remove this manual change
| @@ -1,41 +1,37 @@ | |||
| [project] | |||
| name = "gladiaio-sdk" | |||
| version = "0.4.0" | |||
There was a problem hiding this comment.
Remove this manual change
| - **sdk-python:** Improve dataclass usage | ||
| - **sdk-python:** Add version generation | ||
| - **sdk-python:** Use dataclass instead of TypedDict | ||
| - **sdk-python:** Initial asyncio version of Python SDK |
There was a problem hiding this comment.
Remove this manual change
Summary by CodeRabbit
New Features
Documentation
Tests
Bug Fixes / Improvements
Chores