[FDE-52] Add pre-recorded to sdk by Karamouche · Pull Request #8 · gladiaio/sdk

Karamouche · 2026-02-25T20:25:03Z

Summary by CodeRabbit

New Features
- Pre-Recorded V2 added to JS and Python SDKs: sync/async clients, upload, create/transcribe jobs, polling, fetch/delete results, convenience aliases, and comprehensive typed models.
Documentation
- README expanded with monorepo overview, architecture, generator workflow, getting started, contributing, and release guidance.
Tests
- New end-to-end test suites and updated test runner configs for Pre-Recorded V2.
Bug Fixes / Improvements
- Better HTTP handling (redirects, global agent), richer validation error hints, and adjusted default timeouts.
Chores
- Added root EditorConfig and various manifest/formatting cleanups.

…ad time

coderabbitai · 2026-02-25T20:25:12Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds PreRecorded V2: generator emits prerecorded schemas; new PreRecorded V2 clients, core logic, and generated types for JS/TS and Python; E2E Vitest tests and configs; SDK public APIs and versions updated; HTTP client improvements and enriched HttpError diagnostics; README and repo config updates.

Changes

Cohort / File(s)	Summary
Repo config & docs `\.editorconfig`, `README.md`	Add root EditorConfig; expand README with monorepo architecture, generator workflow, contributor and release guidance.
E2E configs & tests runner `e2e/e2e-node-cjs/vitest.config.mjs`, `e2e/e2e-node-esm/vitest.config.ts`, `e2e/e2e-node-esm/package.json`, `e2e/e2e-python/pyproject.toml`	Add Vitest configs, switch Node ESM tests to Vitest, add dev deps and normalize Python project config.
E2E tests `e2e/e2e-node-cjs/test/prerecorded_v2_async.test.cjs`, `e2e/e2e-node-esm/test/prerecorded_v2_async.test.ts`, `e2e/e2e-node-esm/test/live_v2_session.test.ts`, `e2e/e2e-python/tests/test_prerecorded_v2_async.py`	Add comprehensive prerecorded V2 end-to-end tests covering upload/create/poll/get/delete/getFile/transcribe and error scenarios.
Generator core & types `packages/generator/src/types.ts`, `packages/generator/src/generator.ts`, `packages/generator/src/generators/base.ts`	Introduce `PreRecordedV2Schemas` type, preprocessing step and generator pipeline to emit prerecorded schemas to language generators.
Generator language generators `packages/generator/src/generators/typescript.ts`, `packages/generator/src/generators/python.ts`	Add `fieldOverrides` to force specific fields optional; adjust ordering/signature to accept type name for override lookup.
SDK JS core & exports `packages/sdk-js/src/client.ts`, `packages/sdk-js/src/index.ts`, `packages/sdk-js/package.json`, `packages/sdk-js/src/version.ts`	Expose `preRecordedV2()` factory and `preRecorded`/`live` aliases; bump SDK version to 0.7.0 and re-export prerecorded types.
SDK JS prerecorded implementation `packages/sdk-js/src/v2/prerecorded/client.ts`, `packages/sdk-js/src/v2/prerecorded/generated-types.ts`, `packages/sdk-js/src/v2/prerecorded/index.ts`	Add `PreRecordedV2Client` with upload/create/poll/get/delete/getFile/transcribe flows and large set of generated TypeScript types; index re-exports.
SDK JS networking & tests `packages/sdk-js/src/network/iso-fetch.ts`, `packages/sdk-js/src/network/httpClient.test.ts`, `packages/sdk-js/src/network/httpClient.ts`	Use undici Agent via setGlobalDispatcher and return global fetch; add `rawResponse` flag to request options; relax some test assertions.
SDK Python public surface & client factories `packages/sdk-python/src/gladiaio_sdk/__init__.py`, `packages/sdk-python/src/gladiaio_sdk/client.py`, `packages/sdk-python/src/gladiaio_sdk/client_options.py`, `packages/sdk-python/src/gladiaio_sdk/version.py`, `packages/sdk-python/pyproject.toml`	Export prerecorded clients/types; add factory methods/aliases (`pre_recorded_v2`, `pre_recorded_v2_async`, `live_v2`, `live_v2_async`); use env var default for API key; add DEFAULT_* timeouts; bump SDK version to 0.5.0.
SDK Python networking & websocket `packages/sdk-python/src/gladiaio_sdk/network/http_client.py`, `packages/sdk-python/src/gladiaio_sdk/network/websocket_client.py`, `packages/sdk-python/src/gladiaio_sdk/network/__init__.py`	Enable follow_redirects and `files` support in httpx clients; enrich HttpError with validation_errors and suggestion helpers; simplify websocket attribute handling and re-export helpers.
SDK Python PreRecorded implementation `packages/sdk-python/src/gladiaio_sdk/v2/prerecorded/core.py`, `.../async_client.py`, `.../client.py`, `.../generated_types.py`, `.../__init__.py`	Add PreRecorded V2 core utilities (pure logic), async and sync PreRecorded V2 clients, comprehensive generated dataclasses/types, and package init re-exports.
Python tests for error enrichment `packages/sdk-python/tests/network/test_async_http_client.py`	Add tests for HttpError enrichment, validation_errors parsing, invalid parameter extraction and suggestion formatting.
Changelog & minor formatting `packages/sdk-python/CHANGELOG.md`, other small files	Minor formatting fixes and trailing newline adjustments.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant SDK as PreRecordedV2Client
    participant HTTP as HttpClient
    participant API as Gladia API

    Client->>SDK: transcribe(audio, options)
    alt Local file
        SDK->>SDK: prepare upload
        SDK->>HTTP: POST /v2/upload (multipart)
        HTTP->>API: upload audio
        API-->>HTTP: upload response (audio_url)
        HTTP-->>SDK: audio_url
    else URL input
        SDK->>SDK: use provided audio_url
    end
    SDK->>HTTP: POST /v2/pre-recorded (create job)
    HTTP->>API: create job
    API-->>HTTP: init response (job_id)
    HTTP-->>SDK: job_id
    loop poll
        SDK->>HTTP: GET /v2/pre-recorded/{job_id}
        HTTP->>API: job status
        API-->>HTTP: job response
        HTTP-->>SDK: status
        alt done
            SDK->>Client: return final result
        else pending
            SDK->>SDK: sleep(interval)
        else error
            SDK->>Client: raise error
        end
    end

sequenceDiagram
    participant OpenAPI
    participant Generator
    participant PreProc as PreProcessor
    participant BaseGen
    participant TSGen
    participant PyGen

    OpenAPI->>Generator: provide spec
    Generator->>BaseGen: generate LiveV2 schemas
    BaseGen->>TSGen: emit LiveV2 types
    BaseGen->>PyGen: emit LiveV2 types
    Generator->>PreProc: preProcessSchemaForPreRecordedV2(spec)
    PreProc-->>Generator: PreRecordedV2Schemas
    Generator->>BaseGen: generatePreRecordedV2(PreRecordedV2Schemas)
    BaseGen->>TSGen: generatePreRecordedV2Types()
    TSGen-->>TSGen: apply fieldOverrides
    BaseGen->>PyGen: generatePreRecordedV2Types()
    PyGen-->>PyGen: apply fieldOverrides
    TSGen-->>Repository: write v2/prerecorded types
    PyGen-->>Repository: write v2/prerecorded types

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 I hopped through specs and typed each line,

I uploaded bytes and polled till stars did shine,
Generators hummed, the SDKs grew new,
JS and Python learned the prerecorded view,
A carrot of types — crunchy, fresh, and true!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 51.92% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title '[FDE-52] Add pre-recorded to sdk' is partially related to the changeset; it refers to a real and significant aspect of the change (adding pre-recorded V2 functionality), but is overly broad and does not capture the full scope of the work, which includes generator updates, test coverage, documentation, version bumps, and HTTP error handling improvements across multiple SDKs.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/async-sdk

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…ecking for file uploads

…gnment

…ch assertions

…and return Node.js fetch

added object in PreRecordedV2DeleteResponse core.py for the deletion method

added return result for the javascript deletion method

e2e/e2e-node-esm/test/prerecorded_v2_async.test.ts

lrossillon-gladia · 2026-03-13T15:55:50Z

packages/sdk-js/src/v2/prerecorded/client.ts

@@ -0,0 +1,188 @@
+import { readFileSync } from 'fs'


Because GladiaClient statically imports this module from the root package entrypoint,
these top-level fs/path imports break browser/IIFE consumers even when they never use
pre-recorded APIs. I.e. you cannot use the streaming interface for the SDK from the browser anymore

tbh, per my LLM:

"The actual problem is that the root entrypoint also exports GladiaClient, and packages/sdk-js/src/ client.ts has a static value import of PreRecordedV2Client from packages/sdk-js/src/v2/ prerecorded/index.ts, which in turn re-exports packages/sdk-js/src/v2/prerecorded/client.ts. That file now has top-level imports of fs and path.

But this is a classic problem to beware of on typescript packages

lrossillon-gladia · 2026-03-13T16:31:20Z

packages/sdk-python/src/gladiaio_sdk/v2/prerecorded/client.py

+      audio: A local file path (str or Path), an open binary file object, or a URL (str).
+      options: Optional transcription options (no audio_url). Can be a
+        :class:`PreRecordedV2TranscriptionOptions` instance or a dict.
+      interval: Seconds between polling attempts.


This docstring uses seconds, but the Python top-level client options still expose bare numeric defaults without unit docs; can we document timeout units at the options definition too, especially since the JS SDK documents milliseconds explicitly?

(also why the change seconds / milliseconds between JS and Python? But it doesn't matter that much)

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (4)

README.md (4)
15-27: ⚠️ Potential issue | 🟡 Minor

Add language hint to the directory tree code block.

The fenced code block is missing a language specifier, causing markdownlint MD040 violations. Adding text will resolve the linting issue.
📝 Proposed fix
-```
+```text
 ├── packages/
 │   ├── generator/          # Fetches Gladia OpenAPI schema and generates types
 │   ├── sdk-js/             # JavaScript/TypeScript SDK (ESM, CJS, IIFE)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 15 - 27, Add a language hint to the fenced code block
that contains the ASCII directory tree so markdownlint MD040 is not triggered;
update the opening fence from ``` to ```text (the directory tree code block
shown in the README, i.e., the block that starts with the line containing "├──
packages/") so the block is explicitly marked as plain text.
212-219: ⚠️ Potential issue | 🟡 Minor

Add language hint to the package structure code block.

The fenced code block showing the package directory structure is missing a language specifier, causing markdownlint MD040 violations. Adding text will resolve the linting issue.
📝 Proposed fix
-```
+```text
 packages/sdk-<language>/
 ├── src/
 │   └── v2/live/           # The generator will write here
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 212 - 219, The fenced code block that shows the
package directory structure (starting with the line "packages/sdk-<language>/")
lacks a language specifier and triggers markdownlint MD040; update that fenced
code block to include a language hint (e.g., add "text" after the opening triple
backticks) so the block begins with ```text and the rest of the snippet (the
lines under "packages/sdk-<language>/", "├── src/", "│   └── v2/live/", etc.)
remains unchanged.
35-46: ⚠️ Potential issue | 🟡 Minor

Add language hint to the flow diagram code block.

The fenced code block is missing a language specifier, causing markdownlint MD040 violations. Adding text will resolve the linting issue.
📝 Proposed fix
-```
+```text
 OpenAPI schema (api.gladia.io/openapi.json)
         │
         ▼
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 35 - 46, The fenced diagram block in README.md is
missing a language specifier, causing markdownlint MD040; update the code fence
that contains the ASCII flow diagram (the triple-backtick block shown starting
with "OpenAPI schema (api.gladia.io/openapi.json)") to include a language hint
like text (i.e., change ``` to ```text) so the linter recognizes it as a plain
text block and the MD040 warning is resolved.
65-74: ⚠️ Potential issue | 🟡 Minor

Add language hint to the commit format code block.

The fenced code block showing the commit message template is missing a language specifier, causing markdownlint MD040 violations. Adding text will resolve the linting issue.
📝 Proposed fix
 This project uses [Conventional Commits](https://www.conventionalcommits.org/) enforced by commitlint. Commits must follow the format:
 
-```
+```text
 <type>(optional scope): <description>
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.

In @README.md around lines 65 - 74, Update the fenced code block under the
"Commit convention" section to include a language hint to satisfy markdownlint
MD040: change the opening fence from totext for the block that contains
"(optional scope): " so the commit message template is fenced
as text ... ; no other changes required.
</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🧹 Nitpick comments (1)</summary><blockquote>

<details>
<summary>packages/sdk-python/src/gladiaio_sdk/client_options.py (1)</summary><blockquote>

`9-12`: **Consider using `None` as the sentinel instead of a magic value.**

The comment indicates these constants detect "no timeout configured." If a user explicitly sets `http_timeout=10.0`, the detection logic may incorrectly treat it as "not configured." A cleaner pattern is to default to `Optional[float] = None` and apply the fallback internally:

```python
http_timeout: float | None = None  # None means "use default for context"
```

Then in the consuming code, fall back to the appropriate default (e.g., 10s for WebSocket, longer for pre-recorded).

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@packages/sdk-python/src/gladiaio_sdk/client_options.py` around lines 9 - 12,
Replace the sentinel magic values DEFAULT_HTTP_TIMEOUT and DEFAULT_WS_TIMEOUT
with None-based sentinels (use Optional[float] defaults) so "no timeout
configured" is represented by None; update the client options declaration (e.g.,
http_timeout, ws_timeout in client_options.py) to default to None and change any
consuming logic (where the code checks DEFAULT_HTTP_TIMEOUT/DEFAULT_WS_TIMEOUT)
to explicitly fall back to the actual numeric defaults (e.g., 10s for WS or
longer for pre-recorded) when the timeout value is None rather than comparing
against the magic constants.
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @packages/sdk-js/src/v2/prerecorded/client.ts:

Around line 151-168: The poll() loop can overshoot the caller's timeout when
timeout < interval because it always awaits sleep(interval); modify poll (in
PreRecordedV2 client) to compute remaining = timeout - (Date.now() - start) when
timeout is provided, throw immediately if remaining <= 0, and sleep for
Math.min(interval, remaining) instead of the full interval; keep behavior
unchanged when timeout is undefined. Update references in callers like
createAndPoll()/transcribe() rely on poll() honoring the deadline.

Duplicate comments:
In @README.md:

Around line 15-27: Add a language hint to the fenced code block that contains
the ASCII directory tree so markdownlint MD040 is not triggered; update the
opening fence from totext (the directory tree code block shown in the
README, i.e., the block that starts with the line containing "├── packages/") so
the block is explicitly marked as plain text.

Around line 212-219: The fenced code block that shows the package directory
structure (starting with the line "packages/sdk-/") lacks a language
specifier and triggers markdownlint MD040; update that fenced code block to
include a language hint (e.g., add "text" after the opening triple backticks) so
the block begins with ```text and the rest of the snippet (the lines under
"packages/sdk-/", "├── src/", "│ └── v2/live/", etc.) remains
unchanged.

Around line 35-46: The fenced diagram block in README.md is missing a language
specifier, causing markdownlint MD040; update the code fence that contains the
ASCII flow diagram (the triple-backtick block shown starting with "OpenAPI
schema (api.gladia.io/openapi.json)") to include a language hint like text
(i.e., change totext) so the linter recognizes it as a plain text block
and the MD040 warning is resolved.

Around line 65-74: Update the fenced code block under the "Commit convention"
section to include a language hint to satisfy markdownlint MD040: change the
opening fence from totext for the block that contains "(optional
scope): " so the commit message template is fenced as ```text ...
---

Nitpick comments:
In `@packages/sdk-python/src/gladiaio_sdk/client_options.py`:
- Around line 9-12: Replace the sentinel magic values DEFAULT_HTTP_TIMEOUT and
DEFAULT_WS_TIMEOUT with None-based sentinels (use Optional[float] defaults) so
"no timeout configured" is represented by None; update the client options
declaration (e.g., http_timeout, ws_timeout in client_options.py) to default to
None and change any consuming logic (where the code checks
DEFAULT_HTTP_TIMEOUT/DEFAULT_WS_TIMEOUT) to explicitly fall back to the actual
numeric defaults (e.g., 10s for WS or longer for pre-recorded) when the timeout
value is None rather than comparing against the magic constants.
ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 31c2a3bf-1334-4a43-b4a2-fd9a20f93694

📥 Commits

Reviewing files that changed from the base of the PR and between 29cd60c and a3490b2.

📒 Files selected for processing (3)

README.md

packages/sdk-js/src/v2/prerecorded/client.ts

packages/sdk-python/src/gladiaio_sdk/client_options.py

coderabbitai · 2026-03-17T16:01:32Z

packages/sdk-js/src/v2/prerecorded/client.ts

+  async poll(
+    jobId: string,
+    { interval = 3_000, timeout }: { interval?: number; timeout?: number } = {}
+  ): Promise<PreRecordedV2Response> {
+    const start = Date.now()
+    while (true) {
+      const result = await this.get(jobId)
+      if (result.status === 'done') {
+        return result
+      }
+      if (result.status === 'error') {
+        throw new Error(`Pre-recorded job ${jobId} failed with error code: ${result.error_code}`)
+      }
+      if (timeout !== undefined && Date.now() - start >= timeout) {
+        throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`)
+      }
+      await sleep(interval)
+    }


⚠️ Potential issue | 🟠 Major

poll() can exceed the caller’s timeout.

If timeout is smaller than interval, this loop still sleeps the full interval after the timeout check, so createAndPoll() and transcribe() can wait well past the documented deadline. Clamp the sleep to the remaining budget instead of always sleeping interval.

⏱️ Suggested change

async poll( jobId: string, { interval = 3_000, timeout }: { interval?: number; timeout?: number } = {} ): Promise<PreRecordedV2Response> { - const start = Date.now() + const deadline = timeout === undefined ? undefined : Date.now() + timeout while (true) { + if (deadline !== undefined && Date.now() >= deadline) { + throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`) + } const result = await this.get(jobId) if (result.status === 'done') { return result } if (result.status === 'error') { throw new Error(`Pre-recorded job ${jobId} failed with error code: ${result.error_code}`) } - if (timeout !== undefined && Date.now() - start >= timeout) { + if (deadline !== undefined && Date.now() >= deadline) { throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`) } - await sleep(interval) + await sleep( + deadline === undefined ? interval : Math.min(interval, Math.max(0, deadline - Date.now())) + ) } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async poll(

jobId: string,

{ interval = 3_000, timeout }: { interval?: number; timeout?: number } = {}

): Promise<PreRecordedV2Response> {

const start = Date.now()

while (true) {

const result = await this.get(jobId)

if (result.status === 'done') {

return result

}

if (result.status === 'error') {

throw new Error(`Pre-recorded job ${jobId} failed with error code: ${result.error_code}`)

}

if (timeout !== undefined && Date.now() - start >= timeout) {

throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`)

}

await sleep(interval)

}

async poll(

jobId: string,

{ interval = 3_000, timeout }: { interval?: number; timeout?: number } = {}

): Promise<PreRecordedV2Response> {

const deadline = timeout === undefined ? undefined : Date.now() + timeout

while (true) {

if (deadline !== undefined && Date.now() >= deadline) {

throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`)

}

const result = await this.get(jobId)

if (result.status === 'done') {

return result

}

if (result.status === 'error') {

throw new Error(`Pre-recorded job ${jobId} failed with error code: ${result.error_code}`)

}

if (deadline !== undefined && Date.now() >= deadline) {

throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`)

}

await sleep(

deadline === undefined ? interval : Math.min(interval, Math.max(0, deadline - Date.now()))

)

}

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/sdk-js/src/v2/prerecorded/client.ts` around lines 151 - 168, The poll() loop can overshoot the caller's timeout when timeout < interval because it always awaits sleep(interval); modify poll (in PreRecordedV2 client) to compute remaining = timeout - (Date.now() - start) when timeout is provided, throw immediately if remaining <= 0, and sleep for Math.min(interval, remaining) instead of the full interval; keep behavior unchanged when timeout is undefined. Update references in callers like createAndPoll()/transcribe() rely on poll() honoring the deadline.

coderabbitai

♻️ Duplicate comments (5)

README.md (4)
212-219: ⚠️ Potential issue | 🟡 Minor

Add language hint to the fenced code block.

This code block is missing an info string, causing markdownlint to flag MD040. Add text as the language identifier.
📝 Proposed fix
-```
+```text
 packages/sdk-<language>/
 ├── src/
 │   └── v2/live/           # The generator will write here
 ├── project.json           # Nx project configuration
 ├── README.md
 └── <language build config> (e.g. go.mod, Gemfile, pom.xml)
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.

In @README.md around lines 212 - 219, The fenced code block in README.md lacks
an info string (triggering MD040); update the triple-backtick fence that
precedes the directory tree (the snippet showing "packages/sdk-/") to
include the language hint "text" (i.e., change totext) so markdownlint
recognizes it as a code block with an explicit language identifier.
</details>

---

`15-27`: _⚠️ Potential issue_ | _🟡 Minor_

**Add language hint to the fenced code block.**

This code block is missing an info string, causing `markdownlint` to flag MD040. Add `text` as the language identifier.


<details>
<summary>📝 Proposed fix</summary>

```diff
-```
+```text
 ├── packages/
 │   ├── generator/          # Fetches Gladia OpenAPI schema and generates types
 │   ├── sdk-js/             # JavaScript/TypeScript SDK (ESM, CJS, IIFE)
 │   └── sdk-python/         # Python SDK (sync + async)
 ├── e2e/
 │   ├── e2e-node-esm/       # E2E tests — Node ESM
 │   ├── e2e-node-cjs/       # E2E tests — Node CJS
 │   ├── e2e-python/         # E2E tests — Python
 │   └── javascript-fixtures/
 ├── tools/scripts/           # Release & publish automation
 └── .github/workflows/       # CI, release, publish pipelines
 ```
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.

In @README.md around lines 15 - 27, The fenced code block in README.md lacks an
info string and triggers markdownlint MD040; update that block by adding the
language hint "text" to the opening backticks so the block starts with ```text,
ensuring the directory tree sample is treated as plain text and the linter
warning is resolved.
</details>

---

`35-46`: _⚠️ Potential issue_ | _🟡 Minor_

**Add language hint to the fenced code block.**

This code block is missing an info string, causing `markdownlint` to flag MD040. Add `text` as the language identifier.


<details>
<summary>📝 Proposed fix</summary>

```diff
-```
+```text
 OpenAPI schema (api.gladia.io/openapi.json)
         │
         ▼
    generator
     ├── Live V2
     │   ├──▶ packages/sdk-js/src/v2/live/generated-types.ts
     │   └──▶ packages/sdk-python/src/gladiaio_sdk/v2/live/generated_types.py
     └── PreRecorded V2
         ├──▶ packages/sdk-js/src/v2/prerecorded/generated-types.ts
         └──▶ packages/sdk-python/src/gladiaio_sdk/v2/prerecorded/generated_types.py
 ```
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.

In @README.md around lines 35 - 46, The fenced code block containing the ASCII
tree starting with "OpenAPI schema (api.gladia.io/openapi.json)" is missing a
language info string and triggers markdownlint MD040; fix it by adding the info
string "text" to the opening backticks of that code block (so the block begins
with ```text) in the README so the snippet is treated as plain text.
</details>

---

`69-71`: _⚠️ Potential issue_ | _🟡 Minor_

**Add language hint to the fenced code block.**

This code block is missing an info string, causing `markdownlint` to flag MD040. Add `text` as the language identifier.


<details>
<summary>📝 Proposed fix</summary>

```diff
-```
+```text
 <type>(optional scope): <description>
 ```
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.

In @README.md around lines 69 - 71, The fenced code block showing
"(optional scope): " is missing an info string and triggers
markdownlint MD040; update that code fence by adding the language identifier
"text" to the opening backticks so the block starts with ```text ensuring the
info string is present and MD040 is resolved.
</details>

</blockquote></details>
<details>
<summary>packages/sdk-js/src/v2/prerecorded/client.ts (1)</summary><blockquote>

`151-168`: _⚠️ Potential issue_ | _🟠 Major_

**`poll()` can exceed the declared timeout budget.**

When `timeout < interval`, Line 167 always sleeps the full `interval`, so timeout is exceeded before the next iteration.

 

<details>
<summary>Proposed patch</summary>

```diff
-    const start = Date.now()
+    const deadline = timeout === undefined ? undefined : Date.now() + timeout
     while (true) {
+      if (deadline !== undefined && Date.now() >= deadline) {
+        throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`)
+      }
       const result = await this.get(jobId)
       if (result.status === 'done') {
         return result
       }
       if (result.status === 'error') {
         throw new Error(`Pre-recorded job ${jobId} failed with error code: ${result.error_code}`)
       }
-      if (timeout !== undefined && Date.now() - start >= timeout) {
+      if (deadline !== undefined && Date.now() >= deadline) {
         throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`)
       }
-      await sleep(interval)
+      await sleep(
+        deadline === undefined ? interval : Math.min(interval, Math.max(0, deadline - Date.now()))
+      )
     }
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@packages/sdk-js/src/v2/prerecorded/client.ts` around lines 151 - 168, The
poll method can sleep past the timeout when timeout < interval; update
PreRecordedV2Client.poll to compute remaining time after checking timeout (use
start and Date.now()), and before calling sleep(interval) if timeout is set,
throw immediately if remaining <= 0 and otherwise sleep Math.min(interval,
remaining); ensure jobId, start, interval, timeout and sleep are the referenced
symbols to locate and modify.
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In @packages/sdk-js/src/v2/prerecorded/client.ts:

Around line 151-168: The poll method can sleep past the timeout when timeout <
interval; update PreRecordedV2Client.poll to compute remaining time after
checking timeout (use start and Date.now()), and before calling sleep(interval)
if timeout is set, throw immediately if remaining <= 0 and otherwise sleep
Math.min(interval, remaining); ensure jobId, start, interval, timeout and sleep
are the referenced symbols to locate and modify.

In @README.md:

Around line 212-219: The fenced code block in README.md lacks an info string
(triggering MD040); update the triple-backtick fence that precedes the directory
tree (the snippet showing "packages/sdk-/") to include the language
hint "text" (i.e., change totext) so markdownlint recognizes it as a
code block with an explicit language identifier.

Around line 15-27: The fenced code block in README.md lacks an info string and
triggers markdownlint MD040; update that block by adding the language hint
"text" to the opening backticks so the block starts with ```text, ensuring the
directory tree sample is treated as plain text and the linter warning is
resolved.

Around line 35-46: The fenced code block containing the ASCII tree starting
with "OpenAPI schema (api.gladia.io/openapi.json)" is missing a language info
string and triggers markdownlint MD040; fix it by adding the info string "text"
to the opening backticks of that code block (so the block begins with ```text)
in the README so the snippet is treated as plain text.

Around line 69-71: The fenced code block showing "(optional scope):
" is missing an info string and triggers markdownlint MD040; update
that code fence by adding the language identifier "text" to the opening
backticks so the block starts with ```text ensuring the info string is present
and MD040 is resolved.
</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: defaults

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `d42ee134-7221-41b0-9ea0-154f9cd1400a`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between a3490b2b3b4d9afca853512b5a4351c73c22472e and 10cc9ab67a93f3caeea742ff5b028f5421efff99.

</details>

<details>
<summary>📒 Files selected for processing (3)</summary>

* `README.md`
* `packages/sdk-js/src/v2/prerecorded/client.ts`
* `packages/sdk-python/src/gladiaio_sdk/client_options.py`

</details>

<details>
<summary>🚧 Files skipped from review as they are similar to previous changes (1)</summary>

* packages/sdk-python/src/gladiaio_sdk/client_options.py

</details>

</details>

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (2)

packages/sdk-js/src/v2/prerecorded/client.ts (1)

151-168: ⚠️ Potential issue | 🟡 Minor

poll() can exceed the caller's timeout.

The timeout check on line 164 occurs after get() but before sleep(). If timeout is smaller than interval, the full interval sleep still executes, causing the actual wait time to exceed the documented deadline. Clamp the sleep duration to the remaining time budget.

⏱️ Suggested fix

   async poll(
     jobId: string,
     { interval = 3_000, timeout }: { interval?: number; timeout?: number } = {}
   ): Promise<PreRecordedV2Response> {
-    const start = Date.now()
+    const deadline = timeout === undefined ? undefined : Date.now() + timeout
     while (true) {
       const result = await this.get(jobId)
       if (result.status === 'done') {
         return result
       }
       if (result.status === 'error') {
         throw new Error(`Pre-recorded job ${jobId} failed with error code: ${result.error_code}`)
       }
-      if (timeout !== undefined && Date.now() - start >= timeout) {
+      if (deadline !== undefined && Date.now() >= deadline) {
         throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`)
       }
-      await sleep(interval)
+      const remaining = deadline === undefined ? interval : Math.max(0, deadline - Date.now())
+      if (remaining <= 0) {
+        throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`)
+      }
+      await sleep(Math.min(interval, remaining))
     }
   }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/sdk-js/src/v2/prerecorded/client.ts` around lines 151 - 168, In
poll(), the current timeout check happens before sleep so a long interval can
push the actual wait past the caller's timeout; change the loop to compute
remaining = timeout - (Date.now() - start) (when timeout is defined) and, before
calling sleep(), if remaining <= 0 throw the timeout Error, otherwise sleep for
Math.min(interval, remaining); keep full-interval sleep when timeout is
undefined; reference the poll function, variables start, interval, timeout,
get(), and sleep() when making the change.

e2e/e2e-python/tests/test_prerecorded_v2_async.py (1)

21-24: ⚠️ Potential issue | 🟠 Major

Skip this e2e module when GLADIA_API_KEY is absent.

Without a module-level guard, local runs and CI without secrets will fail when GladiaClient() raises ValueError due to missing API key.

🧪 Suggested guard

 YOUTUBE_VIDEO_URL = "https://www.youtube.com/watch?v=DYyY8Nh3TQE"
 POLL_TIMEOUT_S = 180.0
 YOUTUBE_POLL_TIMEOUT_S = 600.0
+
+pytestmark = pytest.mark.skipif(
+  not os.getenv("GLADIA_API_KEY"),
+  reason="requires GLADIA_API_KEY",
+)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@e2e/e2e-python/tests/test_prerecorded_v2_async.py` around lines 21 - 24, Add
a module-level guard that skips the entire e2e module when the GLADIA_API_KEY
env var is missing to avoid GladiaClient() raising ValueError during import; at
the top of the file (near YOUTUBE_VIDEO_URL / POLL_TIMEOUT_S constants) check
os.environ.get("GLADIA_API_KEY") and call pytest.skip("Gladia API key missing,
skipping e2e tests", allow_module_level=True) if absent so tests that construct
GladiaClient() are not executed.

🧹 Nitpick comments (5)

packages/sdk-js/src/v2/prerecorded/client.ts (1)

60-65: Consider using async file reading to avoid blocking the event loop.

readFileSync blocks the thread while reading the file. For large audio files, this can degrade server performance. The dynamic import is already async, so switching to readFile from fs/promises is straightforward.

♻️ Proposed fix

-      const { readFileSync } = await import('fs')
+      const { readFile } = await import('fs/promises')
       const { basename } = await import('path')
-      const fileBuffer = readFileSync(audio_url)
+      const fileBuffer = await readFile(audio_url)
       const filename = basename(audio_url)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/sdk-js/src/v2/prerecorded/client.ts` around lines 60 - 65, Replace
the synchronous readFileSync with an async file read: dynamically import or
require readFile from 'fs/promises' and await readFile(audio_url) to populate
fileBuffer instead of blocking the event loop; keep using basename(audio_url)
for filename, create the Blob from the awaited fileBuffer and append to formData
as before (variables: audio_url, fileBuffer, filename, blob, formData).

packages/sdk-python/src/gladiaio_sdk/network/http_client.py (2)

16-31: Unbounded recursion risk with deeply nested payloads.

_flatten_json_keys uses recursion without a depth limit. Maliciously crafted or excessively nested JSON payloads could cause a RecursionError. Consider adding a depth guard.

♻️ Proposed fix with depth limit

-def _flatten_json_keys(obj: Any, out: set[str] | None = None) -> set[str]:
+def _flatten_json_keys(obj: Any, out: set[str] | None = None, *, _depth: int = 0, _max_depth: int = 50) -> set[str]:
   """Collect all string keys from nested dict/list JSON payloads."""
   if out is None:
     out = set()
+  if _depth >= _max_depth:
+    return out
   if isinstance(obj, dict):
     for k, v in obj.items():
       if isinstance(k, str):
         out.add(k)
-      _flatten_json_keys(v, out)
+      _flatten_json_keys(v, out, _depth=_depth + 1, _max_depth=_max_depth)
   elif isinstance(obj, list):
     for item in obj:
-      _flatten_json_keys(item, out)
+      _flatten_json_keys(item, out, _depth=_depth + 1, _max_depth=_max_depth)
   return out

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/sdk-python/src/gladiaio_sdk/network/http_client.py` around lines 16
- 31, The _flatten_json_keys function can recurse indefinitely on deeply nested
payloads; add a max depth guard to avoid RecursionError by introducing an
optional depth/current_depth parameter (or a constant MAX_DEPTH) and return
early when current_depth >= MAX_DEPTH, ensuring callers still get collected keys
up to that depth; update the function signature (e.g., _flatten_json_keys(obj:
Any, out: set[str] | None = None, depth: int = 0)) and increment depth on each
recursive call, and document/choose a reasonable MAX_DEPTH constant and behavior
(skip deeper children) in the function that calls or caches results.

75-105: Lazy imports inside function are acceptable but unconventional.

The deferred imports of PreRecordedV2InitTranscriptionRequest and LiveV2InitRequest avoid circular dependencies. This is a valid pattern, though documenting the reason would help future maintainers.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/sdk-python/src/gladiaio_sdk/network/http_client.py` around lines 75
- 105, Add a short inline comment at the top of the _schema_field_names_for_url
function explaining that the imports of PreRecordedV2InitTranscriptionRequest
and LiveV2InitRequest are intentionally deferred (lazy) to avoid circular import
issues and to keep startup cost low; keep the current lazy-import placement and
behavior, but document why those imports are inside the function and reference
the specific types (PreRecordedV2InitTranscriptionRequest and LiveV2InitRequest)
so future maintainers understand the rationale.

packages/sdk-python/src/gladiaio_sdk/v2/prerecorded/client.py (2)

202-213: The return False path on non-202 may be unreachable.

When HttpClient.delete() receives a 4xx/5xx response, it raises HttpError before returning. The resp.status_code == 202 check would only catch non-error status codes like 200/201/204. Consider if this is the intended behavior or if the docstring should clarify.
📝 Clarify docstring or adjust logic

If the API returns 200/204 for already-deleted jobs or other success variants:
   def delete(self, job_id: str) -> bool:
     """Delete a pre-recorded transcription job.

     Args:
       job_id: The UUID of the transcription job to delete.

     Returns:
-      True if the job was deleted successfully (HTTP 202), False otherwise.
+      True if the job was deleted successfully (HTTP 202).
+      Returns False for other 2xx responses (e.g., already deleted).
+      Raises HttpError for 4xx/5xx responses.
     """
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/sdk-python/src/gladiaio_sdk/v2/prerecorded/client.py` around lines
202 - 213, The delete method currently checks resp.status_code == 202 but
HttpClient.delete raises HttpError for 4xx/5xx so the False path may never run;
either update the delete docstring (in delete) to state that HttpError is raised
on non-2xx responses, or change logic to catch HttpError from
self._http_client.delete and return False for error responses while treating 202
(and other success codes like 200/204) as True; locate the method delete,
self._http_client.delete, and self._core.build_job_endpoint to implement the
preferred behavior.
252-263: Potential off-by-one in timeout check.

The condition (time.time() - start) >= timeout is checked after get() returns but before sleeping. If get() takes longer than the timeout, the check correctly triggers. However, if get() completes just under the timeout and then the loop sleeps, total elapsed time could exceed timeout. This is likely acceptable for polling semantics but worth noting.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/sdk-python/src/gladiaio_sdk/v2/prerecorded/client.py` around lines
252 - 263, The polling loop can sleep past the timeout: after calling get() you
should compute elapsed = time.time() - start and if timeout is set raise
TimeoutError immediately when elapsed >= timeout (use
_core.create_timeout_error_message(job_id, timeout)), and when deciding to sleep
use sleep_time = min(interval, max(0, timeout - elapsed)) (or skip sleeping if
timeout is exhausted) so the loop never sleeps beyond the configured timeout;
adjust the loop around start, get(job_id), timeout, interval and the
_core.create_timeout_error_message call accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/sdk-js/src/v2/prerecorded/client.ts`:
- Around line 124-127: The delete(jobId) method incorrectly checks
response.status because this.httpClient.delete<Response> returns a parsed body
(so status is undefined); update delete to either call the client in a way that
returns the raw HTTP response (e.g., change the generic from Response to void or
use the client option that returns the raw Response) so you can test
response.status === 202, or simplify by wrapping the call in try/catch and
return true on successful completion and false on error; specifically modify the
delete method in the class (the delete(jobId: string): Promise<boolean> function
that calls this.httpClient.delete) to use one of those two approaches.
- Around line 135-138: The getFile method currently calls
this.httpClient.get<Response>(...) and then response.arrayBuffer(), but
HttpClient auto-parses JSON responses so the returned value may be a parsed
object (breaking arrayBuffer()); update getFile to request the raw binary
response by using the HttpClient option that prevents JSON auto-parsing (e.g.,
set responseType: 'arraybuffer' or an equivalent flag) or call a low-level/raw
fetch method on the client that returns the native Response; ensure the change
is made in the getFile function and that the returned value is the actual
Response (so response.arrayBuffer() is valid) rather than a parsed JSON object.

In `@packages/sdk-python/src/gladiaio_sdk/network/http_client.py`:
- Around line 274-298: The enriched HttpError returned by
enrich_http_error_with_field_suggestions currently discards the original
exception chain; update the HttpError construction in
enrich_http_error_with_field_suggestions to pass cause=err.__cause__ (or the
appropriate constructor parameter for the underlying HttpError type) so the
original __cause__ is preserved when creating the new HttpError instance,
keeping all other fields (message, method, url, status, id, request_id,
response_body, response_headers, validation_errors, invalid_parameters)
unchanged.

---

Duplicate comments:
In `@e2e/e2e-python/tests/test_prerecorded_v2_async.py`:
- Around line 21-24: Add a module-level guard that skips the entire e2e module
when the GLADIA_API_KEY env var is missing to avoid GladiaClient() raising
ValueError during import; at the top of the file (near YOUTUBE_VIDEO_URL /
POLL_TIMEOUT_S constants) check os.environ.get("GLADIA_API_KEY") and call
pytest.skip("Gladia API key missing, skipping e2e tests",
allow_module_level=True) if absent so tests that construct GladiaClient() are
not executed.

In `@packages/sdk-js/src/v2/prerecorded/client.ts`:
- Around line 151-168: In poll(), the current timeout check happens before sleep
so a long interval can push the actual wait past the caller's timeout; change
the loop to compute remaining = timeout - (Date.now() - start) (when timeout is
defined) and, before calling sleep(), if remaining <= 0 throw the timeout Error,
otherwise sleep for Math.min(interval, remaining); keep full-interval sleep when
timeout is undefined; reference the poll function, variables start, interval,
timeout, get(), and sleep() when making the change.

---

Nitpick comments:
In `@packages/sdk-js/src/v2/prerecorded/client.ts`:
- Around line 60-65: Replace the synchronous readFileSync with an async file
read: dynamically import or require readFile from 'fs/promises' and await
readFile(audio_url) to populate fileBuffer instead of blocking the event loop;
keep using basename(audio_url) for filename, create the Blob from the awaited
fileBuffer and append to formData as before (variables: audio_url, fileBuffer,
filename, blob, formData).

In `@packages/sdk-python/src/gladiaio_sdk/network/http_client.py`:
- Around line 16-31: The _flatten_json_keys function can recurse indefinitely on
deeply nested payloads; add a max depth guard to avoid RecursionError by
introducing an optional depth/current_depth parameter (or a constant MAX_DEPTH)
and return early when current_depth >= MAX_DEPTH, ensuring callers still get
collected keys up to that depth; update the function signature (e.g.,
_flatten_json_keys(obj: Any, out: set[str] | None = None, depth: int = 0)) and
increment depth on each recursive call, and document/choose a reasonable
MAX_DEPTH constant and behavior (skip deeper children) in the function that
calls or caches results.
- Around line 75-105: Add a short inline comment at the top of the
_schema_field_names_for_url function explaining that the imports of
PreRecordedV2InitTranscriptionRequest and LiveV2InitRequest are intentionally
deferred (lazy) to avoid circular import issues and to keep startup cost low;
keep the current lazy-import placement and behavior, but document why those
imports are inside the function and reference the specific types
(PreRecordedV2InitTranscriptionRequest and LiveV2InitRequest) so future
maintainers understand the rationale.

In `@packages/sdk-python/src/gladiaio_sdk/v2/prerecorded/client.py`:
- Around line 202-213: The delete method currently checks resp.status_code ==
202 but HttpClient.delete raises HttpError for 4xx/5xx so the False path may
never run; either update the delete docstring (in delete) to state that
HttpError is raised on non-2xx responses, or change logic to catch HttpError
from self._http_client.delete and return False for error responses while
treating 202 (and other success codes like 200/204) as True; locate the method
delete, self._http_client.delete, and self._core.build_job_endpoint to implement
the preferred behavior.
- Around line 252-263: The polling loop can sleep past the timeout: after
calling get() you should compute elapsed = time.time() - start and if timeout is
set raise TimeoutError immediately when elapsed >= timeout (use
_core.create_timeout_error_message(job_id, timeout)), and when deciding to sleep
use sleep_time = min(interval, max(0, timeout - elapsed)) (or skip sleeping if
timeout is exhausted) so the loop never sleeps beyond the configured timeout;
adjust the loop around start, get(job_id), timeout, interval and the
_core.create_timeout_error_message call accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bdb4dd83-bfb3-4a80-910e-765c86dd61b1

📥 Commits

Reviewing files that changed from the base of the PR and between 10cc9ab and 49fcc56.

⛔ Files ignored due to path filters (3)

bun.lock is excluded by !**/*.lock
e2e/e2e-python/uv.lock is excluded by !**/*.lock
packages/sdk-python/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (12)

e2e/e2e-python/tests/test_prerecorded_v2_async.py
packages/sdk-js/package.json
packages/sdk-js/src/v2/prerecorded/client.ts
packages/sdk-js/src/version.ts
packages/sdk-python/CHANGELOG.md
packages/sdk-python/pyproject.toml
packages/sdk-python/src/gladiaio_sdk/network/__init__.py
packages/sdk-python/src/gladiaio_sdk/network/http_client.py
packages/sdk-python/src/gladiaio_sdk/v2/prerecorded/async_client.py
packages/sdk-python/src/gladiaio_sdk/v2/prerecorded/client.py
packages/sdk-python/src/gladiaio_sdk/version.py
packages/sdk-python/tests/network/test_async_http_client.py

✅ Files skipped from review due to trivial changes (5)

packages/sdk-python/CHANGELOG.md
packages/sdk-python/src/gladiaio_sdk/version.py
packages/sdk-js/package.json
packages/sdk-js/src/version.ts
packages/sdk-python/src/gladiaio_sdk/network/init.py

🚧 Files skipped from review as they are similar to previous changes (2)

packages/sdk-python/pyproject.toml
packages/sdk-python/src/gladiaio_sdk/v2/prerecorded/async_client.py

packages/sdk-js/src/v2/prerecorded/client.ts

coderabbitai · 2026-03-20T08:45:48Z

packages/sdk-python/src/gladiaio_sdk/network/http_client.py

+def enrich_http_error_with_field_suggestions(
+  err: HttpError,
+  known_fields: Sequence[str],
+  *,
+  status_codes: tuple[int, ...] = (400, 422),
+) -> HttpError:
+  """Return a copy of ``err`` with difflib-based field hints, or ``err`` if none apply."""
+  if err.status not in status_codes or not err.invalid_parameters:
+    return err
+  hint = format_invalid_field_suggestions(err.invalid_parameters, known_fields)
+  if not hint:
+    return err
+  return HttpError(
+    message=f"{str(err)}\n\n{hint}",
+    method=err.method,
+    url=err.url,
+    status=err.status,
+    id=err.id,
+    request_id=err.request_id,
+    response_body=err.response_body,
+    response_headers=err.response_headers,
+    validation_errors=err.validation_errors,
+    invalid_parameters=err.invalid_parameters,
+  )



⚠️ Potential issue | 🟡 Minor

__cause__ is not preserved when enriching the error.

If the original HttpError has a __cause__ set (from chained exceptions), the enriched copy loses it. Consider passing cause=err.__cause__ to preserve the exception chain.

🛡️ Proposed fix

return HttpError( message=f"{str(err)}\n\n{hint}", method=err.method, url=err.url, status=err.status, id=err.id, request_id=err.request_id, response_body=err.response_body, response_headers=err.response_headers, validation_errors=err.validation_errors, invalid_parameters=err.invalid_parameters, + cause=err.__cause__, )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/sdk-python/src/gladiaio_sdk/network/http_client.py` around lines 274 - 298, The enriched HttpError returned by enrich_http_error_with_field_suggestions currently discards the original exception chain; update the HttpError construction in enrich_http_error_with_field_suggestions to pass cause=err.__cause__ (or the appropriate constructor parameter for the underlying HttpError type) so the original __cause__ is preserved when creating the new HttpError instance, keeping all other fields (message, method, url, status, id, request_id, response_body, response_headers, validation_errors, invalid_parameters) unchanged.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (3)

packages/sdk-js/src/v2/prerecorded/client.ts (3)

154-171: ⚠️ Potential issue | 🟠 Major

poll() can exceed the declared timeout budget.

Line 170 always sleeps the full interval; when timeout < interval, total wait overshoots caller deadline.

💡 Proposed fix

-    const start = Date.now()
+    const deadline = timeout === undefined ? undefined : Date.now() + timeout
     while (true) {
+      if (deadline !== undefined && Date.now() >= deadline) {
+        throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`)
+      }
       const result = await this.get(jobId)
       if (result.status === 'done') {
         return result
       }
       if (result.status === 'error') {
         throw new Error(`Pre-recorded job ${jobId} failed with error code: ${result.error_code}`)
       }
-      if (timeout !== undefined && Date.now() - start >= timeout) {
+      if (deadline !== undefined && Date.now() >= deadline) {
         throw new Error(`Pre-recorded job ${jobId} did not complete within ${timeout}ms`)
       }
-      await sleep(interval)
+      await sleep(
+        deadline === undefined ? interval : Math.min(interval, Math.max(0, deadline - Date.now()))
+      )
     }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/sdk-js/src/v2/prerecorded/client.ts` around lines 154 - 171, The
poll() loop in PreRecordedV2Client can overshoot the provided timeout because it
always awaits the full interval; modify poll (function poll, local variables
start, interval, timeout, and the sleep call) to compute remaining = timeout -
(Date.now() - start) when timeout is provided, and before sleeping either throw
if remaining <= 0 or await sleep(Math.min(interval, remaining)); if timeout is
undefined keep sleeping the full interval; ensure the timeout check occurs
before sleeping so the method never waits past the caller's deadline.

60-63: ⚠️ Potential issue | 🟠 Major

Avoid sync disk I/O in this async request path.

Line 62 uses readFileSync, which blocks the event loop under load. Prefer async node:fs/promises.readFile.

💡 Proposed fix

-      const { readFileSync } = await import('fs')
-      const { basename } = await import('path')
-      const fileBuffer = readFileSync(audio_url)
+      const [{ readFile }, { basename }] = await Promise.all([
+        import('node:fs/promises'),
+        import('node:path'),
+      ])
+      const fileBuffer = await readFile(audio_url)
       const filename = basename(audio_url)

#!/bin/bash
# Verify sync fs usage in JS SDK sources.
rg -n --type=ts -C2 '\breadFileSync\s*\(' packages/sdk-js/src

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/sdk-js/src/v2/prerecorded/client.ts` around lines 60 - 63, The code
is using readFileSync which blocks the event loop; replace the synchronous call
with async file I/O by importing and using readFile from 'fs/promises' (or
node:fs/promises) and await it instead of readFileSync when reading audio_url;
update the import lines (remove the sync import of 'fs') and keep basename from
'path' as before, ensuring the surrounding async function (in client.ts where
audio_url and filename are used) awaits the readFile promise so the rest of the
logic uses the resolved fileBuffer.

138-140: ⚠️ Potential issue | 🟠 Major

Force raw response for file download to keep arrayBuffer() valid.

Line 139 should request the raw Response; otherwise JSON content-types are auto-parsed by HttpClient, and arrayBuffer() may fail at runtime.

💡 Proposed fix

-    const response = await this.httpClient.get<Response>(`/v2/pre-recorded/${jobId}/file`)
+    const response = await this.httpClient.get<Response>(`/v2/pre-recorded/${jobId}/file`, {
+      rawResponse: true,
+    })
     return response.arrayBuffer()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/sdk-js/src/v2/prerecorded/client.ts` around lines 138 - 140, The
getFile method calls this.httpClient.get which auto-parses JSON; update the get
call in getFile(jobId: string) to request the raw/fetch Response (e.g., pass the
httpClient option that disables parsing or sets responseType to raw/arraybuffer)
so you receive a Response object and can safely call response.arrayBuffer();
modify the call to
this.httpClient.get<Response>(`/v2/pre-recorded/${jobId}/file`, { /* raw
response option */ }) and then return response.arrayBuffer().

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/sdk-js/src/network/httpClient.ts`:
- Around line 107-111: The request methods in httpClient.ts (get, post, put,
delete, request) currently accept a rawResponse?: boolean option but their
TypeScript signatures don't reflect that: add overloads for each method so that
when options includes { rawResponse: true } the return type is Promise<Response>
(the raw fetch Response), and when options omits rawResponse or rawResponse is
false the return type is Promise<T> (the parsed JSON generic). Update the method
signatures for get/post/put/delete/request to declare these two overloads and
remove any unsafe casts (e.g., the cast at the previous unsafe cast location) so
callers like prerecorded/client.ts calling get<Response>(..., { rawResponse:
true }) or get<T>(...) without rawResponse are type-safe.

---

Duplicate comments:
In `@packages/sdk-js/src/v2/prerecorded/client.ts`:
- Around line 154-171: The poll() loop in PreRecordedV2Client can overshoot the
provided timeout because it always awaits the full interval; modify poll
(function poll, local variables start, interval, timeout, and the sleep call) to
compute remaining = timeout - (Date.now() - start) when timeout is provided, and
before sleeping either throw if remaining <= 0 or await sleep(Math.min(interval,
remaining)); if timeout is undefined keep sleeping the full interval; ensure the
timeout check occurs before sleeping so the method never waits past the caller's
deadline.
- Around line 60-63: The code is using readFileSync which blocks the event loop;
replace the synchronous call with async file I/O by importing and using readFile
from 'fs/promises' (or node:fs/promises) and await it instead of readFileSync
when reading audio_url; update the import lines (remove the sync import of 'fs')
and keep basename from 'path' as before, ensuring the surrounding async function
(in client.ts where audio_url and filename are used) awaits the readFile promise
so the rest of the logic uses the resolved fileBuffer.
- Around line 138-140: The getFile method calls this.httpClient.get which
auto-parses JSON; update the get call in getFile(jobId: string) to request the
raw/fetch Response (e.g., pass the httpClient option that disables parsing or
sets responseType to raw/arraybuffer) so you receive a Response object and can
safely call response.arrayBuffer(); modify the call to
this.httpClient.get<Response>(`/v2/pre-recorded/${jobId}/file`, { /* raw
response option */ }) and then return response.arrayBuffer().

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cc8615b4-bc83-4c07-9057-eb48e734c68a

📥 Commits

Reviewing files that changed from the base of the PR and between 49fcc56 and 6030af9.

📒 Files selected for processing (3)

packages/sdk-js/src/network/httpClient.test.ts
packages/sdk-js/src/network/httpClient.ts
packages/sdk-js/src/v2/prerecorded/client.ts

🚧 Files skipped from review as they are similar to previous changes (1)

packages/sdk-js/src/network/httpClient.test.ts

coderabbitai · 2026-03-20T09:49:34Z

packages/sdk-js/src/network/httpClient.ts

+type RequestOptions = Omit<RequestInit, 'method' | 'headers'> & {
+  headers?: Headers
+  /** When true, the successful response is the raw `fetch` `Response` (no JSON parsing). */
+  rawResponse?: boolean
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify current type-contract gap and potentially unsafe call sites. rg -n "async (get|post|put|delete|request)<" packages/sdk-js/src/network/httpClient.ts rg -nP --type=ts -C2 'httpClient\.(get|post|put|delete)<Response>\(' packages/sdk-js/src

Repository: gladiaio/sdk

Length of output: 1124

🏁 Script executed:

# Get the full httpClient.ts file to understand the method signatures and implementation cat -n packages/sdk-js/src/network/httpClient.ts | head -220

Repository: gladiaio/sdk

Length of output: 7453

🏁 Script executed:

# Check what's at lines 285-287 mentioned in the review sed -n '280,290p' packages/sdk-js/src/network/httpClient.ts

Repository: gladiaio/sdk

Length of output: 388

🏁 Script executed:

# Search for all usages of httpClient methods with rawResponse to identify type mismatches rg -nP --type=ts "httpClient\.(get|post|put|delete|request)" packages/sdk-js/src -A1 | grep -A1 "rawResponse"

Repository: gladiaio/sdk

Length of output: 132

🏁 Script executed:

# Check lines 285-287 specifically sed -n '283,290p' packages/sdk-js/src/network/httpClient.ts

Repository: gladiaio/sdk

Length of output: 285

🏁 Script executed:

# Comprehensive search for all httpClient method calls to identify unsafe type patterns rg -nP --type=ts "httpClient\.(get|post|put|delete|request)\<" packages/sdk-js/src -B1 -A2

Repository: gladiaio/sdk

Length of output: 2661

Add overloads to enforce rawResponse behavior in method signatures.

The rawResponse flag (line 109) controls runtime behavior but isn't reflected in method types. This allows unsafe type assumptions:

Line 139 (prerecorded/client.ts): Calls get<Response>() without rawResponse: true, then invokes .arrayBuffer(). Without the flag, the implementation JSON-parses the response instead of returning raw Response, causing a runtime failure.

Line 126 (prerecorded/client.ts): Calls delete<Response>(..., { rawResponse: true }). Works at runtime but the type system doesn't enforce that Response type requires rawResponse: true.

Add overloads so:

rawResponse: true always returns Response

Calls without rawResponse return the generic ResponseType (JSON-parsed)

This applies to all methods: get, post, put, delete, and request (lines 174, 181, 188, 195, 202, and the unsafe cast at 285-287).

Proposed typing fix

type RequestOptions = Omit<RequestInit, 'method' | 'headers'> & { headers?: Headers /** When true, the successful response is the raw `fetch` `Response` (no JSON parsing). */ rawResponse?: boolean } +type RawResponseRequestOptions = RequestOptions & { rawResponse: true } +type ParsedRequestOptions = RequestOptions & { rawResponse?: false | undefined } + export class HttpClient { + async get(url: string | URL, init: RawResponseRequestOptions): Promise<Response> + async get<ResponseType = Response>(url: string | URL, init?: ParsedRequestOptions): Promise<ResponseType> async get<ResponseType = Response>( url: string | URL, init: RequestOptions = {} ): Promise<ResponseType> { return this.request('GET', url, init) } + async delete(url: string | URL, init: RawResponseRequestOptions): Promise<Response> + async delete<ResponseType = Response>(url: string | URL, init?: ParsedRequestOptions): Promise<ResponseType> async delete<ResponseType = Response>( url: string | URL, init: RequestOptions = {} ): Promise<ResponseType> { return this.request('DELETE', url, init) } + async request( + method: HttpMethod, + url: string | URL, + init: RawResponseRequestOptions + ): Promise<Response> + async request<ResponseType>( + method: HttpMethod, + url: string | URL, + init?: ParsedRequestOptions + ): Promise<ResponseType> async request<ResponseType>( method: HttpMethod, url: string | URL, init: RequestOptions = {} ): Promise<ResponseType> {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/sdk-js/src/network/httpClient.ts` around lines 107 - 111, The request methods in httpClient.ts (get, post, put, delete, request) currently accept a rawResponse?: boolean option but their TypeScript signatures don't reflect that: add overloads for each method so that when options includes { rawResponse: true } the return type is Promise<Response> (the raw fetch Response), and when options omits rawResponse or rawResponse is false the return type is Promise<T> (the parsed JSON generic). Update the method signatures for get/post/put/delete/request to declare these two overloads and remove any unsafe casts (e.g., the cast at the previous unsafe cast location) so callers like prerecorded/client.ts calling get<Response>(..., { rawResponse: true }) or get<T>(...) without rawResponse are type-safe.

Karamouche · 2026-03-24T16:52:43Z

packages/sdk-python/src/gladiaio_sdk/client.py

+    if "http_timeout" not in kwargs and not args:
+      merged_options = dataclasses.replace(
+        merged_options,
+        http_timeout=300,  # 5 minutes to handle file uploads


Should it be hardcoded ?

Karamouche · 2026-03-24T16:52:56Z

packages/sdk-python/src/gladiaio_sdk/client.py

+    if "http_timeout" not in kwargs and not args:
+      merged_options = dataclasses.replace(
+        merged_options,
+        http_timeout=300,  # 5 minutes to handle file uploads


Hardcoded timeout

Karamouche · 2026-03-25T10:51:56Z

packages/sdk-js/src/version.ts

@@ -1,2 +1,2 @@
 // This file is auto-generated during build. Do not edit manually.
-export const SDK_VERSION = '0.6.1' as const


Remove this manual change

Karamouche · 2026-03-25T10:52:01Z

packages/sdk-js/package.json

 {
  "name": "@gladiaio/sdk",
  "description": "Gladia JavaScript/TypeScript SDK",
-  "version": "0.6.1",


Remove this manual change

Karamouche · 2026-03-25T10:52:10Z

packages/sdk-python/src/gladiaio_sdk/version.py

@@ -1,2 +1,2 @@
 # This file is auto-generated. Do not edit manually.


Remove this manual change

Karamouche · 2026-03-25T10:52:16Z

packages/sdk-python/pyproject.toml

@@ -1,41 +1,37 @@
 [project]
 name = "gladiaio-sdk"
-version = "0.4.0"


Remove this manual change

Karamouche · 2026-03-25T10:52:23Z

packages/sdk-python/CHANGELOG.md

 - **sdk-python:** Improve dataclass usage
 - **sdk-python:** Add version generation
 - **sdk-python:** Use dataclass instead of TypedDict
- **sdk-python:** Initial asyncio version of Python SDK


Remove this manual change

corrected

Karamouche added 16 commits February 25, 2026 13:47

feat: Update README to enhance clarity on how to contribute

ed0c184

feat: Add generated types for async transcription

3e0565d

fix: Moved generated pre-recorded types to ./prerecorded

b7858cf

fix: Make post_session_metadata despite schema definition

acca3cf

feat: Implemented prerecorded client mvp

a41a4ab

feat: Add 'transcribe' method in prerecorded python

4e52606

fix: Add editor config for tab size in python

c6b7767

fix: Add higher timeout duration for prerecorded, regarding file uplo…

56d6601

…ad time

feat: Add sync and async client for pre-recorded

9b36232

fix: Add 'follow_redirect" to handle get_file redirection to s3

bd3a394

fix: Detail audio upload response

7cc2a67

feat: Load automatically the api_key from GLADIA_API_KEY env variable

981706d

feat: Add typescript prerecorded sdk

39812c5

feat: Add types to export types for python prerecorded sdk

33edb8b

feat: Generate PII redaction types

43459c7

chore: Refactor pyproject.toml files to ignore E501

01682b3

Karamouche added 4 commits February 25, 2026 15:29

fix: Improve websocket client attribute handling and enhance error ch…

6441845

…ecking for file uploads

fix: Update upload request schema handling to ensure proper type assi…

f8a8b92

…gnment

chore: Add marshmallow dependency to 3.26.2 to fix security issues

7655371

test: Update HttpClient tests to use toHaveBeenCalledWith for mockFet…

4dc7f8c

…ch assertions

Karamouche force-pushed the feat/async-sdk branch from b3f52dd to 4dc7f8c Compare February 26, 2026 18:49

egenthon-cmd force-pushed the feat/async-sdk branch 2 times, most recently from 3290ef6 to 2c5f17b Compare February 27, 2026 13:47

test: add tests for each method of python and javascript sdk

9bd37af

Karamouche force-pushed the feat/async-sdk branch from 7bd8e8b to 9bd37af Compare February 27, 2026 14:59

fix: Update undici integration in iso-fetch to set global dispatcher …

3bfd4cc

…and return Node.js fetch

Karamouche changed the title ~~Add pre-recorded to sdk~~ [FDE-52] Add pre-recorded to sdk Feb 27, 2026

egenthon-cmd added 2 commits March 2, 2026 12:52

fix: add result deletion method

5d31940

added object in PreRecordedV2DeleteResponse core.py for the deletion method

fix: javascript add result deletion method

6b32b0f

added return result for the javascript deletion method

Karamouche requested review from MaximeGaudin, jqueguiner, lrossillon-gladia and tnesztler March 11, 2026 13:57

lrossillon-gladia reviewed Mar 13, 2026

View reviewed changes

e2e/e2e-node-esm/test/prerecorded_v2_async.test.ts Show resolved Hide resolved

lrossillon-gladia previously requested changes Mar 16, 2026

View reviewed changes

fix: remove readfilesync and adjust docs

10cc9ab

egenthon-cmd force-pushed the feat/async-sdk branch from a3490b2 to 10cc9ab Compare March 17, 2026 15:58

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

egenthon-cmd added 5 commits March 19, 2026 19:39

feat: add suggestions parameters and explicit validation errors

a375300

fix: renaming all audio files as audio_url in python

265808c

fix: change of versions for sdks

4b72eca

fix: commit uv lock and bun lock version changed

96f900d

fix: prettier javascript test file

49fcc56

coderabbitai bot reviewed Mar 20, 2026

View reviewed changes

fix: delete method not returning always false javascript

6030af9

coderabbitai bot reviewed Mar 20, 2026

View reviewed changes

egenthon-cmd requested a review from lrossillon-gladia March 23, 2026 15:20

feat: fitting new openapi

645f7df

gladiaio deleted a comment from coderabbitai bot Mar 24, 2026

Karamouche commented Mar 24, 2026

View reviewed changes

Karamouche commented Mar 25, 2026

View reviewed changes

egenthon-cmd added 3 commits March 25, 2026 14:13

fix: set up timeouts for each method js and python

e81d92a

fix: removed manual changes of versions

68f94c8

fix: formatting change of version and locks change

7d42683

lrossillon-gladia approved these changes Mar 26, 2026

View reviewed changes

egenthon-cmd merged commit 3332a1b into main Mar 26, 2026
6 checks passed

coderabbitai bot mentioned this pull request Apr 9, 2026

fix: added readme for async #16

Closed

		@@ -1,2 +1,2 @@
		// This file is auto-generated during build. Do not edit manually.
		export const SDK_VERSION = '0.6.1' as const

		@@ -1,2 +1,2 @@
		# This file is auto-generated. Do not edit manually.

Conversation

Karamouche commented Feb 25, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Karamouche commented Feb 25, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 25, 2026 •

edited

Loading