Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions fern/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,9 @@ navigation:
- page: Self-hosted streaming
path: pages/05-guides/self-hosted-streaming.mdx
hidden: true
- page: Self-hosted async
path: pages/05-guides/self-hosted-async.mdx
hidden: true
- page: Webhooks
path: pages/05-guides/webhooks.mdx
- page: Evaluating STT models
Expand Down
198 changes: 198 additions & 0 deletions fern/pages/05-guides/self-hosted-async.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
---
title: "Self-Hosted Async Transcription"
hide-nav-links: true
description: "Deploy AssemblyAI's async transcription solution within your own infrastructure"
---

The **AssemblyAI Self-Hosted Async Solution** provides a secure transcription solution that can be deployed within your own infrastructure. This solution is designed for partners who need complete control over their data and infrastructure while maintaining high-quality speech-to-text capabilities.

## Core principle

- **Complete data isolation**: No audio data, transcript data, or personally identifiable information (PII) will ever be sent to AssemblyAI servers. Only usage metadata and licensing information is transmitted.

## System requirements

### Hardware requirements

- **GPU**: NVIDIA GPU support required (any NVIDIA GPU model will work)

### Software requirements

- **Operating System**: Linux
- **Container Runtime**: Docker required
- **AWS Account**: Required for pulling container images from our ECR registry

## Prerequisites

- Active enterprise contract with AssemblyAI
- AWS account credentials for container registry access
- Linux environment with Docker installed
- NVIDIA Container Toolkit for GPU support

## Setup and deployment

### 1. Docker runtime with GPU support

**1.1** Verify NVIDIA drivers are installed:
```bash
nvidia-smi
```

**1.2** Install NVIDIA Container Toolkit:

Follow the [NVIDIA Container Toolkit installation guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) to set up GPU support for Docker.

**1.3** Verify the Docker runtime has GPU access:
```bash
docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi
```

### 2. Obtain credentials

**AWS ECR Access**: AssemblyAI will manually provision AWS account credentials for your team to pull container images from our private Amazon ECR registry. Contact your AssemblyAI representative to obtain these credentials.

### 3. AWS ECR authentication

Check warning on line 54 in fern/pages/05-guides/self-hosted-async.mdx

View workflow job for this annotation

GitHub Actions / lint

[vale] reported by reviewdog 🐶 [AssemblyAI.Headings] Use sentence-style capitalization for '3. AWS ECR authentication'. Raw Output: {"message": "[AssemblyAI.Headings] Use sentence-style capitalization for '3. AWS ECR authentication'. ", "location": {"path": "fern/pages/05-guides/self-hosted-async.mdx", "range": {"start": {"line": 54, "column": 5}}}, "severity": "WARNING"}

Authenticate with AWS ECR using the provided credentials:

```bash
aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin 344839248844.dkr.ecr.us-west-2.amazonaws.com
```

### 4. Pull the Docker image

Pull the self-hosted ML container image:

```bash
docker pull 344839248844.dkr.ecr.us-west-2.amazonaws.com/self-hosted-ml-prod:release-v0.1
```

### 5. Obtain license file

AssemblyAI will provide a license file (`license.jwt`) that is required to run the container. The license file contains:
- Expiration date
- Usage limits
- Customer identification

<Callout type="warning">
The license file provided for testing is valid for 30 days. For production deployments, contact AssemblyAI to obtain a production license.
</Callout>

### 6. Run the container

Start the self-hosted ML container with GPU support:

```bash
docker run --gpus all -p 8000:8000 \
-e NVIDIA_DRIVER_CAPABILITIES=all \
-v /absolute/local/path/to/license.jwt:/app/license.jwt \
344839248844.dkr.ecr.us-west-2.amazonaws.com/self-hosted-ml-prod:release-v0.1
```

**Parameters explained**:
- `--gpus all`: Enables GPU access for the container
- `-p 8000:8000`: Maps port 8000 from the container to the host
- `-e NVIDIA_DRIVER_CAPABILITIES=all`: Enables all NVIDIA driver capabilities
- `-v /absolute/local/path/to/license.jwt:/app/license.jwt`: Mounts the license file into the container

<Callout type="info">
Replace `/absolute/local/path/to/license.jwt` with the actual absolute path to your license file on the host system.
</Callout>

## Using the API

Once the container is running, you can interact with it using HTTP requests.

### Check container health

Verify that the container is ready to accept requests:

```bash
curl "http://localhost:8000/health"
```

A successful response indicates the container is ready to process transcription requests.

### Transcribe an audio file

Submit an audio file for transcription:

```bash
curl -X POST "http://localhost:8000/predict" \
-F "file=@/path/to/file.mp3" \
-F 'payload={"language": "en"}'
```

**Parameters**:
- `file`: The audio file to transcribe (supports common audio formats like MP3, WAV, M4A, etc.)
- `payload`: JSON object containing transcription parameters
- `language`: Language code for the audio (e.g., `"en"` for English)

**Example response**:
```json
{
"text": "This is the transcribed text from your audio file.",
"words": [
{
"text": "This",
"start": 0,
"end": 200,
"confidence": 0.98
}
]
}
```

## Supported languages

The self-hosted async solution supports multiple languages. Specify the language code in the `payload` parameter when making transcription requests.

Common language codes:
- `en`: English
- `es`: Spanish
- `fr`: French
- `de`: German
- `it`: Italian
- `pt`: Portuguese
- `nl`: Dutch

For a complete list of supported languages, contact your AssemblyAI representative.

## Troubleshooting

### Container fails to start

**Issue**: Container exits immediately after starting.

**Solution**: Verify that:
1. The license file path is correct and the file exists
2. The license file is not expired
3. GPU drivers are properly installed (`nvidia-smi` should work)
4. NVIDIA Container Toolkit is installed

### Health check fails

**Issue**: The `/health` endpoint returns an error or times out.

**Solution**:
1. Wait a few moments for the container to fully initialize
2. Check container logs: `docker logs <container_id>`
3. Verify GPU access: Ensure the container can access the GPU

### Transcription request fails

**Issue**: The `/predict` endpoint returns an error.

**Solution**:
1. Verify the audio file format is supported
2. Check that the `language` parameter is valid
3. Ensure the file path in the curl command is correct
4. Review container logs for detailed error messages

## Support

For technical support or questions about the self-hosted async solution, contact your AssemblyAI representative or reach out to the AssemblyAI support team.

## Simplified installation

AssemblyAI is working on packaging solutions and installation scripts to simplify the deployment process for customers. For the latest information on simplified installation options, contact your AssemblyAI representative.
Loading