A searchable audio transcript interface using Wav2Vec2, Elasticsearch, Flask, and React.
URL: http://3.90.182.103/
asr/
: ASR microservice with wav2vec2 modeldeployment-design/
: Architecture design (PDF)elastic-backend/
: Elasticsearch indexing setupsearch-ui/
: Frontend search interface
- Navigate to directory
cd asr
- Build the image:
docker build -t asr-api ./asr
- Run the container:
docker run -p 8001:8001 asr-api
- Test the API:
curl -F "file=@/path/to/sample.mp3" http://localhost:8001/asr
- Navigate to directory
cd elastic-backend
- (Optional but recommended) Create a virtual environment:
python3 -m venv venv
source venv/bin/activate
- Install Python dependencies:
pip install -r requirements.txt
- Start Elasticsearch cluster
docker compose up
Open http://localhost:9200/_cat/nodes?v in your browser — you should see both es01 and es02 nodes listed.
- Index data
python cv-index.py
- Start backend API server
python search_api.py
- Navigate to the frontend directory:
cd search-ui
- Install dependencies:
npm install
- Start development sever
npm start
Open http://localhost:3000 with your browser to see the result.
- The ASR model used (
wav2vec2-large-960h
) may produce inaccurate transcriptions, especially for non-US accents or noisy audio, therefore the search function searches both transcribed and actual audio - Some metadata fields (e.g., age, gender, accent) may be missing or inconsistent in the source CSV file.
- The search UI currently does not support fuzzy matching or partial phrase queries.
- Facets are limited to a fixed number of values (e.g., only top 10 accent types are shown).
- Backend and search functionality assumes the local Elasticsearch and ASR services are running on ports
9200
and8001
respectively.