Skip to content

Streamlit tool for keyword/semantic search, transliteration, and AI translation over a classical-language corpus.

License

Notifications You must be signed in to change notification settings

fridalyf412/manwen-viewer

Repository files navigation

Manwen Laodang Viewer (MVP)

Streamlit tool for keyword & semantic search over a digitized archival corpus in a low-resource classical language, with Möllendorff transliteration and AI-assisted sentence-level translation.

Features

  • BM25 keyword search (script / English / transliteration aware)
  • Optional semantic search (small 384-dim embeddings)
  • Möllendorff transliteration (š/č/ž + ASCII option)
  • Batch translate + CSV export
  • Caching for speed & lower translation cost

Quickstart

conda create -n manwen311 python=3.11 -y
conda activate manwen311
python -m pip install -r requirements.txt   # or import environment.yml if you prefer
python preprocess.py                        # builds .cache from your CSV
python -m streamlit run app.py

About

Streamlit tool for keyword/semantic search, transliteration, and AI translation over a classical-language corpus.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Languages