Skip to content

stefantaubert/pinyin-to-ipa

Repository files navigation

pinyin-to-ipa

CI codecov PyPI Python versions License: MIT
PyPI Downloads Hugging Face 🤗 DOI

A Python library, web application, and command-line tool for transcribing Pinyin to IPA.
Tone markers are attached to the vowel of each syllable.

Getting started

Installation

pip install pinyin-to-ipa[app] --user

Usage as library

from pinyin_to_ipa import pinyin_to_ipa

print(pinyin_to_ipa("hang4"))
# OrderedSet([('x', 'a˥˩', 'ŋ'), ('h', 'a˥˩', 'ŋ')])

print(pinyin_to_ipa("ng"))
# OrderedSet([('ŋ',)])

Usage as web app

Start the web app from the command-line:

$ pinyin-to-ipa-app

Or visit 🤗 Hugging Face for a live demo.

Screenshot Hugging Face

Usage as CLI

$ pinyin-to-ipa-cli
usage: pinyin-to-ipa-cli [-h] [-v] [--sep SEP] [--first] PINYIN

Command-line interface (CLI) to transcribe pinyin to IPA.

positional arguments:
  PINYIN         pinyin

optional arguments:
  -h, --help     show this help message and exit
  -v, --version  show program's version number and exit
  --sep SEP      separator between phonemes (default: )
  --first        return only first result (default: False)

Example

$ pinyin-to-ipa-cli "pang1" 
pʰa˥ŋ
$ pinyin-to-ipa-cli "pang2" 
pʰa˧˥ŋ
$ pinyin-to-ipa-cli "pang3" 
pʰa˧˩˧ŋ
$ pinyin-to-ipa-cli "pang4" 
pʰa˥˩ŋ
$ pinyin-to-ipa-cli "pang5" 
pʰaŋ
$ pinyin-to-ipa-cli "pang" 
pʰaŋ
$ pinyin-to-ipa-cli "hàng" 
xa˥˩ŋ
ha˥˩ŋ
$ pinyin-to-ipa-cli "hàng" --first
xa˥˩ŋ
$ pinyin-to-ipa-cli "hng" 
hŋ
$ pinyin-to-ipa-cli "test" 
No IPA transcription available!

Phoneme set

Vowels

a ɛ e ə ɚ ɤ i o ɔ u ʊ y 

Diphthongs

ai̯ au̯ aɚ̯¹ ei̯ ou̯ 

¹ These phonemes are not included if only the first transcription is used.

Consonants

f h¹ j k kʰ l m n ŋ p pʰ ɹ̩² ɻ² ɻ̩² 
s ʂ t tʰ ts tsʰ tɕ tɕʰ ʈʂ ʈʂʰ 
w x ɕ ɥ z̩¹² ʐ¹² ʐ̩¹²

² These consonants contain also tones.

Tones

Vowels and diphthongs contain one of these tones:

˥ (first tone)
˧˥ (second tone)
˧˩˧ (third tone)
˥˩ (fourth tone)
(none)

References

Acknowledgments

  • pypinyin
  • Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – CRC 1410

Citation

If you want to cite this repo, you can use this BibTeX-entry generated by GitHub (see About => Cite this repository).

Taubert, S. (2025). pinyin-to-ipa (Version 1.0.0) [Computer software]. https://doi.org/10.5281/zenodo.15229718

About

Command-line interface and Python library to transcribe pinyin to IPA. The tones are attached to the vowel of the syllable.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages