Skip to content

Latest commit

 

History

History
146 lines (98 loc) · 5.97 KB

README.md

File metadata and controls

146 lines (98 loc) · 5.97 KB

ewts-rs

Converter from EWTS (Extended Wylie Transliteration Scheme) to Tibetan Unicode symbols.

emaho


Fully compliant with the standard. See all rules on The Tibetan and Himalayan Library's site and tests on them here in rules_test.rs file.

Important

Currently, only the conversion from EWTS to Tibetan Unicode is implemented. The conversion in the opposite direction will be coming soon.


Can be used:

  • as Rust-library, of course. Because written in Rust - ewts
  • as command line tool - ewts-cli
  • in JS-environment (via wasm) - ewts-wasm
  • as C dynamic library in C/C++/Cython - ewts-c

ewts Crates.io Version

Core conversion library.

[rust docs]

Example:

use ewts::EwtsConverter;

let converter = EwtsConverter::create();
let ewts_str = "oM aHhU~M` badz+ra gu ru pad+ma sid+d+hi hU~M`:";

let tib_unicode_str = converter.ewts_to_unicode(ewts_str);

assert_eq!(tib_unicode_str, "ཨོཾ་ཨཿཧཱུྂ་བཛྲ་གུ་རུ་པདྨ་སིདྡྷི་ཧཱུྂ༔");

ewts-cli Crates.io Version

Command line interface for conversion. For use in your favorite console

Example:

$ ewts --input "bkra shis bde legs/"
# བཀྲ་ཤིས་བདེ་ལེགས།

$ ewts --help
# ...
# Usage: ewts [OPTIONS] --input <INPUT>
# 
# Options:
#   -s, --source-type <SOURCE_TYPE>  Type of input symbols [default: ewts] [possible values: ewts, unicode]
#   -i, --input <INPUT>              String to convert
#   -h, --help                       Print help
#   -V, --version                    Print version

# to convert file:
$ ewts -i "$(cat /path/to/your/file.txt)"

Demo

ewts-cli.mp4

Installation

For now only with

cargo install ewts-cli

ewts-wasm Npm Version

WASM-module for using in browser, nodejs or somewhere else.

See details in ewts-wasm/README.md.

Installation

npm install ewts

Usage

import {EwtsConverter} from 'ewts'

const converter = new EwtsConverter()

const ewtsStr = "oM ma Ni pad+me hU~M/"

const tibUnicodeStr = converter.ewtsToUnicode(ewtsStr)

console.log(tibUnicodeStr)
// "ཨོཾ་མ་ཎི་པདྨེ་ཧཱུྃ།"

ewts-c

A little wrapper around core rust ewts conversion library, usable in C/C++/Cython code. Or anywhere a C-code can be called. See example and test code. Also see docs for details here.

Speed comparison with other converters

I do not know who will need to transliterate large amounts of text, but still I want to mention the difference in the speed of some implementations. Especially considering that the current implementation is several times faster than others (from 12 to almost 100 times).

Tool Speed Launch code
ewts-rs (in rust) ~26491 Kb/s (1.00x) bench/main.rs
ewts-rs (c++ bindings) ~25820 Kb/s (1.02x) cpp_bench.cpp
jsewts ~2141 Kb/s (12.4x) js_tools_bench.js
ewts-js ~1941 Kb/s (13.6x) js_tools_bench.js
ewts-rs (as wasm) ~14598 Kb/s (1.81x) js_tools_bench.js
ewts-converter (java) ~1822 Kb/s (14.54x) java_bench.java
pyewts ~274 Kb/s (96.7x) python_tools_bench.py

A little bit more info is at bench/README

References

  • Ewts symbols table.
  • Initially, the character matches for are taken from here. Thanks for not having to type them manually.

Misc

  • This converter does not perform any checks, substitutions, transformations - if you have written incorrectly, you will get incorrect characters in the result.