This .NET project contains several implementations of Romanization of Thai text.
For example, สวัสดี
is "romanized" to sawatdi
.
There are currently two romanization algorithms:
- The Thai Language Toolkit (TLTK) algorithm originally from here.
- This implementation invokes Python code, and requires python to be installed.
- The Thai2Rom algorithm originally from here.
- This implementation runs native code (using the Torch machine learning framework -- available for Mac OS / Windows / Linux). The native code for these 3 platforms is bundled in the project and no installation is required.
- Currently does not do word-separation; the romanized characters follow the same spacing as the input Thai text.
using ThaiRomanizationSharp.Tltk;
IThaiRomanizationService romanizer = new ThaiRomanizationService();
string english = romanizer.Romanize("สวัสดี");
// or
using ThaiRomanizationSharp.Thai2Rom;
IThaiRomanizationService romanizer = new Thai2RomService();
string english = romanizer.Romanize("สวัสดี");
- Credit to Assoc.Prof. Wirote Aroonmanakun (Ph.D.)
- Director of the Siridhorn Thai Language Institute, Chulalongkorn University
- Original code is from https://github.com/attapol/tltk/blob/master/tltk/nlp.py.
- Thai Romanization main project page http://pioneer.chula.ac.th/~awirote/resources/thai-romanization.html.
The C# code of the Thai2Rom algorithm is based on the Python code from the PyThaiNLP project.
- For running the ThaiRomanizationSharp.Thai2Rom library, either reference it from your project, or run the unit tests as normal via the
dotnet
command line, Visual Studio Code, or Visual Studio.
- See the README.md in the ThaiRomanizationSharp.Thai2Rom subdirectory for more information.
- For running the ThaiRomanizationSharp.Thai2Rom library Thai Language Toolkit Project, there are some setup steps you need to do first. The rest of the README is devoted to these steps.
- See the README.md in the ThaiRomanizationSharp.Tltk subdirectory for more information.
- In VS Code open integrated terminal by pressing ctrl+`.
- The terminal should start from the root of the project.
- Run the project with the following command:
$ dotnet run
- Wait for a while and you should find an output message in the integrated terminal.
- PyThaiNLP/pythainlp#11
- https://github.com/comdevx/thai2karaoke
- Debugging with Code lens option
- Not launching debugger issue
- XUnit.ITestOutputHelper.WriteLine not showing up issue
- More details what code changes in nlp.py
- Convert project to a class library
- Unit test with xUnit
- GitHub Actions to run a unit test
- GitHub Actions to deploy a library to Nuget and release page
- Custom Docker image
- Deploy example project to Azure App Service container
read_thaidict
reset_thaidict
check_thaidict
edits2
pos_tag
pos_tag_wordlist
pos_load
change_tag
chunk
ner
ner_load
wrd_len
g2p_all
sylparse_all
th2ipa
word_segmentX
wordseg_w2v
word_segment_nbest
wordsegmm_bn
chartparse_mm_bn
word_segment_mm
wordseg_mm