All notable changes to this project will be documented in this file. This project adheres to Semantic Versioning. It follows some conventions.
- convert O and E (not part of the spec but useful to represent loan words)
- new lenient mode in toUnicode() with lower casing and DTS normalization
- fixes for the w character in alalc conversion
- fixes for the . character in alalc conversion
- remove the hyphens in the ewts -> alalc conversion
- fixes in ewts to ala-lc conversion
- handle ewts character &
- fixes in ewts to ala-lc conversion
- support for conversion from ewts to ala-lc
- support for alalc transliteration scheme and DTS (not publicly documented)
- sloppy mode now supports various non-ascii apostrophes
- 0x2018 and 0x2019 are now invalid in non-sloppy mode
- sloppy mode should be a little bit faster
- reasonable replacement for
x
,X
and...
used to denote unreadable part in BDRC data.
- Maven packaging
- complete the list of ambiguous syllables with:
'bs
->'bas
mgs
->mags
- add the following decompositions when reading Unicode:
\u0F75
->\u0F71\u0F74
\u0F73
->\u0F71\u0F72
- renaming Class to
io.bdrc.ewtsconverter.EwtsConverter
- when
fix_spacing
is on, lower caseP
,K
,G
,C
,B
,L
,M
,S
(but notSh
) - when
fix_spacing
is on, replace common EWTS mistakes:b
->ba
m
->ma
m'i
->ma'i
b'i
->ba'i
- do not add a warning when writing affixed as
'm
or'ng
(as inpa'm
) - add a warning if the resulting string starts with a combining character (for XML 1.1 validation)