forked from marian-nmt/sentencepiece
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Adding alternative project name for spm latest to prevent lib conflicts * Update cmake * Update CMakeFiles to allow for configurable artifact names * Enables --encode_unicode_case option for case-aware sentence piece (marian-nmt#10) * Enables --encode_unicode_case option for case-aware sentence piece * Example: This IS a TEST OF THE CASING gets converted internally to Tthis Uis a Atest of the casing before segmentation. * This is fully reversible. * Enable toggling Case Encoding flag from C++ Train API (marian-nmt#11) * Enable toggling Case Encoding flag from C++ Train API * Fixing issue with hardcoding truth value of encode_decode_case flag * Disable denormalizer flags (marian-nmt#13) Co-authored-by: Rohit Jain <Rohit.Jain@microsoft.com> * Fix Surface String to Token Mappings for Case Encoding (marian-nmt#12) Co-authored-by: Marcin Junczys-Dowmunt <marcinjd@microsoft.com> Co-authored-by: Rohit Jain <Rohit.Jain@microsoft.com> * add one header file to installation * Rename VERSION to VERSION.txt * Rename VERSION to VERSION.txt Installing python package fails with below error. This change addresses this issue ``` × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [10 lines of output] Traceback (most recent call last): File "<string>", line 2, in <module> File "<pip-setuptools-caller>", line 34, in <module> File "/home/alferre/code/sentencepiece/python/setup.py", line 111, in <module> version=version(), File "/home/alferre/code/sentencepiece/python/setup.py", line 36, in version with codecs.open('VERSION.txt', 'r', 'utf-8') as f: File "/opt/conda/envs/ptca/lib/python3.8/codecs.py", line 905, in open file = builtins.open(filename, mode, buffering) FileNotFoundError: [Errno 2] No such file or directory: 'VERSION.txt' [end of output] ``` --------- Co-authored-by: Rohit Jain <rjai@microsoft.com> Co-authored-by: Rohit Jain <Rohit.Jain@microsoft.com> Co-authored-by: Marcin Junczys-Dowmunt <marcinjd@microsoft.com> Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com> Co-authored-by: alexandremuzio <ax.muzio@gmail.com>
- Loading branch information
1 parent
60c17dd
commit acb66f4
Showing
18 changed files
with
66,932 additions
and
136,107 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Oops, something went wrong.