Extract all unique characters of each column of a csv file, combine and manipulate results and store the results in text files for further usage.
This tool was developed to create font assets for TextmeshPro in Unity. Creating textures with just the character you need is essential for languages like Chinese, Japanese or Korean.
- Input file can be defined in config.xml, default value is "in/example.csv"
- Languages are defined in columns, first column defines the language name (see example.csv)
- Column
ID
andDescription
will be ignored - Newline character (\n\r) and all emojis will be ignored
- Text files are created for each language and named "ColumnName.txt". Output path can be defined in config.xml
- One file per column, expecting to have one language per column
- Windows: Doubleclick Run.bat
- Windows, Mac, Linux: Run
java -jar CsvCharacterExtractor.jar
in the terminal
- With the config you can set the in and out path as well as characters that should be always or never included. Take a look at the example config
- Paths can be relative, e.g.
in/example.csv
- Paths can be absolute, e.g.
C:/Users/UserName/Documents/LanguageCharacterFiles/
- Use forward slashes only
/
- Automatically add lower and upper case charaters to the unique characters file
- Create union files of multiple separate columns
- Document code
- Add information on how to build the project
- https://github.com/uniVocity/univocity-parsers (Apache 2.0 License)
- https://github.com/vdurmont/emoji-java (MIT License)