Skip to content

Commit

Permalink
Fix potential conversion error
Browse files Browse the repository at this point in the history
And update opencc-data to 1.0.5
  • Loading branch information
ayaka14732 committed Oct 4, 2020
1 parent 2cdc350 commit 979419e
Show file tree
Hide file tree
Showing 4 changed files with 87 additions and 60 deletions.
13 changes: 10 additions & 3 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,15 @@ jobs:
run: |
build/prepare.sh
python build/main.py
- name: Upload artifact
- name: Upload FanWunMing
uses: actions/upload-artifact@v2
with:
name: Font files
path: output/*.ttf
name: FanWunMing
path: |
output/FanWunMing-*.ttf
!output/FanWunMing-TW-*.ttf
- name: Upload FanWunMing-TW
uses: actions/upload-artifact@v2
with:
name: FanWunMing-TW
path: output/FanWunMing-TW-*.ttf
57 changes: 25 additions & 32 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,35 +1,28 @@
Copyright 2020 Ayaka Mikazuki (https://ayaka.shn.hk/).
This Font Software is licensed under the SIL Open Font License,
Version 1.1.


Copyright 2014-2019 Adobe (http://www.adobe.com/), with Reserved Font
Name 'Source'. Source is a trademark of Adobe in the United States
and/or other countries.


This Font Software is licensed under the SIL Open Font License, Version 1.1.
This license is copied below, and is also available with a FAQ at:
http://scripts.sil.org/OFL


-----------------------------------------------------------
SIL OPEN FONT LICENSE Version 1.1 - 26 February 2007
-----------------------------------------------------------

PREAMBLE
The goals of the Open Font License (OFL) are to stimulate worldwide
development of collaborative font projects, to support the font creation
efforts of academic and linguistic communities, and to provide a free and
open framework in which fonts may be shared and improved in partnership
with others.
development of collaborative font projects, to support the font
creation efforts of academic and linguistic communities, and to
provide a free and open framework in which fonts may be shared and
improved in partnership with others.

The OFL allows the licensed fonts to be used, studied, modified and
redistributed freely as long as they are not sold by themselves. The
fonts, including any derivative works, can be bundled, embedded,
fonts, including any derivative works, can be bundled, embedded,
redistributed and/or sold with any software provided that any reserved
names are not used by derivative works. The fonts and derivatives,
however, cannot be released under any other type of license. The
requirement for fonts to remain under this license does not apply
to any document created using the fonts or their derivatives.
requirement for fonts to remain under this license does not apply to
any document created using the fonts or their derivatives.

DEFINITIONS
"Font Software" refers to the set of files released by the Copyright
Expand All @@ -39,25 +32,25 @@ include source files, build scripts and documentation.
"Reserved Font Name" refers to any names specified as such after the
copyright statement(s).

"Original Version" refers to the collection of Font Software components as
distributed by the Copyright Holder(s).
"Original Version" refers to the collection of Font Software
components as distributed by the Copyright Holder(s).

"Modified Version" refers to any derivative made by adding to, deleting,
or substituting -- in part or in whole -- any of the components of the
Original Version, by changing formats or by porting the Font Software to a
new environment.
"Modified Version" refers to any derivative made by adding to,
deleting, or substituting -- in part or in whole -- any of the
components of the Original Version, by changing formats or by porting
the Font Software to a new environment.

"Author" refers to any designer, engineer, programmer, technical
writer or other person who contributed to the Font Software.

PERMISSION & CONDITIONS
Permission is hereby granted, free of charge, to any person obtaining
a copy of the Font Software, to use, study, copy, merge, embed, modify,
redistribute, and sell modified and unmodified copies of the Font
Software, subject to the following conditions:
a copy of the Font Software, to use, study, copy, merge, embed,
modify, redistribute, and sell modified and unmodified copies of the
Font Software, subject to the following conditions:

1) Neither the Font Software nor any of its individual components,
in Original or Modified Versions, may be sold by itself.
1) Neither the Font Software nor any of its individual components, in
Original or Modified Versions, may be sold by itself.

2) Original or Modified Versions of the Font Software may be bundled,
redistributed and/or sold with any software, provided that each copy
Expand All @@ -67,9 +60,9 @@ in the appropriate machine-readable metadata fields within text or
binary files as long as those fields can be easily viewed by the user.

3) No Modified Version of the Font Software may use the Reserved Font
Name(s) unless explicit written permission is granted by the corresponding
Copyright Holder. This restriction only applies to the primary font name as
presented to the users.
Name(s) unless explicit written permission is granted by the
corresponding Copyright Holder. This restriction only applies to the
primary font name as presented to the users.

4) The name(s) of the Copyright Holder(s) or the Author(s) of the Font
Software shall not be used to promote, endorse or advertise any
Expand All @@ -80,8 +73,8 @@ permission.
5) The Font Software, modified or unmodified, in part or in whole,
must be distributed entirely under this license, and must not be
distributed under any other license. The requirement for fonts to
remain under this license does not apply to any document created
using the Font Software.
remain under this license does not apply to any document created using
the Font Software.

TERMINATION
This license becomes null and void if any of the above conditions are
Expand Down
58 changes: 45 additions & 13 deletions build/main.py
Original file line number Diff line number Diff line change
@@ -1,29 +1,51 @@
from collections import defaultdict
from datetime import date
from glob import glob
from itertools import chain
from itertools import chain, groupby
import json
from opencc import OpenCC
import os
import subprocess

FONT_VERSION = 1.003
FONT_VERSION = 1.004

# Define the max entries size in a subtable.
# We define a number that is small enough here, so that the entries will not exceed
# the size limit.
SUBTABLE_MAX_COUNT = 4000

# This function is used to split a GSUB table into several subtables.
def grouper(lst, n, start=0):
# The following two functions are used to split a GSUB table into several subtables.
def grouper(iterable, n=SUBTABLE_MAX_COUNT):
'''
Split a list into chunks of size n.
>>> list(grouper([1, 2, 3, 4, 5], 2))
>>> list(grouper([1, 2, 3, 4, 5], n=2))
[[1, 2], [3, 4], [5]]
>>> list(grouper([1, 2, 3, 4, 5, 6], n=2))
[[1, 2], [3, 4], [5, 6]]
'''
while start < len(lst):
yield lst[start:start+n]
start += n
iterator = iter(iterable)
while True:
lst = []
try:
for _ in range(n):
lst.append(next(iterator))
except StopIteration:
if lst:
yield lst
break
yield lst

def grouper2(iterable, n=SUBTABLE_MAX_COUNT, key=None):
'''
Split a iterator into chunks of maximum size n by the given key.
>>> list(grouper2(['AA', 'BBB', 'CCC', 'DDD', 'EE'], n=3, key=len))
[['AA'], ['BBB', 'CCC', 'DDD'], ['EE']]
>>> list(grouper2(['AA', 'BBB', 'CCC', 'DDD', 'EE'], n=2, key=len))
[['AA'], ['BBB', 'CCC'], ['DDD'], ['EE']]
'''
for _, vx in groupby(iterable, key=key):
for vs in grouper(vx, n):
yield vs

# An opentype font can hold at most 65535 glyphs.
MAX_GLYPH_COUNT = 65535
Expand Down Expand Up @@ -142,7 +164,8 @@ def build_opencc_word_table(codepoints_tonggui, codepoints_font, twp=False):
codepoints.update(codepoints_v)

# Sort from longest to shortest to force longest match
return sorted(((k, v) for k, v in entries.items()), key=lambda k_v: (-len(k_v[0]), k_v[0])), codepoints
conversion_item_len = lambda conversion_item: len(conversion_item[0])
return sorted(entries.items(), key=conversion_item_len, reverse=True), codepoints

def disassociate_codepoint_and_glyph_name(obj, codepoint, glyph_name):
'''
Expand Down Expand Up @@ -275,29 +298,34 @@ def insert_empty_feature(obj, feature_name):
obj['GSUB']['features'][feature_name] = []

def create_word2pseu_table(obj, feature_name, conversions):
conversion_item_len = lambda conversion_item: len(conversion_item[0])
subtables = [{'substitutions': [{'from': glyph_names_k, 'to': pseudo_glyph_name} for glyph_names_k, pseudo_glyph_name in subtable]} for subtable in grouper2(conversions, key=conversion_item_len)] # {from: [a1, a2, ...], to: b}
obj['GSUB']['features'][feature_name].append('word2pseu')
obj['GSUB']['lookups']['word2pseu'] = {
'type': 'gsub_ligature',
'flags': {},
'subtables': [{'substitutions': subtable} for subtable in grouper(conversions, SUBTABLE_MAX_COUNT)]
'subtables': subtables
}
obj['GSUB']['lookupOrder'].append('word2pseu')

def create_char2char_table(obj, feature_name, conversions):
subtables = [{k: v for k, v in subtable} for subtable in grouper(conversions)]
obj['GSUB']['features'][feature_name].append('char2char')
obj['GSUB']['lookups']['char2char'] = {
'type': 'gsub_single',
'flags': {},
'subtables': [{k: v for k, v in subtable} for subtable in grouper(conversions, SUBTABLE_MAX_COUNT)]
'subtables': subtables
}
obj['GSUB']['lookupOrder'].append('char2char')

def create_pseu2word_table(obj, feature_name, conversions):
conversion_item_len = lambda conversion_item: len(conversion_item[1])
subtables = [{k: v for k, v in subtable} for subtable in grouper2(conversions, key=conversion_item_len)]
obj['GSUB']['features'][feature_name].append('pseu2word')
obj['GSUB']['lookups']['pseu2word'] = {
'type': 'gsub_multiple',
'flags': {},
'subtables': [{k: v for k, v in subtable} for subtable in grouper(conversions, SUBTABLE_MAX_COUNT)]
'subtables': subtables
}
obj['GSUB']['lookupOrder'].append('pseu2word')

Expand Down Expand Up @@ -341,6 +369,8 @@ def build_dest_path_from_src_path(path, twp=False):
def go(path, twp=False):
font = load_font(path, ttc_index=0)

# Determine the final Unicode range by the original font and OpenCC convert tables

codepoints_font = build_codepoints_font(font)
codepoints_tonggui = build_codepoints_tonggui() & codepoints_font

Expand All @@ -358,6 +388,8 @@ def go(path, twp=False):
available_glyph_count = MAX_GLYPH_COUNT - get_glyph_count(font)
assert available_glyph_count >= len(entries_word)

# Build glyph substitution tables and insert into font

word2pseu_table = []
char2char_table = []
pseu2word_table = []
Expand All @@ -367,7 +399,7 @@ def go(path, twp=False):
glyph_names_k = [codepoint_to_glyph_name(font, codepoint) for codepoint in codepoints_k]
glyph_names_v = [codepoint_to_glyph_name(font, codepoint) for codepoint in codepoints_v]
insert_empty_glyph(font, pseudo_glyph_name)
word2pseu_table.append({'from': glyph_names_k, 'to': pseudo_glyph_name})
word2pseu_table.append((glyph_names_k, pseudo_glyph_name))
pseu2word_table.append((pseudo_glyph_name, glyph_names_v))

for codepoint_k, codepoint_v in entries_char:
Expand Down
19 changes: 7 additions & 12 deletions build/prepare.sh
Original file line number Diff line number Diff line change
@@ -1,13 +1,8 @@
#!/bin/sh
mkdir -p output
wget -q -nc -P cache https://github.com/ButTaiwan/genyo-font/releases/download/v1.501/GenYoMin.zip
wget -q -nc -P cache https://cdn.jsdelivr.net/npm/opencc-data@1.0.4/data/STCharacters.txt
wget -q -nc -P cache https://cdn.jsdelivr.net/npm/opencc-data@1.0.4/data/STPhrases.txt
wget -q -nc -P cache https://cdn.jsdelivr.net/npm/opencc-data@1.0.4/data/TWPhrasesIT.txt
wget -q -nc -P cache https://cdn.jsdelivr.net/npm/opencc-data@1.0.4/data/TWPhrasesName.txt
wget -q -nc -P cache https://cdn.jsdelivr.net/npm/opencc-data@1.0.4/data/TWPhrasesOther.txt
wget -q -nc -P cache https://cdn.jsdelivr.net/npm/opencc-data@1.0.4/data/TWVariants.txt
cat cache/TWPhrasesIT.txt cache/TWPhrasesName.txt cache/TWPhrasesOther.txt > cache/TWPhrases.txt
wget -q -nc -P cache https://gist.githubusercontent.com/fatum12/941a10f31ac1ad48ccbc/raw/59d7e29b307ae3439317a975ef390cd729f9bc17/ttc2ttf.pe
wget -q -nc -P cache https://raw.githubusercontent.com/rime-aca/character_set/e7d009a8a185a83f62ad2c903565b8bb85719221/通用規範漢字表.txt
unzip -q -n -d cache cache/GenYoMin.zip "*.ttc"
mkdir -p cache output
cd cache
curl -LsSO https://github.com/ButTaiwan/genyo-font/releases/download/v1.501/GenYoMin.zip
curl -LsSZ --remote-name-all https://cdn.jsdelivr.net/npm/opencc-data@1.0.5/data/{STCharacters.txt,STPhrases.txt,TWPhrasesIT.txt,TWPhrasesName.txt,TWPhrasesOther.txt,TWVariants.txt}
curl -LsSo 通用規範漢字表.txt https://raw.githubusercontent.com/rime-aca/character_set/e7d009a8a185a83f62ad2c903565b8bb85719221/%E9%80%9A%E7%94%A8%E8%A6%8F%E7%AF%84%E6%BC%A2%E5%AD%97%E8%A1%A8.txt
cat TWPhrasesIT.txt TWPhrasesName.txt TWPhrasesOther.txt > TWPhrases.txt
unzip -q -n GenYoMin.zip "*.ttc"

0 comments on commit 979419e

Please sign in to comment.