Skip to content

Commit

Permalink
Merge pull request #562 from StephenMcConnel/BL-13305-EvenMoreWorkOnG…
Browse files Browse the repository at this point in the history
…rids

Script for generating reduced-langtags.json (BL-13305) (#562)
  • Loading branch information
andrew-polk authored Jul 26, 2024
2 parents c687819 + 6b0029f commit 3d1461f
Show file tree
Hide file tree
Showing 9 changed files with 19,095 additions and 25,857 deletions.
33 changes: 33 additions & 0 deletions scripts/langtags/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
## Generating a new reduced-langtags.json file

Simply run the `reduce.sh` script from this directory like this inside the git bash
shell window:
```
./reduce.sh
```
If you want to ensure a fresh copy of the `langtags.json` file, delete any
existing copy in this folder first:
```
rm langtags.json
./reduce.sh
```
(An existing copy of `langtags.json` is used for processing to save download time.)

After generating a new version of the `reduced-langtags.json` file, it will need to
be copied (or moved) to the src/components/AggregateGrid folder:
```
cp reduced-langtags.json ../../src/components/AggregateGrid
```
or
```
mv reduced-langtags.json ../../src/components/AggregateGrid
```

### Developer notes

There is no need to use either npm or yarn as far as I can tell. Perhaps node is
trying to keep up with bun?

The shell script may need to have its line endings changed if you want to run it
under another shell such as the Cygwin bash shell window or the Windows Subsystem
for Linux shell window.
38 changes: 38 additions & 0 deletions scripts/langtags/extract-reduction.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import * as fs from "fs";
import * as path from "path";
import { fileURLToPath } from "url";

const __filename = fileURLToPath(import.meta.url); // get the resolved path to the file
const __dirname = path.dirname(__filename); // get the name of the directory

const langtags = JSON.parse(
fs.readFileSync(path.join(__dirname, "langtags.json"), "utf8")
);

const reduced_langtags = langtags
.filter(
(langtag) => !!langtag.full
// I'm not convinced that the following three lines are a good idea.
// && !(/^[a-z]{2,3}-[A-Z]{2}$/.test(langtag.tag))
// && !(/^[a-z]{2,3}-[A-Z][a-z]{3}$/.test(langtag.tag))
// && !(/^[a-z]{2,3}-[A-Z][a-z]{3}-[A-Z]{2}$/.test(langtag.tag))
)
.map((langtag) => {
const reduced = {
tag: langtag.tag,
name: langtag.name,
names: langtag.names,
region: langtag.region,
regionname: langtag.regionname,
// we aren't using these fields currently
//regions: langtag.regions,
//iso639_3: langtag.iso639_3 && langtag.iso639_3 !== langtag.tag ? langtag.iso639_3 : undefined,
};
return reduced;
});

fs.writeFileSync(
path.join(__dirname, "reduced-langtags.json"),
JSON.stringify(reduced_langtags, null, 2),
"utf8"
);
11 changes: 11 additions & 0 deletions scripts/langtags/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"name": "reduce-langtags",
"version": "1.0.0",
"type": "module",
"main": "reduce-langtags.js",
"dependencies": {
"fs": "^0.0.1-security",
"path": "0.12.7",
"url": "0.11.0"
}
}
8 changes: 8 additions & 0 deletions scripts/langtags/reduce.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/sh
# download langtags.json if we don't already have it
if [ ! -f langtags.json ]; then
wget 'https://ldml.api.sil.org/index.html?query=langtags&ext=json' -O langtags.json
fi

# run the javascript program to extract the reduced data
node extract-reduction.js
3 changes: 2 additions & 1 deletion src/components/AggregateGrid/AggregateGridInterfaces.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ export interface ILangTagData {
names?: string[];
region: string;
regionname: string;
regions?: string[];
//regions?: string[];
//iso639_3?: string;
}

// If we go back to using the country id data, we'll need this interface.
Expand Down
Loading

0 comments on commit 3d1461f

Please sign in to comment.