Skip to content

Commit

Permalink
Script for generating reduced-langtags.json (BL-13305)
Browse files Browse the repository at this point in the history
Plus a few other small bug fixes and tweaks.
  • Loading branch information
StephenMcConnel committed Jul 25, 2024
1 parent c687819 commit 6b0029f
Show file tree
Hide file tree
Showing 9 changed files with 19,095 additions and 25,857 deletions.
33 changes: 33 additions & 0 deletions scripts/langtags/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
## Generating a new reduced-langtags.json file

Simply run the `reduce.sh` script from this directory like this inside the git bash
shell window:
```
./reduce.sh
```
If you want to ensure a fresh copy of the `langtags.json` file, delete any
existing copy in this folder first:
```
rm langtags.json
./reduce.sh
```
(An existing copy of `langtags.json` is used for processing to save download time.)

After generating a new version of the `reduced-langtags.json` file, it will need to
be copied (or moved) to the src/components/AggregateGrid folder:
```
cp reduced-langtags.json ../../src/components/AggregateGrid
```
or
```
mv reduced-langtags.json ../../src/components/AggregateGrid
```

### Developer notes

There is no need to use either npm or yarn as far as I can tell. Perhaps node is
trying to keep up with bun?

The shell script may need to have its line endings changed if you want to run it
under another shell such as the Cygwin bash shell window or the Windows Subsystem
for Linux shell window.
38 changes: 38 additions & 0 deletions scripts/langtags/extract-reduction.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import * as fs from "fs";
import * as path from "path";
import { fileURLToPath } from "url";

const __filename = fileURLToPath(import.meta.url); // get the resolved path to the file
const __dirname = path.dirname(__filename); // get the name of the directory

const langtags = JSON.parse(
fs.readFileSync(path.join(__dirname, "langtags.json"), "utf8")
);

const reduced_langtags = langtags
.filter(
(langtag) => !!langtag.full
// I'm not convinced that the following three lines are a good idea.
// && !(/^[a-z]{2,3}-[A-Z]{2}$/.test(langtag.tag))
// && !(/^[a-z]{2,3}-[A-Z][a-z]{3}$/.test(langtag.tag))
// && !(/^[a-z]{2,3}-[A-Z][a-z]{3}-[A-Z]{2}$/.test(langtag.tag))
)
.map((langtag) => {
const reduced = {
tag: langtag.tag,
name: langtag.name,
names: langtag.names,
region: langtag.region,
regionname: langtag.regionname,
// we aren't using these fields currently
//regions: langtag.regions,
//iso639_3: langtag.iso639_3 && langtag.iso639_3 !== langtag.tag ? langtag.iso639_3 : undefined,
};
return reduced;
});

fs.writeFileSync(
path.join(__dirname, "reduced-langtags.json"),
JSON.stringify(reduced_langtags, null, 2),
"utf8"
);
11 changes: 11 additions & 0 deletions scripts/langtags/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"name": "reduce-langtags",
"version": "1.0.0",
"type": "module",
"main": "reduce-langtags.js",
"dependencies": {
"fs": "^0.0.1-security",
"path": "0.12.7",
"url": "0.11.0"
}
}
8 changes: 8 additions & 0 deletions scripts/langtags/reduce.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/sh
# download langtags.json if we don't already have it
if [ ! -f langtags.json ]; then
wget 'https://ldml.api.sil.org/index.html?query=langtags&ext=json' -O langtags.json
fi

# run the javascript program to extract the reduced data
node extract-reduction.js
3 changes: 2 additions & 1 deletion src/components/AggregateGrid/AggregateGridInterfaces.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ export interface ILangTagData {
names?: string[];
region: string;
regionname: string;
regions?: string[];
//regions?: string[];
//iso639_3?: string;
}

// If we go back to using the country id data, we'll need this interface.
Expand Down
Loading

0 comments on commit 6b0029f

Please sign in to comment.