Prepare v7.1 (#275)

* Update README.md * Update docs. * Update changelog. * Update version and dependencies * test: New baseline. * fix (docs): Incorporate changes to the vuepress2 config api in beta 47. * Update CHANGELOG.md
about-code · Sep 18, 2023 · d942175 · d942175
1 parent 5e37def
commit d942175
Show file tree

Hide file tree

Showing 12 changed files with 961 additions and 962 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,21 +2,34 @@
 
 All notable changes to this project will be documented in this file. See [standard-version](https://github.com/conventional-changelog/standard-version) for commit guidelines.
 
+## [7.1.0](https://github.com/about-code/glossarify-md/compare/v7.0.0...v7.1.0) (2023-08-27)
+
+
+### Features
+
+* New option `linking.sortAlternatives` for sorting and linking the most likely appropriate definition, first, when there are multiple definitions for a term. More see [Multiple Glossaries and Ambiguities](https://github.com/about-code/glossarify-md/blob/master/doc/ambiguities.md#selecting-the-most-appropriate-definition) ([#269](https://github.com/about-code/glossarify-md/issues/269)) ([74744a3](https://github.com/about-code/glossarify-md/commit/74744a3ede184805a6064e41c2cf6a7a4ab97f9c))
+
+
+### Documentation Updates
+
+* Minor changes to story and examples. ([d3fe139](https://github.com/about-code/glossarify-md/commit/d3fe139b6844cbf8cdbfb6a0e1a427fd46c56e8c))
+* Incorporate changes to the vuepress2 config api in beta 47. ([6a57fff](https://github.com/about-code/glossarify-md/commit/6a57fffdfe0091b24e97a33e06c06e3c3a1241cd))
+
 ## [7.0.0](https://github.com/about-code/glossarify-md/compare/v6.3.3...v7.0.0) (2023-03-26)
 
 
 ### ⚠ BREAKING CHANGES
 
 * End of support for NodeJS 14.x.
-* No longer supporting CommonJS module system, see [using with vuepress 1.x](https://github.com/about-code/glossarify-md/blob/v7.0.0/docuse-with-vuepress.md)
+* No longer supporting CommonJS module system, see [using with vuepress 1.x](https://github.com/about-code/glossarify-md/blob/v7.0.0/doc/use-with-vuepress.md)
 * Bumped `glob` dependency from `v7` to `v9`. **This *might* affect you** when using glob patterns in a glossarify-md config, e.g.:
   - `glossaries` with `[{ "file": "./some/**/glob/**/pattern*.md" }]`
   - `includeFiles`
   - `excludeFiles`
   - `keepRawFiles`
 
   Notable changes:
-  
+
   - `\` is now only used as an escape character, and never as a path separator in glob patterns, so that **Windows users** have a way to match against filenames containing literal glob pattern characters.
   - **Glob pattern paths must use forward-slashes as path separators**, since `\` is an escape character to match literal glob pattern characters.
   - further changes see [glob/changelog.md](https://github.com/isaacs/node-glob/blob/main/changelog.md)

diff --git a/README.md b/README.md
@@ -5,6 +5,8 @@
 
 [CommonMark]: https://www.commonmark.org
 
+[doc-ambiguity]: https://github.com/about-code/glossarify-md/blob/master/doc/ambiguity.md
+
 [doc-book-index]: https://github.com/about-code/glossarify-md/blob/master/doc/gen-book-index.md
 
 [doc-config]: https://github.com/about-code/glossarify-md/blob/master/conf/README.md
@@ -307,8 +309,17 @@ Superficial injury of knee or lower leg
 ```
 
 With adding *who-icd-codes.md* to the list of glossaries every mention of [⚕NC32](#nc32 "Fracture of forearm") or [⚕NC90](#nc90 "Superficial injury of knee or lower leg") in documents will have a tooltip and link to the glossary definition, too.
+**Since v5.0.0** `file` can also be used with a [glob] file pattern:
+
+```json
+"glossaries": [
+    { "file": "./**/*.md" },
+]
+```
+
+This way each markdown file matching the pattern will be processed like a glossary. More see [Cross-Linking][doc-cross-linking] and [Multiple Glossaries and Ambiguity][doc-ambiguity].
 
-> **ⓘ Since v5.0.0** `file` can also be used with a [glob] file pattern. This way each markdown file matching a pattern will be processed like a glossary. More see [Cross-Linking][doc-cross-linking].
+> **ⓘ Note:** `termHint` only works for `file` pointing at a particular file name.
 
 ## Sorting Glossaries
 
@@ -337,7 +348,7 @@ The `i18n` object is passed *as is* to the collator function. Thus you can use a
 
 ## [Advanced Topics][doc-extended]
 
-See **[here][doc-extended]**, for example:
+See **[here][doc-extended]**, for advanced topics:
 
 - Importing and exporting terms
 - Generating files, such as a book index, lists of figures, etc.

diff --git a/doc/ambiguities.md b/doc/ambiguities.md
diff --git a/doc/sort-alternatives-by-ref-count.md b/doc/sort-alternatives-by-ref-count.md
@@ -0,0 +1,151 @@
+# [Sorting Alternatives by Counting Glossary References](#sorting-alternatives-by-counting-glossary-references)
+
+[multiple glossaries]: ../README.md#multiple-glossaries
+
+[A]: ./glossary-a.md#ambiguous-term "Term definition in glossary A"
+
+[B]: ./glossary-b.md#ambiguous-term "Term definition in glossary B"
+
+[C]: ./glossary-c.md#ambiguous-term "Term definition in glossary C"
+
+[D]: ./glossary-d.md#ambiguous-term "Term definition in glossary D"
+
+Read [Ambiguities][1], first, to a get a bit more context on what this aims to achieve.
+
+```json
+"linking": {
+  "sortAlternatives": {
+    "by": "glossary-ref-count"
+   }
+}
+```
+
+By counting how often terms of various glossaries occur we get a distribution like, for example:
+
+    refCount
+        ^
+        |
+      3_|       _
+      2_|      | |  _
+      1_|   _  | | | |  _
+      0_|  |1| |3| |2| |1|
+        +-|---|---|---|---|--> glossary
+          | A | B | C | D |
+
+The distribution tells us that the writer has mentioned (whatever) terms defined in glossary `B` *three times*, terms defined in glossary `C` *two times* and terms defined in glossaries `A` or `D` *once*, each. Sorting the bars by glossary reference count (descending) yields a glossary priority:
+
+    refCount
+        ^
+        |
+      3_|   _
+      2_|  | |  _
+      1_|  | | | |  _   _
+      0_|  |3| |2| |1| |1|
+        +-|---|---|---|---|--> glossary
+          | B | C | A | D |
+
+The order `B,C,A,D` is a *context-sensitive glossary priority derived from a writer's actual use of glossary terminology*. Once we find a [term occurrence][2] *Ambiguous Term* with definitions in glossaries, e.g. `A,B,C` then above distribution suggests linking term definitions in glossary order `B,C,A` producing a linkified result *[Ambiguous Term][B]<sup>[2)][C],[3)][A]</sup>* (move your mouse over the links to get a hint on the link target).
+
+> **Note:** Due to glossary priority being derived from a writer's use of glossary terminology linkification results are expected to change between subsequent runs of glossarify-md when a writer's use of glossary terminology has changed between those runs.
+
+#### [Different priorities for different sections](#different-priorities-for-different-sections)
+
+```json
+"linking": {
+  "sortAlternatives": {
+    "by": "glossary-ref-count",
+    "perSectionDepth": 1
+   }
+}
+```
+
+Let's assume a Markdown document with a *[Table of Contents][3]:*
+
+*   `# Section 1`
+*   `# Section 2`
+*   `## Section 2.1`
+*   `## Section 2.2`
+*   `### Section 2.2.1`
+
+By sorting definitions `perSectionDepth: 0` the system counts and aggregates a single distribution per markdown file. Counting and sorting `perSectionDepth: 1` increases sensitivity towards *section-specfic* use of terminology and makes glossarify-md aggregate distinct distributions for sections at a heading level `# Heading 1` which add up to the total count for the file:
+
+*Each dot denotes an occurrence of some term from glossary A, B, C or D in the section:*
+
+    Heading
+     Depth                   .
+       |                   . : .
+       |                 : : : :
+       |                 : : : :
+       |                 A,B,C,D
+       |                  Total
+     0_|                    o          .
+       |         .         / \     . . : :
+       |       . : : .    /   \    : : : :
+       |       A,B,C,D   /     \   A,B,C,D
+     1_|_#    Section 1 o       o Section 2
+       |
+       V
+
+The new setting changes [term definition][4] priorities to
+
+*   `B,C,A,D` in context of Section 1
+*   `C,D,A,B` in context of Section 2
+
+Subsections *Section 2.1, Section 2.2* and *Section 2.2.1* inherit [term definition][4] priority `C,D,A,B` from their parent section *Section 2*. You may notice that the total count hasn't changed. Upper nodes summarize their child nodes.
+
+The **default** sensitivity when sorting by `glossary-ref-count` is sorting `perSectionDepth: 2`. Omitting `perSectionDepth` would be equal to `perSectionDepth: 2` and in our example it may revealead distributions:
+
+    Heading
+     Depth                   .
+       |                   . : .
+       |                 : : : :
+       |                 : : : :
+       |                 A,B,C,D
+       |                  Total
+     0_|                    o          .
+       |         .         / \     . . : :
+       |       . : : .    /   \    : : : :
+       |       A,B,C,D   /     \   A,B,C,D
+     1_|_#    Section 1 o       o Section 2
+       |                       / \
+       |               .      /   \           :
+       |         . : . :     /     \      : . : .
+       |         A,B,C,D    /       \     A,B,C,D
+     2_|_##    Section 2.1 o         o  Section 2.2
+       |
+       V
+
+The new tree continues to suggest a glossary priority
+
+*   `B,C,A,D` in context of Section 1
+*   `C,D,A,B` in context of Section 2
+
+but now suggests different [term definition][4] priorities
+
+*   `D,B,A,C` in context of Section 2.1 and deeper
+*   `C,A,B,D` in context of Section 2.2 and deeper.
+
+*Section 2.2.1* now inherits a priority `C,A,B,D` from its parent section *Section 2.2*.
+
+> **ⓘ What's the "right" value for `perSectionDepth`?**
+>
+> Short story: there's none. You may want to ask yourself two questions for a good *tradeoff*:
+>
+> 1.  At which section level do you think your use of glossary terminology changes in a way that terms with a meaning in section A may could have a different meaning in section B?
+> 2.  Are your sections verbose enough at the given section depth and do they mention enough glossary terms (on average) to help selecting the most appropriate definition for an ambiguous term occurrence?
+>
+> The default is `perSectionDepth: 2`. See details for our own answers which drove that decision.
+>
+> <details><ol><li> We expect a heading at level 1 to be a book title, especially in single-file projects. Then headings at level 2 denote book chapters. We assume it is more likely to introduce new topics with their own terminology at the level of chapters than at the level of sections within a chapter. Deeper sections may add details to a chapter's topic but do not change it, significantly. Therefore having separate term definition priorities at levels deeper than 2 may not be required in many situations.</li><li>The deeper the level the less words are being scanned and the less term occurrences can contribute to a glossary-ref-count distribution. The less glossary term occurrences the higher the weight of those few occurrences that have been mentioned when it comes to deciding on the most appropriate definition for an ambiguous term occurrence. At times this might exactly what you want. Then changing the default and counting separately for deeper levels would be sane. However, in general, the less words are being evaluated the higher the risk of not finding and counting enough glossary term occurrences in total, to make good decisions for ambiguous term occurrences in particular.</li> </ol>
+>
+> From this reasoning we concluded that `perSectionDepth: 2` seems to be a good tradeoff and sensible default.
+
+</details>
+
+[1]: ./ambiguities.md#linking-to-the-most-appropriate-term-definition
+
+[2]: https://github.com/about-code/glossarify-md/blob/master/doc/glossary.md#term-occurrence "A phrase in a Markdown file A which matches the phrase of a heading in a Markdown file B where B was configured to be a glossary file."
+
+[3]: https://github.com/about-code/glossarify-md/blob/master/README.md
+
+[4]: https://github.com/about-code/glossarify-md/blob/master/doc/glossary.md#term-definition "A term definition is, technically, the phrase of a heading in a Markdown file which was configured to be a glossary file."
diff --git a/doc/use-with-vuepress.md b/doc/use-with-vuepress.md
@@ -75,36 +75,32 @@ Installs glossarify-md with a syntax plug-in for *frontmatter* syntax.
 
 ## [Configure vuepress](#configure-vuepress)
 
-glossarify-md and [vuepress 🌎][1] need to be aligned in how they create hyperlink URLs with browser-friendly URL-hashes `#...`, also called *[slugs][2]*.
+> *   vuepress v1
+> *   vuepress v2 (beta 47+)
 
-> ⚠ **Important (Non-English / Non-ASCII charsets):** vuepress's default slugger creates hashes with lowercase *ASCII characters, only*. [github-slugger] instead maps UNICODE characters onto their lowercase UNICODE equivalent.
-> For example, non-ASCII *Äquator* (German) becomes `#aquator` with vuepress defaults but becomes `#äquator` when using vuepress with [github-slugger]. Some consequences to consider:
->
-> 1.  Bookmarks onto published web pages continue to resolve to the web page but a browser may no longer resolve the page section and stops scrolling when sections outside the visible viewport.
-> 2.  As a Markdown writer you may have authored links `[Foo](#aquator)`, manually, which have to be changed to `[Foo](#äquator)`.
-
-### [Configure vuepress 2.x](#configure-vuepress-2x)
+glossarify-md and [vuepress 🌎][1] need to be aligned in how they create hyperlink URLs with browser-friendly URL-hashes `#...`, also called *[slugs][2]*.
 
 <em>./docs/.vuepress/config.js</em>
 
 ```js
 import { getSlugger } from "glossarify-md"
 
-const slugify = {
-  slugify: getSlugger()
-};
 module.exports = {
-    markdown: {               // vuepress v2.x
-      toc: { ...slugify },
-      anchor: { ...slugify },
-      extractHeaders: { ...slugify }
+    markdown: {
+      slugify: getSlugger()
     }
 };
 ```
 
-### [Configure vuepress 1.x](#configure-vuepress-1x)
+> ⚠ **Important (Non-English / Non-ASCII charsets):** vuepress's default slugger creates hashes with lowercase *ASCII characters, only*. [github-slugger] instead maps UNICODE characters onto their lowercase UNICODE equivalent.
+> For example, non-ASCII *Äquator* (German) becomes `#aquator` with vuepress defaults but becomes `#äquator` when using vuepress with [github-slugger]. Some consequences to consider:
+>
+> 1.  Bookmarks onto published web pages continue to resolve to the web page but a browser may no longer resolve the page section and stops scrolling when sections outside the visible viewport.
+> 2.  As a Markdown writer you may have authored links `[Foo](#aquator)`, manually, which have to be changed to `[Foo](#äquator)`.
+
+### [Get rid of glossarify-md](#get-rid-of-glossarify-md)
 
-> ⚠ **We recommend [using vuepress 1.x with glossarify-md <= v6, only][doc-v6]**. Using glossarify-md v7 with vuepress 1.x requires you to install a CommonJS version of [github-slugger v1][github-slugger] for yourself while glossarify-md uses [github-slugger v2][github-slugger]. Slugs should be compatible, because [github-slugger v1 and v2 still implement the same algorithm][github-slugger-diff] but the mere fact that vuepress and glossarify-md no longer physically execute the same code to generate slugs makes it more likely to break in a future when some major release of glossarify-md starts using a potentially incompatbile [github-slugger v3][github-slugger].
+Given you want to get rid of glossarify-md but keep on using [vuepress 🌎][1]. Then you may not want URLs and [URL][3] [slugs][2], to change. To keep them stable while dropping glossarify-md just [import][4] `github-slugger` yourself.
 
     npm i --save github-slugger@^1.5.0
 
@@ -142,10 +138,14 @@ module.exports = {
 *   `npm run glossarified` builds and serves the glossarified version from `outDir`.
 *   `npm run build` just builds the glossarified [vuepress 🌎][1] site without running a server.
 
-More see [README.md][3].
+More see [README.md][5].
 
 [1]: https://vuepress.vuejs.org "A static website generator translating markdown files into a website powered by [vuejs]."
 
 [2]: https://github.com/about-code/glossarify-md/blob/master/doc/glossary.md#slug "A slug is a URL-friendly identifier that can be used within URL fragments to address headings / sections on a page."
 
-[3]: ../README.md
+[3]: https://github.com/about-code/glossarify-md/blob/master/doc/glossary.md#uri--url "Uniform Resource Identifier and Uniform Resource Locator are both the same thing, which is an ID with a syntax scheme://authority.tld/path/#fragment?query like https://my.org/foo/#bar?q=123."
+
+[4]: https://github.com/about-code/glossarify-md/blob/master/doc/import.md#importing-terms "⚠ Important: glossarify-md is able to import terms and definitions from a remote location using https, when configured this way."
+
+[5]: ../README.md