From ee7e308901265927cf4077e2db6c5d282b3ab0c2 Mon Sep 17 00:00:00 2001
From: Arthit Suriyawongkul <arthit@gmail.com>
Date: Sat, 9 Nov 2024 18:51:28 +0000
Subject: [PATCH] Add TOC to README

---
 README.md | 123 +++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 103 insertions(+), 20 deletions(-)
diff --git a/README.md b/README.md
index 131c6f6..071f545 100644
--- a/README.md
+++ b/README.md
@@ -8,35 +8,69 @@ SPDX-License-Identifier: Apache-2.0
 Thai natural language processing library in Rust,
 with Python and Node bindings. Formerly oxidized-thainlp.
 
+## Table of contents
+
+- [Features](#features)
+- [Dictionary file](#dictionary-file)
+- [Usage](#usage)
+  - [Python binding](#python-binding)
+  - [Node.js binding](#nodejs-binding)
+  - [Rust library](#rust-library)
+  - [Command-line interface](#command-line-interface)
+- [Build](#build)
+- [Development](#development)
+- [License](#license)
+
 ## Features
 
 - Thai word tokenizer
-  - Use maximal-matching dictionary-based tokenization algorithm and honor Thai Character Cluster boundaries
-    - [2.5x faster](https://github.com/PyThaiNLP/nlpo3/blob/main/nlpo3-python/notebooks/nlpo3_segment_benchmarks.ipynb) than similar pure Python implementation (PyThaiNLP's newmm)
-  - Load a dictionary from a plain text file (one word per line) or from `Vec<String>`
+  - Use maximal-matching dictionary-based tokenization algorithm
+    and honor [Thai Character Cluster][tcc] boundaries
+    - [2.5x faster][benchmark]
+      than similar pure Python implementation (PyThaiNLP's newmm)
+  - Load a dictionary from a plain text file (one word per line)
+    or from `Vec<String>`
+
+[tcc]: https://dl.acm.org/doi/10.1145/355214.355225
+[benchmark]: https://github.com/PyThaiNLP/nlpo3/blob/main/nlpo3-python/notebooks/nlpo3_segment_benchmarks.ipynb
 
 ## Dictionary file
 
-- For the interest of library size, nlpO3 does not assume what dictionary the developer would like to use.
-  It does not come with a dictionary. A dictionary is needed for the dictionary-based word tokenizer.
+- For the interest of library size, nlpO3 does not assume what dictionary the
+  user would like to use, and it does not come with a dictionary.
+- A dictionary is needed for the dictionary-based word tokenizer.
 - For tokenization dictionary, try
-  - [words_th.tx](https://github.com/PyThaiNLP/pythainlp/blob/dev/pythainlp/corpus/words_th.txt) from [PyThaiNLP](https://github.com/PyThaiNLP/pythainlp/) - around 62,000 words (CC0)
-  - [word break dictionary](https://github.com/tlwg/libthai/tree/master/data) from [libthai](https://github.com/tlwg/libthai/) - consists of dictionaries in different categories, with make script (LGPL-2.1)
+  - [words_th.tx][dict-pythainlp] from [PyThaiNLP][pythainlp]
+    - ~62,000 words
+    - CC0-1.0
+  - [word break dictionary][dict-libthai] from [libthai][libthai]
+    - consists of dictionaries in different categories, with a make script
+    - LGPL-2.1
+
+[pythainlp]: https://github.com/PyThaiNLP/pythainlp
+[libthai]: https://github.com/tlwg/libthai/
+[dict-pythainlp]: https://github.com/PyThaiNLP/pythainlp/blob/dev/pythainlp/corpus/words_th.txt
+[dict-libthai]: https://github.com/tlwg/libthai/tree/master/data
 
 ## Usage
 
-### Command-line interface
+### Node.js binding
 
-- [nlpo3-cli](nlpo3-cli/) <a href="https://crates.io/crates/nlpo3-cli/"><img alt="crates.io" src="https://img.shields.io/crates/v/nlpo3-cli.svg"/></a>
+#### Source code
 
-```bash
-echo "ฉันกินข้าว" | nlpo3 segment
-```
+See [nlpo3-nodejs](./nlpo3-nodejs/) directory.
+
+### Python binding
+
+[![PyPI](https://img.shields.io/pypi/v/nlpo3.svg "PyPI")](https://pypi.python.org/pypi/nlpo3)
 
-### Bindings
+#### Install
+
+```shell
+pip install nlpo3
+```
 
-- [Node.js](nlpo3-nodejs/)
-- [Python](nlpo3-python/) <a href="https://pypi.python.org/pypi/nlpo3"><img alt="pypi" src="https://img.shields.io/pypi/v/nlpo3.svg"/></a>
+#### Example
 
 ```python
 from nlpo3 import load_dict, segment
@@ -45,9 +79,19 @@ load_dict("path/to/dict.file", "dict_name")
 segment("สวัสดีครับ", "dict_name")
 ```
 
-### As Rust library
+#### Source code
 
-<a href="https://crates.io/crates/nlpo3/"><img alt="crates.io" src="https://img.shields.io/crates/v/nlpo3.svg"/></a>
+See [nlpo3-python](./nlpo3-python/) directory.
+
+### Rust library
+
+[![crates.io](https://img.shields.io/crates/v/nlpo3.svg "crates.io")](https://crates.io/crates/nlpo3/)
+
+#### Install
+
+```shell
+cargo install nlpo3
+```
 
 In `Cargo.toml`:
 
@@ -57,6 +101,8 @@ In `Cargo.toml`:
 nlpo3 = "1.3.2"
 ```
 
+#### Example
+
 Create a tokenizer using a dictionary from file,
 then use it to tokenize a string (safe mode = true, and parallel mode = false):
 
@@ -87,6 +133,36 @@ Remove words from an existing tokenizer:
 tokenizer.remove_word(&["กระเพรา", "ชานชลา"]);
 ```
 
+#### Source code
+
+See the [root](/) directory.
+
+### Command-line interface
+
+[![Crates](https://img.shields.io/crates/v/nlpo3-cli.svg "Crates")](https://crates.io/crates/nlpo3-cli/)
+
+#### Install
+
+```shell
+cargo install nlpo3-cli
+```
+
+#### Usage
+
+```shell
+nlpo3 help
+```
+
+#### Example
+
+```bash
+echo "ฉันกินข้าว" | nlpo3 segment
+```
+
+#### Source code
+
+See [nlpo3-cli](./nlpo3-cli/) directory.
+
 ## Build
 
 ### Requirements
@@ -115,10 +191,17 @@ cargo build --release
 
 Check `target/` for build artifacts.
 
-## Development documents
+## Development
+
+Development document:
 
 - [Notes on custom string](src/NOTE_ON_STRING.md)
 
-## Issues
+Issues:
+
+- Please report issues at <https://github.com/PyThaiNLP/nlpo3/issues>
+
+## License
 
-Please report issues at <https://github.com/PyThaiNLP/nlpo3/issues>
+nlpO3 is copyrighted by its authors and licensed under terms of the Apache
+Software License 2.0 (Apache-2.0) - see file [LICENSE](./LICENSE) for details.