steno is a personal project to digitize my stenographic writings.
In my youth most of my notes were written in shorthand (Duployer system). This project aims to digitize old notes and integrate them into the current note system (org-roam).
The method would contain 2 steps:
- scan the notes page by page
- transform each image into a text file
Another goal of this project is to check the viability of basilisp (clojure on python vm). Python has many interesting libraries (especially in the science part) but has horrible syntax so basilisp could be a very good solution.
The application should be a simple pipe:
- extractor
- split the page image in word images
- image-processor
- clean and simplify the word image
- converter
- convert the word image in a sequence of numbers
- translator
- convert the number sequence in a string of chars
- Install
- direnv
- nix (https://nixos.org/download/#nix-install-linux)
- Create an
.envrc.local
file (see .envrc.local.example). - In the project folder run:
direnv allow
first time it will be a long process to download all packages and libraries.
- Install
- python 3.12+
- uv (https://docs.astral.sh/uv/getting-started/installation/)
- babashka
- cljstyle
- kondo
- Create manually the user variables defined in
.envrc.local.example
. - In the project folder run:
uv venv uv sync
The application could be run with the command:
bb app <params>
To see the params
available run:
bb app -h
To format the code run:
bb format
To lint the code run:
bb kondo
To make sure that no unformatted commits with lint errors end up in the main branch run initially:
git config core.hooksPath hooks
The pre-push
script will block the push if there are style or lint errors in code.
- https://en.wikipedia.org/wiki/Duployan_shorthand
- https://opencv.org/
- https://theailearner.com/tag/skeletonization-opencv/
- https://github.com/Wesley-Li/skeleton
- https://docs.opencv.org/4.x/d9/d61/tutorial_py_morphological_ops.html
This project is released under the GNU General Public License. See the file for details.