rvankoert · rvankoert · Sep 27, 2023 · Aug 31, 2023 · Sep 4, 2023 · Sep 11, 2023
diff --git a/.github/workflows/run_tests.yml b/.github/workflows/run_tests.yml
@@ -21,22 +21,22 @@ jobs:
     - name: Install dependencies
       run: |
         python -m pip install --upgrade pip
-        pip install -r src/requirements.txt
+        pip install -r requirements.txt
 
     - name: Test utils
       run: |
         python -m unittest tests/test_utils.py
       env:
         TF_CPP_MIN_LOG_LEVEL: '2'
 
-    - name: Test DataGeneratorNew2
+    - name: Test DataGenerator
       if: always()
       run: |
         python -m unittest tests/test_datagenerator.py
       env:
         TF_CPP_MIN_LOG_LEVEL: '2'
 
-    - name: Test DataLoaderNew
+    - name: Test DataLoader
       if: always()
       run: |
         python -m unittest tests/test_dataloader.py
@@ -49,3 +49,10 @@ jobs:
         python -m unittest tests/test_model_creation.py
       env:
         TF_CPP_MIN_LOG_LEVEL: '2'
+
+    - name: Test replacing model layers
+      if: always()
+      run: |
+        python -m unittest tests/test_model_replacement.py
+      env:
+        TF_CPP_MIN_LOG_LEVEL: '2'
diff --git a/README.md b/README.md
@@ -9,6 +9,8 @@ Loghi HTR also works on machine printed text.
 1. [Installation](#installation)
 2. [Usage](#usage)
 3. [Variable-size Graph Specification Language (VGSL)](#variable-size-graph-specification-language-vgsl)
+4. [API Usage Guide](#api-usage-guide)
+4. [Frequently Asked Questions (FAQ)](#FAQ)
 
 ## Installation
 
@@ -40,7 +42,7 @@ python3 -m pip install .
 
 ```bash
 git clone https://github.com/rvankoert/loghi-htr.git
-cd loghi-htr/src/
+cd loghi-htr
 python3 -m pip install -r requirements.txt
 ```
 
@@ -73,9 +75,9 @@ The command-line options include, but are not limited to:
 - `--do_train`: Enable the training stage.
 - `--do_validate`: Enable the validation stage.
 - `--do_inference`: Perform inference.
-- `--train_list`: List of files containing training data. Format: `path/to/textline/image <TAB> transcription`.
-- `--validation_list`: List of files containing validation data. Format: `path/to/textline/image <TAB> transcription`.
-- `--inference_list`: List of files containing data to perform inference on. Format: `path/to/textline/image`.
+- `--train_list`: List of files containing training data. Format: `/path/to/textline/image <TAB> transcription`.
+- `--validation_list`: List of files containing validation data. Format: `/path/to/textline/image <TAB> transcription`.
+- `--inference_list`: List of files containing data to perform inference on. Format: `/path/to/textline/image`.
 - `--learning_rate`: Set the learning rate. Recommended values range from 0.001 to 0.000001, with 0.0003 being the default.
 - `--channels`: Number of image channels. Use 3 for standard RGB-images, and 4 for images with an alpha channel containing the textline polygon-mask.
 - `--gpu`: GPU configuration. Use -1 for CPU, 0 for the first GPU, and so on.
@@ -84,6 +86,8 @@ The command-line options include, but are not limited to:
 - `--use_mask`: Enable when using `batch_size` > 1.
 - `--results_file`: The inference results are aggregated in this file.
 - `--config_file_output`: The output location of the config.
+- `--replace_recurrent_layer`: Specifies the [VGSL string](#variable-size-graph-specification-language-vgsl) to define the architecture of the recurrent layers that will replace the recurrent layers of an existing model. This argument is required when you want to modify the recurrent layers of a model specified by `--existing_model`. The VGSL string describes the type, direction, and number of units for the recurrent layers. For example, "Lfs128 Lf64" describes two LSTM layers with 128 and 64 units respectively. When using this argument, ensure that `--existing_model` is also provided to specify the model whose recurrent layers you want to replace.
+- `--replace_final_layer`: Enables the replacement of the final dense layer of an existing model. This is useful when you want to adjust the number of output characters or change the masking option without modifying the rest of the model. When this argument is used, the model specified by `--existing_model` will have its final dense layer replaced with a new one. The number of output units will be adjusted based on the value of `number_characters` and whether the `--use_mask` option is enabled.
 
 For detailed options and configurations, you can also refer to the help command:
 
@@ -117,8 +121,8 @@ python3 main.py
 ```bash
 CUDA_VISIBLE_DEVICES=0 
 python3 main.py 
---existing_model path/to/existing/model 
---charlist path/to/existing/model/charlist.txt
+--existing_model /path/to/existing/model 
+--charlist /path/to/existing/model/charlist.txt
 --do_inference 
 --inference_list "inference_lines_1.txt"
 --height 64 
@@ -254,3 +258,98 @@ In this example, the string defines a neural network with input layers, convolut
 - **ResidualBlock**: Documentation in progress.
 - **CTCLayer**: Documentation in progress.
 
+## API Usage Guide
+
+This guide walks you through the process of setting up and running the API, as well as how to interact with it.
+
+### 1. Setting up the API
+
+Navigate to the `src/api` directory in your project:
+
+```bash
+cd src/api
+```
+
+#### Starting the API
+
+You have the choice to run the API using either `gunicorn` (recommended) or `flask`. To start the server:
+
+Using `gunicorn`:
+
+```bash
+python3 gunicorn_app.py
+```
+
+Or using `flask`:
+
+```bash
+python3 flask_app.py
+```
+
+#### Environment Variables Configuration
+
+Before running the app, you must set several environment variables. The app fetches configurations from these variables:
+
+**Gunicorn Options:**
+
+```bash
+GUNICORN_RUN_HOST        # Default: "127.0.0.1:8000": The host and port where the API should run.
+GUNICORN_WORKERS         # Default: "1": Number of worker processes.
+GUNICORN_THREADS         # Default: "1": Number of threads per worker.
+GUNICORN_ACCESSLOG       # Default: "-": Access log settings.
+```
+
+**Loghi-HTR Options:**
+
+```bash
+LOGHI_MODEL_PATH         # Path to the model.
+LOGHI_CHARLIST_PATH      # Path to the character list.
+LOGHI_MODEL_CHANNELS     # Number of channels in the model.
+LOGHI_BATCH_SIZE         # Default: "256": Batch size for processing.
+LOGHI_OUTPUT_PATH        # Directory where predictions are saved.
+LOGHI_MAX_QUEUE_SIZE     # Default: "10000": Maximum size of the processing queue.
+```
+
+**GPU Options:**
+
+```bash
+LOGHI_GPUS               # Default: "0": GPU configuration.
+```
+
+You can set these variables in your shell or use a script. An example script to start a `gunicorn` server can be found in `src/api/start_local_app.sh`.
+
+### 2. Interacting with the running API
+
+Once the API is up and running, you can send HTR requests using curl. Here's how:
+
+```bash
+curl -X POST -F "image=@$input_path" -F "group_id=$group_id" -F "identifier=$filename" http://localhost:5000/predict
+```
+
+Replace `$input_path`, `$group_id`, and `$filename` with your specific values. The model processes the image, predicts the handwritten text, and saves the predictions in the specified output path (from the `LOGHI_OUTPUT_PATH` environment variable).
+
+---
+
+This guide should help you get started with the API. For advanced configurations or troubleshooting, please reach out for support.
+
+## FAQ
+
+If you're new to using this tool or encounter issues, this FAQ section provides answers to common questions and problems. If you don't find your answer here, please reach out for further assistance.
+
+### How do I use `replace_recurrent_layer`?
+
+The `replace_recurrent_layer` is a feature that allows you to replace the recurrent layers of an existing model with a new architecture defined by a VGSL string. To use it:
+
+1. Specify the model you want to modify using the `--existing_model` argument.
+2. Provide the VGSL string that defines the new recurrent layer architecture with the `--replace_recurrent_layer` argument. The VGSL string describes the type, direction, and number of units for the recurrent layers. For example, "Lfs128 Lfs64" describes two LSTM layers with 128 and 64 units respectively, with both layers returning sequences.
+3. (Optional) Use `--use_mask` if you want the replaced layer to account for masking.
+4. Execute your script or command, and the tool will replace the recurrent layers of your existing model based on the VGSL string you provided.
+
+### I'm getting the following error when I want to use `replace_recurrent_layer`: `Input 0 of layer "lstm_1" is incompatible with the layer: expected ndim=3, found ndim=2.` What do I do?
+
+This error usually indicates that there is a mismatch in the expected input dimensions of the LSTM layer. Often, this is because the VGSL spec for the recurrent layers is missing the `[s]` argument, which signifies that the layer should return sequences.
+
+To resolve this:
+- Ensure that your VGSL string for the LSTM layer has an `s` in it, which will make the layer return sequences. For instance, instead of "Lf128", use "Lfs128".
+- Re-run the script or command with the corrected VGSL string.
+
diff --git a/model/charList2.txt b/model/charList2.txt
diff --git a/src/requirements.txt → requirements.txt b/src/requirements.txt → requirements.txt
diff --git a/src/AppLocker.py b/src/AppLocker.py