Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File cleanup #3

Merged
merged 20 commits into from
Sep 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
ec94aa1
Confidence interval, removal of unused code
Thelukepet Aug 31, 2023
7d92050
Cer confidence (line-level) based on certainty input
Thelukepet Sep 4, 2023
ef6b397
Integrated Tim's Gunicorn solution as replacement for Flask
Thelukepet Sep 11, 2023
12e2210
Removed test files, normalised data_generator and data_loader, refact…
Thelukepet Sep 21, 2023
1317a5b
Re-adding the test files for github tests
Thelukepet Sep 21, 2023
0cb14df
Adjusted class names / param names for current version
Thelukepet Sep 21, 2023
ca7c823
Remove pycache, move requirements, update README
TimKoornstra Sep 22, 2023
3cea407
Update tests with new requirements location
TimKoornstra Sep 22, 2023
978f3d1
Remove redundant model.py functions
TimKoornstra Sep 22, 2023
f45ab47
Uncomment config.py, remove redundant args from parser
TimKoornstra Sep 25, 2023
655400a
Readd args used for replace_recurrent_layer
TimKoornstra Sep 25, 2023
f2abad8
Make replace_recurrent_layer use VGSL spec
TimKoornstra Sep 25, 2023
87912f8
Add unittests for replace_recurrent_layer
TimKoornstra Sep 25, 2023
dd4a4ab
Unittests can be run from within root or tests/ folder
TimKoornstra Sep 25, 2023
81f2df5
Update README.md with FAQ + add no model error handling for VGSL
TimKoornstra Sep 25, 2023
46dd41c
Add API instructions to README
TimKoornstra Sep 25, 2023
290686a
Add custom model names to VGSLSpec
TimKoornstra Sep 25, 2023
cc7219a
Bugfix Bidirectional layer
Thelukepet Sep 26, 2023
92461a2
Solve merge conflicts
TimKoornstra Sep 27, 2023
c711c66
Remove files and comments
TimKoornstra Sep 27, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
13 changes: 10 additions & 3 deletions .github/workflows/run_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,22 +21,22 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r src/requirements.txt
pip install -r requirements.txt

- name: Test utils
run: |
python -m unittest tests/test_utils.py
env:
TF_CPP_MIN_LOG_LEVEL: '2'

- name: Test DataGeneratorNew2
- name: Test DataGenerator
if: always()
run: |
python -m unittest tests/test_datagenerator.py
env:
TF_CPP_MIN_LOG_LEVEL: '2'

- name: Test DataLoaderNew
- name: Test DataLoader
if: always()
run: |
python -m unittest tests/test_dataloader.py
Expand All @@ -49,3 +49,10 @@ jobs:
python -m unittest tests/test_model_creation.py
env:
TF_CPP_MIN_LOG_LEVEL: '2'

- name: Test replacing model layers
if: always()
run: |
python -m unittest tests/test_model_replacement.py
env:
TF_CPP_MIN_LOG_LEVEL: '2'
111 changes: 105 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ Loghi HTR also works on machine printed text.
1. [Installation](#installation)
2. [Usage](#usage)
3. [Variable-size Graph Specification Language (VGSL)](#variable-size-graph-specification-language-vgsl)
4. [API Usage Guide](#api-usage-guide)
4. [Frequently Asked Questions (FAQ)](#FAQ)

## Installation

Expand Down Expand Up @@ -40,7 +42,7 @@ python3 -m pip install .

```bash
git clone https://github.com/rvankoert/loghi-htr.git
cd loghi-htr/src/
cd loghi-htr
python3 -m pip install -r requirements.txt
```

Expand Down Expand Up @@ -73,9 +75,9 @@ The command-line options include, but are not limited to:
- `--do_train`: Enable the training stage.
- `--do_validate`: Enable the validation stage.
- `--do_inference`: Perform inference.
- `--train_list`: List of files containing training data. Format: `path/to/textline/image <TAB> transcription`.
- `--validation_list`: List of files containing validation data. Format: `path/to/textline/image <TAB> transcription`.
- `--inference_list`: List of files containing data to perform inference on. Format: `path/to/textline/image`.
- `--train_list`: List of files containing training data. Format: `/path/to/textline/image <TAB> transcription`.
- `--validation_list`: List of files containing validation data. Format: `/path/to/textline/image <TAB> transcription`.
- `--inference_list`: List of files containing data to perform inference on. Format: `/path/to/textline/image`.
- `--learning_rate`: Set the learning rate. Recommended values range from 0.001 to 0.000001, with 0.0003 being the default.
- `--channels`: Number of image channels. Use 3 for standard RGB-images, and 4 for images with an alpha channel containing the textline polygon-mask.
- `--gpu`: GPU configuration. Use -1 for CPU, 0 for the first GPU, and so on.
Expand All @@ -84,6 +86,8 @@ The command-line options include, but are not limited to:
- `--use_mask`: Enable when using `batch_size` > 1.
- `--results_file`: The inference results are aggregated in this file.
- `--config_file_output`: The output location of the config.
- `--replace_recurrent_layer`: Specifies the [VGSL string](#variable-size-graph-specification-language-vgsl) to define the architecture of the recurrent layers that will replace the recurrent layers of an existing model. This argument is required when you want to modify the recurrent layers of a model specified by `--existing_model`. The VGSL string describes the type, direction, and number of units for the recurrent layers. For example, "Lfs128 Lf64" describes two LSTM layers with 128 and 64 units respectively. When using this argument, ensure that `--existing_model` is also provided to specify the model whose recurrent layers you want to replace.
- `--replace_final_layer`: Enables the replacement of the final dense layer of an existing model. This is useful when you want to adjust the number of output characters or change the masking option without modifying the rest of the model. When this argument is used, the model specified by `--existing_model` will have its final dense layer replaced with a new one. The number of output units will be adjusted based on the value of `number_characters` and whether the `--use_mask` option is enabled.

For detailed options and configurations, you can also refer to the help command:

Expand Down Expand Up @@ -117,8 +121,8 @@ python3 main.py
```bash
CUDA_VISIBLE_DEVICES=0
python3 main.py
--existing_model path/to/existing/model
--charlist path/to/existing/model/charlist.txt
--existing_model /path/to/existing/model
--charlist /path/to/existing/model/charlist.txt
--do_inference
--inference_list "inference_lines_1.txt"
--height 64
Expand Down Expand Up @@ -254,3 +258,98 @@ In this example, the string defines a neural network with input layers, convolut
- **ResidualBlock**: Documentation in progress.
- **CTCLayer**: Documentation in progress.

## API Usage Guide

This guide walks you through the process of setting up and running the API, as well as how to interact with it.

### 1. Setting up the API

Navigate to the `src/api` directory in your project:

```bash
cd src/api
```

#### Starting the API

You have the choice to run the API using either `gunicorn` (recommended) or `flask`. To start the server:

Using `gunicorn`:

```bash
python3 gunicorn_app.py
```

Or using `flask`:

```bash
python3 flask_app.py
```

#### Environment Variables Configuration

Before running the app, you must set several environment variables. The app fetches configurations from these variables:

**Gunicorn Options:**

```bash
GUNICORN_RUN_HOST # Default: "127.0.0.1:8000": The host and port where the API should run.
GUNICORN_WORKERS # Default: "1": Number of worker processes.
GUNICORN_THREADS # Default: "1": Number of threads per worker.
GUNICORN_ACCESSLOG # Default: "-": Access log settings.
```

**Loghi-HTR Options:**

```bash
LOGHI_MODEL_PATH # Path to the model.
LOGHI_CHARLIST_PATH # Path to the character list.
LOGHI_MODEL_CHANNELS # Number of channels in the model.
LOGHI_BATCH_SIZE # Default: "256": Batch size for processing.
LOGHI_OUTPUT_PATH # Directory where predictions are saved.
LOGHI_MAX_QUEUE_SIZE # Default: "10000": Maximum size of the processing queue.
```

**GPU Options:**

```bash
LOGHI_GPUS # Default: "0": GPU configuration.
```

You can set these variables in your shell or use a script. An example script to start a `gunicorn` server can be found in `src/api/start_local_app.sh`.

### 2. Interacting with the running API

Once the API is up and running, you can send HTR requests using curl. Here's how:

```bash
curl -X POST -F "image=@$input_path" -F "group_id=$group_id" -F "identifier=$filename" http://localhost:5000/predict
```

Replace `$input_path`, `$group_id`, and `$filename` with your specific values. The model processes the image, predicts the handwritten text, and saves the predictions in the specified output path (from the `LOGHI_OUTPUT_PATH` environment variable).

---

This guide should help you get started with the API. For advanced configurations or troubleshooting, please reach out for support.

## FAQ

If you're new to using this tool or encounter issues, this FAQ section provides answers to common questions and problems. If you don't find your answer here, please reach out for further assistance.

### How do I use `replace_recurrent_layer`?

The `replace_recurrent_layer` is a feature that allows you to replace the recurrent layers of an existing model with a new architecture defined by a VGSL string. To use it:

1. Specify the model you want to modify using the `--existing_model` argument.
2. Provide the VGSL string that defines the new recurrent layer architecture with the `--replace_recurrent_layer` argument. The VGSL string describes the type, direction, and number of units for the recurrent layers. For example, "Lfs128 Lfs64" describes two LSTM layers with 128 and 64 units respectively, with both layers returning sequences.
3. (Optional) Use `--use_mask` if you want the replaced layer to account for masking.
4. Execute your script or command, and the tool will replace the recurrent layers of your existing model based on the VGSL string you provided.

### I'm getting the following error when I want to use `replace_recurrent_layer`: `Input 0 of layer "lstm_1" is incompatible with the layer: expected ndim=3, found ndim=2.` What do I do?

This error usually indicates that there is a mismatch in the expected input dimensions of the LSTM layer. Often, this is because the VGSL spec for the recurrent layers is missing the `[s]` argument, which signifies that the layer should return sequences.

To resolve this:
- Ensure that your VGSL string for the LSTM layer has an `s` in it, which will make the layer return sequences. For instance, instead of "Lf128", use "Lfs128".
- Re-run the script or command with the corrected VGSL string.

1 change: 0 additions & 1 deletion model/charList2.txt

This file was deleted.

File renamed without changes.
18 changes: 0 additions & 18 deletions src/AppLocker.py

This file was deleted.

Loading