Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loghi-HTR V2 #25

Merged
merged 272 commits into from
Mar 25, 2024
Merged

Loghi-HTR V2 #25

merged 272 commits into from
Mar 25, 2024

Conversation

TimKoornstra
Copy link
Collaborator

@TimKoornstra TimKoornstra commented Mar 15, 2024

V2

This combined pull request encompasses a range of enhancements and refactors across various aspects of the project. Below are the key changes and improvements grouped by their original pull request numbers.

Visualize files revamp #11

  • Refactored old visualize files & made them independent of data loaders / utils files, dark/light-mode support.
  • Time-step prediction visualizer takes top-3 most probable characters for a time-step, generates a CTC encoding table, pre-processes "invisible" or unknown characters.
  • Filter activations visualizer: plots conv layer activations on top of input sample image, plots conv layer activations when provided with a random noise image, accounts for models of different channels / sample image shapes.
  • PDF combiner that creates a single sheet of the created visualizers.
  • vis_arg_parser: used for processing user arguments in the terminal similar to how loghi-htr processes arguments.
  • README instructions.

Main refactor & organize files into subfolders #13

  • Extensive refactoring of main.py, organizing functions into separate files within subfolders inside the src directory.
  • Overhauled logging for better clarity and structured logging.
  • Improved GPU handling by shifting selection from environment variable manipulation to direct TensorFlow handling.
  • Adjusted TensorFlow logging to clean up output.
  • Deprecation notices for specific arguments effective May 2024.
  • Updates in ResidualBlock to incorporate ReLU activation by default.
  • This update constitutes a major change in the codebase.

Improved LR schedule #14

  • Added a custom LoghiLearningRateSchedule class for more flexible learning rate adjustments.
  • Integrated learning rate logging in each step within CustomLoghiCallback.
  • Added unit tests for the LoghiLearningRateSchedule class and incorporated these tests into GitHub Actions.

Improved args, and added config file #15

  • Implemented the ability to run Loghi using a --config_file argument.
  • Completely overhauled config.json file structure for better organization.
  • Added a Config class in src/setup/config.py for enhanced management of configuration settings.
  • API updated to support both old and new config.json structures for backward compatibility.

API v2 #17

  • Significant updates and refactorings across the model loading process, code refactoring for efficiency, and new environment variables.
  • Gunicorn integration and endpoint monitoring for health checks.
  • Security enhancements and separate decoding process for better GPU utilization.
  • Logging improvements and output format changes for predictions.

Minor improvements #18

  • General code cleanup and custom callback enhancement.
  • Introduced RMSProp as an additional optimizer option.
  • Adjustments in CTC Loss functionality and relocation for better organization.

V1 DataGenerator and Data Augment Revamp on GPU (no unittest updates) #21

  • Refactored data augmentations to be part of the final Sequential model for GPU support.
  • Custom Keras Layer classes for each type of augmentation, easily extendable for future augments.
  • Data augmentation visualizations with --visualize_augments.

QOL changes and code simplifications #22

  • Simplifications across the code enhancing readability and maintainability.
  • Functional improvements in greedy decoding confidence score and speed enhancements in test and validation modes.

Improve code quality #23

  • Replaced f-strings in logging with lazy %.
  • Removed unused variables and added encoding to all with open() statements.

Data loader upgrade #24

  • Refactored DataLoader and DataGenerator classes, renaming and improving code documentation and clarity.
  • Fixed an augmentation bug and split functions into smaller subroutines for better readability.
  • Add ability to use "sample weights" in training data. They should be supplied in the second column of the txt file.

Other changes include deprecated arguments in favor of new ones, updated requirements.txt, and added a recommended model to the model library.

TimKoornstra and others added 30 commits November 24, 2023 09:40
Oh yeah, it's all coming together
Next: another revision and split functions into files
…oved parts of visualize prep to separate vis_utils.py. Added error handling and docstrings
…alents, docstrings, normalisation of variable names
…rements.txt, pruned vis_arg_parser, reintroduced main() functions, PdfMaker changes to structure
Main refactor & organize files into subfolders
@TimKoornstra TimKoornstra merged commit 59af90a into master Mar 25, 2024
5 checks passed
@TimKoornstra TimKoornstra deleted the v2 branch March 25, 2024 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants