diff --git a/README.md b/README.md index e6a617c..e689e9d 100644 --- a/README.md +++ b/README.md @@ -12,30 +12,27 @@ ## πŸ€— Introduction -**torchlm** is a PyTorch landmarks-only library with **100+ data augmentations**, support **training** and **inference**. **torchlm** is aims at only focus on any landmark detection, such as face landmarks, hand keypoints and body keypoints, etc. It provides **30+** native data augmentations and can **bind** with **80+** transforms from torchvision and albumentations, no matter the input is a np.ndarray or a torch Tensor, **torchlm** will automatically be compatible with different data types and then wrap it back to the original type through a **autodtype** wrapper. Further, **torchlm** will add modules for **training** and **inference** in the future. +**torchlm** is aims to build a high level pipeline for face landmarks detection, support **100+ data augmentations**, **training** and **inference**, can can easily install with **pip**.
- - - - - - - -
- - - - + + + +

❀️ Star πŸŒŸπŸ‘†πŸ» this repo to support me if it does any helps to you, thanks ~

+## πŸ‘‹ Core Features +* High level pipeline for **training** and **inference**. +* Provides **30+** native landmarks data augmentations. +* Can **bind 80+** transforms from torchvision and albumentations with **one-line-code**. +* Support awesome models for face landmarks detection, such as YOLOX, YOLOv5, ResNet, MobileNet, ShuffleNet and PIPNet, etc. -# πŸ†• What's New - +## πŸ†• What's New +* [2022/03/08]: Add **PIPNet**: [Towards Efficient Facial Landmark Detection in the Wild, CVPR2021](https://github.com/jhb86253817/PIPNet) * [2022/02/13]: Add **30+** native data augmentations and **bind** **80+** transforms from torchvision and albumentations. ## πŸ› οΈ Usage @@ -44,53 +41,56 @@ * opencv-python-headless>=4.5.2 * numpy>=1.14.4 * torch>=1.6.0 -* torchvision>=0.9.0 +* torchvision>=0.8.0 * albumentations>=1.1.0 +* onnx>=1.8.0 +* onnxruntime>=1.7.0 +* tqdm>=4.10.0 ### Installation -you can install **torchlm** directly from [pypi](https://pypi.org/project/torchlm/). +you can install **torchlm** directly from [pypi](https://pypi.org/project/torchlm/). See [NOTE](#torchlm-NOTE) before installation!!! ```shell pip3 install torchlm # install from specific pypi mirrors use '-i' pip3 install torchlm -i https://pypi.org/simple/ ``` -or install from source. +or install from source if you want the latest torchlm and install it in editable mode with `-e`. ```shell -# clone torchlm repository locally +# clone torchlm repository locally if you want the latest torchlm git clone --depth=1 https://github.com/DefTruth/torchlm.git cd torchlm # install in editable mode pip install -e . ``` +
+ +**NOTE**: If you have the conflict problem between different installed version of opencv (opencv-python and opencv-python-headless, `ablumentations` need opencv-python-headless). Please uninstall the opencv-python and opencv-python-headless first, and then reinstall torchlm. See [albumentations#1139](https://github.com/albumentations-team/albumentations/issues/1139) for more details. + +```shell +# first uninstall confilct opencvs +pip uninstall opencv-python +pip uninstall opencv-python-headless +pip uninstall torchlm # if you have installed torchlm +# then reinstall torchlm +pip install torchlm # will also install deps, e.g opencv +``` -### Data Augmentation +### 🌟🌟Data Augmentation **torchlm** provides **30+** native data augmentations for landmarks and can **bind** with **80+** transforms from torchvision and albumentations through **torchlm.bind** method. Further, **torchlm.bind** provide a `prob` param at bind-level to force any transform or callable be a random-style augmentation. The data augmentations in **torchlm** are `safe` and `simplest`. Any transform operations at runtime cause landmarks outside will be auto dropped to keep the number of landmarks unchanged. The layout format of landmarks is `xy` with shape `(N, 2)`, `N` denotes the number of the input landmarks. No matter the input is a np.ndarray or a torch Tensor, **torchlm** will automatically be compatible with different data types and then wrap it back to the original type through a **autodtype** wrapper. * use almost **30+** native transforms from **torchlm** directly ```python import torchlm transform = torchlm.LandmarksCompose([ - # use native torchlm transforms - torchlm.LandmarksRandomScale(prob=0.5), - torchlm.LandmarksRandomTranslate(prob=0.5), - torchlm.LandmarksRandomShear(prob=0.5), - torchlm.LandmarksRandomMask(prob=0.5), - torchlm.LandmarksRandomBlur(kernel_range=(5, 25), prob=0.5), - torchlm.LandmarksRandomBrightness(prob=0.), - torchlm.LandmarksRandomRotate(40, prob=0.5, bins=8), - torchlm.LandmarksRandomCenterCrop((0.5, 1.0), (0.5, 1.0), prob=0.5), - # ... - ]) + torchlm.LandmarksRandomScale(prob=0.5), + torchlm.LandmarksRandomMask(prob=0.5), + torchlm.LandmarksRandomBlur(kernel_range=(5, 25), prob=0.5), + torchlm.LandmarksRandomBrightness(prob=0.), + torchlm.LandmarksRandomRotate(40, prob=0.5, bins=8), + torchlm.LandmarksRandomCenterCrop((0.5, 1.0), (0.5, 1.0), prob=0.5) +]) ```
- - - - - - - -
@@ -102,76 +102,45 @@ transform = torchlm.LandmarksCompose([ * **bind** **80+** torchvision and albumentations's transforms through **torchlm.bind** ```python -import torchvision -import albumentations -import torchlm transform = torchlm.LandmarksCompose([ - # use native torchlm transforms - torchlm.LandmarksRandomScale(prob=0.5), - # bind torchvision image only transforms, bind with a given prob - torchlm.bind(torchvision.transforms.GaussianBlur(kernel_size=(5, 25)), prob=0.5), - torchlm.bind(torchvision.transforms.RandomAutocontrast(p=0.5)), - # bind albumentations image only transforms - torchlm.bind(albumentations.ColorJitter(p=0.5)), - torchlm.bind(albumentations.GlassBlur(p=0.5)), - # bind albumentations dual transforms - torchlm.bind(albumentations.RandomCrop(height=200, width=200, p=0.5)), - torchlm.bind(albumentations.Rotate(p=0.5)), - # ... - ]) + torchlm.bind(torchvision.transforms.GaussianBlur(kernel_size=(5, 25)), prob=0.5), + torchlm.bind(albumentations.ColorJitter(p=0.5)) +]) ``` -* **bind** custom callable array or Tensor functions through **torchlm.bind** +See [transforms.md](docs/api/transforms.md) for supported transforms sets and more example can be found at [test/transforms.py](test/transforms.py). + +
+ bind custom callable array or Tensor functions through torchlm.bind ```python # First, defined your custom functions -def callable_array_noop(img: np.ndarray, landmarks: np.ndarray) -> Tuple[np.ndarray, np.ndarray]: - # do some transform here ... +def callable_array_noop(img: np.ndarray, landmarks: np.ndarray) -> Tuple[np.ndarray, np.ndarray]: # do some transform here ... return img.astype(np.uint32), landmarks.astype(np.float32) -def callable_tensor_noop(img: Tensor, landmarks: Tensor) -> Tuple[Tensor, Tensor]: - # do some transform here ... +def callable_tensor_noop(img: Tensor, landmarks: Tensor) -> Tuple[Tensor, Tensor]: # do some transform here ... return img, landmarks ``` ```python # Then, bind your functions and put it into the transforms pipeline. transform = torchlm.LandmarksCompose([ - # use native torchlm transforms - torchlm.LandmarksRandomScale(prob=0.5), - # bind custom callable array functions torchlm.bind(callable_array_noop, bind_type=torchlm.BindEnum.Callable_Array), - # bind custom callable Tensor functions with a given prob - torchlm.bind(callable_tensor_noop, bind_type=torchlm.BindEnum.Callable_Tensor, prob=0.5), - # ... - ]) + torchlm.bind(callable_tensor_noop, bind_type=torchlm.BindEnum.Callable_Tensor, prob=0.5) +]) ``` -
- - - - - - - -
- - - - - - - -
+
+
+ some global debug setting for torchlm's transform * setup logging mode as `True` globally might help you figure out the runtime details ```python -import torchlm # some global setting torchlm.set_transforms_debug(True) torchlm.set_transforms_logging(True) torchlm.set_autodtype_logging(True) -``` +``` + some detail information will show you at each runtime, the infos might look like ```shell LandmarksRandomScale() AutoDtype Info: AutoDtypeEnum.Array_InOut @@ -194,21 +163,98 @@ LandmarksRandomTranslate() Execution Flag: False But, is ok if you pass a Tensor to a np.ndarray-like transform, **torchlm** will automatically be compatible with different data types and then wrap it back to the original type through a **autodtype** wrapper. +
+ + +### πŸŽ‰πŸŽ‰Training +In **torchlm**, each model have a high level and user-friendly API named `training`, here is a example of [PIPNet](https://github.com/jhb86253817/PIPNet). +```python +from torchlm.models import pipnet + +model = pipnet( + backbone="resnet18", + pretrained=False, + num_nb=10, + num_lms=98, + net_stride=32, + input_size=256, + meanface_type="wflw", + backbone_pretrained=True, + map_location="cuda", + checkpoint=None +) + +model.training( + self, + annotation_path: str, + criterion_cls: nn.Module = nn.MSELoss(), + criterion_reg: nn.Module = nn.L1Loss(), + learning_rate: float = 0.0001, + cls_loss_weight: float = 10., + reg_loss_weight: float = 1., + num_nb: int = 10, + num_epochs: int = 60, + save_dir: Optional[str] = "./save", + save_interval: Optional[int] = 10, + save_prefix: Optional[str] = "", + decay_steps: Optional[List[int]] = (30, 50), + decay_gamma: Optional[float] = 0.1, + device: Optional[Union[str, torch.device]] = "cuda", + transform: Optional[transforms.LandmarksCompose] = None, + coordinates_already_normalized: Optional[bool] = False, + **kwargs: Any # params for DataLoader +) -> nn.Module: +``` +Please jump to the entry point of the function for the detail documentations of **training** API for each defined models in torchlm, e.g [pipnet/_impls.py#L159](https://github.com/DefTruth/torchlm/blob/main/torchlm/models/pipnet/_impls.py#L159). Further, the model implementation plan is as follows: -* Supported Transforms Sets, see [transforms.md](docs/api/transforms.md). A detail example can be found at [test/transforms.py](test/transforms.py). +❔ YOLOX ❔ YOLOv5 ❔ NanoDet βœ… [PIPNet](https://github.com/jhb86253817/PIPNet) ❔ ResNet ❔ MobileNet ❔ ShuffleNet ❔... -### Training(TODO) -* [ ] YOLOX -* [ ] YOLOv5 -* [ ] NanoDet -* [ ] PIPNet -* [ ] ResNet -* [ ] MobileNet -* [ ] ShuffleNet -* [ ] ... +βœ… = known work and official supported, ❔ = in my plan, but not coming soon. -### Inference +### πŸ‘€πŸ‘‡ Inference +#### C++ API The ONNXRuntime(CPU/GPU), MNN, NCNN and TNN C++ inference of **torchlm** will be release at [lite.ai.toolkit](https://github.com/DefTruth/lite.ai.toolkit). +#### Python API +In **torchlm**, we offer a high level API named `runtime.bind` to bind any models in torchlm and then you can run the `runtime.forward` API to get the output landmarks and bboxes, here is a example of [PIPNet](https://github.com/jhb86253817/PIPNet). +```python +import cv2 +import torchlm +from torchlm.tools import faceboxesv2 +from torchlm.models import pipnet + +def test_pipnet_runtime(): + img_path = "./1.jpg" + save_path = "./1.jpg" + checkpoint = "./pipnet_resnet18_10x98x32x256_wflw.pth" + image = cv2.imread(img_path) + + torchlm.runtime.bind(faceboxesv2()) + torchlm.runtime.bind( + pipnet( + backbone="resnet18", + pretrained=True, + num_nb=10, + num_lms=98, + net_stride=32, + input_size=256, + meanface_type="wflw", + backbone_pretrained=True, + map_location="cpu", + checkpoint=checkpoint + ) + ) + landmarks, bboxes = torchlm.runtime.forward(image) + image = torchlm.utils.draw_bboxes(image, bboxes=bboxes) + image = torchlm.utils.draw_landmarks(image, landmarks=landmarks) + + cv2.imwrite(save_path, image) +``` +
+ + + + +
## πŸ“– Documentations * [x] [Data Augmentation's API](docs/api/transforms.md) diff --git a/docs/assets/pipnet0.jpg b/docs/assets/pipnet0.jpg new file mode 100644 index 0000000..7702fa7 Binary files /dev/null and b/docs/assets/pipnet0.jpg differ diff --git a/docs/assets/pipnet_300W_CELEBA_model.gif b/docs/assets/pipnet_300W_CELEBA_model.gif new file mode 100644 index 0000000..5685312 Binary files /dev/null and b/docs/assets/pipnet_300W_CELEBA_model.gif differ diff --git a/docs/assets/pipnet_WFLW_model.gif b/docs/assets/pipnet_WFLW_model.gif new file mode 100644 index 0000000..253bf59 Binary files /dev/null and b/docs/assets/pipnet_WFLW_model.gif differ diff --git a/docs/assets/pipnet_shaolin_soccer.gif b/docs/assets/pipnet_shaolin_soccer.gif new file mode 100644 index 0000000..2600e9b Binary files /dev/null and b/docs/assets/pipnet_shaolin_soccer.gif differ diff --git a/requirements.txt b/requirements.txt index 0761cf4..c00b387 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,5 +1,5 @@ # torchlm -opencv-python-headless>=4.5.2 +opencv-python-headless>=4.3.0 numpy>=1.14.4 torch>=1.6.0 torchvision>=0.9.0 diff --git a/setup.py b/setup.py index 450d97a..f72d3cc 100644 --- a/setup.py +++ b/setup.py @@ -25,14 +25,14 @@ def get_long_description(): url="https://github.com/DefTruth/torchlm", packages=setuptools.find_packages(), install_requires=[ - "opencv-python-headless>=4.5.2", + "opencv-python-headless>=4.3.0", "numpy>=1.14.4", "torch>=1.6.0", "torchvision>=0.8.0", "albumentations>=1.1.0", "onnx>=1.8.0", "onnxruntime>=1.7.0", - "tqdm>=4.60.0" + "tqdm>=4.10.0" ], classifiers=[ "Programming Language :: Python :: 3", diff --git a/torchlm/data/_converters.py b/torchlm/data/_converters.py index e69de29..d9acdbc 100644 --- a/torchlm/data/_converters.py +++ b/torchlm/data/_converters.py @@ -0,0 +1,13 @@ +import os +import cv2 +import numpy as np +from abc import ABCMeta, abstractmethod +from typing import Tuple, Optional, List + + +class BaseConverter(object): + __metaclass__ = ABCMeta + + @abstractmethod + def convert(self, *args, **kwargs): + raise NotImplementedError \ No newline at end of file diff --git a/torchlm/models/pipnet/_impls.py b/torchlm/models/pipnet/_impls.py index 35e1648..efaca9b 100644 --- a/torchlm/models/pipnet/_impls.py +++ b/torchlm/models/pipnet/_impls.py @@ -157,6 +157,33 @@ def training( coordinates_already_normalized: Optional[bool] = False, **kwargs: Any # params for DataLoader ) -> nn.Module: + """ + :param annotation_path: the path to a annotation file, the format must be + "img0_path img_path x0 y0 x1 y1 ... xn-1,yn-1" + "img1_path img_path x0 y0 x1 y1 ... xn-1,yn-1" + "img2_path img_path x0 y0 x1 y1 ... xn-1,yn-1" + "img3_path img_path x0 y0 x1 y1 ... xn-1,yn-1" + ... + :param criterion_cls: loss criterion for PIPNet heatmap classification, default MSELoss + :param criterion_reg: loss criterion for PIPNet offsets regression, default L1Loss + :param learning_rate: learning rate, default 0.0001 + :param cls_loss_weight: weight for heatmap classification + :param reg_loss_weight: weight for offsets regression + :param num_nb: the number of Nearest-neighbor landmarks for NRM, default 10 + :param num_epochs: the number of training epochs + :param save_dir: the dir to save checkpoints + :param save_interval: the interval to save checkpoints + :param save_prefix: the prefix to save checkpoints, the saved name would look like + {save_prefix}-epoch{epoch}-loss{epoch_loss}.pth + :param decay_steps: decay steps for learning rate + :param decay_gamma: decay gamma for learning rate + :param device: training device, default cuda. + :param transform: user specific transform. If None, torchlm will build a default transform, + more details can be found at `torchlm.transforms.build_default_transform` + :param coordinates_already_normalized: denoted the label in annotation_path is normalized(by image size) of not + :param kwargs: params for DataLoader + :return: A trained model. + """ device = device if torch.cuda.is_available() else "cpu" # prepare dataset default_dataset = _PIPTrainDataset(