diff --git a/README.md b/README.md
index e6a617c..e689e9d 100644
--- a/README.md
+++ b/README.md
@@ -12,30 +12,27 @@
## π€ Introduction
-**torchlm** is a PyTorch landmarks-only library with **100+ data augmentations**, support **training** and **inference**. **torchlm** is aims at only focus on any landmark detection, such as face landmarks, hand keypoints and body keypoints, etc. It provides **30+** native data augmentations and can **bind** with **80+** transforms from torchvision and albumentations, no matter the input is a np.ndarray or a torch Tensor, **torchlm** will automatically be compatible with different data types and then wrap it back to the original type through a **autodtype** wrapper. Further, **torchlm** will add modules for **training** and **inference** in the future.
+**torchlm** is aims to build a high level pipeline for face landmarks detection, support **100+ data augmentations**, **training** and **inference**, can can easily install with **pip**.
β€οΈ Star πππ» this repo to support me if it does any helps to you, thanks ~
+## π Core Features
+* High level pipeline for **training** and **inference**.
+* Provides **30+** native landmarks data augmentations.
+* Can **bind 80+** transforms from torchvision and albumentations with **one-line-code**.
+* Support awesome models for face landmarks detection, such as YOLOX, YOLOv5, ResNet, MobileNet, ShuffleNet and PIPNet, etc.
-# π What's New
-
+## π What's New
+* [2022/03/08]: Add **PIPNet**: [Towards Efficient Facial Landmark Detection in the Wild, CVPR2021](https://github.com/jhb86253817/PIPNet)
* [2022/02/13]: Add **30+** native data augmentations and **bind** **80+** transforms from torchvision and albumentations.
## π οΈ Usage
@@ -44,53 +41,56 @@
* opencv-python-headless>=4.5.2
* numpy>=1.14.4
* torch>=1.6.0
-* torchvision>=0.9.0
+* torchvision>=0.8.0
* albumentations>=1.1.0
+* onnx>=1.8.0
+* onnxruntime>=1.7.0
+* tqdm>=4.10.0
### Installation
-you can install **torchlm** directly from [pypi](https://pypi.org/project/torchlm/).
+you can install **torchlm** directly from [pypi](https://pypi.org/project/torchlm/). See [NOTE](#torchlm-NOTE) before installation!!!
```shell
pip3 install torchlm
# install from specific pypi mirrors use '-i'
pip3 install torchlm -i https://pypi.org/simple/
```
-or install from source.
+or install from source if you want the latest torchlm and install it in editable mode with `-e`.
```shell
-# clone torchlm repository locally
+# clone torchlm repository locally if you want the latest torchlm
git clone --depth=1 https://github.com/DefTruth/torchlm.git
cd torchlm
# install in editable mode
pip install -e .
```
+
-
-
-
-
-
-
-
-
@@ -102,76 +102,45 @@ transform = torchlm.LandmarksCompose([
* **bind** **80+** torchvision and albumentations's transforms through **torchlm.bind**
```python
-import torchvision
-import albumentations
-import torchlm
transform = torchlm.LandmarksCompose([
- # use native torchlm transforms
- torchlm.LandmarksRandomScale(prob=0.5),
- # bind torchvision image only transforms, bind with a given prob
- torchlm.bind(torchvision.transforms.GaussianBlur(kernel_size=(5, 25)), prob=0.5),
- torchlm.bind(torchvision.transforms.RandomAutocontrast(p=0.5)),
- # bind albumentations image only transforms
- torchlm.bind(albumentations.ColorJitter(p=0.5)),
- torchlm.bind(albumentations.GlassBlur(p=0.5)),
- # bind albumentations dual transforms
- torchlm.bind(albumentations.RandomCrop(height=200, width=200, p=0.5)),
- torchlm.bind(albumentations.Rotate(p=0.5)),
- # ...
- ])
+ torchlm.bind(torchvision.transforms.GaussianBlur(kernel_size=(5, 25)), prob=0.5),
+ torchlm.bind(albumentations.ColorJitter(p=0.5))
+])
```
-* **bind** custom callable array or Tensor functions through **torchlm.bind**
+See [transforms.md](docs/api/transforms.md) for supported transforms sets and more example can be found at [test/transforms.py](test/transforms.py).
+
+
+ bind custom callable array or Tensor functions through torchlm.bind
```python
# First, defined your custom functions
-def callable_array_noop(img: np.ndarray, landmarks: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
- # do some transform here ...
+def callable_array_noop(img: np.ndarray, landmarks: np.ndarray) -> Tuple[np.ndarray, np.ndarray]: # do some transform here ...
return img.astype(np.uint32), landmarks.astype(np.float32)
-def callable_tensor_noop(img: Tensor, landmarks: Tensor) -> Tuple[Tensor, Tensor]:
- # do some transform here ...
+def callable_tensor_noop(img: Tensor, landmarks: Tensor) -> Tuple[Tensor, Tensor]: # do some transform here ...
return img, landmarks
```
```python
# Then, bind your functions and put it into the transforms pipeline.
transform = torchlm.LandmarksCompose([
- # use native torchlm transforms
- torchlm.LandmarksRandomScale(prob=0.5),
- # bind custom callable array functions
torchlm.bind(callable_array_noop, bind_type=torchlm.BindEnum.Callable_Array),
- # bind custom callable Tensor functions with a given prob
- torchlm.bind(callable_tensor_noop, bind_type=torchlm.BindEnum.Callable_Tensor, prob=0.5),
- # ...
- ])
+ torchlm.bind(callable_tensor_noop, bind_type=torchlm.BindEnum.Callable_Tensor, prob=0.5)
+])
```
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
+
+
+ some global debug setting for torchlm's transform
* setup logging mode as `True` globally might help you figure out the runtime details
```python
-import torchlm
# some global setting
torchlm.set_transforms_debug(True)
torchlm.set_transforms_logging(True)
torchlm.set_autodtype_logging(True)
-```
+```
+
some detail information will show you at each runtime, the infos might look like
```shell
LandmarksRandomScale() AutoDtype Info: AutoDtypeEnum.Array_InOut
@@ -194,21 +163,98 @@ LandmarksRandomTranslate() Execution Flag: False
But, is ok if you pass a Tensor to a np.ndarray-like transform, **torchlm** will automatically be compatible with different data types and then wrap it back to the original type through a **autodtype** wrapper.
+
+
+
+### ππTraining
+In **torchlm**, each model have a high level and user-friendly API named `training`, here is a example of [PIPNet](https://github.com/jhb86253817/PIPNet).
+```python
+from torchlm.models import pipnet
+
+model = pipnet(
+ backbone="resnet18",
+ pretrained=False,
+ num_nb=10,
+ num_lms=98,
+ net_stride=32,
+ input_size=256,
+ meanface_type="wflw",
+ backbone_pretrained=True,
+ map_location="cuda",
+ checkpoint=None
+)
+
+model.training(
+ self,
+ annotation_path: str,
+ criterion_cls: nn.Module = nn.MSELoss(),
+ criterion_reg: nn.Module = nn.L1Loss(),
+ learning_rate: float = 0.0001,
+ cls_loss_weight: float = 10.,
+ reg_loss_weight: float = 1.,
+ num_nb: int = 10,
+ num_epochs: int = 60,
+ save_dir: Optional[str] = "./save",
+ save_interval: Optional[int] = 10,
+ save_prefix: Optional[str] = "",
+ decay_steps: Optional[List[int]] = (30, 50),
+ decay_gamma: Optional[float] = 0.1,
+ device: Optional[Union[str, torch.device]] = "cuda",
+ transform: Optional[transforms.LandmarksCompose] = None,
+ coordinates_already_normalized: Optional[bool] = False,
+ **kwargs: Any # params for DataLoader
+) -> nn.Module:
+```
+Please jump to the entry point of the function for the detail documentations of **training** API for each defined models in torchlm, e.g [pipnet/_impls.py#L159](https://github.com/DefTruth/torchlm/blob/main/torchlm/models/pipnet/_impls.py#L159). Further, the model implementation plan is as follows:
-* Supported Transforms Sets, see [transforms.md](docs/api/transforms.md). A detail example can be found at [test/transforms.py](test/transforms.py).
+β YOLOX β YOLOv5 β NanoDet β
[PIPNet](https://github.com/jhb86253817/PIPNet) β ResNet β MobileNet β ShuffleNet β...
-### Training(TODO)
-* [ ] YOLOX
-* [ ] YOLOv5
-* [ ] NanoDet
-* [ ] PIPNet
-* [ ] ResNet
-* [ ] MobileNet
-* [ ] ShuffleNet
-* [ ] ...
+β
= known work and official supported, β = in my plan, but not coming soon.
-### Inference
+### ππ Inference
+#### C++ API
The ONNXRuntime(CPU/GPU), MNN, NCNN and TNN C++ inference of **torchlm** will be release at [lite.ai.toolkit](https://github.com/DefTruth/lite.ai.toolkit).
+#### Python API
+In **torchlm**, we offer a high level API named `runtime.bind` to bind any models in torchlm and then you can run the `runtime.forward` API to get the output landmarks and bboxes, here is a example of [PIPNet](https://github.com/jhb86253817/PIPNet).
+```python
+import cv2
+import torchlm
+from torchlm.tools import faceboxesv2
+from torchlm.models import pipnet
+
+def test_pipnet_runtime():
+ img_path = "./1.jpg"
+ save_path = "./1.jpg"
+ checkpoint = "./pipnet_resnet18_10x98x32x256_wflw.pth"
+ image = cv2.imread(img_path)
+
+ torchlm.runtime.bind(faceboxesv2())
+ torchlm.runtime.bind(
+ pipnet(
+ backbone="resnet18",
+ pretrained=True,
+ num_nb=10,
+ num_lms=98,
+ net_stride=32,
+ input_size=256,
+ meanface_type="wflw",
+ backbone_pretrained=True,
+ map_location="cpu",
+ checkpoint=checkpoint
+ )
+ )
+ landmarks, bboxes = torchlm.runtime.forward(image)
+ image = torchlm.utils.draw_bboxes(image, bboxes=bboxes)
+ image = torchlm.utils.draw_landmarks(image, landmarks=landmarks)
+
+ cv2.imwrite(save_path, image)
+```
+
## π Documentations
* [x] [Data Augmentation's API](docs/api/transforms.md)
diff --git a/docs/assets/pipnet0.jpg b/docs/assets/pipnet0.jpg
new file mode 100644
index 0000000..7702fa7
Binary files /dev/null and b/docs/assets/pipnet0.jpg differ
diff --git a/docs/assets/pipnet_300W_CELEBA_model.gif b/docs/assets/pipnet_300W_CELEBA_model.gif
new file mode 100644
index 0000000..5685312
Binary files /dev/null and b/docs/assets/pipnet_300W_CELEBA_model.gif differ
diff --git a/docs/assets/pipnet_WFLW_model.gif b/docs/assets/pipnet_WFLW_model.gif
new file mode 100644
index 0000000..253bf59
Binary files /dev/null and b/docs/assets/pipnet_WFLW_model.gif differ
diff --git a/docs/assets/pipnet_shaolin_soccer.gif b/docs/assets/pipnet_shaolin_soccer.gif
new file mode 100644
index 0000000..2600e9b
Binary files /dev/null and b/docs/assets/pipnet_shaolin_soccer.gif differ
diff --git a/requirements.txt b/requirements.txt
index 0761cf4..c00b387 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,5 +1,5 @@
# torchlm
-opencv-python-headless>=4.5.2
+opencv-python-headless>=4.3.0
numpy>=1.14.4
torch>=1.6.0
torchvision>=0.9.0
diff --git a/setup.py b/setup.py
index 450d97a..f72d3cc 100644
--- a/setup.py
+++ b/setup.py
@@ -25,14 +25,14 @@ def get_long_description():
url="https://github.com/DefTruth/torchlm",
packages=setuptools.find_packages(),
install_requires=[
- "opencv-python-headless>=4.5.2",
+ "opencv-python-headless>=4.3.0",
"numpy>=1.14.4",
"torch>=1.6.0",
"torchvision>=0.8.0",
"albumentations>=1.1.0",
"onnx>=1.8.0",
"onnxruntime>=1.7.0",
- "tqdm>=4.60.0"
+ "tqdm>=4.10.0"
],
classifiers=[
"Programming Language :: Python :: 3",
diff --git a/torchlm/data/_converters.py b/torchlm/data/_converters.py
index e69de29..d9acdbc 100644
--- a/torchlm/data/_converters.py
+++ b/torchlm/data/_converters.py
@@ -0,0 +1,13 @@
+import os
+import cv2
+import numpy as np
+from abc import ABCMeta, abstractmethod
+from typing import Tuple, Optional, List
+
+
+class BaseConverter(object):
+ __metaclass__ = ABCMeta
+
+ @abstractmethod
+ def convert(self, *args, **kwargs):
+ raise NotImplementedError
\ No newline at end of file
diff --git a/torchlm/models/pipnet/_impls.py b/torchlm/models/pipnet/_impls.py
index 35e1648..efaca9b 100644
--- a/torchlm/models/pipnet/_impls.py
+++ b/torchlm/models/pipnet/_impls.py
@@ -157,6 +157,33 @@ def training(
coordinates_already_normalized: Optional[bool] = False,
**kwargs: Any # params for DataLoader
) -> nn.Module:
+ """
+ :param annotation_path: the path to a annotation file, the format must be
+ "img0_path img_path x0 y0 x1 y1 ... xn-1,yn-1"
+ "img1_path img_path x0 y0 x1 y1 ... xn-1,yn-1"
+ "img2_path img_path x0 y0 x1 y1 ... xn-1,yn-1"
+ "img3_path img_path x0 y0 x1 y1 ... xn-1,yn-1"
+ ...
+ :param criterion_cls: loss criterion for PIPNet heatmap classification, default MSELoss
+ :param criterion_reg: loss criterion for PIPNet offsets regression, default L1Loss
+ :param learning_rate: learning rate, default 0.0001
+ :param cls_loss_weight: weight for heatmap classification
+ :param reg_loss_weight: weight for offsets regression
+ :param num_nb: the number of Nearest-neighbor landmarks for NRM, default 10
+ :param num_epochs: the number of training epochs
+ :param save_dir: the dir to save checkpoints
+ :param save_interval: the interval to save checkpoints
+ :param save_prefix: the prefix to save checkpoints, the saved name would look like
+ {save_prefix}-epoch{epoch}-loss{epoch_loss}.pth
+ :param decay_steps: decay steps for learning rate
+ :param decay_gamma: decay gamma for learning rate
+ :param device: training device, default cuda.
+ :param transform: user specific transform. If None, torchlm will build a default transform,
+ more details can be found at `torchlm.transforms.build_default_transform`
+ :param coordinates_already_normalized: denoted the label in annotation_path is normalized(by image size) of not
+ :param kwargs: params for DataLoader
+ :return: A trained model.
+ """
device = device if torch.cuda.is_available() else "cpu"
# prepare dataset
default_dataset = _PIPTrainDataset(