You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/about.md
-17Lines changed: 0 additions & 17 deletions
Original file line number
Diff line number
Diff line change
@@ -105,26 +105,9 @@ flowchart TD
105
105
106
106
## Model Training
107
107
108
-
The LRASPP MobileNetV3-Large[^1] model was trained using a stochastic gradient descent
109
-
optimizer with the learning rate set to $0.35$, and L2 weight decay set to
110
-
$3 \times 10^{-6}$. The model was trained for a total of 100 epochs. A cosine annealing
111
-
learning rate schedule[^2] was used to improve accuracy. The loss function used for
112
-
training was Focal Tversky Loss[^3], with parameters
113
-
$\alpha=0.7, \beta=0.3, \gamma=4.0 / 3.0$.
114
-
115
-
The model was trained on an AWS p3.8xlarge instance with 4 Nvidia Tesla V100 GPUS and
116
-
took 18 hours to finish. At the end of training, the model parameters which achieved
117
-
the best IoU~$\mu$~ score on the validation data split were saved for inference. It is
118
-
these parameters that were used to calculate the final performance statistics for the
119
-
model on the test split.
120
-
121
108
Source code for model training is available on GitHub
122
109
at [:material-github: hakai-ml-train](https://github.com/tayden/hakai-ml-train).
123
110
124
-
[^1]: [A. Howard et al., ‘Searching for MobileNetV3’. arXiv, 2019.](https://arxiv.org/abs/1905.02244)
125
-
[^2]: [I. Loshchilov and F. Hutter, ‘SGDR: Stochastic Gradient Descent with Warm Restarts’. arXiv, 2016.](https://arxiv.org/abs/1608.03983)
126
-
[^3]: [N. Abraham and N. M. Khan, ‘A Novel Focal Tversky loss function with improved Attention U-Net for lesion segmentation’. arXiv, 2018.](https://arxiv.org/abs/1810.07842)
0 commit comments