You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently I was trying to train a segmenter model from the command-line and I encountered the following error:
RuntimeError: Predictions and targets are expected to have the same shape, but
got torch.Size([1, 8, 450, 348]) and torch.Size([1, 7, 450, 348]).
The model I was using as the base model was not model_best.mlmodel because the training process did not produce it (it simply stopped at the 50th epoch). The training data is of the same type as the one used to train the base model.
Can someone please explain what the error means? I do not see how the model parameters could have changed given that I took an output from the previous training cycle and used it as the base model for the next one, using very similar training data.
The text was updated successfully, but these errors were encountered:
On 25/03/30 02:44PM, Matthew Ong wrote:
megamattc created an issue (mittagessen/kraken#695)
Hi,
Recently I was trying to train a segmenter model from the command-line and I encountered the following error:
```
RuntimeError: Predictions and targets are expected to have the same shape, but
got torch.Size([1, 8, 450, 348]) and torch.Size([1, 7, 450, 348]).
srun: error: r13g04: task 0: Exited with exit code 1
srun: Terminating StepId=27328777.
```
The model I was using as the base model was not ```model_best.mlmodel``` but apparently one from a previous epoch (49/50, I don't know why the training did not complete to the 50th epoch). The training data is of the same type as the one used to train the base model.
Can someone please explain what the error means? I do not see how the model parameters could have changed given that I took an output from the previous training cycle and used it as the base model for the next one, using very similar training data.
This looks like a bug in the typemap computation that was fixed a while
ago. Could you install `main` and see if the error still occurs?
The epochs just show 49/50 because the indicator starts counting from 0
and the second indicator shows the overall number of epochs. People have
been asking about it multiple times already so I just changed it.
I believe I installed from the main branch of the repository (pip install git+https://github.com/mittagessen/kraken.git) on a VM and ran the same command as above, but I got the same error:
RuntimeError: Predictions and targets are expected to have the same shape, but got torch.Size([1, 9, 450, 557]) and torch.Size([1, 8, 450, 557]).
Hi,
Recently I was trying to train a segmenter model from the command-line and I encountered the following error:
The model I was using as the base model was not
model_best.mlmodel
because the training process did not produce it (it simply stopped at the 50th epoch). The training data is of the same type as the one used to train the base model.Can someone please explain what the error means? I do not see how the model parameters could have changed given that I took an output from the previous training cycle and used it as the base model for the next one, using very similar training data.
The text was updated successfully, but these errors were encountered: