Skip to content

int64 support for some operations not supported #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ryx2 opened this issue Nov 13, 2019 · 3 comments
Open

int64 support for some operations not supported #10

ryx2 opened this issue Nov 13, 2019 · 3 comments

Comments

@ryx2
Copy link

ryx2 commented Nov 13, 2019

I have installed all the pip packages in a venv, and when I pip list, everything matches up. I also installed pytorch from source. When I attempt to run

python3 train.py -c /tank/home/xury1/segmentation/bonnetal/train/tasks/segmentation/config/persons/ mobilenetv2_test.yaml --log /tank/home/xury1/segmentation/bonnetal/train/tasks/segmentation/log -p /dev/null

INTERFA

CE:
config yaml: /tank/home/xury1/segmentation/bonnetal/train/tasks/segmentation/config/persons/mobilenetv2_test.yaml
log dir /tank/home/xury1/segmentation/bonnetal/train/tasks/segmentation/log
model path /dev/null
eval only False
No batchnorm False

Commit hash (training version): b'5368eed'

Opening config file /tank/home/xury1/segmentation/bonnetal/train/tasks/segmentation/config/persons/mobilenetv2_test.yaml
model folder doesnt exist! Start with random weights...
Copying files to /tank/home/xury1/segmentation/bonnetal/train/tasks/segmentation/log for further reference.
Images from: /tank/home/xury1/segmentation_data/persons/roads_annotated/ds1/train/img
Labels from: /tank/home/xury1/segmentation_data/persons/roads_annotated/ds1/train/lbl
Inference batch size: 3
Images from: /tank/home/xury1/segmentation_data/persons/roads_annotated/ds1/valid/img
Labels from: /tank/home/xury1/segmentation_data/persons/roads_annotated/ds1/valid/lbl
Original OS: 32
New OS: 16.0
[Decoder] os: 8 in: 32 skip: 32 out: 32
[Decoder] os: 4 in: 32 skip: 24 out: 24
[Decoder] os: 2 in: 24 skip: 16 out: 16
[Decoder] os: 1 in: 16 skip: 3 out: 16
Using normalized weights as bias for head.

Couldn't load backbone, using random weights. Error: [Errno 20] Not a directory: '/dev/null/backbone'
Couldn't load decoder, using random weights. Error: [Errno 20] Not a directory: '/dev/null/segmentation_decoder'
Couldn't load head, using random weights. Error: [Errno 20] Not a directory: '/dev/null/segmentation_head'
Total number of parameters: 2154794
Total number of parameters requires_grad: 2154794
Param encoder 1812800
Param decoder 341960
Param head 34
Training in device: cuda
/tank/home/xury1/segmentation/bonnetal/train/tasks/segmentation/bonnetal/lib/python3.5/site-packages/torch/optim/lr_scheduler.py:100: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule.See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
[IOU EVAL] IGNORE: tensor([], dtype=torch.int64)
[IOU EVAL] INCLUDE: tensor([0, 1])
Traceback (most recent call last):
File "train.py", line 118, in
trainer.train()
File "../../tasks/segmentation/modules/trainer.py", line 302, in train
scheduler=self.scheduler)
File "../../tasks/segmentation/modules/trainer.py", line 494, in train_epoch
evaluator.addBatch(output.argmax(dim=1), target)
File "../../tasks/segmentation/modules/ioueval.py", line 42, in addBatch
tuple(idxs), self.ones, accumulate=True)
RuntimeError: "embedding_backward" not implemented for 'Long'

@ryx2
Copy link
Author

ryx2 commented Nov 13, 2019

I should also include my yaml file:

# training parameters
train:
  loss: "xentropy"       # must be either xentropy or iou
  max_epochs: 300
  max_lr: 0.01           # sgd learning rate max
  min_lr: 0.001          # warmup initial learning rate
  up_epochs: 0.5         # warmup during first XX epochs (can be float)
  down_epochs:  30       # warmdown during second XX epochs  (can be float)
  max_momentum: 0.9      # sgd momentum max when lr is mim
  min_momentum: 0.85     # sgd momentum min when lr is max
  final_decay: 0.995     # learning rate decay per epoch after initial cycle (from min lr)
  w_decay: 0.0005        # weight decay
  batch_size: 5          # batch size
  report_batch: 1        # every x batches, report loss
  report_epoch: 1        # every x epochs, report validation set
  save_summary: False    # Summary of weight histograms for tensorboard
  save_imgs: True        # False doesn't save anything, True saves some 
                         # sample images (one per batch of the last calculated batch)
                         # in log folder
  avg_N: 3               # average the N best models
  crop_prop:
    height: 480
    width: 480

# backbone parameters
backbone:
  name: "mobilenetv2"
  dropout: 0.02
  dropout: 0.02
  bn_d: 0.05
  OS: 16 # output stride
  train: True # train backbone?
  extra:
    width_mult: 1.0
    shallow_feats: True # get features before the last layer (mn2)

decoder:
  name: "aspp_progressive"
  dropout: 0.02
  bn_d: 0.05
  train: True # train decoder?
  extra:
    aspp_channels: 32
    last_channels: 16

# classification head parameters
head:
  name: "segmentation"
  dropout: 0.1

# dataset (to find parser)
dataset:
  name: "persons"
  location: "/tank/home/xury1/segmentation_data/persons/roads_annotated/ds1/"
  workers: 3 # number of threads to get data
  img_means: #rgb
    - 0.46992042
    - 0.45250652
    - 0.42510188
  img_stds: #rgb
    - 0.29184756
    - 0.28221624
    - 0.29719201
  img_prop:
    width: 640
    height: 480
    depth: 3
  labels:
    0: 'background'
    1: 'person'
  labels_w:
    0: 1.0
    1: 1.0
  color_map: # bgr
    0: [0,0,0]
    1: [0,255,0]

Where the imgs and lbl's in that dataset folder are float32's and uint8, respectively

@duda1202
Copy link

Hi,

Were you able to resolve this issue? I am having the exact same issue when using my docker but it worked on the bonnetal docker, For both I use the exact same dataset and config files

@ryx2
Copy link
Author

ryx2 commented Apr 22, 2020

@duda1202 i was able to get this to work, it's a versioning problem. I forget which version changes made it work since this was months ago, but i just pasted my pip freeze here.

`Package Version


absl-py 0.8.1
appdirs 1.4.3
astor 0.8.0
backcall 0.1.0
cycler 0.10.0
decorator 4.4.1
gast 0.3.2
genpy 2016.1.3
grpcio 1.25.0
h5py 2.10.0
imageio 2.6.1
imgaug 0.3.0
ipdb 0.12.3
ipython 7.9.0
ipython-genutils 0.2.0
jedi 0.15.1
Keras-Applications 1.0.8
Keras-Preprocessing 1.1.0
kiwisolver 1.1.0
Mako 1.1.0
Markdown 3.1.1
MarkupSafe 1.1.1
matplotlib 3.0.3
mock 3.0.5
networkx 2.4
numpy 1.17.4
onnx 1.5.0
opencv-python 3.4.0.12
opencv-python-headless 4.1.2.30
parso 0.5.1
pexpect 4.7.0
pickleshare 0.7.5
Pillow 6.0.0
pip 19.3.1
pkg-resources 0.0.0
prompt-toolkit 2.0.10
protobuf 3.10.0
ptyprocess 0.6.0
pycuda 2019.1.2
Pygments 2.5.2
pyparsing 2.4.5
python-dateutil 2.8.1
pytools 2019.1.1
PyWavelets 1.1.1
PyYAML 5.1
scikit-image 0.15.0
scikit-learn 0.20.3
scipy 0.19.1
setuptools 20.7.0
Shapely 1.6.4.post2
six 1.13.0
tensorboard 1.13.1
tensorflow 1.13.1
tensorflow-estimator 1.13.0
termcolor 1.1.0
torch 1.3.1
torchvision 0.4.2
traitlets 4.3.3
typing 3.7.4.1
typing-extensions 3.7.4.1
wcwidth 0.1.7
Werkzeug 0.16.0
wheel 0.33.6 `

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants