Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UNITER pretrained checkpoint missing config #1174

Open
Maxsparrow opened this issue Dec 12, 2021 · 2 comments
Open

UNITER pretrained checkpoint missing config #1174

Maxsparrow opened this issue Dec 12, 2021 · 2 comments
Assignees

Comments

@Maxsparrow
Copy link

🐛 Bug

Following the instructions in #1144 by @Ryan-Qiyu-Jiang, I attempted to run the command to fine tune UNITER on VQA2, but it appears to be missing the necessary config.

Also, the uniter.pretrained.tar.gz file appears to contain an invalid folder path, but I was able to resolve this manually (described below).

Command

mmf_run config=projects/uniter/configs/vqa2/defaults.yaml run_type=train_val dataset=vqa2 model=uniter checkpoint.resume_zoo=uniter.pretrained

To Reproduce

Steps to reproduce the behavior:

  1. mmf_run config=projects/uniter/configs/vqa2/defaults.yaml run_type=train_val dataset=vqa2 model=uniter checkpoint.resume_zoo=uniter.pretrained
  2. Fix the tarball folder path:
    • If you don't do this, you will get an error like AssertionError: None or multiple checkpoints files. MMF doesn't know what to do. , as it can't find the checkpoint file in the nested folder
    • mv ~/.cache/torch/mmf/data/models/uniter.pretrained/private/home/ryanjiang/winoground/pretrained_models/uniter_pretrained_mmf.pth ~/.cache/torch/mmf/data/models/uniter.pretrained/
  3. Run the same command again from 1.
  File "/home/maxsparrow/.pyenv/versions/miniconda3-4.7.12/envs/mmf3/bin/mmf_run", line 33, in <module>
    sys.exit(load_entry_point('mmf', 'console_scripts', 'mmf_run')())
  File "/home/maxsparrow/code/CS7643/mmf/mmf_cli/run.py", line 133, in run
    main(configuration, predict=predict)
  File "/home/maxsparrow/code/CS7643/mmf/mmf_cli/run.py", line 52, in main
    trainer.load()
  File "/home/maxsparrow/code/CS7643/mmf/mmf/trainers/mmf_trainer.py", line 46, in load
    self.on_init_start()
  File "/home/maxsparrow/code/CS7643/mmf/mmf/trainers/core/callback_hook.py", line 20, in on_init_start
    callback.on_init_start(**kwargs)
  File "/home/maxsparrow/code/CS7643/mmf/mmf/trainers/callbacks/checkpoint.py", line 30, in on_init_start
    self._checkpoint.load_state_dict()
  File "/home/maxsparrow/code/CS7643/mmf/mmf/utils/checkpoint.py", line 244, in load_state_dict
    load_pretrained=ckpt_config.resume_pretrained,
  File "/home/maxsparrow/code/CS7643/mmf/mmf/utils/checkpoint.py", line 268, in _load
    ckpt, should_continue = self._load_from_zoo(file)
  File "/home/maxsparrow/code/CS7643/mmf/mmf/utils/checkpoint.py", line 448, in _load_from_zoo
    zoo_ckpt = load_pretrained_model(file)
  File "/home/maxsparrow/code/CS7643/mmf/mmf/utils/checkpoint.py", line 163, in load_pretrained_model
    return _load_pretrained_model(model_name_or_path_or_checkpoint, args, kwargs)
  File "/home/maxsparrow/code/CS7643/mmf/mmf/utils/checkpoint.py", line 140, in _load_pretrained_model
    config = get_config_from_folder_or_ckpt(download_path, ckpt)
  File "/home/maxsparrow/code/CS7643/mmf/mmf/utils/checkpoint.py", line 90, in get_config_from_folder_or_ckpt
    "No configs provided with pretrained model"
AssertionError: No configs provided with pretrained model while checkpoint also doesn't have configuration.

Expected behavior

Expect it to train using pretrained UNITER checkpoint with resume_zoo. Other pretrained resume_zoo checkpoints have a config.yaml file as part of the downloaded tarball, but this one does not.

@Ryan-Qiyu-Jiang Ryan-Qiyu-Jiang self-assigned this Dec 12, 2021
@Ryan-Qiyu-Jiang
Copy link
Contributor

Thanks Max, you're right will fix this!

@Ryan-Qiyu-Jiang
Copy link
Contributor

Hey an update!
Made pr to fix #1175
In the meanwhile,
if you delete your current download dir ~/.cache/torch/mmf/data/models/uniter.pretrained/
and comment out the checksum from the model zoo config for UNITER
and try to re-download I expect the config issue should be resolved.

Btw, the UNITER checkpoints were pretrained on BUTD features on CC/SBU which is a different
object detection model than the one used to extract features in MMF.
So the pretrained checkpoints won't give good pretraining task results on MMF features,
but you can still finetune UNITER from the pretrained checkpoint on downstream tasks like VQA2 for comparable accuracy.

Added this to the projects doc in this PR stack.
Thanks for bringing this up, didn't know nautilus worked with VL!
Let me know if you have any questions :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants