Fine-Tuned Token Locations #1

MrWan001 · 2023-03-04T02:36:37Z

DEFAULT_EMBED_PATH = "/root/downloads/da-fusion/dataset)-tokens/(dataset)-{seedJ-(examples_per_class}.pt"

Hello，the. pt file cannot be found. What effect does it have on the program？

brandontrabucco · 2023-03-05T03:21:03Z

DEFAULT_EMBED_PATH in the code points to the locations of fine-tuned tokens from text inversion.

We will be releasing the tokens for the three datasets we evaluated on shortly, which you can download and place in the location specified by DEFAULT_EMBED_PATH . Which datasets are you using, or are you using a custom dataset?

Best,
Brandon

jlsaint · 2023-03-31T09:16:52Z

Hello @brandontrabucco ,

I'm dealing with a similar issue. Could you please explain the difference between DEFAULT_EMBED_PATH and DEFAULT_SYNTHETIC_DIR? That is, if the former is intended to point to the text inversion tokens as you've said, then what should the latter point to? Asking because intuitively those two variables seem to mean the same thing.

Also, I followed your instructions here and now have learned_embeds.bin for each COCO class (.bin files are in coco-#-#/{class}/ for every class). How should I format the relevant parameters if I want to run train_classifier.py? The given template for DEFAULT_EMBED_PATH seems class-agnostic...

Best,
jl

brandontrabucco · 2023-03-31T16:45:54Z

Hello jlsaint,

Thanks for following up on this issue! The parameter DEFAULT_SYNTHETIC_DIR points to a location on the local disk of your machine where augmented images from Stable Diffusion will be saved for caching. This serves two roles:

First, caching the images to the disk means they don't have to be stored in memory, and for datasets with many images / classes, this can be crucial if there are too many augmented images. Note based on this point that we generated the augmented images only once at the beginning of training in our example training scripts using the train_dataset.generate_augmentations(num_synthetic) method. Here train_dataset is an instance of our FewShotDataset and num_synthetic is an integer that controls how many synthetic images we generate from Stable Diffusion for each real image. If you want to generate synthetic images more than once, you need only call generate_augmentations again later in the script.

Second, having the images cached means you can inspect the augmented images for tuning hyperparameters and confirming if the DA-Fusion is working as expected.

For your last point, take a look at this script: https://github.com/brandontrabucco/da-fusion/blob/main/aggregate_embeddings.py

After we do text inversion and have several class-specific tokens, we merge them together into a single dictionary containing all the tokens using the above script. This produces a class-agnostic template for the DEFAULT_EMBED_PATH.

Let me know if you have other questions!

Best,
Brandon

jameelhassan · 2023-04-02T18:17:36Z

Hello @brandontrabucco,
Is it possible to share the fine-tuned tokens from text-inversion for the three datasets?

I am hoping to run it for Imagenet.
Thanks.

brandontrabucco · 2023-04-02T19:36:49Z

Sure! We have uploaded the current set of tokens here:
https://drive.google.com/drive/folders/1JxPq05zy1_MGbmgHfVIeeFMjL56Cef53?usp=sharing

jameelhassan · 2023-04-02T20:37:46Z

Thank you very much.

brandontrabucco self-assigned this Mar 25, 2023

brandontrabucco changed the title ~~problem~~ Fine-Tuned Token Locations Apr 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine-Tuned Token Locations #1

Fine-Tuned Token Locations #1

MrWan001 commented Mar 4, 2023

brandontrabucco commented Mar 5, 2023

jlsaint commented Mar 31, 2023

brandontrabucco commented Mar 31, 2023

jameelhassan commented Apr 2, 2023

brandontrabucco commented Apr 2, 2023

jameelhassan commented Apr 2, 2023

Fine-Tuned Token Locations #1

Fine-Tuned Token Locations #1

Comments

MrWan001 commented Mar 4, 2023

brandontrabucco commented Mar 5, 2023

jlsaint commented Mar 31, 2023

brandontrabucco commented Mar 31, 2023

jameelhassan commented Apr 2, 2023

brandontrabucco commented Apr 2, 2023

jameelhassan commented Apr 2, 2023