Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Latest commit

 

History

History
67 lines (50 loc) · 3.35 KB

INSTALL.md

File metadata and controls

67 lines (50 loc) · 3.35 KB

Installation

We train FLIP models on TPU by default and use tpu-vm-base TPU software version. We use python 3.8.10 and install the following dependencies:

sudo pip3 install clu==0.0.7
sudo pip3 install timm==0.4.12
sudo pip3 install tensorflow_addons
sudo pip3 install jax[tpu]==0.3.15 jaxlib==0.3.15 flax==0.5.3 -f https://storage.googleapis.com/jax-releases/libtpu_releases.html
sudo pip3 install absl-py cached_property gin-config numpy seqio-nightly tensorstore
# compatible with jax=0.3.15
sudo pip3 install orbax==0.0.12
sudo pip3 install chex==0.1.6

sudo pip3 uninstall -y tensorflow
sudo pip3 uninstall -y tensorflow  # do it twice in case of any duplicated installation
sudo pip3 uninstall -y tf-nightly
sudo pip3 install tensorflow==2.8.0
sudo pip3 install tensorflow-text==2.7.3
sudo pip3 install pandas==1.4.4
sudo pip3 install transformers==4.21.3
sudo pip3 install numpy==1.23.3

Dataset preparation

We train on LAION datasets and evalute on ImageNet-1K dataset.

ImageNet-1K

For ImageNet-1K, we follow the link to prepare the dataset.

LAION datasets

We download the LAION-400M dataset and the LAION-2B-en dataset (the English subset of LAION-5B) and prepare them into TFRecord files using the img2dataset tool.

Specifically, we first download the metadata *.parquet files (containing text and image URLs) from Hugging Face datasets for LAION-400M and LAION-2B-en. Then we use img2dataset to download the images, resize them into a short size of 480 pixels, and pack them into TFRecord files as follows. (Here we download them into Google Cloud Storage buckets OUTPUT_DIR, but you can also use a local directory for OUTPUT_DIR as long as it has enough disk space.)

LAION-400M:

pip3 install img2dataset tensorflow tensorflow-io

META_DATA=path/to/laion400m-meta  # download from https://huggingface.co/datasets/laion/laion400m/tree/main
OUTPUT_DIR=gs://your-gcs-bucket/laion-400m/tfrecord_dataset_img480

img2dataset --url_list $META_DATA --input_format "parquet" \
  --url_col "URL" --caption_col "TEXT" --output_format tfrecord \
  --output_folder $OUTPUT_DIR \
  --processes_count 32 --thread_count 256 --image_size 480 --resize_mode keep_ratio  \
  --save_additional_columns '["NSFW","similarity","LICENSE"]'

LAION-2B-en:

pip3 install img2dataset tensorflow tensorflow-io

META_DATA=path/to/laion2b-en-meta  # download from https://huggingface.co/datasets/laion/laion2B-en/tree/main
OUTPUT_DIR=gs://your-gcs-bucket/laion-2b-en/tfrecord_dataset_img480

img2dataset --url_list $META_DATA --input_format "parquet" \
  --url_col "URL" --caption_col "TEXT" --output_format tfrecord \
  --output_folder $OUTPUT_DIR \
  --processes_count 32 --thread_count 256 --image_size 480 --resize_mode keep_ratio  \
  --save_additional_columns '["NSFW","similarity","LICENSE"]'

More details can be found in the img2dataset examples for LAION-400M and LAION-2B-en.