Usage:

usage: main.py [-h] [--train_dir TRAIN_DIR] [--test_dir TEST_DIR]
               [--output_dir OUTPUT_DIR]
               [--target_cols TARGET_COLS [TARGET_COLS ...]]
               [--gpus GPUS [GPUS ...]] [--do_train] [--do_infer]
               [--ckpt_dir CKPT_DIR] [--image_model IMAGE_MODEL]
               [--bert_model BERT_MODEL] [--word_embedding WORD_EMBEDDING]
               [--max_vocab MAX_VOCAB] [--max_word MAX_WORD]
               [--image_size IMAGE_SIZE] [--max_len MAX_LEN] [--lower]
               [--text_separator TEXT_SEPARATOR] [--n_hiddens N_HIDDENS]
               [--drop_rate DROP_RATE] [--lr LR] [--batch_size BATCH_SIZE]
               [--n_epochs N_EPOCHS] [--kaggle] [--seed SEED]

ICDAR 2021: Multimodal Emotion Recognition on Comics scenes (EmoRecCom)

optional arguments:
  -h, --help            show this help message and exit
  --train_dir TRAIN_DIR
                        path to the train data directory to train model
  --test_dir TEST_DIR   path to the test data directory to predict
  --output_dir OUTPUT_DIR
                        path to directory for models saving
  --target_cols TARGET_COLS [TARGET_COLS ...]
                        define columns for forecasting
  --gpus GPUS [GPUS ...]
                        select gpus to use
  --do_train            whether train the pretrained model with provided train
                        data
  --do_infer            whether predict the provided test data with the
                        trained models from checkpoint directory
  --ckpt_dir CKPT_DIR   path to the directory containing checkpoints (.h5)
                        models
  --image_model IMAGE_MODEL
                        pretrained image model name in list ['efn-b0',
                        'efn-b1', 'efn-b2', 'efn-b3', 'efn-b4', 'efn-b5',
                        'efn-b6', 'efn-b7'] None for using unimodal model
  --bert_model BERT_MODEL
                        path to pretrained bert model path or directory (e.g:
                        https://huggingface.co/models)
  --word_embedding WORD_EMBEDDING
                        path to a pretrained static word embedding in list
                        ['glove.840B.300d', 'wiki.en.vec',
                        'crawl-300d-2M.vec', 'wiki-news-300d-1M.vec'] None for
                        using bert model only to represent text
  --max_vocab MAX_VOCAB
                        maximum of word in the vocabulary (Tensorflow word
                        tokenizer)
  --max_word MAX_WORD   maximum word per text sample (Tensorflow word
                        tokenizer)
  --image_size IMAGE_SIZE
                        size of image
  --max_len MAX_LEN     max sequence length for padding and truncation (Bert
                        word tokenizer)
  --lower               whether lowercase text or not
  --text_separator TEXT_SEPARATOR
                        define separator to join conversations
  --n_hiddens N_HIDDENS
                        concatenate n_hiddens final layer to get sequence's
                        bert embedding, -1 for using [CLS] token embedding
                        only
  --drop_rate DROP_RATE
                        drop out rate for both images and text encoders
  --lr LR               learning rate
  --batch_size BATCH_SIZE
                        num examples per batch
  --n_epochs N_EPOCHS   num epochs required for training
  --kaggle              whether using kaggle environment or not
  --seed SEED           seed for reproceduce
Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Usage:

Files

README.md

Latest commit

History

README.md

File metadata and controls

Usage: