Skip to content

Codebase for EMNLP 2024 Findings Paper "Knowledge-Guided Dynamic Modality Attention Fusion Framework for Multimodal Sentiment Analysis"

License

Notifications You must be signed in to change notification settings

MKMaS-GUET/KuDA

Repository files navigation

KuDA

Codebase for EMNLP 2024 Findings Paper:

Knowledge-Guided Dynamic Modality Attention Fusion Framework for Multimodal Sentiment Analysis

Model Architecture:

modelArchitecture

The code was refactored to integrate all datasets; please contact me if you find any bugs. Thanks.

Content

Data Preparation

KuDA uses four MSA datasets and BERT in the corresponding languages: Chinese (CH-SIMS, CH-SIMSv2) and English (CMU-MOSI, CMU-MOSEI).

Datasets

  • CH-SIMS / CMU-MOSI / CMU-MOSEI can be downloaded from MMSA.
  • CH-SIMSv2 can be downloaded from ch-sims-v2 (Supervised).

Pretrained Language Modal

Environment

The paper's basic training environment for its results is Python 3.8, Pytorch 1.9.0 with a single NVIDIA RTX 3090. Notably, different hardware and software environments can cause the results to fluctuate.

Running

Note: The parameters of the two stages need to be modified for different datasets because the data lengths and dimensions are different.

Stage 1: Knowledge Inject Pretraining

There are two ways to obtain weights of knowledge injection:

  1. Download the translated text file from this link (required for MOSI and MOSEI, not required for CH-SIMS and CH-SIMSv2), and execute the following command to pretrain each modality:

    python pretrain.py
  2. The weights we have previously trained can be downloaded from this link.

Stage 2: Multimodal Sentiment Analysis

python train.py

Note

  1. In Encoder_KIAdapter.py, you need to modify the source code of torch.nn.TransformerEncoder to return the intermediate hidden statues. The code can be modified as follows:

    class TransformerEncoder(Module):
     r"""TransformerEncoder is a stack of N encoder layers
     """
     __constants__ = ['norm']
    
     def __init__(self, encoder_layer, num_layers, norm=None):
         super(TransformerEncoder, self).__init__()
         self.layers = _get_clones(encoder_layer, num_layers)
         self.num_layers = num_layers
         self.norm = norm
    
     def forward(self, src: Tensor, mask: Optional[Tensor] = None, src_key_padding_mask: Optional[Tensor] = None) -> Tensor:
         r"""Pass the input through the encoder layers in turn.
         """
         output = src
         hidden_state_list = []
         hidden_state_list.append(output)
    
         for mod in self.layers:
             output = mod(output, src_mask=mask, src_key_padding_mask=src_key_padding_mask)
             hidden_state_list.append(output)
    
         if self.norm is not None:
             output = self.norm(output)
    
         return output, hidden_state_list
  2. After completing the preparation of data and models, the file structure is as follows:

    ├─core
    ├─data
    │  ├─CH-SIMS
    │  ├─CH-SIMSv2
    │  ├─MOSI
    │  └─MOSEI
    ├─log
    ├─models
    ├─pretrainedModel
    │  ├─BERT
    │  └─KnowledgeInjectPretraining
    ├─opts.py
    ├─pretrain.py
    ├─train.py
  3. We gratefully acknowledge the help of open-source projects used in this work 🎉🎉🎉, including MMSA, ALMT, TMBL, TETFN, CENet, CubeMLP, Self-MM, MMIM, BBFN, MISA, MulT, LMF, TFN, etc 😄.

Citation

Paper publication address:

Knowledge-Guided Dynamic Modality Attention Fusion Framework for Multimodal Sentiment Analysis

Please cite our paper if you find it having other limitations and valuable for your research (卑微求引用 T^T) :

@inproceedings{feng2024knowledge,
  title={Knowledge-Guided Dynamic Modality Attention Fusion Framework for Multimodal Sentiment Analysis},
  author={Feng, Xinyu and Lin, Yuming and He, Lihua and Li, You and Chang, Liang and Zhou, Ya},
  booktitle={Findings of the Association for Computational Linguistics: EMNLP 2024},
  pages={14755--14766},
  year={2024}
}

About

Codebase for EMNLP 2024 Findings Paper "Knowledge-Guided Dynamic Modality Attention Fusion Framework for Multimodal Sentiment Analysis"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages