Multilingual Multi-Figurative Language Detection (ACL 2023 Findings)

Abstract: Figures of speech help people express abstract concepts and evoke stronger emotions than literal expressions, thereby making texts more creative and engaging. Due to its pervasive and fundamental character, figurative language understanding has been addressed in Natural Language Processing, but it's highly understudied in a multilingual setting and when considering more than one figure of speech at the same time. To bridge this gap, we introduce multilingual multi-figurative language modelling, and provide a benchmark for sentence-level figurative language detection, covering three common figures of speech and seven languages. Specifically, we develop a framework for figurative language detection based on template-based prompt learning. In so doing, we unify multiple detection tasks that are interrelated across multiple figures of speech and languages, without requiring task- or language-specific modules. Experimental results show that our framework outperforms several strong baselines and may serve as a blueprint for the joint modelling of other interrelated tasks.

Overview

Quick Start

How to use

from transformers import MT5TokenizerFast, MT5ForConditionalGeneration
tokenizer = MT5TokenizerFast.from_pretrained('laihuiyuan/MMFLD')
model = MT5ForConditionalGeneration.from_pretrained('laihuiyuan/MMFLD')

prompt = 'Which figure of speech does this text contain? (A) Literal. (B) {}. | Text: {}'
task = 'Idiom' # Hyperbole and Metaphor are also supported
text = 'This is a perfect way to break the ice and start the conversation.'

inputs = prompt.format(task, text)
inputs = tokenizer(inputs, return_tensors="pt")
output = model.generate(**inputs)
pred = tokenizer.decode(output[0].tolist(), skip_special_tokens=True, clean_up_tokenization_spaces=False)

Train model

python mt5_ft.py 
    -lang en_XX zh_CN de_DE es_XX it_IT fa_IR ru_RU \
    -form hyperbole idiom metaphor \
    -prompt 'Which figure of speech does this text contain? (A) Literal. (B) {}. | Text: '

Citation

If you use this code, please cite our paper:

@inproceedings{lai-etal-2023-multi,
    title = "Multilingual Multi-Figurative Language Detection",
    author = "Lai, Huiyuan and Toral, Antonio and Nissim, Malvina",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2023",
    month = July,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
img		img
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
mt5_ft.py		mt5_ft.py
polynomial_lr_decay.py		polynomial_lr_decay.py
requirements.txt		requirements.txt
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multilingual Multi-Figurative Language Detection (ACL 2023 Findings)

Overview

Quick Start

How to use

Train model

Citation

About

Releases

Packages

Languages

License

laihuiyuan/MMFLD

Folders and files

Latest commit

History

Repository files navigation

Multilingual Multi-Figurative Language Detection (ACL 2023 Findings)

Overview

Quick Start

How to use

Train model

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages