README

Here are code and dataset for our ACL2023 paper: Grounded Multimodal Named Entity Recognition on Social Media

Updates

20230728: Twitter10000 v2.0

We have made some revisions to the Twitter10000 dataset. In Twitter10000 v2.0, we made several detailed revisions to the BIO tagging and bounding box annotations, improving the alignment between the two to ensure a more accurate and consistent relationship.

Dataset

Our dataset is built on two benchmark MNER datasets, i.e., Twitter-15 (Zhang et al., 2018) and Twitter-17 (Yu et al., 2020).

The preprocessed CoNLL format files are provided in this repo. For each tweet, the first line is its image id, and the following lines are its textual contents.
Step 1：Download each tweet's associated images via this link (https://drive.google.com/file/d/1PpvvncnQkgDNeBMKVgG2zFYuRhbL873g/view)
Step 2: Use VinVL to identify all the candidate objects, and put them under the folder named "Twitter10000_VinVL". We have uploaded the features extracted by VinVL to Google Drive and Baidu Netdisk (code: TwVi).

Requirement

pytorch 1.7.1
transformers 3.4.0
fastnlp 0.6.0

Usage

Training for H-Index

sh train.sh

Evaluation

sh test.sh

Acknowledgements

Using the dataset means you have read and accepted the copyrights set by Twitter and original dataset providers.
Some codes are based on the codes of BARTNER, thanks a lot!

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Twitter10000		Twitter10000
Twitter10000_v2.0		Twitter10000_v2.0
model		model
README.md		README.md
test.py		test.py
test.sh		test.sh
train.py		train.py
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README

Updates

20230728: Twitter10000 v2.0

Dataset

Requirement

Usage

Training for H-Index

Evaluation

Acknowledgements

About

Releases

Packages

Languages

NUSTM/GMNER

Folders and files

Latest commit

History

Repository files navigation

README

Updates

20230728: Twitter10000 v2.0

Dataset

Requirement

Usage

Training for H-Index

Evaluation

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages