Content

Content

📢 News
Bridging the Gap in Text-Based Emotion Detection
Languages
Tracks
Dataset and Download Links
Evaluation
Important Dates and Task Phases
How to Participate
Competition Rules and Terms
Dataset paper
Communication
FAQs
Resources
References
Organizers

🛑‼️ Evaluation starting soon: Please check some important updates regarding the shared task setup. 🛑‼️

📢 News

18 Febuary 2025

The task dataset, including the test set, is released. Download it here. To prevent our dataset from being scraped and used to train LLMs, we have encrypted the file and will share the password to participants via email.
We have released the task dataset papers (Paper 1 and Paper 2)
- Paper 1 (For all languages except Amharic, Oromo, Somali, and Tigrinya): BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages.
- Paper 2 (If you use one the following languages: Amharic, Oromo, Somali, and Tigrinya) Evaluating the Capabilities of Large Language Models for Multi-label Emotion Understanding. You can use the BibTeX below to cite the dataset papers.

    @misc{muhammad2025brighterbridginggaphumanannotated,
    title={BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages}, 
    author={Shamsuddeen Hassan Muhammad and Nedjma Ousidhoum and Idris Abdulmumin and Jan Philip Wahle and Terry Ruas and Meriem Beloucif and Christine de Kock and Nirmal Surange and Daniela Teodorescu and Ibrahim Said Ahmad and David Ifeoluwa Adelani and 
    Alham Fikri Aji and Felermino D. M. A. Ali and Ilseyar Alimova and Vladimir Araujo and Nikolay Babakov and Naomi Baes and Ana-Maria Bucur and Andiswa Bukula and Guanqun Cao and Rodrigo Tufino Cardenas and Rendi Chevi and Chiamaka Ijeoma Chukwuneke and 
    Alexandra Ciobotaru and Daryna Dementieva and Murja Sani Gadanya and Robert Geislinger and Bela Gipp and Oumaima Hourrane and Oana Ignat and Falalu Ibrahim Lawan and Rooweither Mabuya and Rahmad Mahendra and Vukosi Marivate and Andrew Piper and Alexander 
     Panchenko and Charles Henrique Porto Ferreira and Vitaly Protasov and Samuel Rutunda and Manish Shrivastava and Aura Cristina Udrea and Lilian Diana Awuor Wanzare and Sophie Wu and Florian Valentin Wunderlich and Hanif Muhammad Zhafran and Tianhui Zhang 
    and Yi Zhou and Saif M. Mohammad},
    year={2025},
    eprint={2502.11926},
    archivePrefix={arXiv},
    primaryClass={cs.CL},
    url={https://arxiv.org/abs/2502.11926} 
    }

@inproceedings{belay-etal-2025-evaluating,
  title = "Evaluating the Capabilities of Large Language Models for Multi-label Emotion Understanding",
  author = "Belay, Tadesse Destaw  and Azime, Israel Abebe  and Ayele, Abinew Ali  and Sidorov, Grigori  and Klakow, Dietrich  and Slusallek, Philip  and Kolesnikova, Olga  and Yimam, Seid Muhie",
  booktitle = "Proceedings of the 31st International Conference on Computational Linguistics",
  year = "2025",
  address = "Abu Dhabi, UAE",
  publisher = "Association for Computational Linguistics",
  url = "https://aclanthology.org/2025.coling-main.237/",
  pages = "3523--3540"
}

The updated task ranking is available (this is an unofficial ranking and may be subject to modifications)

Track 1 (Multi-label Emotion Detection): Unofficial Ranking

Track 2 (Emotion Intensity ): Unofficial Ranking

Track 3 (Cross-lingual Emotion Detection): Unofficial Ranking
Please complete this form (https://forms.gle/UuYZEuCqmW4MF5Z18). Only teams that fill out the above form will appear in the official ranking in our task paper.
Each team will be assigned papers to review from other submissions. So, if you submit a system paper, you will be assigned a paper to review.

08 Febuary 2025

You can now check the ranking for all three tracks (this is an unofficial ranking and may be subject to modifications). See above
The official ranking will be released in the task description paper and will include only teams that submit a system description paper.
We encourage everyone to write a system description paper (deadline: 28th Feb, AoE time). Everyone is eligible to submit a system description paper, regardless of their ranking—even if they rank last. Others can benefit from the insights and information you share!
We will share the paper submission link as soon as SemEval releases it.
For the ranking, we used the team information you provided. If your team is missing, it means you did not fill out the shared team information form. Please email us to have your team added. We will also share another form next week for you to briefly describe your system.
We will set up a post-evaluation page and release the gold labels for the test data so you can continue analyzing model performance for system paper preparation.
Our dataset paper, including baseline results and dataset details, will be released on 15th Feb.

02 Febuary 2025

The competition has ended, and we will announce the rankings very soon.

15 January 2025

The evaluation phase has started and all the datasets have been released.

31 December 2024

🛑‼️ Please check some important updates regarding the shared task setup. 🛑‼️

Separate Codabench for Each Track

This shared task includes three tracks (A, B, and C). The three tracks were previously hosted on a single Codabench competition platform. However, due to an unresolved issue with Codabench and to improve the submission process, we now have 3 CodaBench competition pages (i.e., one per track), which means that 🛑‼️you have to register for each track separately🛑‼️. Please use the updated links below:

Track A: Multi-label Emotion Detection – Track A Competition Page
Track B: Emotion Intensity – Track B Competition Page
Track C: Cross-lingual Emotion Detection – Track C Competition Page

**New Dataset Release

We have now released the datasets for all languages in the shared task (except Romanian). Details about which languages are included in each track can be found in this section.
Note: You need to re-download all the datasets and disregard the previous ones, as we have made some changes. The dataset in each split is THE SAME as the old ones BUT the IDs may have changed due to processing. Please re-run the prediction on the evaluation data, and follow the new format for submission on the competition website.

Updated Participation Guide and Submission Instructions

The participation guidelines have been updated to reflect the changes in the Codabench submission process. Please see the guide for detailed instructions.
We have updated the submission instructions on Codabench. Please refer to the "Submission Instructions" section on Codabench for a detailed guide on how to prepare your submission file.

Revised Evaluation Timeline and SemEval 2025

The evaluation period has been updated as follows:
- Start Date: 15 January 2025
- End Date: 28 January 2025
SemEval 2025 will be co-located with ACL 2025 (in Vienna, Austria). We look forward to seeing you there!

16 September 2024

The competition website is now updated on Codabench: Codabench Competition.
The dataset is now available on the competition website. If you previously downloaded the dataset via Google Drive, please download the updated version from the competition website as the data has been revised.

10 September 2024

We have released the training and development datasets for seven languages: English (eng), German (deu), Oromo (orm), Brazilian Portuguese (ptbr), Russian (rus), and Somali (som), and Tigrinya (tig). More languages are on the way, and we’ll be updating the table below table with release information over the next few days.
The competition website will be live soon. Stay tuned for more updates!

Bridging the Gap in Text-Based Emotion Detection

Emotions are simultaneously familiar and mysterious. On the one hand, we all express and manage our emotions every day. Yet, on the other hand, emotions are complex, nuanced, and sometimes hard to articulate.

We use language in subtle and complex ways to express emotion (Wiebe et al. 2005, Mohammad and Kiritcheko 2018, Mohammad et al. 2018). Further, people are highly variable in how they perceive and express emotions (even within the same culture or social group). Thus, we can never truly identify how one is feeling based on something that they have said with absolute certainty.

Emotion recognition is not one task but an umbrella term for several tasks such as detecting the emotions of the speaker, identifying what emotion a piece of text is conveying and detecting emotions evoked in a reader (Mohammad 2021, Mohammad 2023).

This task is on perceived emotions and focuses on:

Determining what emotion most people will think the speaker may be feeling given a sentence or a short text snippet uttered by the speaker.

The task is not about:

The emotion evoked in the reader.
The emotion of someone else mentioned in the text.
Or even the true emotion of the speaker (which cannot be definitively known from just a short text snippet).

We acknowledge the importance of this distinction as perceived emotions can differ from actual emotions due to various factors such as cultural context, individual differences in emotional expression, and the limitations of text-based communication (Van Woensel and Nevil 2019, Wakerfield 2021).

Languages

We include a large number of languages with many predominantly spoken in regions characterised by a relatively limited availability of NLP resources (e.g., Africa, Asia, Eastern Europe and Latin America):

Afrikaans (afr), Algerian Arabic (arq), Amharic (amh), Portuguese (Brazilian) (ptbr), Mandarin Chinese (chn), Emakhuwa (vmw), English (eng), German (deu), Hausa (hau), Hindi (hin), Igbo (ibo), Indonesian (ind), isiXhosa (xho), isiZulu (zul), Javanese (jav), Kinyarwanda (kin), Spanish (Latin American) (esp), Marathi (mar), Moroccan Arabic (ary), Portuguese (Mozambican) (pt-MZ), Nigerian-Pidgin (pcm), Oromo (orm), Romanian (ron), Russian (rus), Somali (som), Sundanese (sun), Swahili (swa), Swedish (swe), Tatar (tat), Tigrinya (tir), Ukrainian (ukr), Yoruba (yor).

Tracks

This shared task consists of three tracks: Track A, Track B, and Track C. Participants can choose to participate in one or more of these tracks.

Track A: Multi-label Emotion Detection (Competition page is here)

Given a target text snippet, predict the perceived emotion(s) of the speaker. Specifically, select whether each of the following emotions apply: joy, sadness, fear, anger, surprise, or disgust. In other words, label the text snippet with: joy (1) or no joy (0), sadness (1) or no sadness (0), anger (1) or no anger (0), surprise (1) or no surprise (0), and disgust (1) or no disgust (0).

Note that for some languages such as English, the set perceived emotions includes 5 emotions: joy, sadness, fear, anger, or surprise and does not include disgust.

A training dataset with gold emotion labels will be provided for this track.

Example

Below is a sample of the English training data (Track A). A text snippet can have multiple emotions (e.g., the sentence with the ID sample_05 expresses both joy and surprise), or none (e.g., sample_04 with all the emotion values equal to 0 is considered neutral).

Track B: Emotion Intensity (Competition page is here)

Given a target text and a target perceived emotion, predict the intensity for each of the classes.

The set of the perceived emotions includes: joy, sadness, fear, anger, surprise, or disgust.

The set of ordinal intensity classes includes:

0: No emotion
1: Low degree of emotion
2: Moderate degree of emotion
3: High degree of emotion

Example

Below is a sample of the English training data (Track B). For each emotion, the value associated with it indicates its degree of intensity. For example, sample_05 has a value of 3 for joy and a value of 3 for surprise, i.e., high degrees of joy and surprise in the text snippet.

A training dataset with gold emotion labels will be provided for this track.

Note that for some languages such as English, the set perceived emotions includes 5 emotions: joy, sadness, fear, anger, or surprise and does not include disgust.

Track C: Cross-lingual Emotion Detection (Competition page is here)

Given a labeled training set in one of the languages given above, predict the perceived emotion labels of a new text instance in a different target language.

The set of the six perceived emotion classes includes: joy, sadness, fear, anger, surprise, or disgust.

The dataset in this track has the same format as the dataset in Track A.

A training dataset will not be provided for this track.

Note that for some languages such as English, the set perceived emotions includes 5 emotions: joy, sadness, fear, anger, or surprise and does not include disgust.

Languages and Tracks

The table below lists the tracks and specifies the languages available in each track for the shared task. Each track has its own dedicated Codabench page (competition platform), and the corresponding links are provided below:

Track A: Multi-label Emotion Detection – Track A Codabench
Track B: Emotion Intensity – Track B Codabench
Track C: Cross-lingual Emotion Detection – Track C Codabench

Participants may choose to participate in one or more languages and tracks of their preference.

No.	Language	Code	Track A	Track B	Track C
1	Afrikaans	AFR	✓	✗	✓
2	Algerian Arabic	ARQ	✓	✓	✓
3	Amharic	AMH	✓	✓	✓
4	Chinese	CHN	✓	✓	✓
5	Emakhuwa	VMW	✓	✗	✓
6	English	ENG	✓	✓	✓
7	German	DEU	✓	✓	✓
8	Hausa	HAU	✓	✓	✓
9	Hindi	HIN	✓	✗	✓
10	Igbo	IBO	✓	✗	✓
11	Indonesian	IND	✗	✗	✓
12	isiXhosa	XHO	✗	✗	✓
13	isiZulu	ZUL	✗	✗	✓
14	Javanese	JAV	✓	✗	✓
15	Kinyarwanda	KIN	✓	✗	✓
16	Marathi	MAR	✓	✗	✓
17	Moroccan Arabic	ARY	✓	✗	✓
18	Nigerian-Pidgin	PCM	✓	✗	✓
19	Oromo	ORM	✓	✗	✓
20	Portuguese (Brazilian)	PTBR	✓	✓	✓
21	Portuguese (Mozambican)	PTMZ	✓	✗	✓
22	Romanian	RON	✓	✓	✓
23	Russian	RUS	✓	✓	✓
24	Somali	SOM	✓	✗	✓
25	Spanish (Latin American)	ESP	✓	✓	✓
26	Sundanese	SUN	✓	✓	✓
27	Swahili	SWA	✓	✗	✓
28	Swedish	SWE	✓	✗	✓
29	Tatar	TAT	✓	✗	✓
30	Tigrinya	TIR	✓	✗	✓
31	Ukrainian	UKR	✓	✓	✓
32	Yoruba	YOR	✓	✗	✓

Legend:

✓: The language is supported for the specified track.
✗: The language is not supported for the specified track.

Dataset and Download Links

Visit the official competition page on Codabench: https://www.codabench.org/competitions/3863/
Follow the detailed instructions provided here to download the data.
Important note some languages do not include the Disgust class.

Evaluation

The performance of the submitted systems will be evaluated based on the following metrics:

Track A: Multilabel Emotion Detection The evaluation metric will be the F1-macro based on the predicted labels and the gold ones.
Track B: Emotion Intensity The evaluation metric will be the Pearson correlation between the predicted labels and the gold ones.
Track C: Crosslingual Emotion Detection The evaluation metric will be the F1-macro based on the predicted labels and the gold ones.
For details about the evaluation script and the submission file format checker, check this guide.

Important Dates and Task Phases

Description	Deadline
Sample Data Ready	~~15 July 2024~~
Training Data Ready	10 September 2024
Evaluation Start	15 January 2025
Evaluation End	28 January 2025
System Description Paper Due	28 February 2025
Notification to authors	31 March 2025
Camera ready due	21 April 2025
SemEval workshop 2025	(co-located with ACL2025)

The task will be divided into three phases: Development, Evaluation, and Post-Evaluation. The following summarize the phases and their timelines.

Development Phase:

This phase runs from 02 September to 15 January 2024.
Train (with gold labels) and dev data (without gold labels) will be released for this phase.
Train and evaluate your model on the dev set via CodaLab.
Up to 999 submissions are allowed, and the leaderboard is open for you to view your results and those of others.

Evaluation Phase:

This phase runs from around 15 January to 28 January 2024 (tentative).
Test data will be released (without gold labels).
Participants will have the opportunity to evaluate their models on the test data.
Each team is allowed only three submissions. Only the final submission will be considered your official entry for the competition.
The leaderboard is disabled and will only be published after the submission deadline.

Post-Evaluation Phase:

Starts around 31 January 2024 and never ends.
In this phase, you can still submit and test your system even after the official competition ends. This way, you can keep improving your work.
We will make the leaderboard public again so you can see how you are doing compared to others.
You can use CodaLab to talk with other participants, share ideas, and learn how to make your system better.

How to Participate

Register: Sign up on the CodaBench competition platform.
Track: Decide on the track(s) you want to participate in (Track A, B and/or C).
Download: Access to the datasets for each track will be provided in this repository.
Develop: Build your models using the provided data.
Submit: Submit your predictions on the CodaBench competition platform.

Please follow the guidelines shared here.

Competition Rules and Terms

1. Consent to Public Release of Scores

By submitting results, you consent to the public release of your scores on:
- the competition website,
- at the designated workshop,
- in associated proceedings.
Task organizers have discretion over the release and choice of metrics.
Scores may include:
- automatic and manual quantitative judgments,
- qualitative judgments,
- other metrics as deemed appropriate.

2. Score Release and Validity

Task organizers reserve the right to withhold scores for:
- incomplete submissions,
- erroneous submissions,
- deceptive submissions,
- rule-violating submissions.
Inclusion of a submission's scores does not constitute endorsement.

3. Team Participation Rules

Participants may be involved in only one team.
Exceptions may be granted with prior approval from organizers.

4. Account Management

Each team must create and use exactly one account on the designated platform.

5. Team Constitution

Team membership cannot be changed after the evaluation period begins.

6. Development Period Rules

Teams can submit up to 999 submissions.
Results are visible only to the submitting team.
Leaderboard is disabled.
Warnings and errors are visible for each submission.

7. Evaluation Period Rules

The teams are contrained to make 3 submissions.
Only the final submission will be considered official.
Warnings and errors are visible for each submission.

8. Post-Competition

The gold labels will be released after the competition.
The teams are encouraged to report results on all their system variants in their description paper.
The official submission results must be clearly indicated.

9. Public Release of Submissions

Final team submissions may be made public after the evaluation period.

10. Disclaimer about the Datasets

Organizers and affiliated institutions provide no warranties on dataset correctness or completeness.
They are not liable for dataset access or usage.

11. Peer Review Process

Each participant will review another team's system description paper.

12. Dataset Usage Restrictions

Datasets should only be used for scientific or research purposes.
Any other use is explicitly prohibited.
Datasets must not be redistributed or shared with third parties.
Interested parties should be directed to the official website.

13. Final ranking

To be included in the official task ranking, you **MUST** submit a system description paper.

Dataset paper

We will soon release a dataset paper that describes the data collection, annotation process, and baseline experiments. This paper will provide additional details and information that will be useful for the task participants.

Communication

Join our Discord Channel to ask questions and receive updates (coming soon).
If you have any questions or issues, please feel free to create an issue.
Contact organizers at: emotion-semeval-2025-organisers[at]googlegroups[dot]com

FAQs

Do I have to participate in all languages for a given track?

No, you can participate in one or more languages.

How will you verify my submitted model?

To be included in the final team rankings of our shared task, it is mandatory for participants to submit a system description paper describing their approaches and methodologies in detail, therefore ensuring scientific integrity.

When will you release the gold labels?

For the dev set, the gold labels will be released when the evaluation phase starts and the gold labels for the different test sets will be released after the competition is over.

Can I use LLMs in the different tracks?

Yes.

Can I use additional datasets (e.g, publicly provided ones from other sources)?

Yes. Please do cite them in the system description paper.

How was the data collected?

The data collection process a standard one, you can check previous papers in the area to have an idea (e.g., https://aclanthology.org/S18-1001.pdf). We have data instances (text snippets) annotated by >3 annotators. The annotators decide whether some emotion is present in a given instance (text snippet). For details about the data sources, annotation guidelines, number of annotators per language, etc., this information will be shared in the dataset paper.

How was the data annotated and did you use LLMs to annotate it?

No. The data instances were annotated by (>=3) native speakers and no LLMs were involved in the process. The annotators labeled the whole sentences not the words and they were expected to have different opinions as (1)this a subjective task and emotions are complex and (2)we were interested in the emotion(s) that they perceived. See the task definition for more details.

Will I be included in the final ranking if I do not write a system description paper?

No. You MUST write a system description paper to be included in the final ranking.

I have never written a system description paper. How can I write one?

We will have an online writing tutorial and share resources to help you write a system description paper.

Do I need to pay conference registration fees and/or attend SemEval for my paper to be published?

It is not required to attend the SemEval workshop for the paper to be published. You do not have to pay any registration fees if you do not attend the workshop. However, if you want to attend the workshop, you need to pay for attendance.

Our system did not perform very well, should I still write a system description paper?

We want to hear from **all** of you even if you did not outperform other systems!Write about the details of your system. (**Yes we want your insights from any negative results!**)

Resources

SemEval 2025 Shared Tasks
Frequently Asked Questions about SemEval
Paper Submission Requirements
Guidelines for Writing Papers
Paper style files
Previous shared-tasks on emotion detection SemEval-2018 Task 1: Affect in Tweets
Resources for Beginners
- Starter kit (Note that you need to donload the data from CodaBench.)
- Writing tutorial: Blogpost
- Examples of additional datasets/lexicons:
  - Emotion lexicons: http://saifmohammad.com/WebPages/lexicons.html
  - SemEval-2018 Task 1: Affect in Tweets for arb, eng, esp.
  - Emotions in Drama for deu.
  - RESD, CEDR-M7, Dusha for rus.
Paper submission link (to be added)

References

Janyce Wiebe, Theresa Wilson, and Claire Cardie. "Annotating expressions of opinions and emotions in language." Language resources and evaluation 39 (2005): 165-210.

Saif M. Mohammad,, and Svetlana Kiritchenko. "Understanding Emotions: A Dataset of Tweets to Study Interactions between Affect Categories." Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018).

Saif M. Mohammad, Felipe Bravo-Marquez, Mohammad Salameh, and Svetlana Kiritchenko: SemEval-2018 Task 1: Affect in Tweets. In Proceedings of the International Workshop on Semantic Evaluation (SemEval-2018).

Lieve Van Woensel and Nissy Nevil. 2019. What if your emotions were tracked to spy on you? European Parliamentary Research Service, PE 634.415. https://www.europarl.europa.eu/RegData/etudes/ATAG/2019/634415/EPRS_ATA(2019)634415_EN.pdf.

Jane Wakefield. 2021. AI emotion-detection software tested on Uyghurs. BBC. https://www.bbc.com/news/technology-57101248.

Saif M. Mohammad "Ethics sheet for automatic emotion recognition and sentiment analysis." Computational Linguistics 48.2 (2022): 239-278.

Saif M. Mohammad "Best Practices in the Creation and Use of Emotion Lexicons." Findings of the Association for Computational Linguistics (EACL 2023).

Organizers

Shamsuddeen Hassan Muhammad, Seid Muhie Yimam , Nedjma Ousidhoum, Idris Abdulmumin, David Ifeoluwa Adelani, Ibrahim Said Ahmad, Alham Fikri Aji, Felermino Ali, Vladimir Araujo, Abinew Ali Ayele,Meriem Beloucif, Christine de Kock, Oana Ignat, Alexander Panchenko, Terry Ruas, Nirmal Surange, Daniela Teodorescu, Jan Philip Wahle, Yi Zhou, Saif M. Mohammad

Name		Name	Last commit message	Last commit date
Latest commit History 516 Commits
assets		assets
sample_submission_files		sample_submission_files
task-dataset		task-dataset
.gitignore		.gitignore
README.md		README.md
baseline.ipynb		baseline.ipynb

emotion-analysis-project/SemEval2025-Task11

Folders and files

Latest commit

History

Repository files navigation

Content

📢 News

18 Febuary 2025

08 Febuary 2025

02 Febuary 2025

15 January 2025

31 December 2024

Separate Codabench for Each Track

**New Dataset Release

Updated Participation Guide and Submission Instructions

Revised Evaluation Timeline and SemEval 2025

16 September 2024

10 September 2024

Bridging the Gap in Text-Based Emotion Detection

Languages

Tracks

Track A: Multi-label Emotion Detection (Competition page is here)

Example

Track B: Emotion Intensity (Competition page is here)

Example

Track C: Cross-lingual Emotion Detection (Competition page is here)

Languages and Tracks

Dataset and Download Links

Evaluation

Important Dates and Task Phases

How to Participate

Competition Rules and Terms

Dataset paper

Communication

FAQs

Resources

References

Organizers

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages