Shouldn't we have the right to know who wrote what we read? The truth is that it's becoming increasingly challenging to distinguish between human-written and AI-generated texts. This project is about building a robust AI detector at Iberian languages, which include Spanish
, Catalan
, Basque
, Galician
, Portuguese
and English
(the one from Gibraltar). This is also the repository where we keep track of our progress in the autextification contest.
Link to the contest: https://sites.google.com/view/iberautextification/home
IberAuTexTification is the second version of the AuTexTification at IberLEF 2023 shared task (Sarvazyan et al., 2023). This context includes more models, more domains and more languages than the previous one.
In the folder where you want to have our repository, run the following line:
git lfs clone https://github.com/WojciechNeuman/autextification.git
We are keeping the general code (one for each of the six languages) and then some specifics arrangements for each language in each branch. In the future we will merge all of them.
We've divided the project into 4 groups, so each one focuses on one Iberian language.
Author | Assigned language |
---|---|
Eurídice Corbí | Catalan |
Natalia Hernández | Spanish |
[Nicolás Nebot] | Catalan |
Wojciech Neuman | English |
Aitana Sebastià | Basque |
[Carlos Torregrosa] | Spanish |
Jose Valero | Basque |