Skip to content

Norbench updates #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sigdelina opened this issue Aug 23, 2022 · 5 comments
Closed

Norbench updates #5

sigdelina opened this issue Aug 23, 2022 · 5 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request good first issue Good for newcomers

Comments

@sigdelina
Copy link
Contributor

The current issue provides information about the models that have been implemented within the framework of Norbench.

The information provided below will contain:

  • documentation for running scripts for the current (POS-tagging, Binary Sentiment Analysis, NER) task
  • list of models that can be used for each of the tasks (at the current moment)
  • future updates, new available models, etc.
@sigdelina
Copy link
Contributor Author

Early updates

The first version of documentation was described here.

@sigdelina sigdelina reopened this Aug 23, 2022
@sigdelina sigdelina added documentation Improvements or additions to documentation enhancement New feature or request good first issue Good for newcomers labels Aug 23, 2022
@sigdelina
Copy link
Contributor Author

Updates in documentation

  • The second version of documentation was uploaded to repository
  • Documentation for POS-tagging task and Binary Sentiment Analysis task was supplemented: two extra sections were added (Evaluation and Available Models) for both of the tasks.
  • The scripts for POS-tagging task and Binary Sentiment Analysis task were updated and now more models can be run on the current tasks (information about models for each of the task in Available Models is given).

What's next:

  • Updating scripts for NER-finetuning (as mentioned in early updates). Scores for XLM-Roberta from the first attempt of implementing could be found here -- but the code overall will be improved

@sigdelina
Copy link
Contributor Author

Updates

  • The documentation for NorBench NorBench was updated
  • Some bugs in the scripts were eliminated
  • More models for POS-tagging, Binary Sentiment Analysis, and NER were evaluated (the scores for them could be found here)
  • To increase the number of models in the benchmark, it was decided to use models that were implemented in ScandEval -- the list of models and their scores constantly updated

What should be done next:

  • some models from ScandEval benchmark should be downloaded as a folder directly into the directory (e.g.Scandibert model). Automatically saving of such models is in the process of solving.

@akutuzov
Copy link
Member

Two things more:

  1. Decouple evaluation code from data (data paths should not be hard-coded in the scripts)
  2. Create a single evaluation script which will run all the benchmarks for a given model.

@akutuzov
Copy link
Member

akutuzov commented Nov 13, 2022

Also:

  • Scripts should load the datasets in the original format (probably even pull them for the respective repositories if not found locally).
  • Every task should be "served" by two scripts: the one which fine-tunes a model and produces predictions (saved in a separate file) and the one which evaluates these predictions on the test set.
  • Argument names must be more sane (currently they are a bit weird, for example, this short_model_name which is in fact the path to the model, etc)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants