Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Norwegian Wordnet Bokmål. Statistics #2

Open
sigdelina opened this issue Jun 27, 2022 · 2 comments
Open

Norwegian Wordnet Bokmål. Statistics #2

sigdelina opened this issue Jun 27, 2022 · 2 comments
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers

Comments

@sigdelina
Copy link
Contributor

This issue contains information about statistics in the Norwegian Wordnet (Bokmål) from the National Norwegian Library.

At the initial stage, statistics was made on the official dataset from the National Library in general, showing distribution of number of examples, pos tags, and senses per lemma.

At this stage, more detailed statistics are carried out, including the following points:

  1. the statistics of distribution of unique sentences for the given lemma: choice was concentrated on lemmas that could provided 5 or more sentences through the dataset.

  2. it is proposed to divide words into categories depending on the number of possible senses.

@sigdelina sigdelina added the good first issue Good for newcomers label Jun 27, 2022
@akutuzov akutuzov added the documentation Improvements or additions to documentation label Jun 27, 2022
@sigdelina
Copy link
Contributor Author

In progress

  • Providing modification in .rdf files in the Wordnet. The mistakes in early sample of modification are removed. The scripts after removing errors can be found here .

@sigdelina
Copy link
Contributor Author

Statistics

The statistics provided in the current link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants