Skip to content

Masakhane swahili ner data prepossessed to support training in the spacy pipeline

License

Notifications You must be signed in to change notification settings

Neurotech-HQ/spacy-ready-masakhane-ner-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spacy-ready-masakhane-ner-data [Swahili]

This repo consist of a preprocess Swahili Masakhane NER data to support training in the spacy pipeline.

The script used in this repository can be applied to preprocess any other kind of Masakhane NER data apart from swahili.

Getting started

There few things you should be familiar with to use these scripts, for now you might to change manually the path to load raw data and also where to store preprocessed one in the clean_ner_data.py file.

When you're done, just run the script and it will immediately preprocess the data for you

python3 clean_ner_data.py

Issues ?

In case you're experiencing issue with anything to do with script and ner data, please raise an issue so as we can quickly fix it.

Contributions

Contributions are very much welcomed, from typo to code to documentation to examples, JUST FORK IT.

Give it star

Did you find this repo useful, give it a star so as more people can find it.

Credits

All the Credits to

  1. Masakhane
  2. Kalebu

Releases

No releases published

Packages

No packages published

Languages