Name		Name	Last commit message	Last commit date
parent directory ..
configs		configs
scripts		scripts
.gitignore		.gitignore
README.md		README.md
project.yml		project.yml
test_parser_low_resource.py		test_parser_low_resource.py

README.md

🪐 Weasel Project: Training a POS tagger and dependency parser for a low-resource language

This project trains a part-of-speech tagger and dependency parser for a low-resource language such as Tagalog. We will be using the TRG and Ugnayan treebanks for this task. Since the number of sentences in each corpus is small, we'll need to evaluate our model using 10-fold cross validation. How to implement this split will be demonstrated in this project (scripts/kfold.py). The cross validation results can be seen below.

10-fold Cross-validation results

	TOKEN_ACC	POS_ACC	MORPH_ACC	TAG_ACC	DEP_UAS	DEP_LAS
TRG	1.000	0.843	0.749	0.833	80.846*	0.554
Ugnayan	0.998	0.819	0.995	0.810	0.667	0.409

📋 project.yml

The project.yml defines the data assets required by the project, as well as the available commands and workflows. For details, see the Weasel documentation.

⏯ Commands

The following commands are defined by the project. They can be executed using weasel run [name]. Commands are only re-run if their inputs have changed.

Command	Description
`preprocess`	Convert the data to spaCy's format
`evaluate-kfold`	Evaluate using k-fold cross validation
`clean`	Remove intermediate files

⏭ Workflows

The following workflows are defined by the project. They can be executed using weasel run [name] and will run the specified commands in order. Commands are only re-run if their inputs have changed.

Workflow	Steps
`all`	`preprocess` → `evaluate-kfold`

🗂 Assets

The following assets are defined by the project. They can be fetched by running weasel assets in the project directory.

File	Source	Description
`assets/tl_trg-ud-test.conllu`	URL	Treebank data for UD_Tagalog-TRG
`assets/tl_ugnayan-ud-test.conllu`	URL	Treebank data for UD_Tagalog-Ugnayan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parser_low_resource

parser_low_resource

README.md

🪐 Weasel Project: Training a POS tagger and dependency parser for a low-resource language

10-fold Cross-validation results

📋 project.yml

⏯ Commands

⏭ Workflows

🗂 Assets

Files

parser_low_resource

Directory actions

More options

Directory actions

More options

Latest commit

History

parser_low_resource

Folders and files

parent directory

README.md

🪐 Weasel Project: Training a POS tagger and dependency parser for a low-resource language

10-fold Cross-validation results

📋 project.yml

⏯ Commands

⏭ Workflows

🗂 Assets