Skip to content

Jamesetay1/SVA-Error-Detector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Subject Verb Agreement Error Detection

This is a program designed to detect subject verb agreement errors.
The program specifically targets errors commonly found in the English Placement Test for Non-Native English speakers (EPT)

Background

This program was the final group project for LING 520: Computational Analysis of English at Iowa State University, Fall 2020 The associated research paper for this program is available here: https://github.com/Jamesetay1/520/blob/master/SVA/docs/research_paper.pdf

Program written by me (James Taylor, jamesetay1@gmail.com)
Corpus Research and Derivation of Rules by:
Ella Alhudithi (ella@iastate.edu)
Thomas Elliott (thomase@iastate.edu)
Sondoss Elnegahy (sondoss@iastate.edu)

How it works

After reading in a file that contains the sentences in question, coreNLP Python library Stanza is used to tokenize, tag, and dependency parse on a sentence by sentence basis.

After this is done we iterate through each sentence object and build a 'forward' dependency list. Each word object already contains the id of it's governor, so this information is simply reconstructed in a format that's easier to use.

For Example, coreNLP parses "He walked through the door ." to the following dependencies:
word.text = He, word.head = 2 (walked)
word.text = walked, word.head = 0 (ROOT)
word.text = through, word.head = 5 (door)
word.text = the, word.head = 5 (door)
word.text = door, word.head = 2 (walked)
word.text = ., word.head = 2 (walked)

Resulting in a forwards dependency list of word objects:
[ [] , [word.text = "he", word.text = "door", word.text = "."], [], [], [word.text = "through", word.text = "the"] ]
Note that this list is populated with word objects, not just their text. Also note that the root never appears (because it can only be a governor, and is never a dependent)

Relationships and Rules

Once we have completed our forward dependency list, we go through the sentence again and look for special relationships to test. We test against an agreement dictionary, which is determined by the matrix below: ![mtx](https://raw.githubusercontent.com/Jamesetay1/520/master/SVA/docs/agreement_matrix.png)

The relationships we are currently looking for are:
1: Main Verb --nsubj--> Noun: If the word is a verb AND it has a nsubj forward dependency AND it does NOT have any aux forward dependency

2: Aux <--aux-- Main Verb --nsubj--> Noun: If the word is a verb AND it has a nsubj forward dependency AND it DOES have any aux forward dependency

3A: Noun <--nsubj-- Subject Predicate --cop--> Verb: If the word has a copular forward dependency AND an nsubj forward dependency
3B: 3A + Subject Predicate --aux--> Aux: If the word has a copular forward dependency AND an nsubj forward dependency AND an aux nsubj forward depdency

Example

Given the sentence: I am happy that he have been a friend since we met last September.

The program will recognize three subject-verb relationships in this sentence:
I am happy (Relationship 3A)
he have been a friend (Relationship 3B)
we met (Relationship 1)

When we check these against our error matrix we find:
Correct: I (PRP) <--nsubj-- am (VBP)
Incorrect: he (PRP) <--nsubj-- have (VBP)
Correct: we (PRP) <--nsubj-- met (VBD)

Limitations

One more permanent limitation of this program surrounds the uncertainty of if there is a mismatch in number between subject and verb or if it is truly a compound noun. coreNLP will mark instances like "The man park his car" as: Man: NN, park: NN, as to say that the noun is 'man park'. Of course this is actually meant to be, "The man parks his car", but the agreement in number between subject and verb was incorrect. This case is currently counted as incorrect IF the head noun of the compound is the root of the sentence.

Future Additions

Future additions of SVA errors should be easy to add with the current framework Additionally, it would be ideal to have a GUI for this program in addition to what is already there.

About

Subject Verb Agreement Error Detector

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages