RelationExtraction

This framework is based on this paper: Relation Extraction: Perspective from Convolutional Neural Networks

Review of the paper:

Traditional approaches - relation extractor uses features generated by linguistic analysis modules.

This paper - relation extractor based on complicated feature engineering by introducing a convolutional neural network that automatically learns features from sentences and minimizes the dependence on external toolkits and resources.

model - multiple window sizes for filters

	  - pre-trained word embeddings as an initializer on a non-static architecture to improve the performance

	  - unblanced corpus (non-relation examples exceed usual relations.)

State-of-the-art

Techniques - feature-based method - kernel-based method

Uses large body of lingustic analysis and knowledge resources to transform relation mentions into some rich representation to be used by some statistical classifier like SVM or Maximum Entropy. - tokenize - part of speech tagging - chunking - name tagging - parsing with existing NLP modules.

Problem in State-of-the-art

Induces errors propogation from supervised NLP toolkits.
Out of domain data performance loss

Paper's approach

System is provided with raw sentences marked with the two entities of interest
Word embeddings as an initializer (captures latent semantic and syntactic properties)
CNN - recognizes specific classes of n-gram and induce more abstract representations.
various window size for convolutional filters (captures wider range of n-grams)
rather than initializing the word embeddings randomly, we use pretrained word embeddings for initialization and optimize both word embeddings and position embeddings as model parameters

Framework

CNN layers:

look-up tables to encode words in sentences by real-valued vectors
convolutional layer to recognize n-grams
pooling layer to determine the most relevant features
a logistic regression layer (a fully connected neural network with a softmax at the end) to perform classification

First layer

Input : sentences marked with two entity
CNNs - work with fixed length inputs => compute the maximal separation between entity mentions linked by a relation and choose an input width greater than this distance.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.idea		.idea
CNN.py		CNN.py
README.md		README.md
input.py		input.py
mnist.py		mnist.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RelationExtraction

State-of-the-art

Problem in State-of-the-art

Paper's approach

Framework

First layer

About

Releases

Packages

Languages

shrutikar/RelationExtraction

Folders and files

Latest commit

History

Repository files navigation

RelationExtraction

State-of-the-art

Problem in State-of-the-art

Paper's approach

Framework

First layer

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages