Skip to content

Experiments on Postagger

memeda edited this page Aug 30, 2016 · 1 revision

Experiments' result in detail on Postagger is recorded as following :

##Dataset

using PKU-WEIBO as the training , developing , and test data . Detail infomation is following :

dataset sentence number word-tags pair(tokens) number
Training 337,422 7,720,621
Developing 8,000 172,054
Test 12,500 271,786

##Model Structure Info

KEY VALUE notes
word dict size 176,047 build from training data
word embedding dimension 50 various on different experiment
BI-LSTM hidden Layer dimension 100 -
BI-LSTM stacked layers number 1 -
Tag hidden dimension 32 -
Tag output dimension(tag number) 28 build from training data

##Experiments

###word embedding with Randomized Initialization

####Train

Training Epoch Training Acc validation Acc Speed(seconds / 10k samples) Memory cost
1 94.07% 94.75% 120.71* -
2 96.78% 95.44% 119.58 -
3 97.32% 95.62% 117.39 -
4 97.62% 95.96% 115.05 -
5 97.82% 96.04% 116.06 -
6 97.95% 96.02% 113.92 -
7 98.06% 96.12% 114.01 -

* Runing on Node01

####Test

ACC = 96.0925 %

Appending Test

using the best model on developing , we got accuracy on relating dataset .

@PKU

develop acc : 97.8118 %

test acc : 97.8933 %

@WEIBO

develop acc : 92.7685 %

test acc : 92.8434 %


###word embedding loading from gigawords

####Train

Hit Rate

gigawords embedding number model word dict size hit rate(load successfully)
335,696(0.34 M) 176,047(0.18M) 95,359/176,047(54.17%)

Result

Training Epoch Training Acc validation Acc Speed(seconds / 10k samples) Memory cost
1 93.97% 93.60% 94.01* -
2 95.65% 94.45% 98.13 -
3 96.23% 94.67% 111.09 -
4 95.56% 95.04% 102.25 -
5 96.80% 95.20% 102.04 -
6 96.95% 95.16% 101.56 -
7 97.07% 95.28% 100.76 -

* runing on Node05

####Test

ACC = 95.3254 %

####Appending Test

@PKU

devel acc : 97.1792 %

test acc : 97.2476 %

@WEIBO

devel acc : 91.5375 %

test acc : 91.857 %

Analysis

The accuracy is totally lower than model with randomized initialization .

May be , only half word embedding hit rate cause the strange result . The conflict came into being with combination of pre-trained word embedding and randomized initialization .


word embedding load from Sogou-news-corpus

####Train

Hit Rate

gigawords embedding number model word dict size hit rate(load successfully)
1,354,247(1.35 M) 176,047(0.18M) 119,245/176,047(67.73%)

Result

Training Epoch Training Acc validation Acc Speed(seconds / 10k samples) Memory cost
1 93.70% 93.02% 177.61* -
2 95.14% 93.81% 202.71 -
3 95.68% 94.17% 213.92 -
4 96.03% 94.41% 215.28 -
5 96.27% 94.52% 220.29 -
6 96.46% 94.55% 220.76 -
7 96.61% **94.68% ** 215.25 -

* runing on Node06

####Test

ACC = 94.6491 %

####Appending Test

@PKU

devel acc : 96.7085 %

test acc : 96.7084 %

@WEIBO

devel acc : 90.6736 %

test acc : 90.9336 %

Analysis

It is so lowly on Node06 , so as the bad accuracy ....


##result summarizing

word embedding initialization method hit rate train acc* best devel acc(@PKU-WEIBO) test acc(@PKU-WEIBO) devel acc(@PKU) test acc(@PKU) devel acc(@WEIBO) test acc(@WEIBO) speed(s/10k)
random - 98.06% 96.12% 96.10% 97.8118 % 97.8933 % 92.7685 % 92.8434 % 114.01(CPU@node01)
gigawords 54.17% 97.07% 95.28% 95.33% 97.1792 % 97.2476 % 91.5375 % 91.8570 % 100.76(CPU@node05)
sogou-news 67.73 96.61% 94.68% 94.65% 96.7085 % 96.7084 % 90.6736 % 90.9336 % 215.25(CPU@node06)

* The train acc is the result of the epoch where validation get the best result . So as the test_acc , speed

keep 4 digits after the point for results on devel and test at pku and weibo . Because LTP using this precession .

###Next

  1. more epoches should be demonstrated

  2. smaller devel granularity( for example , not just on every epoch , but on 5,0000 samples trainning. )

  3. get the un-hit words , and decide what to do with decreasing when loading outer word embedding .