You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I've tried to make my own POS model under LINUX. I first tried with a corpus of
1 600 000 words (and tags) and I got a Stack Overflow error, so I tried with a
much smaller corpus (100 000 words and tags), the program tells me it's reading
the training corpus, then compiling probabilities, then it sends me a Fatal
error: exception Failure("empty context_trie).
What do I do wrong, my corpus is just a file with 1 word and one tag / line
with LF end of line.
TIA for the answer
Original issue reported on code.google.com by dwight...@gmail.com on 25 Nov 2010 at 4:09
The text was updated successfully, but these errors were encountered:
I'm having this same issue. I thought it might be a 32-bit/64-bit thing, since
the binaries in the download section are 32-bit and I'm on a 64-bit system. But
I still get the same error when I use binaries I have compiled on my own I get
the same error. I've tried fiddling with the -s and -f parameters to
hunpos-train, but it doesn't seem to help.
Original comment by arnsh...@gmail.com on 13 Jan 2011 at 10:19
I would like to compile a model for Portuguese. The train corpus is utf-8
encoded (see attachment). However, I've got the same error under Mac OS 10.6.3:
$ cat port.corpus | ./hunpos-train -t 3 -e 2 -s 3 port-model
reading training corpus
compiling probabilities
Fatal error: exception Failure("empty context_trie")
The same error occurs when I don't specify any options, using the default
values.
Original comment by Leonel.Figueiredo.de.Alencar@gmail.com on 21 Jan 2011 at 2:07
The Fatal error: exception Failure("empty context_trie") is caused by the
following bug:
When building the model, a separate estimate for emission probabilities for
words containing digits only is created and it is a fatal error if there are no
such words in the corpus.
Some word forms composed of digits only need to be present in the training
corpus to avoid this error.
Original comment by nova...@gmail.com on 3 Apr 2011 at 11:22
Original issue reported on code.google.com by
dwight...@gmail.com
on 25 Nov 2010 at 4:09The text was updated successfully, but these errors were encountered: