Skip to content

Commit

Permalink
Rewrite of sentence tokenizers
Browse files Browse the repository at this point in the history
  • Loading branch information
Hugo-ter-Doest committed Aug 8, 2024
1 parent ed002eb commit 31ed8f5
Show file tree
Hide file tree
Showing 9 changed files with 364 additions and 1,950 deletions.
5 changes: 1 addition & 4 deletions lib/natural/tokenizers/index.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -136,9 +136,6 @@ export class TokenizerJa extends Tokenizer {
}

export class SentenceTokenizer extends Tokenizer {
tokenize (text: string): string[]
}

export class SentenceTokenizerNew extends Tokenizer {
constructor(abbreviations: string[], sentenceDemarkers?: string[])
tokenize (text: string): string[]
}
2 changes: 1 addition & 1 deletion lib/natural/tokenizers/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -46,4 +46,4 @@ exports.WordPunctTokenizer = require('./regexp_tokenizer').WordPunctTokenizer
exports.TreebankWordTokenizer = require('./treebank_word_tokenizer')
exports.TokenizerJa = require('./tokenizer_ja')
exports.SentenceTokenizer = require('./sentence_tokenizer')
exports.SentenceTokenizerNew = require('./sentence_tokenizer_parser')
exports.SentenceTokenizerNew = require('./sentence_tokenizer')
Loading

0 comments on commit 31ed8f5

Please sign in to comment.