Skip to content

Library for extracting significative keywords, bigrams and trigrams

License

Notifications You must be signed in to change notification settings

elegans-io/manaus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Keyword Extractor

Temporary project to implement a topic extractor.

In a nutshell,...

Compile

sbt compile

zip packet generation

sbt dist

run from sbt

sbt "run-main com.getjenny.manaus.commands.CalculateKeywordsForSentences --raw_conversation data/conversations.txt --word_frequencies data/word_frequency.tsv --output_file data/output.csv"

The input format is a semicolumn separated value file with the following fields: sentence, tokenized_sentence, type, conv_id, sentence_id

The output format is a semicolumn separated value file with the following fields: sentence, tokenized_sentence, type, conv_id, sentence_id, keywords

sbt "run-main com.getjenny.manaus.commands.CalculateKeywordsForSentences --raw_conversation data/conversations.txt --word_frequencies data/word_frequency.tsv --output_file data/output.csv

run from zip packet

./manaus-0.1/bin/calculate-keywords-for-sentences --raw_conversation data/conversations.txt --word_frequencies data/word_frequency.tsv

About

Library for extracting significative keywords, bigrams and trigrams

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages