GSDPMM

The datasets are in format of JSON like follows:
{"text": "centrepoint winter white gala london", "cluster": 65}
{"text": "mourinho seek killer instinct", "cluster": 96}
{"text": "roundup golden globe won seduced johansson voice", "cluster": 72}
{"text": "travel disruption mount storm cold air sweep south florida", "cluster": 140}
{"text": "wes welker blame costly turnover", "cluster": 89}
......

The output of GSDPMM are D (the number of documents in the dataset) lines. Each line contains the estimated cluster for that document.

Citation

Please cite the following paper for the data usage:

@article{chen2019nonparametric, title={A nonparametric model for online topic discovery with word embeddings}, author={Chen, Junyang and Gong, Zhiguo and Liu, Weiwen}, journal={Information Sciences}, volume={504}, pages={32--47}, year={2019}, publisher={Elsevier} }

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.settings		.settings
bin		bin
data		data
lib		lib
result		result
src/main		src/main
.classpath		.classpath
.project		.project
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GSDPMM

Citation

About

Releases

Packages

Contributors 3

Languages

junyachen/GSDPMM

Folders and files

Latest commit

History

Repository files navigation

GSDPMM

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages