-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clustering algorithm evaluation #43
Comments
Last week, I read the documents of MongoDB and started writing the test data. I've finished generating the User and Statements data. |
Last week, I finished generating test data for the clustering algorithm, mainly working on the groups data. I selected 20 statements for each user according to the rule and then group the statements according to the user type and save them in groups data. I also checked the generated test data and made sure it meets the requirements. For the rest of the week, I will write the clustering file, try to apply several different clustering algorithms, and finally integrate them into the test data to get the final result. Then, we can evaluate the results of different clustering algorithms, analyze and compare them, and select the most suitable clustering algorithm for this project. |
Last week I wrote the generate data section and clustering section functions and called them in the clustering_algorithm_evaluation file to generate the data and use the produced data to get clustering results. At present, I use three clustering methods in clustering, among which DBSCAN and OPTICS algorithms are density clustering algorithms. There are still some problems in the implementation of the hierarchical clustering algorithms. It will cause the function to loop indefinitely. |
Last week I found a suitable package to implement hierarchical clustering algorithms. In addition, I also tried to change the input of the clustering algorithm to groupings data, but there are still some problems with the clustering results. |
Last week, I researched how to map the groupings data into the input data required by the clustering algorithm but didn't find any effective way to do it. |
Finish pair statements data and assign agreed pair data into the same group, then print the group result. The agreed pair data means more than 50% of the users assigned these two statements agree they should be in the same group. |
Create a standalone node program that generates test data, that is mongo document like, into an array.
Build a clustering algorithm and run it on the data.
Evaluate the results.
Mongo document like means it has an _id property that is a unique string.
import ObjectID from 'isomorphic-mongo-objectid/src/isomorphic-mongo-objectid'
use this to generate ObjectID.
The text was updated successfully, but these errors were encountered: