Skip to content

kudea/google-play-rating-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Google play rating prediction

Dataset

From Kaggle googleplaystore.csv

algorithm SVM vs SVR

Support vector machines (SVM) is a supervised machine learning classification algorithm, which can be used for classification problem in multi-dimension space. It is widely used in mining researches, such as text mining and opinion mining, and it has a great result. Comparing to other classification algorithms, SVM algorithm usually has better result in higher dimension, where all number of features is quite large and the data is sparse. SVM uses g(x)=w^T x+b as the linear separation hyperplane, where w is the weight vector, b is the bias, x is a set of high dimensional non-linear transformation function, w and b is determined by training data that optimize the following formulas:

image

whereε_iis the slack variables, C is the penalty coefficient, for all the training samples (x_i,y_i).

Support Vector Regression (SVR) is using the SVM algorithm on regression problem. The goal of SVM is to find the separation hyperplane; and the goal of SVR is to find the regression hyperplane.

For the given training set: {(x_1,z_1), …, (x_i,z_i)}

where x_i∈Rn is a feature vector, and z_i∊ R1 is the target output. In order to find the hyperplane, two parameters C>0 and ε>0 must be given and the support vector regression can be defined:

image

In our experiment, we use a free SVM toolkit, sklearn, C = 1to train the SVR model.

Method

image

Result

image

Conclusion

The comparison of the three proposed methods are shown in Table I. Among the three mothods, KNN with K = 15 has the best accuracy against the other two, and Decision Tree method had the worst performance. Hence, to do such rating prediction on Google Play apps, SVR and KNN algorithm is more acceptable. With the trained model, we can predict the rating when a app is given with the corresponding features, so that improve the user experience when surfing the Apps market and provide a early evaluation for developing the potential products.

Releases

No releases published

Packages

No packages published

Languages