In this project, PySpark was used to build a Naive-Bayes classifier to determine the sentiment of a comment. For that, was used a database with approximately 1.5GB of data, which amounts to 3.6 million comments. Spark was used to efficiently handle this large amount of data.
-
Notifications
You must be signed in to change notification settings - Fork 0
RicardoRibeiroRodrigues/SparkNayveBayes
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Nayve Bayes implementation using spark, to be used to handle big data
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published