-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathProjectFlow.txt
30 lines (29 loc) · 1.56 KB
/
ProjectFlow.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Project flow:
1.Web crawling:
•Create a spider using scrapy (a web crawling framework).
Input: By the user or a list of URL’s given as input parameters to the spider.
•Execute the spider.
•The feedback will be retrieved from the scrapped data. This forms the dataset for the project.
2.Topic modelling:
•Latent Dirichlet allocation (LDA), Non-negative matrix function (NMF) will be used to categorize the most important topics from
the feedback.
Input: Feedback Dataset.
Output: List of topics based on the frequencies of the words.
•For each topic obtained, retrieve the tuples which correspond to the particular topic from the feedback.
Output: Categorized Feedback with respect to each topic.
3.Topic summarization:
•The categorized feedback will be summarized using the TextRank (a graph-based ranking model for text processing and summarization)
Input: Categorized feedback.
Output: Topic wise summary.
4.Sentiment analysis:
Opinion mining will be performed in two approaches:
•Bag of words: Each of the words in the feedback are given its polarity of sentiment, and then classified as positive or negative based upon the polarity of the feedback.
Dictionary: A dictionary of opinion oriented words along with their sentiment score is considered for the polarity.
Input: Categorized feedback.
Output: Sentiment of each topic.
•Classifier: Support vector machine classifier (Multikernel)
Testing: Cross-validation.
Technology stack:
•Frontend development: ReactJS
•Analytics: Python
•Backend development: Django, Postgre SQL Server, SQL Database.