diff --git a/README.md b/README.md index a749c3c..a2d95dc 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ One such avenue was the corresponding Sentiment Analysis
+Sentiment Analysis

In sentiment analysis, sentiment refers to the underlying emotion expressed in a piece of text. Through the analysis we aim to determine whether a given text expresses a positive, negative, or neutral sentiment, with the ultimate goal of reaching insights regarding public opinion. The process often uses various techniques varying from looking for specific keywords, to linguistic patterns or contextual information. Presently, however, the
model we deployed relies on neural network architecture in order to learn from documents and classify sentiment.

Our first step was to gauge the public sentiment based on the comments. The model used, developed here in UNIC’s AI Lab, is the perfect candidate for that task. The model is capable of reading textual documents and deciphering whether the sentiment hidden behind the words is positive, negative or neutral. We deployed the model on the comments and one can say the results are surely intriguing. As you can see in the following figure the public appears to be mostly split between positive and negative sentiments regarding the article’s content. While some are happy about what they read, others view the news in a negative light.

@@ -17,7 +17,7 @@ Our first step was to gauge the public sentiment based on the comments. The mode The comments section of an article is often the ground of heated discussions between users. Based on the sentiment the comments echoed, we analyzed the number of replies they received within the first couple of days that the article went live. We can see how the, expectedly, increased activity of the first hour was strongly polarized.



-Topic Modeling

+Topic Modeling

An interesting first insight, although what would be even more interesting is finding the discussions those comments revolve around. A great way to achieve this, bar reading 1500 comments, is topic modeling. Topic modeling allows us to cluster the discussion under different categories based on their content. The algorithm we followed yielded five distinct topics. The comments were then assigned to their respective topic and the connections to their replies were established. In the following interactive map one can explore the various comments, how they are connected and what topic they belong to.



@@ -73,7 +73,7 @@ Still, there was a feeling that we can learn more about those topics. We ended u -
Stance Detection
+
Stance Detection
In order to better understand not only the significance of these words but also the overarching feelings that encompass these five, very distinct, topics we needed to find a way to reveal the commenter’s stance towards the topic at hand through their comments. To achieve this we trained a model to perform Stance Detection on these comments in order to classify them as comments that either support the notion that “AI is a Threat to Humanity”, “AI will benefit Humanity” or remain Ambivalent on the matter. But how exactly does Stance Detection work and how does it differ from the Sentiment Detection mentioned above? While Sentiment Analysis scans the text to discover the Sentiment or, in other words, the emotional charge that permeates the sentences, Stance Detection, on the other hand, works by analyzing the text to decide which Stance or side it takes in a given argument. The results, given the overall tone and subject matter of the article, were not shocking at all with the vast majority of the comments supporting the notion that AI is a Threat to humanity as a whole.


@@ -85,7 +85,7 @@ We can see that the first couple of hours was when the bulk of the comments, and Could it be that everyone really is against AI or is it a case of people who have already adopted an Anti-AI narrative being drawn to an article that depicts AI in a relatively negative light? Whatever the case may be, we were sure that the data contained more insights yet to be uncovered which is exactly why we decided to perform even more analytical procedures in order to extract as much information as possible. Before performing an even deeper dive into the data, though, we figured this was a good opportunity to supplement our previous understanding of the words that were used by taking a look at the top most frequently used words, regardless of the topic, and how they stack up for each respective stance.


-Sarcasm Detection.
+Sarcasm Detection.
After taking a closer look at the comments and weighing them against their classified Stances and Sentiments we couldn’t help but notice that quite a few of them made ample use of sarcasm with comments being either fully ironic or having an ironic remark or two hidden somewhere in the almost paragraph-esque combination of sentences that made them up. So, in similar fashion to the Stance Detection we trained another model that this time detects instances of Sarcasm in the comments and classifies them as either containing Sarcasm or not. Again, the results were not very surprising with most Sarcastic or Sarcastically-leaning comments being Anti-AI. This could also provide an explanation for most, if not all, of the instances of Anti-AI comments that are Positive in their Sentiment.
@@ -97,7 +97,7 @@ After taking a closer look at the comments and weighing them against their class -Effort and Interactions
+Effort and Interactions
But, what about the overall effort that has been put into these comments? Surely, not everyone has the time or the way with words to “pen” down eloquent strings of words that carry across the meaning of what they’re thinking. Luckily for us, there is a way to gain an insight into this as well by examining the amount of words that each comment has and grouping them together into meaningful groups per their size or “bins” as they’re more commonly called.


This reveals that the vast majority of comments is on the shorter side with only a tiny minority breaching the 1,000 character threshold. But what exactly does this tell us? Research has shown that humans, when presented with multiple choices, are more likely to choose to perform the action that seems the easiest and requires the least effort. Therefore, the analysis above seems to be in line with that observation but, are comments with only a few characters really the option that requires the least effort? While having the least amount of characters can definitely indicate the overall effort that has been put into a comment it turns out that there’s an even easier action that readers can take - one that requires no typing at all; liking the already-posted comments they agree with. As liking a comment you agree with is quicker and easier than typing out a comment that, if anything, serves only as a paraphrasing of what has already been said, it was no surprise to see that all the comments we analyzed have collectively gathered a whopping 23,042 likes. Therefore adding likes into the equation allows us to gain an additional perspective into the Stances displayed in the data.
@@ -107,7 +107,7 @@ Based on the above we can infer that the comments that generated the most “buz

-Conclusion and Results
+Conclusion and Results
Overall, we see that the most interacted with comments dealt with the subjects of Humanity and Humanity vs AI/Machines and judging by our prior Analysis of Stance and Sentiment it appears that the majority of commenters don’t seem to share the brightest outlook on the immediate future of humanity in a post-AI-boom world. So, after all this you might be wondering, “What does an average comment from that article look like?”. Well, we have an answer for that as well! While we can’t necessarily share with you the comments as they are, due to privacy-related concerns, we thought it would be interesting to “feed” these comments into an LLM, GPT-4 to be precise, and ask it to return comments that resemble the ones we gave it. The results are actually surprisingly accurate with comments such as:

1. "AI is a double-edged sword, and we need to be careful how we wield it."