Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Development #70

Merged
merged 55 commits into from
Jun 13, 2024
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
f62113b
feat: Added actions matrix creation!
amindadgar Mar 5, 2024
60f668c
feat: zeroing self interactions!
amindadgar Mar 5, 2024
637fe3e
fix: isort linter issue!
amindadgar Mar 5, 2024
229c9b5
fix: linter issue again!
amindadgar Mar 5, 2024
5586800
fix: linter issue again!
amindadgar Mar 5, 2024
5d2c1d9
feat: added ignore received interactions!
amindadgar Mar 6, 2024
65cf2ec
feat: Added user action test cases!
amindadgar Mar 6, 2024
86b3e24
fix: isort linter!
amindadgar Mar 6, 2024
13e4a00
feat: Added test case for other activities than active!
amindadgar Mar 6, 2024
809ead9
fix: we were ignoring interactions wrong!
amindadgar Mar 6, 2024
f32d482
update: remove TODO comment!
amindadgar Mar 6, 2024
3180fdb
feat: more modularizing code!
amindadgar Mar 6, 2024
2e92964
fix: linter issues!
amindadgar Mar 6, 2024
e6d7fcd
fix: reactions weren't being ignored!
amindadgar Mar 6, 2024
00720f3
fix: isort linter error!
amindadgar Mar 6, 2024
cb74ee3
feat: delete comments!
amindadgar Mar 7, 2024
e2566b7
feat: update dependency version!
amindadgar Mar 7, 2024
0ed7861
Merge pull request #68 from TogetherCrew/feat/interaction-to-action
cyri113 Mar 11, 2024
315497c
feat: update test case command!
amindadgar Mar 12, 2024
79b43c0
Merge pull request #69 from TogetherCrew/feat/update-CI
amindadgar Mar 18, 2024
edde9c9
update: dependency lib version!
amindadgar Apr 17, 2024
cb129fa
Merge pull request #71 from TogetherCrew/feat/update-core-analyzer-li…
amindadgar Apr 18, 2024
0b5d072
fix: code cleaning + removing some unnecessary codes!
amindadgar May 20, 2024
c815dd4
fix: update import names based on changes!
amindadgar May 20, 2024
2ad361a
feat: updating workflows based on new changes!
amindadgar May 20, 2024
a5d4c75
fix: more test cases based on updates!
amindadgar May 20, 2024
4ff3cf1
fix: test case!
amindadgar May 21, 2024
68760e9
fix: isort linter issue!
amindadgar May 21, 2024
11049d8
fix: more cleaning for redis, and mongo clients!
amindadgar May 21, 2024
20e682a
Merge pull request #81 from TogetherCrew/wip-80
amindadgar May 21, 2024
8429bc2
fix: wrong variable call!
amindadgar May 21, 2024
5d607aa
fix: black linter issue!
amindadgar May 21, 2024
f67715e
fix: isort linter issues!
amindadgar May 21, 2024
8b380e2
fix: black linter issue!
amindadgar May 21, 2024
27e3980
fix: test case with latest code updates!
amindadgar May 22, 2024
fb00b17
fix: variable refrenced before assignment!
amindadgar May 22, 2024
f3051c8
fix: wrong param reading discord_utils.py!
amindadgar May 22, 2024
98d62ee
feat: using our redis singletone instance!
amindadgar May 22, 2024
9c0d695
feat: disabling logs for neo4j and mongodb!
amindadgar May 22, 2024
1cc46c7
fix: update test case to be like normal situation!
amindadgar May 22, 2024
52b6a3f
fix: trying more to detach neo4j & mongodb logs!
amindadgar May 22, 2024
67d885d
fix: black linter issue
amindadgar May 22, 2024
7324f1d
Merge pull request #84 from TogetherCrew/wip-83
cyri113 May 22, 2024
2771973
feat: updating usage of Neo4jOps!
amindadgar May 22, 2024
0f677d3
fix: import syntax error!
amindadgar May 22, 2024
7e629ec
fix: removing the manually creation of neo4j instance!
amindadgar May 23, 2024
f726298
feat: increasing the neo4j backend lib version!
amindadgar May 23, 2024
82b7ec3
fix: wrong branch to increase the version!
amindadgar May 23, 2024
f1ca0d6
fix: increase neo4j lib version!
amindadgar May 23, 2024
4c4a2c8
fix: adding logs to the check the status!
amindadgar May 23, 2024
a83c817
fix: lint issues!
amindadgar May 23, 2024
9f8ee96
fix: lint issues!
amindadgar May 23, 2024
2a78d09
Merge pull request #88 from TogetherCrew/bugfix-87
amindadgar May 23, 2024
52ec08e
fix: trying to comment the decode response!
amindadgar May 23, 2024
84f5726
Merge pull request #86 from TogetherCrew/feat/use-neo4jops-singleton
cyri113 May 23, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion discord_analyzer/analysis/analytics_interactions_script.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ def per_account_interactions(
# flatten the list
samples_flattened = list(itertools.chain(*samples))

for i, sample in enumerate(samples_flattened):
for _, sample in enumerate(samples_flattened):
account_name = sample[0]["account"]
interaction_count = sample[0]["count"]

Expand Down
99 changes: 81 additions & 18 deletions discord_analyzer/analysis/compute_interaction_matrix_discord.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,10 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# compute_interaction_matrix_discord.py
#
# Author Ene SS Rawa / Tjitse van der Molen

from discord_analyzer.analysis.utils.activity import Activity
import copy
from typing import Any

from discord_analyzer.DB_operations.mongodb_access import DB_access
from discord_analyzer.DB_operations.mongodb_query import MongodbQuery
from numpy import ndarray
from numpy import diag_indices_from, ndarray
from tc_core_analyzer_lib.utils.activity import DiscordActivity

from .utils.compute_interaction_mtx_utils import (
generate_interaction_matrix,
Expand All @@ -21,7 +17,7 @@ def compute_interaction_matrix_discord(
dates: list[str],
channels: list[str],
db_access: DB_access,
activities: list[str] = [Activity.Mention, Activity.Reply, Activity.Reaction],
**kwargs,
) -> dict[str, ndarray]:
"""
Computes interaction matrix from discord data
Expand All @@ -32,21 +28,30 @@ def compute_interaction_matrix_discord(
dates - [str] : list of all dates to be considered for analysis
channels - [str] : list of all channel ids to be considered for analysis
db_access - obj : database access object
activities - list[Activity] :
the list of activities to generate the matrix for
default is to include all 3 `Activity` types
minimum length is 1
**kwargs :
activities - list[Activity] :
the list of activities to generate the matrix for
default is to include all activity types
minimum length is 1

Output:
---------
int_mtx : dict[str, np.ndarray]
keys are representative of an activity
and the 2d matrix representing the interactions for the activity
"""

activities = kwargs.get(
"activities",
[
DiscordActivity.Mention,
DiscordActivity.Reply,
DiscordActivity.Reaction,
DiscordActivity.Lone_msg,
DiscordActivity.Thread_msg,
],
)
feature_projection = {
"thr_messages": 0,
"lone_messages": 0,
"channelId": 0,
"replier": 0,
"replied": 0,
"mentioner": 0,
Expand Down Expand Up @@ -77,15 +82,73 @@ def compute_interaction_matrix_discord(
db_results = list(cursor)

per_acc_query_result = prepare_per_account(db_results=db_results)
per_acc_interaction = process_non_reactions(per_acc_query_result)

# And now compute the interactions per account_name (`acc`)
int_mat = {}
# computing `int_mat` per activity
for activity in activities:
int_mat[activity] = generate_interaction_matrix(
per_acc_interactions=per_acc_query_result,
per_acc_interactions=per_acc_interaction,
acc_names=acc_names,
activities=[activity],
)
# a person interacting to themselves is not counted as activity
if activity in [
DiscordActivity.Reply,
DiscordActivity.Reaction,
DiscordActivity.Mention,
]:
int_mat[activity][diag_indices_from(int_mat[activity])] = 0

return int_mat


def process_non_reactions(
heatmaps_data_per_acc: dict[str, list[dict[str, Any]]],
skip_fields: list[str] = [
"reacted_per_acc",
"mentioner_per_acc",
"replied_per_acc",
"account_name",
"date",
],
) -> dict[str, list[dict[str, Any]]]:
"""
process the non-interactions heatmap data to be like interaction
we will make it self interactions

Parameters
-----------
heatmaps_data_per_acc : dict[str, list[dict[str, Any]]]
heatmaps data per account
the keys are accounts
and the values are the list of heatmaps documents related to them
skip_fields : list[str]
the part of heatmaps document that we don't need to make them like interaction
can be interactions itself and account_name, and date

Returns
--------
heatmaps_interactions_per_acc : dict[str, list[dict[str, Any]]]
the same as before but we have changed the non interaction ones to self interaction
"""
heatmaps_interactions_per_acc = copy.deepcopy(heatmaps_data_per_acc)

for account in heatmaps_interactions_per_acc.keys():
# for each heatmaps document
for document in heatmaps_interactions_per_acc[account]:
activities = document.keys()
actions = set(activities) - set(skip_fields)

for action in actions:
action_count = sum(document[action])
if action_count:
document[action] = [
[{"account": account, "count": sum(document[action])}]
]
else:
# action count was zero
document[action] = []

return heatmaps_interactions_per_acc
46 changes: 10 additions & 36 deletions discord_analyzer/analysis/compute_member_activity.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,12 @@
import networkx as nx
import numpy as np
from dateutil.relativedelta import relativedelta
from discord_analyzer.analysis.compute_interaction_matrix_discord import (
compute_interaction_matrix_discord,
)
from discord_analyzer.analysis.member_activity_history import check_past_history
from discord_analyzer.analysis.utils.member_activity_history_utils import (
MemberActivityPastUtils,
)
from discord_analyzer.analysis.utils.member_activity_utils import (
assess_engagement,
convert_to_dict,
get_joined_accounts,
get_latest_joined_users,
Expand All @@ -27,8 +25,6 @@
update_activities,
)
from discord_analyzer.DB_operations.mongodb_access import DB_access
from tc_core_analyzer_lib.assess_engagement import EngagementAssessment
from tc_core_analyzer_lib.utils.activity import DiscordActivity


def compute_member_activity(
Expand Down Expand Up @@ -215,16 +211,6 @@ def compute_member_activity(

# # # ACTUAL ANALYSIS # # #

assess_engagment = EngagementAssessment(
activities=[
DiscordActivity.Mention,
DiscordActivity.Reply,
DiscordActivity.Reaction,
],
activities_ignore_0_axis=[DiscordActivity.Mention],
activities_ignore_1_axis=[],
)

# for every window index
max_range = int(np.floor(last_start.days / window_param["step_size"]) + 1)
# if max range was chosen negative,
Expand Down Expand Up @@ -283,28 +269,16 @@ def compute_member_activity(
# we could have empty outputs
acc_names = get_latest_joined_users(db_access, count=5)

# obtain interaction matrix
int_mat = compute_interaction_matrix_discord(
acc_names, date_list_w_str, channels, db_access
)

# for each int_mat type
for key in list(int_mat.keys()):
# remove interactions with self
int_mat[key][np.diag_indices_from(int_mat[key])] = 0

# assess engagement
(graph_out, *activity_dict) = assess_engagment.compute(
int_mat=int_mat,
graph_out, activity_dict = assess_engagement(
w_i=new_window_i,
acc_names=np.asarray(acc_names),
act_param=act_param,
WINDOW_D=window_param["period_size"],
**activity_dict,
)

activity_dict = convert_to_dict(
data=list(activity_dict), dict_keys=activities_name
accounts=acc_names,
action_params=act_param,
period_size=window_param["period_size"],
db_access=db_access,
channels=channels,
analyze_dates=date_list_w_str,
activities_name=activities_name,
activity_dict=activity_dict,
)

# make empty dict for node attributes
Expand Down
21 changes: 10 additions & 11 deletions discord_analyzer/analysis/utils/compute_interaction_mtx_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
from discord_analyzer.analysis.analytics_interactions_script import (
per_account_interactions,
)
from discord_analyzer.analysis.utils.activity import Activity
from tc_core_analyzer_lib.utils.activity import DiscordActivity


def prepare_per_account(db_results: list) -> dict[str, list[dict]]:
Expand All @@ -29,13 +29,9 @@ def prepare_per_account(db_results: list) -> dict[str, list[dict]]:

# a dictionary for results of each account
for db_record in db_results:
# if the data for a specific account was not created before, create one as list
acc_name = db_record["account_name"]
if acc_name not in per_acc_query_result.keys():
per_acc_query_result[acc_name] = [db_record]
# else, append
else:
per_acc_query_result[acc_name].append(db_record)
per_acc_query_result.setdefault(acc_name, [])
per_acc_query_result[acc_name].append(db_record)

return per_acc_query_result

Expand Down Expand Up @@ -66,7 +62,6 @@ def generate_interaction_matrix(
an array of integer values
each row and column are representative of account interactions
"""

int_matrix = np.zeros((len(acc_names), len(acc_names)), dtype=np.uint16)

for acc in per_acc_interactions.keys():
Expand Down Expand Up @@ -117,12 +112,16 @@ def prepare_interaction_field_names(activities: list[str]) -> list[str]:
"""
field_names = []
for activity in activities:
if activity == Activity.Mention:
if activity == DiscordActivity.Mention:
field_names.append("mentioner_per_acc")
elif activity == Activity.Reply:
elif activity == DiscordActivity.Reply:
field_names.append("replied_per_acc")
elif activity == Activity.Reaction:
elif activity == DiscordActivity.Reaction:
field_names.append("reacted_per_acc")
elif activity == DiscordActivity.Thread_msg:
field_names.append("thr_messages")
elif activity == DiscordActivity.Lone_msg:
field_names.append("lone_messages")
else:
logging.warning("prepare_interaction_field_names: Wrong activity given!")

Expand Down
76 changes: 75 additions & 1 deletion discord_analyzer/analysis/utils/member_activity_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,13 @@

import numpy as np
import pymongo
from discord_analyzer.analysis.compute_interaction_matrix_discord import (
compute_interaction_matrix_discord,
)
from discord_analyzer.DB_operations.mongodb_access import DB_access
from networkx import DiGraph
from tc_core_analyzer_lib.assess_engagement import EngagementAssessment
from tc_core_analyzer_lib.utils.activity import DiscordActivity


def get_joined_accounts(db_access: DB_access, date_range: tuple[datetime, datetime]):
Expand Down Expand Up @@ -41,7 +47,7 @@ def store_based_date(
analytics_day_range,
joined_acc_dict,
load_past,
**kwargs
**kwargs,
):
"""
store the activities (`all_*`) in a dictionary based on their ending analytics date
Expand Down Expand Up @@ -249,3 +255,71 @@ def get_latest_joined_users(db_access: DB_access, count: int = 5) -> list[str]:
usersId = list(map(lambda x: x["discordId"], usersId))

return usersId


def assess_engagement(
w_i: int,
accounts: list[str],
action_params: dict[str, int],
period_size: int,
db_access: DB_access,
channels: list[str],
analyze_dates: list[str],
activities_name: list[str],
activity_dict: dict[str, dict],
**kwargs,
) -> tuple[DiGraph, dict[str, dict]]:
"""
assess engagement of a window index for users

"""
activities_to_analyze = kwargs.get(
"activities_to_analyze",
[
DiscordActivity.Mention,
DiscordActivity.Reply,
DiscordActivity.Reaction,
DiscordActivity.Lone_msg,
DiscordActivity.Thread_msg,
],
)
ignore_axis0 = kwargs.get(
"ignore_axis0",
[
DiscordActivity.Mention,
],
)
ignore_axis1 = kwargs.get(
"ignore_axis1",
[
DiscordActivity.Reply,
DiscordActivity.Reaction,
],
)

assess_engagment = EngagementAssessment(
activities=activities_to_analyze,
activities_ignore_0_axis=ignore_axis0,
activities_ignore_1_axis=ignore_axis1,
)
# obtain interaction matrix
int_mat = compute_interaction_matrix_discord(
accounts,
analyze_dates,
channels,
db_access,
activities=activities_to_analyze,
)

# assess engagement
(graph_out, *activity_dict) = assess_engagment.compute(
int_mat=int_mat,
w_i=w_i,
acc_names=np.asarray(accounts),
act_param=action_params,
WINDOW_D=period_size,
**activity_dict,
)

activity_dict = convert_to_dict(data=list(activity_dict), dict_keys=activities_name)
return graph_out, activity_dict
Comment on lines +261 to +325
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored assess_engagement function to use new parameters and structures.

Consider simplifying the function by breaking it down into smaller, more manageable functions. This can improve readability and maintainability.

Loading
Loading