Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: hydrater la table du dernier évènement connu pour un email à partir des évènements passés #896

Open
wants to merge 1 commit into
base: 893-enregistrer-le-dernier-evenement-pour-un-email-dans-emaillastseen
Choose a base branch
from

Conversation

vincentporte
Copy link
Contributor

@vincentporte vincentporte commented Jan 28, 2025

Description

🎸 Collecter les User.last_login, Event, DSP, UpVote, ForumRating, Post anonymes et authentifiés, Notification visitées (#891)
🎸 dédupliquer les emails en gardant l'evènement le plus récent
🎸 ignorer les emails déjà enregistrés dans EmailLastSeen, en considerant que l'enregistrement dans EmailLastSeen est le plus récent
🎸 Insérer l'ensemble dans EmailLastSeen

Type de changement

🚧 technique

Points d'attention

🦺 test_collect_clicked_notifs casse par principe en attendant #891
🦺 prérequis #892 & #894

simulation sur données du 27 janvier 2025

$ python manage.py populate_emaillastseen

users logged in: collected 18757
events: collected 19067
DSP: collected 21938
UpVotes: collected 22770
forum ratings: collected 22994
posts: collected 27634
collect_clicked_notifs: pending #891
clicked notifications: collected 27634
deduplication: 19687
remove known last seen: 19687
insert last seen: 19687
that's all folks!

@vincentporte vincentporte added the python Pull requests that update Python code label Jan 28, 2025
@vincentporte vincentporte self-assigned this Jan 28, 2025
Comment on lines +60 to +65
.values_list("poster__email", "created", "kind")
)
qs_anonymous = (
Post.objects.filter(poster=None)
.annotate(kind=Value(EmailLastSeenKind.POST))
.values_list("username", "created", "kind")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +94 to +125
last_seen = collect_users_logged_in()
sys.stdout.write(f"users logged in: collected {len(last_seen)}\n")

last_seen += collect_event()
sys.stdout.write(f"events: collected {len(last_seen)}\n")

last_seen += collect_DSP()
sys.stdout.write(f"DSP: collected {len(last_seen)}\n")

last_seen += collect_upvote()
sys.stdout.write(f"UpVotes: collected {len(last_seen)}\n")

last_seen += collect_forum_rating()
sys.stdout.write(f"forum ratings: collected {len(last_seen)}\n")

last_seen += collect_post()
sys.stdout.write(f"posts: collected {len(last_seen)}\n")

last_seen += collect_clicked_notifs()
sys.stdout.write(f"clicked notifications: collected {len(last_seen)}\n")

dedup_last_seen_dict = deduplicate(last_seen)
sys.stdout.write(f"deduplication: {len(dedup_last_seen_dict)}\n")

dedup_last_seen_dict = remove_known_last_seen(dedup_last_seen_dict)
sys.stdout.write(f"remove known last seen: {len(dedup_last_seen_dict)}\n")

res = insert_last_seen(dedup_last_seen_dict)
sys.stdout.write(f"insert last seen: {len(res)}\n")

sys.stdout.write("that's all folks!\n")
sys.stdout.flush()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C’est violent niveau mémoire de tout charger, mais vu la taille de la commu 🤷

J’aurais plutôt utilisé l’identifiant utilisateur comme clé d’un dict[username, namedtuple(last_seen, kind)] et itéré sur les éléments petit à petit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants