Skip to content

process events from Redis streams#201

Open
elfkuzco wants to merge 6 commits intomainfrom
mill-streams-processor
Open

process events from Redis streams#201
elfkuzco wants to merge 6 commits intomainfrom
mill-streams-processor

Conversation

@elfkuzco
Copy link
Contributor

@elfkuzco elfkuzco commented Mar 5, 2026

Rationale

This PR leverages Redis Streams and consumer groups to publish events when a title is created or modified. These events are picked up by a processor which acknowledges the events and attaches books that match the title to the new title.

Changes

  • publish stream events when a title is created or modified
  • add processor to read, process and acknowledge events. Redis streams and consumer groups allow us to recover from failures/crashes and re-process messages which we haven't acknowledged.
  • add env variable for setting the name of the consumer. This should allow one to spin up different instances with different consumer names. This is also the reason why the streams processor is added as a standalone script instead of another background task.

@elfkuzco elfkuzco self-assigned this Mar 5, 2026
@elfkuzco elfkuzco requested a review from benoit74 March 5, 2026 02:48
@benoit74
Copy link
Contributor

benoit74 commented Mar 5, 2026

Looking at the code, I'm not comfortable at all to deploy this significant architecture change without significant discussions with the rest of the team. An inappropriate choice might lead to complex production issues to handle.

I've opened up openzim/overview#77 to better discuss this before diving into this PR implementation details.

Since this discussion might take time, and since we need the feature to reprocess books without titles in the CMS probably faster than the central discussion will settle, I would like that we move forward now with a "simpler" plan B. I propose that you implement this as a periodic (every minute by default) reprocessing of books without titles. The logic should first check if there is now a matching title, and if yes trigger the usual process_book function. This is not really elegant / efficient, but it easier to move forward for me. WDYT?

@elfkuzco
Copy link
Contributor Author

elfkuzco commented Mar 5, 2026

Okay. My plan initially was to create an Event table and then when a title is created/modified, the event would be created. The background task would notice it (perhaps when next it runs as a background task), act on it and then, delete processed events from the DB so it doesn't get filled up with useless entries. WDYT of this approach?

@benoit74
Copy link
Contributor

benoit74 commented Mar 5, 2026

I'm fine with this Event table as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants