feat(streaming): add redis pubsub in NotifBrokerActor, supports data …

…AES256 encryption through the channel
wildonion · Sep 19, 2024 · 08742ac · 08742ac
1 parent 8cf0203
commit 08742ac
Show file tree

Hide file tree

Showing 11 changed files with 2,364 additions and 66 deletions.
diff --git a/README.md b/README.md
@@ -4,7 +4,7 @@
 
 ## ᝰ.ᐟ What am i?
 
-i'm hoopoe, the social event platform allows your hoop get heard!
+i'm hoopoe, a realtime social event platform allows your hoop get heard!
 
 ## Execution flow & system design?
 
@@ -19,7 +19,7 @@ i'm hoopoe, the social event platform allows your hoop get heard!
 
 - **step3)** instance of `NotifData` is cached on redis and stored in db.
 
-- **step4)** client invokes `/notif/get/owner/` api to get its notification during the app execution in a short polling manner.
+- **step4)** client invokes `/notif/get/owner/` api to get its notification during the app execution in a short polling manner or through ws streaming.
 
 ```
    ------------------ server1/node1 actor -----------------                                         ___________

diff --git a/src/apis/v1/ws/notif.rs b/src/apis/v1/ws/notif.rs
@@ -21,7 +21,9 @@ use crate::*;
     the notif broker however stores data on redis and db allows 
     the client to fetch notifs for an owner in a short polling manner
     this way is used to fetch all notifs for an owern in realtime as
-    they're receiving by the RMQ consumer.
+    they're receiving by the RMQ consumer, there is a jobq mpsc channel
+    being used to send the received notif through RMQ to this channel
+    likely we're receiving it in here using the rx of the mpsc channel. 
     addr: localhost:2344/v1/stream/notif/consume/?owner=100&room=notif_room
     owner is the notification owner which must be equal to the `receiver_info`
     field inside the notif_data instance received by the consumer. 

diff --git a/src/interfaces/crypter.rs b/src/interfaces/crypter.rs
@@ -34,13 +34,13 @@ use crate::*;
     the async keywords.
 */
 pub trait Crypter{
-    fn encrypt(&self, secure_cell_config: &mut SecureCellConfig);
-    fn decrypt(&self, secure_cell_config: &mut SecureCellConfig);
+    fn encrypt(&mut self, secure_cell_config: &mut SecureCellConfig);
+    fn decrypt(&mut self, secure_cell_config: &mut SecureCellConfig);
 }
 
 // used for en(de)crypting image in form of Vec<u8> slice or &[u8]
 impl Crypter for &[u8]{
-    fn encrypt(&self, secure_cell_config: &mut wallexerr::misc::SecureCellConfig){
+    fn encrypt(&mut self, secure_cell_config: &mut wallexerr::misc::SecureCellConfig){
         match wallexerr::misc::Wallet::secure_cell_encrypt(secure_cell_config){ // passing the redis secure_cell_config instance
             Ok(data) => {
                 secure_cell_config.data = data
@@ -64,7 +64,7 @@ impl Crypter for &[u8]{
         };
     }
 
-    fn decrypt(&self, secure_cell_config: &mut wallexerr::misc::SecureCellConfig){
+    fn decrypt(&mut self, secure_cell_config: &mut wallexerr::misc::SecureCellConfig){
         match wallexerr::misc::Wallet::secure_cell_decrypt(secure_cell_config){
             Ok(encrypted) => {
 
@@ -101,7 +101,7 @@ impl Crypter for &[u8]{
 
 // used for en(de)crypting image in form of Vec<u8>
 impl Crypter for Vec<u8>{
-    fn encrypt(&self, secure_cell_config: &mut wallexerr::misc::SecureCellConfig){
+    fn encrypt(&mut self, secure_cell_config: &mut wallexerr::misc::SecureCellConfig){
         match wallexerr::misc::Wallet::secure_cell_encrypt(secure_cell_config){ // passing the redis secure_cell_config instance
             Ok(data) => {
                 secure_cell_config.data = data
@@ -125,7 +125,7 @@ impl Crypter for Vec<u8>{
         };
     }
 
-    fn decrypt(&self, secure_cell_config: &mut wallexerr::misc::SecureCellConfig){
+    fn decrypt(&mut self, secure_cell_config: &mut wallexerr::misc::SecureCellConfig){
         match wallexerr::misc::Wallet::secure_cell_decrypt(secure_cell_config){
             Ok(encrypted) => {
 
@@ -161,9 +161,18 @@ impl Crypter for Vec<u8>{
 
 // used for en(de)crypting data in form of string
 impl Crypter for String{
-    fn decrypt(&self, secure_cell_config: &mut SecureCellConfig){
+    fn decrypt(&mut self, secure_cell_config: &mut SecureCellConfig){
+
+        // encrypt convert the raw string into hex encrypted thus
+        // calling decrypt method on the hex string returns the 
+        // raw string
+        secure_cell_config.data = hex::decode(&self).unwrap();
         match Wallet::secure_cell_decrypt(secure_cell_config){ // passing the redis secure_cell_config instance
             Ok(data) => {
+
+                // update the self by converting the data into string format from its utf8
+                *self = std::str::from_utf8(&data).unwrap().to_string();
+
                 secure_cell_config.data = data
             },
             Err(e) => {
@@ -185,11 +194,19 @@ impl Crypter for String{
         };
 
     }
-    fn encrypt(&self, secure_cell_config: &mut SecureCellConfig){
-       match Wallet::secure_cell_encrypt(secure_cell_config){
+    fn encrypt(&mut self, secure_cell_config: &mut SecureCellConfig){
+
+        // use the self as the input data to be encrypted
+        secure_cell_config.data = self.clone().as_bytes().to_vec();
+
+        match Wallet::secure_cell_encrypt(secure_cell_config){
             Ok(encrypted) => {
 
                 let stringified_data = hex::encode(&encrypted);
+
+                // update the self or the string with the hex encrypted data
+                *self = stringified_data;
+
                 // update the data field with the encrypted content bytes
                 secure_cell_config.data = encrypted; 
 

diff --git a/src/server/mod.rs b/src/server/mod.rs
@@ -177,8 +177,9 @@ impl HoopoeServer{
         let connection = sea_orm::Database::connect(
             db_url
         ).await.unwrap();
-        let fresh = args.fresh;
+
         // migration process at runtime
+        let fresh = args.fresh;
         // if fresh{
         //     log::info!("fresh db...");
         //     Migrator::fresh(connection).await.unwrap();

diff --git a/src/workers.rs b/src/workers.rs
@@ -16,4 +16,5 @@
 pub mod cqrs; // cqrs actor components
 pub mod notif; // broker actor component
 pub mod zerlog; // zerlog actor component
-pub mod scheduler; // hoop scheduler actor component
+pub mod scheduler; // hoop scheduler actor component
+pub mod actor;
diff --git a/src/workers/Kafka.md b/src/workers/Kafka.md
@@ -0,0 +1,112 @@
+
+### **1. Kafka Topics**
+A **topic** in Kafka is a logical channel to which producers send messages and from which consumers read messages. Think of a topic as a named category or feed, and each message published to Kafka is categorized under a topic. It's like an exchange it RMQ. Topic is where all the messages get collected in there.
+
+- **Example**: If you are working with a log aggregation system, you might have topics like `application_logs`, `error_logs`, and `event_logs`.
+
+- **Durability**: Kafka topics store data for a specified retention period, even after messages are consumed. This makes it possible to replay or reprocess the data. Unlike the RMQ which removes the messages from the queue once the cosumer receives them this enforces us to use a new queue per each consumer.
+
+### **2. Partitions**
+A **topic** is divided into multiple **partitions** to enable parallelism and scalability.
+
+- **Partitions** allow Kafka to scale horizontally, meaning that more partitions allow more consumers to consume in parallel, leading to higher throughput.
+
+- **Message Order**: Within a single partition, Kafka guarantees the order of messages (i.e., messages are read in the order they are written). However, across multiple partitions, Kafka doesn't guarantee message ordering.
+
+- **Partition Key**: When producing messages, you can specify a **partition key** to control which partition the message is routed to. If no key is specified, Kafka will use a round-robin strategy to distribute messages across partitions. It's like routing key in RMQ.
+
+- **Example**: A topic `sensor_readings` could have 10 partitions. If you send messages with the same sensor ID as a partition key, all readings from that sensor would go to the same partition, maintaining their order.
+
+### **3. Messages**
+A **message** in Kafka is the basic unit of data. It consists of:
+   - **Key** (optional): Helps to determine which partition the message is routed to.
+   - **Value**: The actual data (the payload) that is being sent (e.g., a JSON object, string, etc.).
+   - **Timestamp**: When the message was created.
+
+   Kafka **messages** are stored in **topics** and are read by consumers.
+
+- **Message Structure**: Messages can be serialized in different formats (e.g., JSON, Avro, Protobuf), depending on how the data is intended to be consumed.
+
+### **4. Batch Sending**
+To optimize performance, Kafka producers can send messages in **batches** rather than individually.
+
+- **Batching** reduces the number of network requests, as multiple messages are sent in a single request. This leads to higher throughput and better utilization of Kafka brokers.
+
+- Producers buffer messages in memory and send them as a batch when either:
+  - A certain batch size limit is reached.
+  - A certain time limit is exceeded.
+
+- **Trade-off**: Sending in batches reduces network overhead but can increase latency, as messages are delayed in memory until the batch is full or the time limit is reached.
+
+### **5. Offset**
+An **offset** is a unique identifier that Kafka assigns to each message within a partition. It indicates the position of a message within that partition. Allows consumers resume consuming where they've left.
+
+- **Message Offset**: Kafka keeps track of each message using its offset within the partition. Each partition has its own sequence of offsets starting from `0` and increasing as more messages are produced.
+
+- **Consumer Offset**: Consumers use offsets to track which messages have been read. When a consumer reads a message from a partition, it can store the offset to know where to resume if it needs to continue later (e.g., after a crash or restart).
+
+- **Offset Example**: In a partition, the first message might have offset `0`, the next one `1`, and so on.
+
+### **6. Single Consumers vs. Consumer Groups**
+Kafka's flexibility comes from how it handles consumers, and it supports two main models: **single consumers** and **consumer groups**.
+
+#### **Single Consumer**
+A **single consumer** consumes data from one or more partitions of a topic. Each partition is assigned to one consumer, and only one consumer processes messages from each partition.
+
+- **Example**: If a topic has 3 partitions and you have 1 consumer, that consumer will read from all 3 partitions sequentially.
+
+#### **Consumer Groups**
+A **consumer group** is a group of consumers that work together to consume messages from a topic. Kafka ensures that each partition is consumed by **only one consumer within the group**.
+
+- **Parallel Processing**: Kafka distributes partitions among consumers in the group, allowing messages to be processed in parallel.
+
+- **Example**: If you have a topic with 4 partitions and 2 consumers in the same group, Kafka will assign 2 partitions to each consumer. If a consumer crashes, Kafka will rebalance and reassign the partitions to the remaining consumers.
+
+- **Multiple Consumer Groups**: Multiple consumer groups can consume the same topic independently. Each group maintains its own offsets, so they don't interfere with one another.
+  - **Example**: You could have two different consumer groups, one for real-time processing (group A) and another for batch processing (group B), both consuming from the same topic but handling the data in different ways.
+
+- Consumer groups are like multiple queues bounded to multiple exchanges each of which receives related messages from the exchanges. like consumer1 bind its queue to exchange1 and exchange2.
+
+### **7. Committing Messages**
+Committing a message in Kafka means recording that the message has been successfully processed by a consumer like ack in RMQ, allowing Kafka to manage offsets properly.
+
+- **Auto-Commit**: By default, Kafka can automatically commit offsets periodically, meaning that the consumer keeps track of the last message it read. However, this can lead to issues if the consumer crashes after reading a message but before processing it.
+
+- **Manual Commit**: With **manual commit**, consumers can explicitly commit offsets after processing each message (or batch of messages). This gives consumers control over when to mark a message as processed, ensuring better fault tolerance.
+
+  - **Example**: Suppose a consumer reads a message, processes it, and then manually commits the offset for that message. If the consumer crashes before committing the offset, it will re-read the message upon recovery.
+
+- **Offset Committing in Consumer Groups**: Each consumer in a group commits offsets independently. If a consumer fails, the new consumer taking over the partition will resume from the last committed offset.
+
+---
+
+### **How These Components Work Together in Kafka**:
+
+1. **Topic Creation**:
+   - Let's say you have a Kafka topic `user_activity_logs` with 6 partitions.
+
+2. **Producer Sends Messages**:
+   - A producer sends log messages (user interactions) to Kafka. If the producer specifies a **key** (e.g., `userID`), Kafka uses the key to route messages to specific partitions. Messages are sent in **batches** to reduce network overhead.
+
+3. **Consumers in a Group**:
+   - You have 3 consumers (in **consumer group A**) reading from `user_activity_logs`. Kafka assigns each consumer 2 partitions to read from (since the topic has 6 partitions and there are 3 consumers).
+
+4. **Message Offset Tracking**:
+   - Each message in the partition is assigned an **offset** (e.g., 0, 1, 2...). Consumers track these offsets so they know where to resume in case of failure.
+   - The consumers can **commit offsets** manually or let Kafka handle it automatically, ensuring that they process each message only once.
+
+5. **Rebalancing and Scaling**:
+   - If a new consumer is added to the group, Kafka automatically **rebalances** the partitions among all consumers, ensuring that no two consumers in the group read from the same partition. Conversely, if a consumer crashes, Kafka redistributes its partitions to the remaining consumers.
+
+---
+
+### **Summary**:
+- **Kafka topics** are high-level channels for organizing data.
+- **Partitions** allow parallelism by splitting a topic into substreams.
+- **Messages** are the individual data units that are published and consumed.
+- **Batch sending** improves producer performance by sending multiple messages in a single network call.
+- **Offsets** track message positions within partitions, both for Kafka's storage and consumers' progress.
+- **Single consumers** read data from specific partitions, while **consumer groups** allow partitioned, parallel processing.
+- **Committing messages** is how Kafka ensures data processing reliability by keeping track of the last successfully processed message per consumer.
+
+This design enables Kafka to scale, handle large volumes of data, and maintain reliability in data processing systems.
diff --git a/src/workers/README.md → src/workers/Rmq.md b/src/workers/README.md → src/workers/Rmq.md