Log Realtime Analysis is a robust real-time log aggregation and visualization system designed to handle high-throughput logs using a Kafka-Spark ETL pipeline. For example, it can process application logs tracking user requests, error rates, and API response times in real-time. It integrates with DynamoDB for real-time metrics storage and visualizes key system insights using Python and Dash Plotly Library. The setup uses Docker for containerized deployment, ensuring seamless development and deployment workflows.
- Log Ingestion: High-throughput log streaming with Kafka.
- Real-Time Aggregation: Spark processes logs per minute for metrics like request counts, error rates, and response times.
- Metrics Storage: Aggregated metrics stored in DynamoDB for fast querying. DynamoDB is optimized for low-latency, high-throughput queries, making it ideal for real-time dashboard applications.
- Data Storage: Historical logs saved in HDFS as Parquet files for long-term analysis.
- Interactive Dashboard: Dash application with real-time updates and SLA metrics visualization.
-
Input Topic:
logging_info
for real-time log ingestion.- Purpose: High-throughput, fault-tolerant log streaming.
-
Real-Time Aggregation with Spark
- Processing Logic: Aggregates logs per minute for metrics like request counts, error rates, and response times.
- Output Topic:
agg_logging_info
with structured metrics.
-
Downstream Processing
- DynamoDB: Stores real-time metrics for dashboards with low-latency queries.
- HDFS: Stores aggregated logs in Parquet format for long-term analysis.
-
Visualization with Python Dash
- Purpose: Auto-refreshing dashboards show live system metrics, request rates, error types, and performance insights.
- Image:
bitnami/zookeeper:latest
- Ports:
2181:2181
- Volume:
${HOST_SHARED_DIR}/zookeeper:/bitnami/zookeeper
- Image:
bitnami/kafka:latest
- Ports:
9092:9092
,29092:29092
- Volume:
${HOST_SHARED_DIR}/kafka:/bitnami/kafka
- Image:
amazon/dynamodb-local:latest
- Ports:
8000:8000
- Volume:
${HOST_SHARED_DIR}/dynamodb-local:/data
- Image:
aaronshaf/dynamodb-admin
- Ports:
8001:8001
- Image:
jupyter/all-spark-notebook:python-3.11.6
- Ports:
8888:8888
,4040:4040
- Volume:
${HOST_SHARED_DIR}/spark-jupyter-data:/home/jovyan/data
The Python Dash application provides an intuitive interface for monitoring real-time metrics and logs. Key features include:
- SLA gauge visualization.
- Log-level distribution pie chart.
- Average response time by API.
- Top APIs with highest error counts.
- Real-time log-level line graph.
- SLA Gauge: Visualizes the system's SLA percentage.
- Log Level Distribution: Displays the proportion of different log levels.
- Average Response Time: Bar chart showing average response times for APIs.
- Top Error-Prone APIs: Table listing APIs with the highest error counts.
- Log Counts Over Time: Line chart of log counts aggregated by log levels.
- Docker and Docker Compose installed.
- Shared directory setup for volume bindings.
- Replace
${HOST_SHARED_DIR}
with your host directory. - Replace
${IP_ADDRESS}
with your host machine IP.
- Start the Services:
docker-compose up -d
- Access Jupyter Notebook:
Open
http://localhost:8888
or check the logs for the notebook in Docker for the full URL - Run the Dash App:
Access the dashboard at
python ui/ui-prod.py
http://127.0.0.1:8050
. - Kafka Setup:
- Create topics:
python kafka/kafka_producer.py
- Create topics:
- Log Generation: Logs are streamed to Kafka's
airbnb_system_logs
topic. - Spark Processing: Spark consumes logs, aggregates them, and produces structured metrics to
agg_airbnb_system_logs
. - Metrics Storage: Aggregated data is stored in DynamoDB for real-time querying.
- Long-Term Storage: Historical logs are stored in HDFS in Parquet format.
docker-compose.yml
: Docker configuration for services.ui/ui-prod.py
: Dash application for visualizing logs and metrics.kafka/kafka_topic.py
: Script for creating Kafka Topics one for granular logs and the other for aggregate logs from spark.kafka/kafka_producer.py
: Script for simulating logsspark/spark-portfolio.ipynb
: Consumes granular logs from the topiclogging_info
and aggregates the log data by minute intervals, computes statistics (count, avg, max, min response times), and streams the results in JSON format to the Kafka topicagg_logging_info
spark/spark_kafka.py
: Consumes log messages from a Kafka topic, parses them, and stores aggregated log metrics into a DynamoDB table.
- Integrate machine learning for anomaly detection.
- Add support for multiple regions in DynamoDB.
- Implement alerting (sms and email) for SLA breaches.
- Enhance dashboard for customizable user settings.
- Ronald Nyasha Kanyepi - GitHub. For any inquiries, please contact kanyepironald@gmail.com.