Table of Contents
This idea of this project is to build a End-To-End Web Application for Log Ingestion that can efficiently handle vast volumes of log data, and offer a simple interface for querying this data using full-text search or specific field filters. The logs are ingested over HTTP on port 3000
. I have tried my best to ensure scalability to handle high volumes of logs efficiently using event-driven distributed architecture.
Sample Log Data Format:
{
"level": "error",
"message": "Failed to connect to DB",
"resourceId": "server-1234",
"timestamp": "2023-09-15T08:00:00Z",
"traceId": "abc-xyz-123",
"spanId": "span-456",
"commit": "5e5342f",
"metadata": {
"parentResourceId": "server-0987"
}
}
This section should list any major frameworks/libraries used to bootstrap your project. Leave any add-ons/plugins for the acknowledgements section. Here are a few examples.
Instructions on setting up the project locally. To get a local copy up and running follow these simple example steps.
Setup of the following is needed locally to run the application :
- Java 1.8
- Spring Boot 2.7.17
- Apache Kafka 3.2.3
- Elasticsearch 8.11.1
- Clone the repository
git clone https://github.com/dyte-submissions/november-2023-hiring-kbnewbee
- Run zookeeper
zookeeper-server-start.bat ..\..\config\zookeeper.properties
- Run kafka server
kafka-server-start.bat ..\..\config\server.properties
- Create kafka topic
kafka-topics.bat --create --topic topic-franky --bootstrap-server localhost:9092 --replication-factor 1 --partitions 3
- Change the ssl config to false
cd elasticsearch-8.11.1\config\elasticsearch.yml xpack.security.enabled: false xpack.security.enrollment.enabled: false xpack.security.transport.ssl: enabled: false xpack.security.http.ssl: enabled: false
- Run elasticsearch
cd elasticsearch-8.11.1\bin elasticsearch.bat
- Run the Spring Boot application
-
The application would run on port
3000
-
It provides two APIs for log ingestion
This API can be used to insert one log JSON at a time: POST http://localhost:3000/api/logs/ingest
Request Body : { "level": "info", "message": "Failed to connect to DB", "resourceId": "server-11", "timestamp": "2023-09-15T08:00:00Z", "traceId": "abc-xyz-17", "spanId": "span-16", "commit": "5e5342f", "metadata": { "parentResourceId": "server1-0395" } }
This API can be used to insert multiple log JSON at a time: POST http://localhost:3000/api/logs/ingest/batch
Request Body : [{ "level": "error", "message": "Failed to connect to DB", "resourceId": "server-1234", "timestamp": "2021-09-09T08:00:00Z", "traceId": "abc-xyz-40", "spanId": "span-1", "commit": "5f5342f", "metadata": { "parentResourceId": "server1-0395" } }, { "level": "debug", "message": "Failed to connect to DB", "resourceId": "server-1234", "timestamp": "2021-09-10T08:00:00Z", "traceId": "abc-xyz-41", "spanId": "span-2", "commit": "5f5342f", "metadata": { "parentResourceId": "server2-0495" } }]
-
Once the logs are pushed, we can query using a simple query interface
http://localhost:3000/index.html
-
Use a filter or multiple filters to search the logs.
- Offer a user interface (Web UI or CLI) for full-text search across logs.
- Include filters based on:
- level
- message
- resourceId
- timestamp
- traceId
- spanId
- commit
- metadata.parentResourceId
- Bonus feature
- Implement search within specific date ranges
- Allow combining multiple filters
- Provide real-time log ingestion and searching capabilities.
Kallol Bairagi - @kallob14 - kallolb22@gmail.com