To simulate multiple apps submitting events with GPS coordinates, we created a mock notebook to create the simulated data and push it to the eventhub. The main stream would then watch events coming from that eventhub and consume those events when they become available. It would then check if the gps coordinates are near a given store, and if so would add a promotion and write the results back to another eventhub. Following that downstream consumers could listen to those events and push a notification back to the app, allowing the end users to receive tailored promotions to their location.
- Mock GPS data from app and write to eventhub consumer group dev-readfrom, in the mockgpsdata notebook
- Ingest GPS data from dev-readfrom, do transformation, write back to eventhub dev-writeto in the eventdrivenstreaming notebook
Mock gps data from multiple app users
Consume those events and write back those to the WriteTo eventhub, when those GPS coordinates are the proximity of a given store which has a promotion running
Down stream apps can then listen to the promotion eventhub (WriteTo) and then forward push notifications. In this example, we could use logic apps to become a consumer of the eventhub (for this PoC we left this out of scope but show how it could be done).
Events are stored into ADLS from the eventhubs into bronze. Here they have the avro schema format. The first thing we need to do is extract the body and get the raw data from there. Then we enrich the data by joining it onto the store dataset and promotions, so that we can provide a more rich dataset to provide analysis on. We then write the data to Silver.
To provide an analytics layer we take the Silver data and aggregate it to useful information that our users would be interested in. We use spark streaming to ensure that we get near realtime tables that can provide our users with up to date information.
When the analysis layer (GOLD) is ready, analyts can use it to query the results
clone repo
git clone https://github.com/magrathj/event-driven-hackathon.git
cd into directory
cd /event-driven-hackathon
Run the workspace inside of a remote container (in the .devcontainer folder) so that terraform is already install for you.
You will need the remote container extension from VS code: https://code.visualstudio.com/docs/remote/containers
cd into devops/environments/dev
cd /devops/environments/dev
terraform init
terraform plan
Run the following notebook to mount the workspace to your ADLS
/EventDrivenPromotions/mountlake
Read/Write to eventhub https://docs.databricks.com/spark/latest/structured-streaming/streaming-event-hubs.html
Mock data to eventhub and read back https://docs.microsoft.com/en-gb/azure/databricks/scenarios/databricks-stream-from-eventhubs
Connect keyvault to Azure Databricks https://docs.microsoft.com/en-us/azure/databricks/security/secrets/secret-scopes
Run inside of dev container https://code.visualstudio.com/docs/remote/containers