Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
Updated README to include changes in training.
  • Loading branch information
PRodriguezFlores authored Jun 26, 2024
1 parent b9b3ef9 commit 8cafd5d
Showing 1 changed file with 15 additions and 3 deletions.
18 changes: 15 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,21 @@ bash resources/src/example/run_example.sh
2. yum install rb-aioutliers

## Model design
Initially, data is extracted from a designated druid datasource in timeseries format, with configurable metrics and settings. After rescaling from zero to one and segmentation, an autoencoder reconstructs the data, enabling anomaly detection through k-sigma thresholding.
rb-aioutliers includes two types of artificial intelligence-based models, a shallow one and deep ones. Either way, the idea in both is the same, to reconstruct a "normal" timeseries from the original and detect anomalies comparing them.

The shallow model is an static model for any kind of numerical time series, the reconstruction of the series is done by a special kind of convolution and the anomaly detection makes use of a modified isolation forest algorithm. This model is intended for doing anomaly detection over data where there is no prior information and thus its lightweight training is done in runtime.

The deep models are tensorflow-based deep learning models that automatically learn pattern in historical data based on any kind of druid filter. The reconstruction is done with an autoencoder which forces the reconstruction to loss the information of the less common patterns. Then, the anomalies are detected through regular k-sigma thresholding with mean and standard deviation recovered from previous training sessions. This training is generally expensive and should be done through the corresponding training service.

The anomalies are outputed in Json format together with the data reconstructed by the autoencoder.

## Model execution
rb-aioutliers utilizes the Flask framework to create an HTTP server. Users can send Druid queries via POST requests to the /calculate endpoint. When rb-aioutliers receives the Druid query, it sends a request to the Druid broker, retrieves the necessary data, and then proceeds to execute the anomaly detection model.
rb-aioutliers utilizes the Flask framework to create an HTTP server. Users can send Druid queries along the desired model via POST requests to the /calculate endpoint. When rb-aioutliers receives the Druid query, it sends a request to the Druid broker, retrieves the necessary data, and then proceeds to execute the selected anomaly detection model. The request should look like this:
```
query=<base64_encoded_druid_query>&model=<base64_encoded_model_name>
```
If the model is left empty, not recognized or set as 'default' the executed model will be the shallow outliers. The druid query should use the native druid json format.
Additionaly, an user can request to perform an anomaly detection to a set of data directly. To do so, one should send a 'data' parameter instead of a 'query' parameter, this data should be a JSON with a druid timeseries response.

![img](https://lh3.googleusercontent.com/fife/AK0iWDzo4KPabRCLUNYy9rYH6MyVZeihkZsxeJ27o9FMmKfgCr8h5HOG4o-KxfhxkITIS3a5uk1IgP344kL8bMBxi8GakpIaO9_Gwh8GNPABIpJIkn-CAoBa1pOjg3sWGPKpdc3GetxIq1PINzlbXABCxLUWUpm4EVpZymlCh7MLdP5U8wjjXw_dTLCvyRHbZ2axvNuRTDa7D-uWQ-Fw0AtdVKF66hH4otUm9x7Z6jwP-PhVRD0uZMuWY6gMEw9aoopRbCw0C9s77Y10OM9qMqx39nkQbfBSkpihBy_5CDTnrdGgLsw3JslkB8kDNl1B9_SE725tVonH3Uda2n9QmosY38rKwRonNUuDAhe8gMaXAZhrk1JACZHrzRzgS_8C3GTU8WR0y8Rx0o-aH8ZaPCU_BNPtyLfX600KPBI-va5H2p4V4uV0fH5DPFZoZp3txGSOUFNKein6AWrLHg2b6YNN5tu7ReyNDP5jgWartMd2fxY4VBC9uW2JaPMVbaexEfXO1_yYiBq-rZWYJolYl4SHWuGEAQanp6qeUPZeBBxfmPdpXsUo5Yvj6UIzCv5Pab3tqUmt7TzYWdCESdPAxjSZM37IW-uPeoaMufyBvwz7ygvDzOh66DBXuj40MQDJLcsBZpQY-RlqUT81Y0PgT-kdl-S-Ik-P27trXw2gozIE7eTfhFsLv1RQWlB8SvnpbHVdtff9iwGpjLXxyFwSgzpUwugpfvcgvHBdxBxpD6yFazrXwfM0GKjoMrijxLy06nBk5tbwLiECZbZ25Efx_OnE6zs7ucTwizKwcx4ij7cxGljIj6r4QWA_Odmk_CcAcuIB--dzmdlTIgll-Vktyh-tZaY1m4IyxtWO3E8iikQb-_ZD02RgzxUk85ufJZB7CveeL-o1MXzeq78_oMkMjb-Rlz3RN2G8OXWmSnpFNnLnSq1L7rysqZCEwoYAhAoCz_3GyX_LgelLXYwvzEF7Uf-mGR7d6PyihRjJ6jEWpRFZ2AtkoIOzuvhRMFOimUmUhezFVCy1i7aVWMBqclWTUj3qyM43rAjLAUuCRvKyTYwjDlGlbz5GVbzm5VsQ8uUgLGnRSGkjf-UHdx-Nqn7EBHHr2_YOjVSRD8wdgpp_gLymEWoBF-llMStK2GOV0LhWjkkblDQGEiFex2xTrmin3KAqkOqlMrnrfMRRSqS0WPi-IA_H_0ljXL5QPeVRmocLHqh72Xz3UnHihdLGuk9-bYXtOFw9orthABjp5PqOTl1Aw3QdwQclFHO_8X13jYhihAVGPN01IXYrmzB01l6Fz1LS5ZP4yXw9F6UUsOF8UQxD_3e7_WqJVt4mEKQZojiNPV1OTiu8GFtj1rRXMgXBv_2dIIyKXwnaQSjxRTnk-SFLXvXLxHK4oW4Uu0IS1jXnrSiOijHGeH4TG2adaHEZRM9zlv1OXepXO-V4ytXaYHasZpmFfPwj43lxyyO78r3ax_Vpp1ZR3PpXVdrTlT9oT7_0ilXbIjXTb8feminLsow0aSdXcisTWiMhvhkqpplAOh-U_XMTqZSfJpwlYm_Vcq-YRFJ_7FSGjgN69ZUAQOdkJgoRI0-rfUq-Ev0iEg=w800-h600)

Expand Down Expand Up @@ -81,7 +91,9 @@ For the training service, the Chef client will create individual configuration f

# Trainer Jobs

The trainer nodes send job data to a Redis server (this is done by the ```Trainer service```). Each node also requests the job queue. The RQ worker processes these jobs in the background. If any node goes down, there will be no problem, as the sensor information for training is stored in Redis, and another node can take over its tasks.
The trainer job is organized through zookeeper, which divides nodes between a leader node and one or more followers. The leader job is to put training jobs in a queue where models designated in s3 for training will be queued at regular intervals. If the leader is lost for any reason, the remaining nodes will elect a new leader.

The followers, or trainers, consume from the queue and train the models. Should any follower fail during training (for example, because its trainer lost connection), the leader would notice it, requeuing the job.

![img](https://lh3.googleusercontent.com/fife/AK0iWDw5h78LX5D8lkgspzSvhWiKvl81nPJI7Cmaz1yJGvVo13PqydmNWpwVfwJ_wvtx_4xFkItgPqBFQY0ft2LaM7i_HIpO3iokK2gTlX_v7TzRuOEqz14J-DmN5PceuoxzNoaEEN6tP6XBP1eMxqI3SnF3kc8e6tv4aO9uAmQEXo6queFD4LFCzXfPmkuOrQIjtqwRfrhO8Znn5w9AOwA93wIMTyxnEPbrsSMwDNmD0lN7VNfLI5hwGcbDkV70v87-u8JhBdEbtcHFaG57jY0o9AyUzKFLJW72ZwulMLYFIqndzcMpUU3XUTM0_3U2A2C_2JvvDRmt91AFOgEN70b2fpH5Tm06wUeENAQ6d9pwdQi1ZIs8wobl4Ijhmi9R-cKo5LJYitxFcgcOHIun4Mk52Vl1fC1ErT6vJzt091lr0lAHcf4wGQDMuSoMBXtxSm4JZUi6ahxmH5Fj5n8BNFRxhJH1Pc_aW99G-OxXWL8EyjlDGwj4X9lfuWMEtFqqBgEMzX6KZuQzBLBhNLTOGIfllja5Oghhqd1HYXQHiEWnfWp02oS2O9K-Q7OYMmtm1RggoYQVHA_YDf1pXPMPx_vqoFpiZAD5_7HyTSJaWb_baq67rQazjNQTrn3ihbY-rm3YZ0YUId-8yi1Z3cf-t21uxUTACvqhaeA_TH47jJCnHs3UHrHr5pVU0_YvKZuqRla-SD9IX9lbLrdX0kdqAP3Jsf0u_bY2157ouLvs0Wryn1vs6Q7hThkuLfA6AC8hUDECO6V1IhY1o4g41oXpjjr_urr7_ubfYaIj5EEJi2HTw7pTwBM_cLXbAgdi-e4R63YMT2O15H67hVV436iMEFELjCFEd_PaZDPhI-nWGrWfca-GYm6wI44V2cijE8upM8l2lnGiup5VGaVuGvisSYQB-e2RwSZImOD5q0RGfms6Zmi6JatdJ2obOTgXtdhnvmCLfdSYItMi3ftvyjNJHmCPpVSIeepD8A_JAm5oOciBEehqOkW01Q3qIrbtWQnPMqa_LP-SxnDSaSrCQPRNnlGZ7TrMCHCXJ0y_mjh5UhUK3OOMAq5Yq89Ha7X0E-NJOf__B9YNh7Rq3A9wQpyRa8TjdoxwRpRbCh1WtaQga-znsDeJmg3gOI7irvaMA1jQmeQibLt_gFXg6ePfqJNKdof9aKwpiN9XMAtQPStbmuPpFqwopzZn5mzmkxYuV4k2Mhbsb96-Z-rU6US9l0BMPRbnmtPAeyFOi1eLkEX7Zvlzp5FwnKiweVoQyL2L7Oo80VVY5fuwHUC53wSPm1T-2mLLzcnky-xAlFmiF9t0J9MwHHKQcSugbSo8l_PWbDKnZ0jq5yKMucF0RWXrMGe7jLksv3A9eDTIwHHCTQM0xTSUjPH9QltPhbkFov9kOzRISceVeHhJjT3dxFXNeetA_Ry5PFIeC_l7XACyOh0u_ykbc6q_yNfz1iQgUpzwkRTIeSxC_b3ohCgqQIBJ7ohzqM_2KnXdIBqmsRum4C71XImCW-jDBMiOKPODAZaUq2WCD4jM_GIihD6UITbDkGAfJpkphQXP8MmGAsg80kVs1MFjoLkJcFF8IhNh_04Fow=w918-h986)

Expand Down

0 comments on commit 8cafd5d

Please sign in to comment.