Skip to content

AndrewKuzmin/Analytics-For-IoT-Devices-Using-Spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Analytics-For-IoT-Devices-Using-Spark

Analytics for IoT devices using Apache Spark 2.4.0

Use cases of processing modes (Triggers modes)

  1. Default;
  2. Fixed interval micro-batches;
  3. One-time micro-batch;
  4. Continuous with fixed checkpoint interval;

Optimizations

  1. Tungsten execution engine;
  2. Catalyst query optimizer;
  3. Cost-based optimizer;

Structured Sessionization

  1. KeyValueGroupedDataset.mapGroupsWithState;
  2. KeyValueGroupedDataset.flatMapGroupsWithState;

Examples from below notebooks were used:

  1. Complex and Nested Data;

Reference for test data was used:

Nest Developers

JSON data generator for test data by EverWatch Corporation was used:

Json Data Generator

About

Analytics for IoT devices using Apache Spark Structured Streaming 2.4.0

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published