This is the code repository for Mastering Apache Flink, published by Packt. It contains all the supporting project files necessary to work through the book from start to finish.
##About the book With the advent of massive computer systems, organizations in different domains generate large amounts of data on a real-time basis. The latest entrant to big data processing, Apache Flink, is designed to process continuous streams of data at a lightning fast pace.
This book will be your definitive guide to batch and stream data processing with Apache Flink. The book begins with introducing the Apache Flink ecosystem, setting it up and using the DataSet and DataStream API for processing batch and streaming datasets. Bringing the power of SQL to Flink, this book will then explore the Table API for querying and manipulating data. In the latter half of the book, readers will get to learn the remaining ecosystem of Apache Flink to achieve complex tasks such as event processing, machine learning, and graph processing. The final part of the book would consist of topics such as scaling Flink solutions, performance optimization and integrating Flink with other tools such as ElasticSearch.
All of the code is organized into folders.Each folder starts with a number followed by the application name. The commands and instructions will look like the following:
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Alert other = (Alert) obj;
if (message == null) {
if (other.message != null)
return false;
} else if (!message.equals(other.message))
return false;
return true;
Chapter 01,08, and chapter 09 does not contain code.
- Apache Spark Machine Learning Blueprints
- Instant Apache Stanbol
- Apache Spark for Data Science Cookbook
Click here if you have any feedback or suggestions.