conclusion.tex

\chapter{Conclusion}
\label{chap:conclusion}

\section{Results}
\label{sec:conc-results}
\subsection{Technology Research}
The first result of the thesis is the documentation of the technology research,
which results in gathered knowledge that is further required for this
thesis. It gives an insight into messaging fundamentals and takes a closer look
at Apache Kafka and related topics.

\subsection{Protocol Implementation}

A fundamental result of the thesis is the implementation of the Apache Kafka
protocol in Haskell. The design decision of separating protocol related code
from the broker implementation lead to an isolated product which can now be used
as a library for different projects. The code is provided through an \fnurl{open sourced
repository}{https://github.com/hmb-ba/protocol} for further development.

A complete implementation of the Apache Kafka Protocol would go beyond the scope
of the thesis, as the focus lies not only in implementing the protocol but also
to provide broker functionality. Thus, most important is the ability to produce
and fetch messages. In order to provide compatibility for Apache Kafka clients,
the Metadata API is also required because it is the first part of the
process in producing or consuming messages where it will request information
about the broker and provided topics. The following list gives an overview of
which part of the protocol is implemented and what remains open:

\begin{itemize}
    \tick Metadata API
    %\begin{itemize}
    %    \tick Topic Metadata Request
    %    \tick Metadata Response
    %\end{itemize}
    \tick Produce API
    %\begin{itemize}
    %    \tick Produce Request
    %    \tick Produce Response
    %\end{itemize}
    \tick Fetch API
    %\begin{itemize}
    %    \tick Fetch Request
    %    \tick Fetch Response
    %\end{itemize}
    \fail Offset API
    %\begin{itemize}
    %    \fail Offset Request
    %    \fail Offset Response
    %\end{itemize}
    \fail Offset Commit/Fetch API
    %\begin{itemize}
    %    \fail Consumer Metadata Request
    %    \fail Consumer Metadata %Response
    %        \fail Offset Commit Request
    %    \fail Offset Commit Response
    %    \fail Offset Fetch Request
    %    \fail Offset Fetch Response
    %\end{itemize}
\end{itemize}
\newpage
\subsection{Haskell Message Broker}

The main approach of this work is the implementation of a message broker in
Haskell. The resulting application provides a server with basic functionality in
handling requests via network and persisting messages. The broker is fully Kafka
compatible since it is based on the protocol implementation.  In comparison to the
original Apache Kafka, there are, of course, a lot of features missing in order to offer the
same functionality. Furthermore, as the scope of the thesis lies on a single
broker system, there is not yet support for broker replication. Nevertheless, the
provided Haskell message broker is a prototype which shows feasibility of
developing a similar system in a fully-fledged functional program language like
Haskell. The following list gives an overview of what features the broker system
already provides and what remains open in the particular area:

Produce API:
\begin{itemize}
        \tick Publishing messages to a specific topic and partition
        \tick Publishing to multiple topics and partitions per request
        \tick Supporting batched messages
        \fail Configurable Acknowledgements and Timeout
\end{itemize}

Fetch API:
\begin{itemize}
        \tick Consuming messages of a specific topic and partition depending on given offset
        \fail Supporting consuming from multiple topics and partitions per request
        \fail Supporting configurable min and max bytes for fetched data
        \fail Supporting consumer groups
        data is available
\end{itemize}

Metadata API:
\begin{itemize}
        \tick Fetching available topic names from broker
        \tick Include broker information
        \fail Include partition status
        \fail Include replication information
\end{itemize}

Persistence (Log):
\begin{itemize}
        \tick Persist messages in topic-partition specific log structure
        \tick Read messages from log, optimized with index file
        \tick Provide LogManager for handling files and directories structure
        \fail Write Index to disk and rebuild the memory after the broker is restarted
        \fail Data retention, where old data is discarded after a fixed period of
            time or Kafka's log compaction is applied. 
\end{itemize}

\newpage
\section{Evaluation}

After highlighting the implemented features above, the question about
performance still remains. Is the provided prototype potentially faster than the
original Apache Kafka? Does the prototype near the same throughput? Or are we still far
away from any approximation to the impressive performance of the reference
system? This chapter describes the environment and tools in which the systems
were tested. It also gives insight in early stress tests which were very useful
in finding network related bugs. Additional benchmarks are defined to
demonstrate the performance of the final version of the Haskell message broker.
For comparison, the original Apache Kafka system is involved in the benchmarks
by running equivalent tests on the same hardware.

\subsection{Test Setup}

All of the tests and benchmarks in this chapter are performed with following setup:

\begin{verbatim}
Broker Server: 
  Ubuntu Server 14.04.2 LTS (64 bit)
  Intel(R) Xeon(R) CPU E3-1245 v3 @ 3.40GHz
  One 7200 RPM SATA drive
  16GB of RAM
  1Gb Ethernet
\end{verbatim}

\begin{verbatim}
Client:
  Ubuntu 14.04.2 LTS (64 bit)
  Intel(R) Core(TM) i7-2720QM CPU @ 2.20GHz
  One SSD SATA drive
  16GB of RAM
  1Gb Ethernet
\end{verbatim}

\subsection{HMB Performance Producer}
\label{conc-eval-hmbperformanceprod}

The HMB performance producer is a simple console application used for stress
tests in the broker system. It can be used to generate messages of a given size
and batch them into a single request. It then repeatedly calls the client
library function \lstinline{sendRequest} to pack, encode, and send the request. The following
code shows the simplified main function:

\begin{lstlisting}
import qualified System.Entropy as E

main = do 
    -- Socket setup 
    -- ....
    -- Get numberOfBytes and batchSize as variable input 
    -- ....

    randBytes <- E.getEntropy numberOfBytes 
    let batch = [randBytes | x <- [1..batchSize]]

    replicateM_ 1000000 (sendRequest sock req) 
    putStrLn "done produce"
    return ()

\end{lstlisting}

\subsection{Kafka Performance Producer}
\label{conc-eval-kafkaperformanceprod}

The \fnurl{Kafka Performance Producer}
{https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/tools/ProducerPerformance.java}
is a official product of Apache Kafka that systematically tests their broker
system. It is a wrapper around the original Kafka producer client for producing
as many messages as possible. It provides many \fnurl{configuration options}
{http://kafka.apache.org/documentation.html\#newproducerconfigs} for specific
benchmarks. 

The following listing shows the setup for the benchmarks of the thesis. Values
in brackets [ ] are placeholders for the relevant and variable parameters for the
benchmarks. The used version of Kafka is 2.10-0.8.2.0. 

\begin{verbatim}
Start Zookeeper
>bin/zookeeper-server-start.sh config/zookeeper.properties

Start Kafka Node 
> bin/kafka-server-start.sh config/server.properties

Create test topic [TopicName] on broker server with [IP], 
one partition, no replication: 
>bin/kafka-topics.sh --zookeeper [IP]:2181 
    --create --topic [TopicName] --partitions 1 --replication-factor 1

Start producing benchmark: 
> bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance 
    [TopiName] [NumberOfMessages] [MessageSize] -1 acks=1 
    bootstrap.servers=[IP]:[Port] buffer.memory=67108864 batch.size=[BatchSize]
\end{verbatim}


%\subsection{Test tools}
%\subsubsection{Wireshark}
%Wireshark is used to track tcp streams of ongoing network communication. For the
%To test network throughput and analyze transmitted tcp packaged the tool
%Wireshark is used. 

%\begin{figure}[H]
%    \centering
%    \includegraphics[width=0.85\textwidth]{images/benchmark/bench-1000-1-eth.png}
%    \caption{Example getting tcp throughput graph for specific benchmark test}
%\end{figure}

%\subsubsection{Iptraf-ng}

\subsection{Exposed Issues}

An early version of the broker provided a console client which produced messages
per console input. This setup demonstrated that encoding/decoding of a request
worked. Also, the different broker layers have done their job well.  Later in
the construction phase of the broker, tests were introduced that repeatedly sent
a large amount of requests in a short period of time. This kind of stress test
is very useful to uncover network related bugs and performance lacks. We kept
doing so until the end of the implementation phase.

When dealing with thousands of requests per second, a function with high complexity
can cause terrible bottlenecks. Another method to analyze Haskell code for
potentially slow functions is time and allocation profiling provided by GHC.

The following table lists some of the major issues that were exposed:

\begin{table}[H]
\begin{tabular}{|p{3cm}|p{4.5cm}|p{6cm}|}
\hline
{\bf Exposed Issue}                                                                                  & {\bf Cause}                                                                                                                                                                                                                 & {\bf Solution}                                                                                                                                                                                                                                                                                    \\ \hline
Dramatically slowdown after short period of time. \textbf{[Solved]}                                                        & TCP Zero Window - Socket receive buffer of broker full                                                                                                                                                                      & Broker application was slow in receiving, parsing, and handling incoming requests. Threading concept needs to be optimized (see \ref{sec:impl-broker-threading}). The process of parsing an incoming request needs to be swapped out to another thread, so that the network receive thread can reduce the socket buffer fast enough. \\ \hline
Parsing from socket error "not enough bytes" when testing over network
interface instead of loopback. \textbf{[Solved]} & The \lstinline{len} argument to the
\lstinline{recv} system call is merely the upper bound on the received length
(e.g. the size of the target buffer); the call may produce less data than asked
for. & Introduced function which checks if the exact amount of bytes is really
read. If not, the \lstinline{recv} call is repeated until all requested bytes are present (see details in \ref{sec:impl-broker-socket-receive}).                                                                                                                                              \\ \hline
Throughput drop when producing a larger request with nested list. \textbf{[Solved]}
& Issue in encoding a request called \lstinline{runPut} on each sub element and
appended to a whole.
& No appending of parsed sub elements. Restructure encode module by running
\lstinline{runPut} at latest point in time. This resulted in factor 2 of performance in building requests (see \ref{sec:impl-prot-encoding}).                                                                                                                                      \\ \hline
Slow performance in persisting messages to the log (file system). \textbf{[Solved]}                                & No state and batching at broker side. Each incoming message ended in multiple file accesses which slowed down the whole process.                                                                                                                                & Messages need to be summed up to larger chunks of bytes. After a defined period of time or amount of bytes, multiple messages need to be flushed to disk at once (see details of different approaches in \ref{sec:impl-broker-log}).                                                                                                                                      \\ \hline
Slow index search. \textbf{[Open]} & Base offset has to be determined for each lookup. As for now
this causes disk I/O which is obviously an overhead. & The directory and
its content should be read from disk just once and from then on work with states
(MVar) only. The state has to be passed forth and back between the API and Log
layer for each request but omits I/O by having the base offset in memory all the time. \\ \hline
Slow index insertion causes slowdown of the broker application. \textbf{[Open]} & As for now, the
file size of the log which needs to be stored within an index entry causes I/O.
& A more intuitive and significantly faster solution would be to maintain the log file
size within the \lstinline{LogState} (in-memory). Thus, the file size would have to be
increased by hand when appending a new log entry. \\ \hline
\end{tabular}
\end{table}
\captionof{table}{Uncovered issues in early stress tests}

\newpage
\subsection{Benchmark: Effect of Message Batch Size}
\label{sec:conc-benchmark-1}

The goal of this benchmark is to analyze the network throughput between producer
client and broker system.  The focus lies on the effect of changing the batch
size at producer client to the resulting transmission of bytes per second. The
batch size determines the amount of bytes in which messages are packed
together and sent to the broker as a single request for efficiency. Increasing the
batch size requires more memory. The size of a single message is fixed to 100
bytes because it is generally the more interesting case for a messaging system. It is much easier to get good throughput in MB/sec if the messages are
large (as shown in second benchmark, see \ref{sec:conc-benchmark-2}) but
harder to get good throughput when the messages are small, as the overhead of
processing each message dominates.

Calculation batch size in bytes:
\begin{verbatim}
Messages Batched * (Message Size + Message Overhead) + Request Overhead 
\end{verbatim}

\subsubsection{Conditions}
\begin{table}[H]
\begin{tabular}{|l| p{11.5cm}|} \hline
{\bf Message Size}   & 100 Bytes \\ \hline
{\bf Batch Size [B]} & 195 | 1387 | 8192 | 16384 (Kafka default) | 56769 Bytes \\ \hline
{\bf Tested Clients} &
    \begin{itemize}
        \item Kafka Performance Producer 2.10-0.8.2.0, single threaded
        \item HMB Producer, single threaded
    \end{itemize}\\ \hline
{\bf Tested Brokers} &
    \begin{itemize}
        \item Kafka Broker 2.10-0.8.2.0, no replication, one partition
        \item HMB Broker, no replication, one partition
    \end{itemize}\\ \hline
{\bf Measurement} & Resulting TCP throughput over a period of 10 seconds analyzed with
    Wireshark. Throughput includes message + request overhead\\ \hline
{\bf Scenarios} & Producing as much messages as possible from client (left) to broker (right).
    The size of messages batched together varies in defined steps.
    \begin{enumerate}
        \item Kafka Performance Producer $\rightarrow$ Kafka Broker
        \item Kafka Performance Producer $\rightarrow$ HMB Broker
        \item HMB Producer $\rightarrow$ Kafka Broker
        \item HMB Producer $\rightarrow$ HMB Broker
    \end{enumerate} \\ \hline
\end{tabular}
\end{table}
\captionof{table}{Benchmark conditions "Effect of Producer Batch size"}
\newpage
\subsubsection{Results}
The results are given below for each specific scenario. The graph shows the
TCP throughput in bytes in period of 10 to 20 seconds.

Scenario 1, Kafka $\rightarrow$ Kafka:

\begin{table}[H]
\centering
\begin{tabular}{|l|l|l|} \hline
195 (1 Msg) & 1387 ($\sim$10 Msgs)& 8192 ($\sim$65 Msgs)\\ \hline
\includegraphics[width=0.25\textwidth]{images/benchmark-1/benchmark-1-s1-bs195.png}
&
\includegraphics[width=0.25\textwidth]{images/benchmark-1/benchmark-1-s1-bs1387.png}
&
\includegraphics[width=0.2\textwidth]{images/benchmark-1/benchmark-1-s1-bs8192.png} \\ \hline
\end{tabular}
\end{table}

\begin{table}[H]
\centering
\begin{tabular}{|l|l|} \hline
16384 ($\sim$130 Msgs)& 56769 ($\sim$450 Msgs)\\ \hline
\includegraphics[width=0.25\textwidth]{images/benchmark-1/benchmark-1-s1-bs16384.png}
&
\includegraphics[width=0.25\textwidth]{images/benchmark-1/benchmark-1-s1-bs56769.png} \\ \hline
\end{tabular}
\end{table}

Scenario 2, Kafka $\rightarrow$ HMB:

\begin{table}[H]
\centering
\begin{tabular}{|l|l|l|} \hline
195 (1 Msg) & 1387 ($\sim$10 Msgs)& 8192 ($\sim$65 Msgs)\\ \hline
\includegraphics[width=0.2\textwidth]{images/benchmark-1/benchmark-1-s2-bs195.png}
&
\includegraphics[width=0.25\textwidth]{images/benchmark-1/benchmark-1-s2-bs1387.png}
&
\includegraphics[width=0.2\textwidth]{images/benchmark-1/benchmark-1-s2-bs8192.png} \\ \hline
\end{tabular}
\end{table}

\begin{table}[H]
\centering
\begin{tabular}{|l|l|} \hline
16384 ($\sim$130 Msgs)& 56769 ($\sim$450 Msgs)\\ \hline
\includegraphics[width=0.2\textwidth]{images/benchmark-1/benchmark-1-s2-bs16384.png}
&
\includegraphics[width=0.2\textwidth]{images/benchmark-1/benchmark-1-s2-bs56769.png} \\ \hline
\end{tabular}
\end{table}

\newpage
Scenario 3, HMB $\rightarrow$ Kafka:

\begin{table}[H]
\centering
\begin{tabular}{|l|l|l|} \hline
195 (1 Msg) & 1387 ($\sim$10 Msgs)& 8192 ($\sim$65 Msgs)\\ \hline
\includegraphics[width=0.2\textwidth]{images/benchmark-1/benchmark-1-s3-bs200.png}
&
\includegraphics[width=0.25\textwidth]{images/benchmark-1/benchmark-1-s3-bs1400.png}
&
\includegraphics[width=0.2\textwidth]{images/benchmark-1/benchmark-1-s3-bs8192.png} \\ \hline
\end{tabular}
\end{table}

\begin{table}[H]
\centering
\begin{tabular}{|l|l|} \hline
16384 ($\sim$130 Msgs)& 56769 ($\sim$450 Msgs)\\ \hline
\includegraphics[width=0.2\textwidth]{images/benchmark-1/benchmark-1-s3-bs16384.png}
&
\includegraphics[width=0.2\textwidth]{images/benchmark-1/benchmark-1-s3-bs56769.png} \\ \hline
\end{tabular}
\end{table}

Scenario 4, HMB $\rightarrow$ HMB:

\begin{table}[H]
\centering
\begin{tabular}{|l|l|l|} \hline
195 (1 Msg) & 1387 ($\sim$10 Msgs)& 8192 ($\sim$65 Msgs)\\ \hline
\includegraphics[width=0.2\textwidth]{images/benchmark-1/benchmark-1-s4-bs195.png}
&
\includegraphics[width=0.25\textwidth]{images/benchmark-1/benchmark-1-s4-bs1387.png}
&
\includegraphics[width=0.2\textwidth]{images/benchmark-1/benchmark-1-s4-bs8192.png} \\ \hline
\end{tabular}
\end{table}

\begin{table}[H]
\centering
\begin{tabular}{|l|l|} \hline
16384 ($\sim$130 Msgs)& 56769 ($\sim$450 Msgs)\\ \hline
\includegraphics[width=0.2\textwidth]{images/benchmark-1/benchmark-1-s4-bs16384.png}
&
\includegraphics[width=0.2\textwidth]{images/benchmark-1/benchmark-1-s4-bs56769.png} \\ \hline
\end{tabular}
\end{table}

\newpage
\subsubsection{Conclusion}
\begin{table}[H]
\centering
\begin{tabular}{|l|l|l|l|l|}
\hline
{\bf Benchmark Variable} & \multicolumn{4}{c|}{{\bf Avg. Throughput {[}MB/s{]}}} \\ \hline \hline
Batch Size [Byte]        & Scenario 1       & Scenario 2       & Scenario 3   & Scenario 4   \\ \hline
195  (1 Msg)             & 9                & 8                & 10           & 9            \\ \hline
1387 ($\sim$10 Msgs)     & 24               & 12               & 11           & 16           \\ \hline
8192 ($\sim$65 Msgs)     & 68               & 54               & 13           & 15           \\ \hline
16384 ($\sim$130 Msgs)   & 80               & 32               & 14           & 16           \\ \hline
56769 ($\sim$450 Msgs)   & 98               & 48               & 16           & 17           \\ \hline
\end{tabular}
\end{table}
\captionof{table}{Results of benchmark "Effect of Producer Batch Size"}

The result of the benchmark demonstrates the positive effect to throughput
by increasing the producing batch size. The original Kafka setup (scenario 1)
shows a significant rise by increasing the batch size. However, the
HMB setup (scenario 4) cannot use batching to full capacity. Obviously, this
benchmark exposes the HMB producer to a major bottleneck. Using the Kafka
producer to the HMB broker (scenario 2) gives better results instead of
using the HMB producer. Scenario 3 confirms this suspicion. The handling
of requests on the HMB broker is quite efficient, but there is still a demand
for further optimizations to get the same performance of Kafka.

\subsubsection{Consequences}
During the benchmark, it was conspicuous that the HMB broker is
not using the full capacity of the CPU. While Kafka obviously uses
multi-threading, most of the load weighs on a single thread for HMB. This is why
there is only one API handler thread running at the time, which is definitely
a point to optimize in the future.

Another point in demand with optimization is the producer which manifests a
bottleneck in messages of 100 bytes (when the message size gets bigger, the
result is much better, see \ref{sec:conc-benchmark-1}). Because the provided
HMB producer is quite simple (actually just built to test), another task for
the future is to work off details for optimized producer clients which would
definitely lead to better results. 

\newpage
\subsection{Benchmark: Effect of Message size}
\label{sec:conc-benchmark-2}
The goal of this benchmark is to analyze the network throughput between
producer client and a broker system. The focus lies on the effect of changing the
size of a single message to the resulting transmission of bytes per
second. Persisting performance on the broker is not considered in
this benchmark.

\subsubsection{Conditions}
\begin{table}[H]
\begin{tabular}{|l| p{11.5cm}|} \hline
{\bf Message Size}   & 10 | 100 | 1000 | 10000 | 100000 Bytes \\ \hline
{\bf Batch Size}     & 12800 Bytes \\ \hline
{\bf Tested Clients} &
    \begin{itemize}
        \item Kafka Performance Producer 2.10-0.8.2.0, single threaded
        \item HMB Producer, single threaded
    \end{itemize}\\ \hline
{\bf Tested Brokers} &
    \begin{itemize}
        \item Kafka Broker 2.10-0.8.2.0, no replication, one partition
        \item HMB Broker, no replication, one partition
    \end{itemize}\\ \hline
{\bf Measurement} & Resulting TCP throughput over a period of 10 seconds analyzed with
    Wireshark. Throughput includes message + request overhead.\\ \hline
{\bf Scenarios} & Producing as much messages as possible from client (left) to broker (right).
    The size of a messages varies in defined steps. The batch size thereby is a fixed value. 
  \begin{enumerate}
        \item Kafka Performance Producer $\rightarrow$ Kafka Broker
        \item Kafka Performance Producer $\rightarrow$ HMB Broker
        \item HMB Produer $\rightarrow$ Kafka Broker
        \item HMB Produder $\rightarrow$ HMB Broker
    \end{enumerate} \\ \hline
\end{tabular}
\end{table}
\captionof{table}{Benchmark conditions "Effect of Message size"}

\newpage
\subsubsection{Results}
Scenario 1, Kafka $\rightarrow$ Kafka:

\begin{table}[H]
\centering
\begin{tabular}{|l|l|l|} \hline
10 Bytes & 100 Byte & 1000 Byte (1 KB)\\ \hline
\includegraphics[width=0.28\textwidth]{images/benchmark-2/benchmark-2-s1-ms10.png}
&
\includegraphics[width=0.25\textwidth]{images/benchmark-2/benchmark-2-s1-ms100.png}
&
\includegraphics[width=0.2\textwidth]{images/benchmark-2/benchmark-2-s1-ms1000.png} \\ \hline
\end{tabular}
\end{table}

\begin{table}[H]
\centering
\begin{tabular}{|l|l|} \hline
10000 Byte (10 KB) & 100000 Byte (100 KB)\\ \hline
\includegraphics[width=0.25\textwidth]{images/benchmark-2/benchmark-2-s1-ms10000.png}
&
\includegraphics[width=0.25\textwidth]{images/benchmark-2/benchmark-2-s1-ms100000.png} \\ \hline
\end{tabular}
\end{table}

Scenario 2, Kafka $\rightarrow$ HMB:

\begin{table}[H]
\centering
\begin{tabular}{|l|l|l|} \hline
10 Bytes & 100 Byte & 1000 Byte (1 KB)\\ \hline
\includegraphics[width=0.25\textwidth]{images/benchmark-2/benchmark-2-s2-ms10.png}
&
\includegraphics[width=0.25\textwidth]{images/benchmark-2/benchmark-2-s2-ms100.png}
&
\includegraphics[width=0.25\textwidth]{images/benchmark-2/benchmark-2-s2-ms1000.png} \\ \hline
\end{tabular}
\end{table}

\begin{table}[H]
\centering
\begin{tabular}{|l|l|} \hline
10000 Byte (10 KB) & 100000 Byte (100 KB)\\ \hline
\includegraphics[width=0.2\textwidth]{images/benchmark-2/benchmark-2-s2-ms10000.png}
&
\includegraphics[width=0.25\textwidth]{images/benchmark-2/benchmark-2-s2-ms100000.png} \\ \hline
\end{tabular}
\end{table}

\newpage
Scenario 3, HMB $\rightarrow$ Kafka:

\begin{table}[H]
\centering
\begin{tabular}{|l|l|l|} \hline
10 Bytes & 100 Byte & 1000 Byte (1 KB)\\ \hline
\includegraphics[width=0.2\textwidth]{images/benchmark-2/benchmark-2-s3-ms10.png}
&
\includegraphics[width=0.25\textwidth]{images/benchmark-2/benchmark-2-s3-ms100.png}
&
\includegraphics[width=0.25\textwidth]{images/benchmark-2/benchmark-2-s3-ms1000.png} \\ \hline
\end{tabular}
\end{table}

\begin{table}[H]
\centering
\begin{tabular}{|l|l|} \hline
10000 Byte (10 KB) & 100000 Byte (100 KB)\\ \hline
\includegraphics[width=0.2\textwidth]{images/benchmark-2/benchmark-2-s3-ms10000.png}
&
\includegraphics[width=0.2\textwidth]{images/benchmark-2/benchmark-2-s3-ms100000.png} \\ \hline
\end{tabular}
\end{table}

Scenario 4, HMB $\rightarrow$ HMB:

\begin{table}[H]
\centering
\begin{tabular}{|l|l|l|} \hline
10 Bytes & 100 Byte & 1000 Byte (1 KB)\\ \hline
\includegraphics[width=0.25\textwidth]{images/benchmark-2/benchmark-2-s4-ms10.png}
&
\includegraphics[width=0.25\textwidth]{images/benchmark-2/benchmark-2-s4-ms100.png}
&
\includegraphics[width=0.2\textwidth]{images/benchmark-2/benchmark-2-s4-ms1000.png} \\ \hline
\end{tabular}
\end{table}

\begin{table}[H]
\centering
\begin{tabular}{|l|l|} \hline
10000 Byte (10 KB) & 100000 Byte (100 KB)\\ \hline
\includegraphics[width=0.2\textwidth]{images/benchmark-2/benchmark-2-s4-ms10000.png}
&
\includegraphics[width=0.2\textwidth]{images/benchmark-2/benchmark-2-s4-ms100000.png} \\ \hline
\end{tabular}
\end{table}

\newpage
\subsubsection{Conclusion}
\begin{table}[H]
\centering
\begin{tabular}{|l|l|l|l|l|}
\hline
{\bf Benchmark Variable} & \multicolumn{4}{c|}{{\bf Avg. Throughput {[}MB/s{]}}} \\ \hline
Message Size             & Scenario 1       & Scenario 2       & Scenario 3     & Scenario 4 \\ \hline
10 B                     & 85               & 50               & 48             & 48         \\ \hline
100 B                    & 70               & 35               & 15             & 15         \\ \hline
1000 B (1 KB)            & 80               & 75               & 50             & 55         \\ \hline
10000 B (10 KB)          & 95               & 75               & 100            & 100         \\ \hline
100000 B (100 KB)        & 105              & 100              & 105            & 100         \\ \hline
\end{tabular}
\end{table}

\captionof{table}{Results of benchmark "Effect of Message size"}

The throughput increases as the message size gets bigger. First of
all, why are tests with 10 bytes a message significantly faster than those with
100 bytes? It needs to be considered that the TCP throughput measurement takes
bytes of the messages as well as the message and request overhead which is
produced for transmitting the request. Each message has an overhead of 28 bytes
which, in this case, is 280 percent of the actual message size. Therefore, the
tests with 10 bytes are not really meaningful at the moment. As the message
size gets larger, the overhead can be disregarded.

The benchmark shows that the HMB broker and producer can deal with large
message sizes quite well! The resulting throughput of 10 and 100 KB messages
is nearly the same as the Kafka setup. As already detected in the first benchmark
(see \ref{sec:conc-benchmark-1}), the HMB producer has its troubles with messages
around 100 bytes.

\subsubsection{Consequences}
This benchmark has uncovered the responder thread as a bottleneck with larger
message sizes because the socket buffer overflows. The test results are
made without sending responses from the broker. At this point, an optimization
in the network layer is needed so that the socket buffers never get too large.

Furthermore, the consequences defined in first benchmark also apply in
this case.

%\subsection{Benchmark "Encode/Decode"}

%\subsection{Network Throughput}
%The socket based communication must not be a bottleneck in a high performance
%broker. A producer should be able to send with near the maximal throughput of
%its local network card, whereas the broker needs to handle the incoming data
%streams as fast as possible. If the limits of a single network connection is
%reached, the next step to improve performance is to balance the data
%stream on multiple replicated brokers.

%To test the network throughput for this thesis, a benchmark producer
%which identify the limits is provided.


%The variance seems to be due to Linux's I/O management facilities that batch
%data and then flush it periodically.
%-> Durchgeführte Tests 

%-> Resultat Network Throuput 

%-> Resultat Writing to the log 
%-> Vergleich Kafka (ref to Blog) 

%\begin{table}[h]
%\begin{tabular}{|c|c|c|c|}
%\hline
%{\bf Request Size (with ReqHeader)} & {\bf Message Size {[}Byte{]}} & {\bf Batch Size} & {\bf \begin{tabular}[c]{@{}c@{}}Avg. Throughput\\ {[}MB/s{]}\end{tabular}} \\ \hline
%150 B                               & 100                           & 1                & 27                                                                         \\ \hline
%1050 B                              & 100                           & 10               & 22                                                                         \\ \hline
%10.050 KB                           & 100                           & 100              & 30                                                                         \\ \hline
%100.050 KB                          & 100                           & 1000             & 32                                                                         \\ \hline
%1 MB                                & 100                           & 10000            & 40                                                                         \\ \hline
%550 B                               & 500                           & 1                & 66                                                                         \\ \hline
%1050 B                              & 500                           & 2                & 81                                                                         \\ \hline
%10.050 B                            & 500                           & 20               & 80                                                                         \\ \hline
%100.050 KB                          & 500                           & 200              & 96                                                                         \\ \hline
%1 MB                                & 500                           & 2000             & 68                                                                         \\ \hline
%1050 B                              & 1000                          & 1                & 104                                                                        \\ \hline
%10.050 B                            & 1000                          & 10               & 110                                                                        \\ \hline
%100.050 KB                          & 1000                          & 100              & 112                                                                        \\ \hline
%1 MB                                & 1000                          & 1000             & 87                                                                         \\ \hline
%10.050 B                            & 10000                         & 1                & 117                                                                        \\ \hline
%100.050 KB                          & 10000                         & 10               & *                                                                          \\ \hline
%1 MB                                & 10000                         & 100              & *                                                                          \\ \hline
%\end{tabular}
%\end{table}

\newpage
\section{Experiences with Haskell}
In this section we want to give a statement about the suitability of Haskell
as programming language regarding to the requirements of this work as well as
our own experiences during the thesis. 

\subsection{Assets}
{\bf Haskell type system}. Regarding to a protocol implementation one can take
advantage of the Haskell type system. The given grammar in the case of the
Apache Kafka protocol can be mapped with types very well. Instead of having a
big mesh of interfaces and classes as it would be in object oriented languages,
the mapping in Haskell will get to a compact and good readable code where
changes can be made easily. The power of Haskell types also manifests when
compiling a given code. The type checker subsequently tell any mismatches at
exactly point in the code. We felt very comfortable with this strong and
efficient ability of the GHC and refactoring was getting much easier. When we
started to implement the first prototype for the thesis we intuitively started
with the type base. Retrospectively it was the right decision because the
advantages mentioned above led to working code quite fast.

{\bf Compactness of code}. We had come to love with the functional programming
concepts Haskell provides, especially \textit{pattern matching}, \textit{list
comprehensions}, \textit{guards}, \textit{maps}, \textit{filters} or
\textit{lambdas}. Those functionalities allow programmers to write proper,
elegant and compact code which we think is definitely a benefit for every work.
Concepts like \textit{lamdba} are already present in many other languages (not
only functional ones). We are certain that we can use the learned abilities to
write proper code in any futur project.

{\bf Sources}. Simply searching answers to a specific Haskell problem in the
internet rarely leads to a specific answer. Although we think that the provided
information based from the Haskell community are still very good. One does just
have to know how use them. First of all,
\fnurl{Hoogle}{https://www.haskell.org/hoogle/}, as Haskell API search engine
was very useful to search the standard libraries by any functions. A very
helpful feature of Hoogle is searching by type signature (e.g.
\lstinline{(a->b) -> [a] -> [b]}) which allows to find functions without any
given name. Another tool is \fnurl{Hackage}{https://hackage.haskell.org/ } which
provides access to any open sourced cabal package from the Haskell community.
Each package includes detailed description about provided modules with their
functions and types and how to use it. When we came at the point where we needed
more information about a specific library or a problem, we acquired the
community directly via the \fnurl{Haskell
Subreddit}{https://www.reddit.com/r/haskell/} where we made very good
experiences.

\newpage
\subsection{Difficulties} {\bf Cabal Hell}. To properly separate the components
of protocol and broker
implementation  we used \fnurl{Haskell Cabal}{https://www.haskell.org/cabal/}.
It provides a common system for building a packaging Haskell libraries and
programs. Defining our components as single cabal packages leads to an
independent protocol implementation package which could be used in any other
project. So far so good. Unfortunately the well-known expression of
\textit{Cabal Hell} manifested its right. The fact that reinstalling a package
with \lstinline{cabal install} can break existing packages on a system was
very relevant to our project because we obviously worked on two dependent
packages simultaneously. We needed to reinstall packages all the time.
Fortunately cabal provides the \lstinline{sandbox} functionality, which detects
modifications on local depended packages and reinstalls them automatically.
Although this worked quite well, we still had a lot of cases in which our
packages broke anyway. Then we had to reinitialize the sandbox setup and
reinstall and compile all packages many times. Problems getting really serious
when we tried to reinstall any cabal package with profiling features. Best
experience we made by installing the \textit{cabal} command line tool according
to the official \fnurl{Github readme}
{https://github.com/haskell/cabal/tree/master/cabal-install}.

{\bf Lazy Evaluation}. Programming a server application with lots of I/O parts
often led us to think in sequential kind of manner, whereas we expected that
every line of code gets called exactly at the point where it appears in code.
That is the false thinking.  Actually due to Haskell's lazy evaluation ability,
a function is not called until the value is really needed (e.g. for print out
to console). As beginners in the functional paradigm this was a kind of a
difficulty to deal with. But thinking in terms of streams and laziness allows
programmers to write elegant and compact code which would be awkward in eager
language.


\newpage
\section{Outlook}

What are the next steps? As described in section \ref{sec:conc-results}, the
resulting broker implementation is not a finalized product--it is a prototype.
It shows that it is feasible to develop an Apache Kafka-like broker system
written in Haskell. An extendible and scalable implementation base and
architecture is provided. The provided benchmarks (\ref{sec:conc-benchmark-1}
and \ref{sec:conc-benchmark-2}) demonstrate good performance but also uncover
some major bottlenecks. We are certain that, with further work, one could build
the current version to an extraordinary broker system.

Summarized, the next tasks to do with the prototype involve:
\begin{enumerate}
    \item Facing uncovered bottlenecks (see benchmarks)
    \item Implementation and handling of remaining APIs
    \item Extending the log subsystem
    \item Work off more details for Client API
    \item Further optimizations
\end{enumerate}

After bringing the broker server to a stable and nearly feature complete
application, the next step is introducing broker replication. This task
is probably predestined for another thesis with a goal in analyzing
distributed replication in detail and working out ZooKeeper
integration.

The Haskell community is very vital and active. We experienced that at the
\fnurl{ZuriHac 2015}{https://wiki.haskell.org/ZuriHac2015} (Google Zurich,
29.05.15) and in the \fnurl{HaskellerZ Meetup}{http://www.meetup.com/de/HaskellerZ/}
(ETH Zurich, 30.04.2015) we participated in during the thesis. The results
of this work will be published open source through the \fnurl{GitHub
repositories}{https://github.com/hmb-ba}. The goal is to find contributors for
further development. For this reason, we will also present this work in a
later Meetup in summer 2015. As for now, the highlight remains the
protocol implementation that already has been praised by the Haskell community, where
contributors helped uncovering minor issues.