Skip to content

User Guide 1.0.0 beta

Frank Rosner edited this page Mar 10, 2015 · 4 revisions

User Guide for DDS 1.0.0-beta

Importing DDS

In order to use DDS in your Spark shell, you need to add it to your classpath and import the DDS core functions. Depending on the Scala version your Spark is compiled in, select the correct jar. Assuming your Spark is built with 2.11, start your Spark shell with the following parameter and import the core functionality:

spark-shell --driver-class-path spawncamping-dds-1.0.0-beta_2.11.2.jar
import de.frosner.dds.core.DDS._

The Web UI

DDS comes with a lightweight web server that serves the results and charts to your browser. It pushes JSON objects to the JavaScript front-end that will then display them using HTML, CSS and SVG. The server needs to be started once after the Spark shell has loaded. It can be used for the entire session. However, you can stop and restart it as often as you like.

Starting the server

The server can be started by calling the start() function in the Spark shell. You can also specify the interface and port the server should listen to. To start the server listening to 192.168.0.5:8081, execute the following command:

start("192.168.0.5", 8081)

Stopping the server

The server can be stopped by calling the stop() function in the Spark shell.

Available Functions

To get a list of all available functions use the help() function. Besides the functions based on RDDs, DDS offers a bunch of generic plotting functions not related to Spark. They can be used to plot charts or display tables in the browser from any data source you have.

Generic Plotting Functions

line

line[N](values: Seq[N])(implicit num: Numeric[N])

Prints a line chart with the given values and a default label.

lines

lines[N](labels: Seq[String], values: Seq[Seq[N]])(implicit num: Numeric[N])

Prints a line chart with multiple lines. The label labels(x) corresponds to the value sequence values(x).

bar

bar[N](values: Seq[N])(implicit num: Numeric[N])

Prints a bar chart with the given values and a default label.

bars

bars[N](labels: Seq[String], values: Seq[Seq[N]])(implicit num: Numeric[N])

Prints a bar chart with multiple bars. The label labels(x) corresponds to the value sequence values(x).

pie

pie[K, V](keyValuePairs: Iterable[(K, V)])(implicit num: Numeric[V])

Prints a pie chart with one segment for each key-value pair in the input collection. The keys are assumed to be unique (e.g. values already reduced).

table

table(head: Seq[String], rows: Seq[Seq[Any]])

Prints a table with the given row-wise data. The head(x) corresponds to each column, e.g. rows(0)(x).

RDD Functions

show

def show[V](rdd: RDD[V], sampleSize: Int = 20)(implicit tag: TypeTag[V])

Prints the first lines of the given RDD in a table. The second (optional) argument determines the number of rows to show. DDS can show RDDs of simple values (e.g. strings, numbers) or composite ones (collections, case classes). The values of that RDD need to have a type tag, i.e. if they are custom classes they need to be defined top level.

summarize

summarize(numbers: RDD[N])(implicit num: Numeric[N])

Shows some basic summary statistics for the given numerical RDD.

groupAndSummarize

groupAndSummarize(readyToGroup: RDD[(K, N)])(implicit num: Numeric[N])

Shows some basic summary statistics for each of the groups defined by the given key. It is assumed that there are key-value pairs in each input row, where the key can be used for grouping.

summarizeGroups

summarizeGroups(grouped: RDD[(K, Iterable[N])])(implicit num: Numeric[N])

Shows some basic summary statistics for each of the given groups. It is assumed that there is one input row per group and is usually a result of a group-by operation on an RDD.

groupAndPie

groupAndPie(readyToGroup: RDD[(K, N)])(reduceFunction: (N, N) => N)(implicit num: Numeric[N])

Computes a pie chart visualizing the numeric values per group. It is assumed that there are key-value pairs in each input row, where the key can be used for grouping. DDS will apply the given reduce function to the values of each group before plotting the reduced value in a segment.

pieGroups

pieGroups(grouped: RDD[(K, Iterable[N])])(reduceFunction: (N, N) => N)(implicit num: Numeric[N])

Computes a pie chart visualizing the numeric values per group. It is assumed that there is one input row per group and is usually a result of a group-by operation on an RDD. DDS will apply the given reduce function to the values of each group before plotting the reduced value in a segment.