Basics

The following sections describe the concepts that make up StormCV. A basic understanding of the underlying platform Storm is required in order to fully understand it all. Please read Storm’s basic tutorial and concepts if you are not yet familiar with Storm Spouts, Bolts and Groupings.

StormCV differs from Storm because the primary building blocks are Fetchers and Operations rather than Spouts / Bolts. StormCV topologies are created by defining and linking a number of Spouts / Bolts (as with normal Storm topologies) and specifying which Fetchers / Operations they should execute.

The Model contains a number of Java objects which are send from/to Spouts and Bolts. The model includes Frame, Feature and Descriptor classes which are (de) serialized to Storm Tuples automatically. Hence developers work with these objects rather than with Tuples.
Fetchers are responsible for reading imaging data and are executed within Spouts.
Operations take one ore more objects from the model and produce zero or more output objects. They can for example receive a Frame and generate multiple Feature objects.
Groupings are part of Storm and control the way data is routed through the platform.

The actual work is done by Fetchers and Operations rather than Spouts and Bolts. As a result StormCV contains only a small number of the latter and a far greater number of the latter.

Model

A data model containing common computer vision (CV) primitives forms the core of StormCV and these models are used to hold information and send them to bolts. The overview below is slightly simplified, see javadoc for a full description of these elements.

GroupOfFrames: wraps one or more frames in a single object and enables to platform to route a set of frames to the same destination.
Frame: represents a single frame which contains zero or more Features and optionally an image.
Feature: representing a unique feature containing a name ("SIFT", “SURF” etc) a duration in case it is a temporal feature and either a dense descriptor or a set of (sparse) descriptors.
Descriptor: represents a single descriptor as part of a Feature and contains a location indicating the spatial region the descriptor applies to an a float array with the actual values. In case of SIFT the descriptor will typically hold the location of the keypoint and the values of the 128 long descriptor.

Each class in the model has its own Serializer which enables the platform to convert them to Storm Tuples and back. Unless you want to extend the model (see the Advanced page) you do not need to worry about the serialization. Using this model it is possible to analyze a frame and add different features to it. A Frame for example can have two Features (with names like ‘Face’ and ‘SIFT’) each containing multiple descriptors describing the faces and keypoints respectively. All operations in the platform take one or more of these objects as input and generates zero or more of these objects as output. The output is ‘emitted’ to the next step in the process (i.e. next bolt in the Storm topology). The code below shows how frames, features and descriptors can be created and used.

// create a new frame with a resolution of 800x640 but without the actual image
Frame frame = new Frame("streamId", 0L, Frame.NO_IMAGE, new byte[]{}, 0L, new Rectangle(0, 0, 800, 640));
	
// add a Feature to the frame (in this case with name SIFT and no descriptors)
frame.getFeatures().add(new Feature("streamId",0L, "SIFT",40, null, null));
	
// add a SIFT descriptor to the feature for point (100, 100)
frame.getFeatures().get(0).getSparseDescriptors().add(
	new Descriptor("streamId",0L, new Rectangle(100,100,0,0), 40, new float[]{0.12,0.45,0.0, ...}) // typically a 128 float array for SIFT
);

// add a Face Feature to the frame 
frame.getFeatures().add(new Feature("streamId",0L, "Faces",40, null, null));
	
// add a detected face to the Face feature.
frame.getFeatures().get(1).getSparseDescriptors().add(
	new Descriptor("streamId",0L, new Rectangle(300,86,123,68), 40, new float[]{}) // faces only have no features, only a bounding box 
);

Note, each of these objects always contains two fields which are used for routing and sorting purposes:

streamId: a String describing for each object what stream it belongs to. Each unique video file or stream typically has its own streamId which is how the platform knows what frames/features belong to the same video
sequenceNr: a Long value indicating the place of an object within the channel. Within StormCV this usually is the frame number.

These two pieces of information are always required and are used throughout the platform for grouping and sorting purposes. In addition each object contains a metadata Map<String, Object> which can be used to add metadata to each object within the model, for example the original URI a frame was taken from.

Fetchers

The actual processing of video, consuming and producing objects from the model, is done by Fetches and Operations which are Java interfaces within StormCV. As the name suggests Fetchers are responsible for reading imaging data and are executed within the Spout. StormCV is shipped with a number of Fetcher implementations:

StreamFrameFetcher: reads video streams, extracts frames at certain intervals (defined by the frameSkip parameter) and emits them as Frame objects
FileFrameFetcher: reads video files from a remote source, extracts frames at certain intervals and emits them as Frame objects. FileConnector implementations are used to access remote sources. Currently the platform has a connector for the local filesystem (primarily used for testing) and for Amazon S3 buckets. It is possible to develop your own implementations and let Fetchers use them.
ImageFetcher: reads image files from some source and emits them as Frame objects. This Fetcher uses FileConnectors as well.
RefreshingImageFetcher: reads images from sources that publish continuous refreshing images (like webcams sometimes have)
FetchAndOperateFetcher: enables the use of an Operation within the spout. For example a StreamFrameFetcher can be combined with a ScaleOperation which will also be executed within the spout. This minimizes data transfer from Spout to Bolt.

Frames are encoded as jpg images by default but the encoding can be specified in the StormCV Configuration. Each of these Fetcher implementations add the URI where a frame originates from to the metadata map within the frame (key = 'uri').

Operations

Operations perform the actual analysis of data and are executed within Storm Bolts. An operation takes one or multiple input objects from the model and produces zero or more output objects. An example is the GrayscaleOperation which takes Frame objects, converts its image into grayscale and emits the Frame to the next bolt. There are two different types of Operations which differ in the number of input elements they require:

SingleInputOperations: do their work on one input (like the GrayscaleOperation)
BatchOperations: require two or more input elements before they can perform their operation. An example is the OpticalFlowOperation which requires two subsequent frames.

StormCV contains a number of common Computer Vision operations. The list below shows the most important ones. Please consult the the javadoc for a full list and more documentation.

ColorHistogramOperation: calculates the color histogram for a frame
DrawFeaturesOperation: draws features into frames, used for testing
FeatureExtractorOperation: extracts and describes features such as SIFT, SURF, DENSE SIFT etc
GrayscaleOperation: converts a frame into grayscale
OpticalFlowOperation: calculates dense optical flow using two subsequent frames
ScaleImageOperation: scales frames
SequentialFrameOperation: a utility operation that can execute multiple Operations that work on a Frame object in sequence
TilingOperation: Splits frames in a configurable number of tiles, each tile will get a new streamId

Batchers

BatchOperations are always preceded by a Batcher implementation which is responsible to hold on to objects it has received until it can create the appropriate input for the subsequent BatchOperation. StormCV contains the following Batchers:

DiscreteWindowBatcher: creates batches of input objects of a specified length. Once a batch is created is completely removed from memory. This batcher typically creates batches of frames with specific frame numbers like [0,25], [50,75], [100,125]
SlidingWindowBatcher: creates batches of input objects of a specified length. Only the first element of a created batch is removed from memory. This batcher typically creates batches of frames with specific frame numbers like [0,25], [25,50], [50,75]
SeqeunceNrBatcher: creates batches of input objects with the same sequence number and is typically used to combine multiple features calculated on the same frame.

Batchers group the objects they get on some field within the object. Most of the time this grouping is done on the streamId to ensure information of the same stream is grouped and batched. Bolts containing a Batcher usually require a fieldsGrouping on the same field to ensure they receive all the data they require to created batches.

The OpticalFlowOperation for example is typically preceded by a SlidingWindowBatcher which creates batches of two subsequent frames (of the same stream). A schematic of this process is shown in the following figures (SingleInputOperation on the left and Batcher + BatchOperations on the right)

It is possible to group Frames into a single object called GroupOfFrames and send this to bolts. When a GroupOfFrames is used it is possible to use a BatchOperation without a preceding Batcher. This method can be used to make the topology more scalable and robust since it enables the use of shuffleGroupings instead of fieldGroupings. However, the use of GroupOfFrames typically costs more bandwidth because frames are sent multiple times.

The use of Operations makes the platform very flexible. A single bolt can contain one Operation but may also contain multiple Operations which are executed sequentially; each working on the output of the preceding one. See example 4 how this can be done. This freedom combined with the ability to perform batch operations should provide developers the tools to build the topologies they need.

Grouping

Groupings are part of Storm and control the way data is routed through the platform (also see Storm Documentation and/or Hortonworks description). When a bolt has multiple instances the platform must choose which instance it must send data to. Storm itself contains a number of groupings like shuffleGrouping and fieldsGrouping which can be used in StormCV as well. Fields used within the model can be requested from the Serializers, so a fieldGrouping on the name field of a Feature can be specified like this: fieldsGrouping("spout", new Fields(FeatureSerializer.NAME));

StormCV contains a small number of custom groupings, see the javadoc for details:

FeatureGrouping: can be used to get only Feature objects with a specific name. If for example a Bolt produces different features but the next Bolt only wants ‘SIFT’ it can use this grouping. In essence the FeatureGrouping first performs a filter on all Feature’s it gets and only allows the types he is configured for.
ModGrouping: can be used to only get Objects with a specific sequenceNr. If a ModGrouping constructed to mod on 10 is used on a stream with sequenceNr’s {0,1,10,11,20,21} it will only allow the numbers%10==0 which are 0,10 and 20. This is again a way to filter.
SupervisedGrouping: will create a mapping from source tasks to target tasks when it is being prepared and will use this mapping to route all tuples it gets. Hence this grouping is very rigid and can be used when the order of objects being pushed though a topology is very important.
TileGrouping: is a special grouping that can only be used when a Frame has been split into tiles (they get a new streamId) and the tiles must be recombined again.

Provide feedback

Saved searches