Skip to content

Fundamentals

Charles d'Avernas edited this page Jul 31, 2023 · 5 revisions

Cloud Events

A CloudEvent is a standardized format for representing events in cloud environments. It serves as a common data structure that encapsulates information about an event, such as its type, source, data, and other metadata. The adoption of CloudEvents fosters interoperability among various cloud services and systems, enabling seamless event exchange and integration.

CloudStreams ingests and dispatches CloudEvents using the Structured Content Mode over the HTTP Protocol Binding.

To publish CloudEvents to CloudStreams, perform a valid CloudEvent HTTP POST request to a CloudStreams Gateway's ingestion endpoint, typically located at /api/gateway/v1/cloud-events/pub.

Sample: Sending a CloudEvent to CloudStreams
curl -XPOST -H "Content-type: application/cloudevents+json" -d '{
    "id": "123",
    "source": "https://cloud-streams.io",
    "type": "io.cloud-streams",
    "specversion": "1.1",
    "datacontenttype": "application/json",
    "subject": "foobar",
    "data":{
        "foo": "bar"
    }
}' '/api/gateway/v1/cloud-events/pub'

Streams

A stream is an ordered, append-only serie of consumed CloudEvents.

Streams can be read forward or backward, starting from the start of the stream (0), from a specific offset or from the end of the stream (-1).

CloudStreams stores CloudEvents into a single stream, ensuring they are securely captured and recorded, while providing comprehensive auditing capabilities. Additionally, the use of partitions within the single stream enables efficient event reading and querying based on different attributes such as source, type, subject, correlationId, or causationId (see: Partitions). This design empowers users to access specific sets of events with ease, facilitating streamlined event processing while maintaining full visibility into the complete event flow for auditing purposes.

Partitions

Upon ingestion, CloudStreams store CloudEvents in a global stream, ensuring reliable event capture, before automatically partitioning them based on attributes like source, type, subject, and optionally, correlation ID and causation ID.

Each partition represents a stream of references containing CloudEvents with shared attribute values, streamlining handling and enabling both scalable and responsive event-driven architectures.

CloudStreams partitions ingested cloud events into up to 5 different partition types:

  • by-source: a dedicated stream that allows consumers to read all events with a particular source context attribute.
  • by-type: a dedicated stream that allows consumers to read all events with a particular type context attribute.
  • by-subject: a dedicated stream that allows consumers to read all events with a particular subject context attribute.
  • by-correlation-id: a dedicated stream that allows consumers to read all events with a particular $correlationId metadata value.
  • by-causation-id: a dedicated stream that allows consumers to read all events with a particular $causationId metadata value.

Partitions are identified by a type (ex: by-source) and by an id, which is the value of the attribute or metadata the CloudEvents have been seggregated by (ex: https://my-fake-event-source.com).

Example

Given the following hypothetical CloudEvents:

id: '1'
source: https://cloud-streams.io
type: io.cloud-streams
specversion: '1.1'
datacontenttype: application/json
subject: foo
data:
  foo: bar
---
id: '2'
source: https://test.cloud-streams.io
type: io.cloud-streams
specversion: '1.1'
datacontenttype: application/json
subject: foo
data:
  foo: bar
---
id: '3'
source: https://cloud-streams.io
type: io.cloud-streams.test
specversion: '1.1'
datacontenttype: application/json
subject: foo
data:
  foo: bar
---
id: '4'
source: https://cloud-streams.io
type: io.cloud-streams
specversion: '1.1'
datacontenttype: application/json
subject: bar
data:
  foo: bar

And given they were published to a running Gateway, CloudStreams would store them in a single CloudEvents stream and would then create the following new partitions:

  • A partition of type by-source with id https://cloud-streams.io, referencing events 1, 3 and 4

  • A partition of type by-source with id https://test.cloud-streams.io, referencing event 2

  • A partition of type by-type with id io.cloud-streams, referencing events 1, 2 and 4

  • A partition of type by-type with id io.cloud-streams.test, referencing event 3

  • A partition of type by-subject with id foo, referencing events 1, 2 and 3

  • A partition of type by-subject with id bar, referencing event 4

Subscriptions

A CloudStreams Subscription is a powerful resource that empowers users to configure the dispatch of Cloud Events to consumers in a flexible and efficient manner. With a CloudStreams subscription, users can define precisely how they want to handle and process incoming events, tailoring the consumption experience to their specific requirements.

Partitioning

One of the key features of a CloudStreams subscription is the ability to partition events. Users can choose to consume only the events belonging to a specific partition. This partition-based consumption ensures that consumers receive relevant events based on specific attributes like source, type, subject, correlation ID, or causation ID.

By filtering events based on partitions, users can optimize their event-driven applications to focus on the data that matters most.

Sample: Subscribing to a specific partition
apiVersion: cloud-streams.io/v1
kind: Subscription
metadata:
  name: by-cloudstreams-source
spec:
  partition:
    type: by-source
    id: https://cloud-streams.io
  subscriber:
    uri: https://my-subscriber.com

The above subscription is configured to dispatch only CloudEvents that have their source context attribute set to https://cloud-streams.io

Note that a subscriber that does not define a partition will be dispatched all ingested cloud CloudEvents, thus possible its hindering performances.

Filtering

A CloudStreams subscription also allows users to apply filters to the events within a given partition. These filters act as criteria for selecting specific events that meet defined conditions.

By fine-tuning the event filtering process, users can ensure that only relevant events are dispatched to the consumers, reducing unnecessary processing overhead and maximizing the efficiency of their event processing pipeline.

Sample: Subscribing to all events then filtering them based on complex rules
apiVersion: cloud-streams.io/v1
kind: Subscription
metadata:
  name: by-cloudstreams-source
spec:
  filter:
    type: attribute
    attributes:
      foo: '(.*)bar' #filter supports regular expression, as opposed to partition
  subscriber:
    uri: https://my-subscriber.com

The above subscription is configured to consume all CloudEvents that have a foo context attribute with a value that contains bar.

Mutation

CloudStreams provides the flexibility of mutating the consumed events. Users can modify event data or metadata according to their business needs before dispatching them to consumers.

Event mutation allows for data enrichment, normalization, or any other custom transformations, enabling consumers to work with events in a format that best suits their downstream applications.

Sample: Subscribing all events and mutating them thanks to a runtime expression
apiVersion: cloud-streams.io/v1
kind: Subscription
metadata:
  name: by-cloudstreams-source
spec:
  mutation:
    type: expression
    expression: '${ . + { "bar": "baz" } }'
  subscriber:
    uri: https://my-subscriber.com

The above subscription is configured to consume all CloudEvents and to mutate them by appending a new bar context attribute before dispatch.

Streaming

The streaming capability of CloudStreams subscriptions enables the replaying of events from a given offset in the event stream or partition. This functionality is invaluable for scenarios where consumers need to reprocess events or replay events for debugging, auditing, or recovery purposes. The ability to stream events from a specific point in time ensures resilience and the ability to revisit past events as needed.

Sample: Subscribing all events and mutating them thanks to a runtime expression
apiVersion: cloud-streams.io/v1
kind: Subscription
metadata:
  name: by-cloudstreams-source
spec:
  stream:
    offset: 0
  subscriber:
    uri: https://my-subscriber.com

The above subscription is configured to consume all CloudEvents starting with the first ever received.

Note that a subscriber that does not define a stream will be dispatched ingested cloud CloudEvents as they arrive on CloudStreams.

Clone this wiki locally