-
Notifications
You must be signed in to change notification settings - Fork 0
Fundamentals
CloudStreams resources are resource-oriented architecture (ROA) data structures that describe the main components of the solution.
Similar to Kubernetes Custom Resources, CloudStreams resources enable users to introduce custom configurations and behaviors into the CloudStreams ecosystem without modifying the core system.
These custom resources serve as blueprints for different aspects of the CloudStreams application, allowing users to specify how events should be handled, processed, and routed within the system. Each CloudStreams resource represents a specific concept or functionality that can be tailored to fit the exact requirements of the user's event-driven architecture.
CloudStreams resources are designed to be declarative, meaning users describe the desired state of their event processing infrastructure, and CloudStreams handles the underlying implementation details to achieve that state. This abstraction allows users to focus on defining the higher-level functionalities they need, freeing them from managing the low-level complexities.
CloudStreams leverages the robust capabilities of Hylo to persist and manage resources effectively. Hylo provides essential abstractions based on the principles of Resource-Oriented Architecture (ROA), enabling seamless interactions with the underlying database, which can be easily interchangeable based on the user's preferences. This integration with Hylo ensures a flexible and efficient approach to resource storage and management within the CloudStreams ecosystem.
A CloudEvent is a standardized format for representing events in cloud environments. It serves as a common data structure that encapsulates information about an event, such as its type, source, data, and other metadata. The adoption of CloudEvents fosters interoperability among various cloud services and systems, enabling seamless event exchange and integration.
CloudStreams ingests and dispatches CloudEvents using the Structured Content Mode over the HTTP Protocol Binding.
To publish CloudEvents to CloudStreams, perform a valid CloudEvent HTTP
POST
request to a CloudStreams Gateway's ingestion endpoint, typically located at /api/gateway/v1/cloud-events/pub
.
Sample: Sending a CloudEvent to CloudStreams
curl -XPOST -H "Content-type: application/cloudevents+json" -d '{
"id": "123",
"source": "https://cloud-streams.io",
"type": "io.cloud-streams",
"specversion": "1.1",
"datacontenttype": "application/json",
"subject": "foobar",
"data":{
"foo": "bar"
}
}' '/api/gateway/v1/cloud-events/pub'
A stream is an ordered, append-only serie of consumed CloudEvents.
Streams can be read forward or backward, starting from the start of the stream (0
), from a specific offset or from the end of the stream (-1
).
CloudStreams stores CloudEvents into a single stream, ensuring they are securely captured and recorded, while providing comprehensive auditing capabilities.
Additionally, the use of partitions within the single stream enables efficient event reading and querying based on different attributes such as source
, type
, subject
, correlationId
, or causationId
(see: Partitions). This design empowers users to access specific sets of events with ease, facilitating streamlined event processing while maintaining full visibility into the complete event flow for auditing purposes.
Upon ingestion, CloudStreams store CloudEvents in a global stream, ensuring reliable event capture, before automatically partitioning them based on attributes like source, type, subject, and optionally, correlation ID and causation ID.
Each partition represents a stream of references containing CloudEvents with shared attribute values, streamlining handling and enabling both scalable and responsive event-driven architectures.
CloudStreams partitions ingested cloud events into up to 5 different partition types:
-
by-source
: a dedicated stream that allows consumers to read all events with a particularsource
context attribute. -
by-type
: a dedicated stream that allows consumers to read all events with a particulartype
context attribute. -
by-subject
: a dedicated stream that allows consumers to read all events with a particularsubject
context attribute. -
by-correlation-id
: a dedicated stream that allows consumers to read all events with a particular$correlationId
metadata value. -
by-causation-id
: a dedicated stream that allows consumers to read all events with a particular$causationId
metadata value.
Partitions are identified by a type
(ex: by-source
) and by an id
, which is the value of the attribute or metadata the CloudEvents have been seggregated by (ex: https://my-fake-event-source.com
).
Given the following hypothetical CloudEvents:
id: '1'
source: https://cloud-streams.io
type: io.cloud-streams
specversion: '1.1'
datacontenttype: application/json
subject: foo
data:
foo: bar
---
id: '2'
source: https://test.cloud-streams.io
type: io.cloud-streams
specversion: '1.1'
datacontenttype: application/json
subject: foo
data:
foo: bar
---
id: '3'
source: https://cloud-streams.io
type: io.cloud-streams.test
specversion: '1.1'
datacontenttype: application/json
subject: foo
data:
foo: bar
---
id: '4'
source: https://cloud-streams.io
type: io.cloud-streams
specversion: '1.1'
datacontenttype: application/json
subject: bar
data:
foo: bar
And given they were published to a running Gateway, CloudStreams would store them in a single CloudEvents stream and would then create the following new partitions:
-
A partition of type
by-source
with idhttps://cloud-streams.io
, referencing events1
,3
and4
-
A partition of type
by-source
with idhttps://test.cloud-streams.io
, referencing event2
-
A partition of type
by-type
with idio.cloud-streams
, referencing events1
,2
and4
-
A partition of type
by-type
with idio.cloud-streams.test
, referencing event3
-
A partition of type
by-subject
with idfoo
, referencing events1
,2
and3
-
A partition of type
by-subject
with idbar
, referencing event4
A CloudStreams Subscription
is a powerful resource that empowers users to configure the dispatch of Cloud Events to consumers in a flexible and efficient manner. With a CloudStreams subscription, users can define precisely how they want to handle and process incoming events, tailoring the consumption experience to their specific requirements.
One of the key features of a CloudStreams subscription is the ability to partition events. Users can choose to consume only the events belonging to a specific partition. This partition-based consumption ensures that consumers receive relevant events based on specific attributes like source, type, subject, correlation ID, or causation ID.
By filtering events based on partitions, users can optimize their event-driven applications to focus on the data that matters most.
Sample: Subscribing to a specific partition
apiVersion: cloud-streams.io/v1
kind: Subscription
metadata:
name: by-cloudstreams-source
spec:
partition:
type: by-source
id: https://cloud-streams.io
subscriber:
uri: https://my-subscriber.com
The above subscription is configured to dispatch only CloudEvents that have their source
context attribute set to https://cloud-streams.io
Note that a subscriber that does not define a partition will be dispatched all ingested cloud CloudEvents, thus possible its hindering performances.
A CloudStreams subscription also allows users to apply filters to the events within a given partition. These filters act as criteria for selecting specific events that meet defined conditions.
By fine-tuning the event filtering process, users can ensure that only relevant events are dispatched to the consumers, reducing unnecessary processing overhead and maximizing the efficiency of their event processing pipeline.
Sample: Subscribing to all events then filtering them based on complex rules
apiVersion: cloud-streams.io/v1
kind: Subscription
metadata:
name: by-cloudstreams-source
spec:
filter:
type: attribute
attributes:
foo: '(.*)bar' #filter supports regular expression, as opposed to partition
subscriber:
uri: https://my-subscriber.com
The above subscription is configured to consume all CloudEvents that have a foo
context attribute with a value that contains bar
.
CloudStreams provides the flexibility of mutating the consumed events. Users can modify event data or metadata according to their business needs before dispatching them to consumers.
Event mutation allows for data enrichment, normalization, or any other custom transformations, enabling consumers to work with events in a format that best suits their downstream applications.
Sample: Subscribing all events and mutating them thanks to a runtime expression
apiVersion: cloud-streams.io/v1
kind: Subscription
metadata:
name: by-cloudstreams-source
spec:
mutation:
type: expression
expression: '${ . + { "bar": "baz" } }'
subscriber:
uri: https://my-subscriber.com
The above subscription is configured to consume all CloudEvents and to mutate them by appending a new bar
context attribute before dispatch.
The streaming capability of CloudStreams subscriptions enables the replaying of events from a given offset in the event stream or partition. This functionality is invaluable for scenarios where consumers need to reprocess events or replay events for debugging, auditing, or recovery purposes. The ability to stream events from a specific point in time ensures resilience and the ability to revisit past events as needed.
Sample: Subscribing all events and mutating them thanks to a runtime expression
apiVersion: cloud-streams.io/v1
kind: Subscription
metadata:
name: by-cloudstreams-source
spec:
stream:
offset: 0
subscriber:
uri: https://my-subscriber.com
The above subscription is configured to consume all CloudEvents starting with the first ever received.
Note that a subscriber that does not define a stream will be dispatched ingested cloud CloudEvents as they arrive on CloudStreams.