Domain Kappa Services: patterns for stateful streaming services

In the previous post, we gave an introduction to stateful streaming microservices. We noted that some computations by their very nature require no state because there is no need to consider what happened before. For instance, if I want to know the current temperature is, the service reads the thermometer sensor and sends the value. On the other hand, some computations are stateful by their nature. For instance, if I want to know how much the temperature changed in the past 10 minutes, then the compute must know the previous temperature. That is state.

A stateful service maintains the state within the service, that is, within the service container (ie Docker instance) or in the service process (ie JVM). In this post we will describe patterns to implementing the state of streaming services. In particular, we will introduce a pattern we call domain kappa services.

Domain Kappa Service

A domain service is a particular type of stateful service. We use domain here in the sense of Domain Driven Design. We will have to dive deeper into domain services in another post. For simplicity, a domain is defined by a natural transaction boundary. Transactions are impractical to implement in a distributed system. Thus, we chose to avoid them. A domain service is able to make all changes to an entity without the need to form a transaction with any other service. That is, the change to the entity can occur within the service without having to request dependent changes from other services.

A kappa service is is a particular type of streaming service. By kappa we refer to Kappa Architecture. A kappa service writes every change of state to an append-only immutable log, that is, a stream. Since the service persists its state changed to a stream, it is possible to create any number of persisted views of the data in any number of read optimized formats and assure their consistency.

A domain kappa service is the combination of these two patterns. All changes to a domain occur within a single service. Every change to the domain is written to a stream.

Note that by single service we do not mean a single instance. Distributed services must be able to run as multiple cooperating instances in order to be scalable.

Three components

When implemented as a domain kappa service, a stateful streaming service has three basic components: the data, the compute and the stream (or messaging).

  1. Data is the current state
  2. Compute changes the state
  3. Stream publishes the state changed

Only the compute of a domain kappa service can change the state of the domain. Every change in state in the domain is written to the stream.

Three types of data

From the perspective of compute, there are three types of data:

  1. Observed data: The state of other services to which the service reacts or consumes. This includes events and reference data
  2. Working data: The state used by the compute to know or calculate its own state
  3. Observable data: The state of the service that other services observe

It is easy to confuse or conflate these three types of data. It is important keep them separate in your thinking and design.

The data does not go to the stream or the stream to the data without passing through the compute.

Note that the observable state of one service is the observed state of another service. Streaming services are thus connected into a directed acyclical graph (DAG).

Three options for observed state

Depending on the compute latency requirements of the service, the developer can choose one of three different approaches to handling the observed state of other services:

  1. stateful
  2. stateless
  3. semi-stateless

In the stateful approach, the service maintains an in-memory copy of the observed state of the other service by subscribing to the stream of the other service. The service can read in the entire history from the stream. Alternatively, the copy can be initialized by first reading the observable state of the other service. This approach enables the lowest possible latency but requires the most resources.

In the stateless approach, the service reads the state from the other service by request-response. This approach has higher latency but requires the least resources. In this approach, there is no a duplicate copy of the observable state.

The semi-stateless approach is a hybrid of the other two. The service subscribes to the stream of the other service and keeps a cache of the observable state. The cache is limited in size through time outs. If the service is missing a state in its local cache, then it reads the observable state from the other service and caches it. The subscription to the stream enables us to assure that the local cache is always current.

The following table shows the trade offs available to the developer for the three options.

option elasticity data duplication latency resources
stateful requires migration or restart full X XXX
stateless full none XXX X
semi-stateless full transient XX XX

Three options for working state

Depending on the compute latency requirements of the service, the developer can choose one of three different approaches to handling its working state:

  1. local
  2. remote
  3. container affinity

With the local approach, the service keeps its state in compute memory. This approach has the lowest latency. Partitioning is critical to this approach. The developer must account for failure recovery. Some of the compute frameworks such as Kafka Streams enable this recovery.

In the remote approach, the service keeps its state in a remote database. This approach is the most elastic but has the highest latency. Note that service instances still need partitioning to assure the sequencing of events. However, there is no time lapse on failure because compute does not have to recreate the state.

In the container affinity approach, the service keeps its state in a database in a container that is local to the compute’s container. This approach is a hybrid of the two others. It has lower latency than a remote database because there are no network hops. It has easier recovery in the situation where the compute fails but the physical server does not. CrateIO is well suited for the hybrid approach.

Three options for observable state

The working state and the observable state are conceptually the same thing. Kafka Streams raised this concept by enabling the KTables to be queried externally. The approach of making the working state directly observable sounds very attractive in its simplicity, however it creates an inherent contention of resources between compute and query. Thus, we do not believe that in-memory observable state is a viable option.

The two other options, remote and container-affinity, are viable options for highly reliable enterprise streaming services. You may notice that these options follow the Command Query Responsibility Segregation (CQRS) pattern.

What is the third option? Simply that the working state and the observable state can be the same or different. For instance, a service may keep its working state in a remote database and use that same remote database as its observable state. Or, a service might keep its working state in memory and use a remote database for its observable state. A service might use the same container affinity database for both its working and observable state, or it might have two container affinity databases. The combinations are many.

What makes this flexibility of implementation possible is the Kappa architecture. The observable state and the working state are both instances of persisted views. Since the service persists its state changed to a stream, it is possible to create any number of persisted views of the data and assure their consistency.

In the next post, we will discuss how to manage the race conditions of distributed streaming services.

Was it valuable? Share within your network!

Leave A Reply

Your email address will not be published. Required fields are marked *