Skip to content

Batch-Capable Producer Bindings #2969

Closed
@carolmorneau

Description

@carolmorneau

(Current Batching Pattern) Batch-Accumulator Producer Bindings

Currently, the common batching pattern used by producer bindings is to accumulate messages at the producer binding based on batchSize and batchTimeout configuration and then bulk send to the target system. In a system where the consumer binding is also operating in batch-mode, this could look like the following:

image

The main challenges with the above are:

  • batch coordination between consumer and producer bindings is complicated. Both bindings need to operate on the same batch (aka work-unit) for acknowledgments to work. This gets complex on failure paths.
  • configuration for batchSize and batchTimeout exist at both bindings and need to be consistent. It is hard to reason about the two timeouts which are independent from each others.

(Proposed Enhancement) Batch-Capable Producer Bindings

What if a batch could be delivered to the producer binding as a whole and be handled all at once?

image

In the above:

  1. Consumer binding accumulates a work-unit (a batch) based on its batchSize and batchTimeout configuration. It could be a partial batch.
  2. The batch travels through the functions as a whole
  3. The batch is served to the producer binding as a whole
  4. The producer binding detects that the message is a batch and processes it as a whole. Once the whole batch is successfully processed, the producer binding returns successfully. If anything goes wrong, an exception is thrown.

Benefits of this design:

  • no coordination between consumer and producer bindings required. The batch (or partial batch) that arrives at the producer binding is simply what needs to be processed for that invocation.
  • batch related configuration only exists at the consumer binding. The consumer is the component responsible to create a batch (work-unit).
  • supporting batching at the producer binding is somewhat simpler with this design:
    • Full batch knowledge at every invocation.
    • no batchSize and no batchTimeout to implement

Worth noting that this design does not couple the producer binding to the consumer binding. The producer binding benefits from an already accumulated batch by the consumer binding; however, the producer binding remains completely independent from the consumer binding.

Spring Defined Header For Batched Messages

Batched messages already exist in Spring Cloud Stream. Here is a sample batch message which is produced by the Solace consumer binding:

image
  • The payload is of type ArrayList (in this sample, the batch is of size 3)
  • Top level headers are common to the entire batch
  • The header with key solace_scst_batchedHeaders holds actual message headers for every message within the batch.

Batch messages would benefit from a Spring defined header that is not binder specific. Instead of solace_scst_batchedHeaders, a batched message could use scst_batchedHeaders which would be defined by Spring. This header would standardize the batch message format and would indicate to producer bindings that the message is a batch message and that it should be processed as such.

Final Note

Note that the proposed design can be implemented today with no issue, however, it is not a design currently endorsed by Spring. The goal of this issue is to bless this design and provide standards so that it could be implemented in any binders in a consistent manner.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions