Synchronize partitions by blp · Pull Request #5629 · feldera/feldera

blp · 2026-02-14T00:03:50Z

Describe Manual Test Plan

I didn't test this manually but I did write and run a new unit test for the feature.

Checklist

Unit tests added/updated
Integration tests added/updated
Documentation updated
Changelog updated

Signed-off-by: Ben Pfaff <blp@feldera.com>

This prepares for adding another configuration setting. Signed-off-by: Ben Pfaff <blp@feldera.com>

The following commit will add another way to stage records. This commit makes that one easier to understand. This commit should not change any behavior. Signed-off-by: Ben Pfaff <blp@feldera.com>

Issue: #5607 Signed-off-by: Ben Pfaff <blp@feldera.com>

Signed-off-by: feldera-bot <feldera-bot@feldera.com>

ryzhyk

This looks clean. One thing I couldn't figure out is how we make sure that some of the partition receivers don't buffer unbounded messages while waiting for stragglers.

ryzhyk · 2026-02-15T00:14:27Z

docs.feldera.com/docs/connectors/sources/kafka.md

+
+- If one or a few partitions have timestamps far behind the others, only
+  those partitions will be processed until all the old events are
+  processed.  (This is the flip side of the previous pitfall.)


Flink has a feature called source idleness detection to deal with this problem. We might need something similar.

blp · 2026-02-15T16:32:14Z

This looks clean. One thing I couldn't figure out is how we make sure that some of the partition receivers don't buffer unbounded messages while waiting for stragglers.

Hmm, I thought I had that figured out, but now that I look again, I was wrong.

I'll add a per-partition queuing limit.

ryzhyk · 2026-02-15T17:11:11Z

This looks clean. One thing I couldn't figure out is how we make sure that some of the partition receivers don't buffer unbounded messages while waiting for stragglers.

Hmm, I thought I had that figured out, but now that I look again, I was wrong.

I'll add a per-partition queuing limit.

Can this affect the connector even without this new feature, in the sense that there's nothing preventing partition readers from accumulating unbounded data?

blp · 2026-02-15T17:28:38Z

This looks clean. One thing I couldn't figure out is how we make sure that some of the partition receivers don't buffer unbounded messages while waiting for stragglers.

Hmm, I thought I had that figured out, but now that I look again, I was wrong.
I'll add a per-partition queuing limit.

Can this affect the connector even without this new feature, in the sense that there's nothing preventing partition readers from accumulating unbounded data?

I took another look.

We account data as buffered as soon as it comes in on the partition reader thread, so the total amount buffered is limited by the buffer limit plus however long it takes the controller to tell us to pause. So this will limit the amount buffered in either case.

However, that means there's another complication: we need to make sure that every partition can buffer at least one record, even if the overall buffers are filled, because otherwise we can be full of buffered records that can't be input to the circuit.

Also, I think that the idea of records that are buffered but can't be input to the circuit will cause livelock in the controller, which will continuously try to start a step since we've told it that there is data buffered.

So, there might need to be further distinction between the modes:

In synchronized mode, we need to do per-partition backpressure and we need to (somehow) account records as buffered only when they are available to be input to the circuit.
In non-synchronized mode, we can use the existing strategies.

I'll continue to think about this today. Thanks for asking probing questions.

blp added 5 commits February 13, 2026 10:11

Fix new compiler warnings introduced in Rust 1.93.

6b0d1f0

Signed-off-by: Ben Pfaff <blp@feldera.com>

[pipeline-manager] Remove unused import.

844ddb0

Signed-off-by: Ben Pfaff <blp@feldera.com>

[adapters] Simplify initialization of mostly-default KafkaInputConfig.

0d2076b

This prepares for adding another configuration setting. Signed-off-by: Ben Pfaff <blp@feldera.com>

[adapters] Refactor staging records in Kafka input connector.

cff5e14

The following commit will add another way to stage records. This commit makes that one easier to understand. This commit should not change any behavior. Signed-off-by: Ben Pfaff <blp@feldera.com>

[adapters] Implement Kafka input synchronization across partitions.

79f410c

Issue: #5607 Signed-off-by: Ben Pfaff <blp@feldera.com>

blp requested a review from ryzhyk February 14, 2026 00:03

blp self-assigned this Feb 14, 2026

blp added connectors Issues related to the adapters/connectors crate rust Pull requests that update Rust code user-reported Reported by a user or customer labels Feb 14, 2026

[ci] apply automatic fixes

2b46bf2

Signed-off-by: feldera-bot <feldera-bot@feldera.com>

ryzhyk reviewed Feb 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Synchronize partitions#5629

Synchronize partitions#5629
blp wants to merge 6 commits intomainfrom
synchronize-partitions

blp commented Feb 14, 2026

Uh oh!

ryzhyk left a comment

Uh oh!

ryzhyk Feb 15, 2026

Uh oh!

blp commented Feb 15, 2026

Uh oh!

ryzhyk commented Feb 15, 2026

Uh oh!

blp commented Feb 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

blp commented Feb 14, 2026

Describe Manual Test Plan

Checklist

Uh oh!

ryzhyk left a comment

Choose a reason for hiding this comment

Uh oh!

ryzhyk Feb 15, 2026

Choose a reason for hiding this comment

Uh oh!

blp commented Feb 15, 2026

Uh oh!

ryzhyk commented Feb 15, 2026

Uh oh!

blp commented Feb 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants