What does exactly-once processing mean in streaming pipelines, and how is it achieved?
Exactly-once guarantees that each input record affects the output exactly one time — no duplicates from retries, no gaps from dropped messages. In practice it requires coordination between the messaging system, the processing engine, and the sink: the processor must checkpoint its position and write output atomically, so that a restart replays from the last checkpoint without re-emitting already-written results.
How to think about it
Exactly-once is the hardest correctness guarantee in distributed systems. Most teams settle for at-least-once delivery with idempotent sinks — which is functionally equivalent and far simpler to implement.
The three delivery guarantees
| Guarantee | Behavior on failure |
|---|---|
| At-most-once | Records may be lost; never duplicated. Fire and forget. |
| At-least-once | Records may be duplicated; never lost. Retry on failure. |
| Exactly-once | Records processed exactly one time, even across failures. |
Why at-least-once is the realistic baseline
A consumer reads from Kafka, processes a record, writes to the sink, then commits the offset. If the consumer crashes after writing but before committing the offset, on restart it re-reads the same record and writes again — producing a duplicate. This is the default failure mode.
Achieving exactly-once: the two-phase approach
Kafka Transactions + Idempotent Producer (Kafka-to-Kafka)
Kafka 0.11+ supports transactional producers that atomically write output records and commit input offsets in a single transaction. If the producer dies mid-transaction, the transaction is aborted on recovery.
producer = KafkaProducer(
bootstrap_servers="broker:9092",
transactional_id="pipeline-v1", # enables exactly-once
)
producer.init_transactions()
producer.begin_transaction()
producer.send("output-topic", value=transformed_record)
producer.send_offsets_to_transaction(offsets, group_metadata)
producer.commit_transaction()
Flink Checkpointing + Two-Phase Commit Sink
Flink periodically snapshots operator state and source offsets into durable storage (S3, HDFS). On restart, it rewinds to the last checkpoint. A two-phase commit sink (e.g., Flink’s Kafka sink or JDBC sink with pre-commit / commit hooks) holds output in a pending transaction until the checkpoint is confirmed, then commits — preventing re-emitted records from being visible downstream.
Idempotent sinks as a pragmatic substitute
If the sink supports upserts keyed by a deterministic message ID (e.g., a Snowflake MERGE on event_id), at-least-once delivery with an idempotent sink is equivalent to exactly-once from the consumer’s perspective. Duplicates arrive but are collapsed.
MERGE INTO events AS t USING staging AS s ON t.event_id = s.event_id
WHEN NOT MATCHED THEN INSERT VALUES (s.event_id, s.payload, s.created_at);