datarekha
Data Engineering Hard Asked at KafkaAsked at FlinkAsked at GoogleAsked at Databricks

What does exactly-once processing mean in streaming pipelines, and how is it achieved?

The short answer

Exactly-once guarantees that each input record affects the output exactly one time — no duplicates from retries, no gaps from dropped messages. In practice it requires coordination between the messaging system, the processing engine, and the sink: the processor must checkpoint its position and write output atomically, so that a restart replays from the last checkpoint without re-emitting already-written results.

How to think about it

Exactly-once is the hardest correctness guarantee in distributed systems. Most teams settle for at-least-once delivery with idempotent sinks — which is functionally equivalent and far simpler to implement.

The three delivery guarantees

GuaranteeBehavior on failure
At-most-onceRecords may be lost; never duplicated. Fire and forget.
At-least-onceRecords may be duplicated; never lost. Retry on failure.
Exactly-onceRecords processed exactly one time, even across failures.

Why at-least-once is the realistic baseline

A consumer reads from Kafka, processes a record, writes to the sink, then commits the offset. If the consumer crashes after writing but before committing the offset, on restart it re-reads the same record and writes again — producing a duplicate. This is the default failure mode.

Achieving exactly-once: the two-phase approach

Kafka Transactions + Idempotent Producer (Kafka-to-Kafka)

Kafka 0.11+ supports transactional producers that atomically write output records and commit input offsets in a single transaction. If the producer dies mid-transaction, the transaction is aborted on recovery.

producer = KafkaProducer(
    bootstrap_servers="broker:9092",
    transactional_id="pipeline-v1",  # enables exactly-once
)
producer.init_transactions()
producer.begin_transaction()
producer.send("output-topic", value=transformed_record)
producer.send_offsets_to_transaction(offsets, group_metadata)
producer.commit_transaction()

Flink Checkpointing + Two-Phase Commit Sink

Flink periodically snapshots operator state and source offsets into durable storage (S3, HDFS). On restart, it rewinds to the last checkpoint. A two-phase commit sink (e.g., Flink’s Kafka sink or JDBC sink with pre-commit / commit hooks) holds output in a pending transaction until the checkpoint is confirmed, then commits — preventing re-emitted records from being visible downstream.

Idempotent sinks as a pragmatic substitute

If the sink supports upserts keyed by a deterministic message ID (e.g., a Snowflake MERGE on event_id), at-least-once delivery with an idempotent sink is equivalent to exactly-once from the consumer’s perspective. Duplicates arrive but are collapsed.

MERGE INTO events AS t USING staging AS s ON t.event_id = s.event_id
WHEN NOT MATCHED THEN INSERT VALUES (s.event_id, s.payload, s.created_at);

Keep practising

All Data Engineering questions

Explore further

Skip to content