Overview
Event-driven communication allows components to react to facts or messages without requiring the producer and every consumer to be available at the same time. A broker, queue, topic, or event stream carries messages between independently deployed components.
An event describes something that already happened:
OrderPlaced
PaymentAuthorized
InventoryAdjusted
An asynchronous command requests work:
ReserveInventory
GenerateReport
SendWelcomeEmail
Asynchronous request-reply extends one-way messaging when a caller needs an eventual outcome. The caller submits work, receives an acknowledgement and correlation identifier, and obtains the result later through polling, a callback, a reply queue, a webhook, or a real-time channel.
Benefits include:
- Temporal decoupling.
- Load leveling.
- Independent scaling.
- Failure isolation.
- Extensibility through new subscribers.
- Support for long-running operations.
Costs include:
- Eventual consistency.
- Duplicate and out-of-order delivery.
- More difficult debugging.
- Contract evolution.
- Broker operations.
- Explicit retry, dead-letter, timeout, and cancellation behavior.
This topic matters in interviews because distributed systems rarely provide simple exactly-once, globally ordered behavior. Candidates must explain the message contract, ownership, delivery semantics, failure states, and user experience rather than saying only that a queue makes a system scalable.
Core Concepts
Events, Commands, and Messages
A message is the transport envelope.
A command asks one logical owner to perform an action:
public sealed record ReserveInventory(
Guid CommandId,
Guid OrderId,
IReadOnlyList<ReservationLine> Lines);
An event reports a completed fact:
public sealed record InventoryReserved(
Guid EventId,
Guid OrderId,
DateTimeOffset OccurredAt);
Useful semantic differences:
Do not disguise commands as events to create the appearance of loose coupling.
Queue, Publish-Subscribe, and Event Stream
Queue
- Multiple consumers can compete for work.
- One logical consumer processes each message.
- Suitable for commands, jobs, and load leveling.
Publish-subscribe topic
- Each subscription receives a copy.
- Suitable for notifying independent consumers of events.
- Subscribers can have different retry and filtering policies.
Event stream
- Events are retained in an ordered log.
- Consumers track their own position.
- Replay and late consumers are supported.
- Ordering is normally scoped to a partition.
Choose based on delivery and retention needs, not product naming.
Temporal and Spatial Decoupling
Synchronous call:
Service A must know Service B's address.
Service B must be available now.
A waits for B's response.
Brokered message:
A publishes to a stable destination.
B processes later.
A and B scale independently.
This reduces temporal coupling but introduces dependency on:
- The broker.
- Message contracts.
- Destination configuration.
- Eventual processing.
- Operational recovery.
The producer and consumer are still semantically coupled through the meaning of the message.
Event Notification Versus Event-Carried State
A small notification can contain identifiers:
{
"type": "OrderPlaced",
"orderId": "9af2...",
"version": 7
}
The consumer fetches required data from the authoritative service.
Benefits:
- Small stable contracts.
- Less duplicated data.
- Current data can be fetched.
Costs:
- Extra synchronous dependency.
- The data may change before retrieval.
- Replay may not reproduce historical state.
An event-carried state transfer includes consumer-relevant values:
{
"type": "OrderPlaced",
"orderId": "9af2...",
"customerId": "c103...",
"total": 149.50,
"currency": "USD",
"version": 7
}
This improves consumer autonomy but increases payload, privacy, duplication, and contract-evolution concerns.
Event Envelope
A consistent envelope can include:
public sealed record IntegrationEvent<T>(
Guid MessageId,
string Type,
int SchemaVersion,
DateTimeOffset OccurredAt,
string CorrelationId,
string? CausationId,
string Producer,
T Data);
Common metadata:
- Unique message ID.
- Event or command type.
- Schema version.
- Occurrence time.
- Correlation ID.
- Causation ID.
- Tenant or partition key when appropriate.
- Trace context.
- Producer identity.
Avoid placing secrets or unnecessary personal data in headers or payloads.
Delivery Semantics
Common delivery models are:
At most once
- A message may be lost.
- It is not redelivered.
At least once
- The broker retries when acknowledgement is uncertain.
- A message can be processed more than once.
- Consumers must be idempotent.
Exactly once
- Usually scoped to a broker operation or tightly defined transactional boundary.
- Does not automatically make external side effects exactly once.
If a consumer commits to a database and crashes before acknowledging the broker, redelivery is expected. Design for it.
Acknowledgement and Settlement
Typical consumer flow:
receive message
validate envelope
perform local transaction
commit durable state
acknowledge message
Acknowledging before durable work risks message loss. Acknowledging after work permits duplicate delivery when the acknowledgement is lost.
Broker settlement options commonly include:
- Complete or acknowledge.
- Abandon or retry.
- Defer.
- Dead-letter.
The handler should classify transient failures, permanent validation failures, and business rejections deliberately.
Reliable Publication
A database update followed by publication creates a dual-write problem:
commit succeeds
publish fails
The transactional outbox stores the outgoing message with the business update in one local transaction. A publisher sends outbox records later.
The reverse problem occurs on consumption:
message processed
database commit succeeds
acknowledgement fails
An inbox or processed-message record can make consumer effects idempotent.
Ordering
Global ordering is expensive and often unnecessary.
Most systems guarantee order only:
- Within one queue.
- Within one partition.
- Within a session or key.
- Under restricted concurrency.
Choose a partition key such as OrderId when events for one aggregate must be ordered.
Consumers should use versions:
if (message.OrderVersion <= projection.OrderVersion)
{
return;
}
if (message.OrderVersion != projection.OrderVersion + 1)
{
throw new MissingMessageException();
}
Do not assume timestamps from different machines establish reliable total order.
Schema Evolution
Messages outlive deployments and may be replayed.
Safe evolution practices:
- Add optional fields with defaults.
- Avoid changing the meaning of existing fields.
- Keep stable type identifiers.
- Version incompatible contracts.
- Run old and new consumers during migration.
- Preserve old schemas for retained messages.
- Test producers and consumers independently.
Do not expose internal entity serialization as an integration contract. Internal refactoring should not break every consumer.
Consumer Independence
An event publisher should not know:
- How many consumers exist.
- Which database they use.
- Whether they send email, update search, or calculate analytics.
- Whether they are currently online.
Each consumer owns:
- Its subscription.
- Retry and dead-letter policy.
- Idempotency.
- Storage transaction.
- Scaling.
- Monitoring.
A slow analytics consumer should not block an order-confirmation consumer.
Backpressure and Load Leveling
A queue buffers bursts:
producer rate temporarily > consumer rate
-> queue depth grows
-> consumers process at controlled rate
This protects downstream systems only if consumer scaling and concurrency are bounded. Unrestricted autoscaling can move the overload from the broker to the database.
Monitor:
- Queue depth.
- Oldest-message age.
- Arrival rate.
- Completion rate.
- Processing duration.
- Retry and dead-letter counts.
- Downstream saturation.
If the long-term producer rate exceeds capacity, the queue only delays failure.
Poison Messages and Dead-Letter Queues
A message can fail permanently because it is:
- Malformed.
- Incompatible with the schema.
- Missing required referenced data.
- Violating a business rule.
- Triggering a reproducible software defect.
After bounded attempts, move it to a dead-letter queue instead of retrying forever.
Dead-letter handling needs:
- Alerts.
- Reason and diagnostic metadata.
- Safe inspection tooling.
- Repair or replay procedures.
- Retention policy.
- Authorization and privacy controls.
A dead-letter queue is not a successful terminal state; it is operational debt requiring ownership.
Asynchronous HTTP Request-Reply
For long-running work:
POST /reports
Idempotency-Key: 5ad9...
The server validates and accepts the operation:
HTTP/1.1 202 Accepted
Location: /operations/7c42...
Retry-After: 5
The client polls:
GET /operations/7c42...
Pending response:
{
"status": "Running",
"createdAt": "2026-06-14T08:00:00Z",
"lastUpdatedAt": "2026-06-14T08:00:12Z",
"percentComplete": 60
}
Completed response can:
- Include the result.
- Link to the created resource.
- Return
303 See Otherto the result resource.
202 Accepted means accepted for processing, not completed successfully.
Operation Resource Design
An operation resource should define states such as:
Pending
Running
Succeeded
Failed
Canceled
It should include:
- Operation ID.
- Current state.
- Creation and update times.
- Progress when meaningful.
- Result link.
- Structured failure details.
- Expiration or retention.
Security requirements:
- Authorize access to the operation.
- Avoid predictable cross-user identifiers without authorization.
- Do not leak internal exception details.
- Apply polling rate limits.
- Clean up old operation records.
Idempotent Submission
The client may retry because it did not receive the 202 response.
Use an idempotency key:
Client request + stable idempotency key
-> first call creates operation
-> duplicate call returns same operation
Store:
- Caller scope.
- Endpoint or operation type.
- Request fingerprint when useful.
- Operation ID.
- Result or status.
- Expiration.
The same key with a materially different request should be rejected rather than silently reusing the old result.
Broker-Based Request-Reply
Service-to-service request-reply can use:
- Request queue.
- Reply queue or topic.
- Correlation ID.
- Reply destination.
- Deadline.
public sealed record PriceQuoteRequest(
Guid RequestId,
Guid ProductId,
int Quantity,
string ReplyTo);
The requester stores pending correlation state, sends the request, and waits asynchronously for a matching reply.
Risks include:
- Reply arriving after timeout.
- Duplicate replies.
- Requester restart.
- Unbounded pending state.
- Reply queue buildup.
- Coupling that disguises a synchronous dependency.
If the caller cannot make progress without an immediate response, a normal synchronous call may be simpler.
Callbacks, Webhooks, and Real-Time Channels
Alternatives to polling:
Webhook
- Server calls a client-provided endpoint.
- Requires endpoint verification, authentication, retries, and SSRF protection.
WebSocket or SignalR
- Server pushes progress over an established channel.
- Requires connection lifecycle and authorization design.
Reply queue
- Works well for broker-connected services.
- Requires correlation and expiry.
Polling remains practical for browser clients because it is simple and firewall-friendly.
Cancellation and Timeouts
Cancellation is a business operation, not merely cancellation of one thread.
Expose:
DELETE /operations/7c42...
Define:
- Whether cancellation is best effort.
- Which steps are still reversible.
- Whether compensation is required.
- What happens if completion races with cancellation.
- The final observable state.
Every request message should carry or imply a deadline. A consumer should avoid starting obsolete work after the caller's business deadline has passed.
Observability
Propagate:
- Correlation ID across the workflow.
- Causation ID from triggering message to emitted messages.
- Distributed trace context.
- Business operation ID.
Record:
- Publish and receive time.
- Processing duration.
- Attempts.
- Settlement.
- Operation state transitions.
- Projection or consumer lag.
Logs alone are not enough. Metrics and traces must expose stuck workflows, growing backlog, repeated retries, and dead-lettered messages.
Security
Messaging security includes:
- Producer and consumer workload identity.
- Destination-level authorization.
- Encryption in transit and at rest.
- Tenant isolation.
- Payload minimization.
- Schema validation.
- Replay and duplicate controls.
- Secret-free telemetry.
- Restricted dead-letter access.
Treat messages as untrusted input even when they arrive from an internal broker.
When Event-Driven Communication Makes Sense
Use it when:
- Multiple independent consumers react to a fact.
- Producers and consumers need independent availability and scaling.
- Work is naturally asynchronous.
- Bursts require buffering.
- Replay or event-stream processing is valuable.
- Eventual consistency is acceptable.
Avoid it when:
- The workflow needs immediate strong consistency.
- One simple synchronous call is clearer.
- The team lacks operational support for brokers and asynchronous failures.
- The "event" is really a tightly coupled remote procedure call with extra latency.
Common Mistakes
Common failures include:
- Using events as hidden commands.
- Assuming exactly-once end-to-end processing.
- Publishing database changes without an outbox.
- Acknowledging before durable work.
- Retrying permanent failures forever.
- Depending on global ordering.
- Sending entire internal entities as contracts.
- Omitting idempotency and correlation IDs.
- Returning
202without a usable status resource. - Polling too aggressively.
- Ignoring cancellation, expiry, and cleanup.
- Autoscaling consumers until the database fails.
- Leaving dead-letter queues unmonitored.
Best-Practice Decision Process
- Decide whether the message is a command, event, or reply.
- Select queue, pub/sub, or stream semantics deliberately.
- Define ownership, schema, identity, and versioning.
- Choose an explicit delivery and ordering model.
- Use outbox publication and idempotent consumers.
- Bound retries and own dead-letter recovery.
- Define consistency, timeout, cancellation, and user experience.
- For HTTP, use
202,Location,Retry-After, and an authorized operation resource. - Correlate every step and monitor backlog age and business completion.
- Load-test downstream limits, not only broker throughput.