DEV_NET_CORE
GET_STARTED
AzureAzure data, storage, and caching services

Azure Cache for Redis for low-latency reads and session/state scenarios

Overview

Redis is an in-memory data platform designed for very low-latency access to data organized by keys. It supports strings, hashes, lists, sets, sorted sets, streams, and other specialized structures. Applications commonly use Redis to reduce repeated work, share short-lived state across stateless application instances, coordinate distributed operations, and implement high-throughput counters or leaderboards.

Azure currently has two relevant managed Redis product names:

  • Azure Cache for Redis: The older managed service. All tiers have announced retirement dates. Enterprise and Enterprise Flash retire on March 31, 2027. Basic, Standard, and Premium retire on September 30, 2028.
  • Azure Managed Redis: The current managed Redis offering and the recommended destination for new solutions and migrations.

Interview discussions and existing codebases still frequently use the name Azure Cache for Redis. A current answer should acknowledge the retirement and explain the architecture using Azure Managed Redis unless the question explicitly concerns a legacy deployment.

Common scenarios include:

  • Cache-aside for frequently read database records.
  • Shared ASP.NET Core session state.
  • Short-lived shopping-cart or workflow state.
  • Response and rendered-content caching.
  • Distributed counters and rate-limit state.
  • Leaderboards with sorted sets.
  • Deduplication and idempotency records.
  • Distributed locks with careful correctness constraints.
  • Streams, queues, or pub/sub for suitable transient messaging.

Redis is not automatically a durable system of record. Memory pressure can evict keys, expiration can remove data, failover can lose recently replicated writes, and operators can flush or replace an instance. Data persistence and active geo-replication improve resilience but do not turn every Redis design into a relational database.

For interviews, candidates should be able to:

  • Explain cache-aside and cache invalidation.
  • Select expiration and eviction strategies.
  • Design distributed session state without sticky sessions.
  • Handle cache failures without causing a database outage.
  • Distinguish high availability, persistence, and geo-replication.
  • Size memory, throughput, network bandwidth, and client connections.
  • Explain clustering and Redis hash-slot limitations.
  • Secure the service with Microsoft Entra ID, private networking, TLS, and least privilege.
  • Discuss the Azure Cache for Redis retirement and migration implications.

Core Concepts

Product Direction and Retirement

Azure Cache for Redis remains operational during its retirement period, but it should not be treated as the strategic default.

The announced retirement dates are:

  • Azure Cache for Redis Enterprise and Enterprise Flash: March 31, 2027.
  • Azure Cache for Redis Basic, Standard, and Premium: September 30, 2028.

Existing customers should plan migration before those deadlines. New designs should normally evaluate Azure Managed Redis first.

Migration considerations include:

  • New hostname and authentication configuration.
  • Client compatibility with clustering.
  • Multi-key commands and hash slots.
  • Memory and performance sizing.
  • Persistence and availability settings.
  • Network and private endpoint changes.
  • Cutover and rollback strategy.
  • Whether cached data can be rebuilt or must be migrated.

A disposable cache can often be migrated by creating the new instance, warming it, changing application configuration, and allowing the old cache to expire. Session state, queues, idempotency records, and other stateful uses require a more deliberate continuity plan.

Azure Managed Redis

Azure Managed Redis is based on Redis Enterprise and supports Redis-compatible clients and data structures. It offers managed:

  • Provisioning and patching.
  • Replication and failover.
  • Clustering.
  • Scaling.
  • Private networking.
  • Microsoft Entra ID authentication.
  • Persistence.
  • Active geo-replication in supported tiers.
  • Redis modules for JSON, search, probabilistic structures, and time series.

Its tiers are organized by memory and compute characteristics:

  • Memory Optimized: Higher memory-to-vCPU ratio for memory-heavy workloads that do not need maximum throughput.
  • Balanced: General-purpose balance between memory and compute.
  • Compute Optimized: More compute per unit of memory for throughput-intensive workloads.
  • Flash Optimized: Uses RAM and NVMe storage to reduce cost for very large datasets at the expense of some latency and throughput.

Tier and size selection should use workload tests. Dataset size alone is insufficient because bandwidth, commands per second, value size, CPU cost, shard distribution, and connection count can become bottlenecks first.

Redis Data Structures

Choosing the correct Redis data structure affects both clarity and performance:

  • String: Serialized objects, flags, counters, tokens, or simple values.
  • Hash: Multiple fields associated with one logical object.
  • List: Ordered values and simple queue-like operations.
  • Set: Unique unordered members and membership tests.
  • Sorted set: Unique members ordered by score, useful for leaderboards and scheduling.
  • Stream: Append-only event records with consumer groups.
  • Bitmap and probabilistic structures: Compact flags, cardinality estimates, and deduplication.

Avoid storing one large serialized object when the application frequently updates only a small field. Conversely, splitting a small object into too many keys increases round trips and key-management complexity.

Cache-Aside Pattern

Cache-aside is the most common caching pattern:

  1. The application reads the cache.
  2. On a cache hit, it returns the cached value.
  3. On a cache miss, it reads the authoritative data store.
  4. It writes the value to Redis with an expiration.
  5. It returns the value.

When updating data:

  1. Update the authoritative store.
  2. Invalidate the corresponding cache key.

The order matters. Deleting the cache key before committing the database update creates a window in which another request can miss the cache, read the old database value, and repopulate stale data.

A simplified .NET example:

Code
public async Task<Product?> GetProductAsync(
    int id,
    IDatabase cache,
    CancellationToken cancellationToken)
{
    var key = $"product:v3:{id}";
    var cached = await cache.StringGetAsync(key);

    if (cached.HasValue)
    {
        return JsonSerializer.Deserialize<Product>(cached!);
    }

    var product = await repository.GetProductAsync(id, cancellationToken);
    if (product is null)
    {
        return null;
    }

    await cache.StringSetAsync(
        key,
        JsonSerializer.Serialize(product),
        TimeSpan.FromMinutes(10));

    return product;
}

Production code also needs timeout handling, metrics, stampede protection, serializer compatibility, and a decision about how to behave when Redis is unavailable.

Cache Keys

Keys should be predictable, namespaced, and versioned:

Code
environment:service:entity:version:id
prod:catalog:product:v3:4182
prod:identity:session:v2:8f31...

Good key design supports:

  • Environment separation.
  • Service ownership.
  • Schema changes.
  • Targeted invalidation.
  • Human diagnosis.
  • Cluster hash-slot control where needed.

Avoid:

  • Raw personally identifiable information in keys.
  • Unbounded user-provided key components.
  • Global scans such as KEYS * in production.
  • Ambiguous keys shared by unrelated services.
  • Reusing a key after changing the serialized value contract.

Key versioning is often simpler and safer than deleting every old key during deployment. Old versions expire naturally.

Expiration and Time to Live

Expiration limits staleness and memory use. Common approaches include:

  • Absolute expiration: Key expires after a fixed duration.
  • Sliding expiration: Lifetime extends when accessed.
  • No expiration: Key remains until explicitly deleted or evicted.

TTL should reflect:

  • How quickly the source data changes.
  • How much staleness is acceptable.
  • Cost of rebuilding the value.
  • Consequences of a cache miss.
  • Available memory.

A short TTL improves freshness but increases source-store load. A long TTL improves hit rate but can return stale data longer.

Add random jitter to TTL values for large sets of similar keys:

Code
var ttl = TimeSpan.FromMinutes(10)
          + TimeSpan.FromSeconds(Random.Shared.Next(0, 90));

Jitter prevents thousands of keys from expiring at the same instant.

Eviction

Redis has finite memory. When the configured memory limit is reached, the eviction policy determines what happens.

Typical policy families include:

  • Evict least-recently-used keys with expiration.
  • Evict least-recently-used keys from all keys.
  • Evict least-frequently-used keys.
  • Evict keys with the nearest expiration.
  • Remove random keys.
  • Reject writes instead of evicting.

The right policy depends on whether all keys are disposable:

  • A pure read cache can often use an all-keys eviction policy.
  • A mixed cache containing sessions or coordination records needs more care.
  • A state store that cannot tolerate eviction should not share an instance with disposable cache entries under an eviction policy.

Separate workloads into different Redis instances when their durability, eviction, security, or scaling requirements conflict.

Cache Hits, Misses, and Effectiveness

The cache hit ratio is:

Code
hits / (hits + misses)

A low hit ratio can mean:

  • TTL is too short.
  • Memory is too small.
  • The workload has low reuse.
  • Keys are inconsistent.
  • Data is being evicted.
  • Deployments are changing key versions frequently.
  • Requests are spread across high-cardinality values.

A high hit ratio is not sufficient by itself. Measure:

  • P50, P95, and P99 latency.
  • Database load reduction.
  • Redis CPU and server load.
  • Network bandwidth.
  • Memory fragmentation.
  • Evictions and expirations.
  • Timeouts and reconnects.
  • Per-shard distribution.

Do not keep a cache that adds cost and complexity without improving an end-to-end service objective.

Cache Stampede

A cache stampede occurs when many requests miss the same popular key and all query the source simultaneously.

Mitigation options include:

  • Per-key distributed locking.
  • Single-flight request coalescing inside each application instance.
  • Serving a stale value while one worker refreshes.
  • Proactive refresh before expiration.
  • TTL jitter.
  • Request rate limits.
  • Prewarming high-value keys.

A simple distributed lock needs:

  • A unique owner token.
  • Atomic acquire with expiration.
  • Safe release only by the owner.
  • A bounded wait.
  • Handling for work that outlives the lock lease.

Distributed locks are not a substitute for database constraints or idempotency when correctness is critical.

Negative Caching

Negative caching stores the fact that an item was not found. It reduces repeated source queries for missing keys.

Use a short TTL because:

  • The item might be created soon.
  • Authorization results can change.
  • Long negative caching can hide newly available data.

Use a distinct marker rather than serializing null ambiguously.

Sessions in Distributed Applications

In-memory session state inside one web server is lost when:

  • The process restarts.
  • A deployment replaces the instance.
  • Autoscaling routes the next request to another instance.

A distributed Redis session store allows any application instance to load the same session by an opaque session identifier held in a secure cookie.

Benefits include:

  • Stateless application instances.
  • Horizontal scaling without sticky sessions.
  • Rolling deployments.
  • Centralized session expiration.

Risks include:

  • Redis becomes a request-path dependency.
  • Cache latency affects every session-enabled request.
  • Eviction can log users out or lose workflow state.
  • Large sessions increase memory and network cost.
  • Active-active multi-region session consistency is difficult.

Keep sessions small and avoid storing the complete user profile, authorization policy, or large shopping carts when an authoritative database is more appropriate.

ASP.NET Core Distributed Session

ASP.NET Core session state can use a distributed cache provider. A typical design registers Redis and session middleware:

Code
builder.Services.AddStackExchangeRedisCache(options =>
{
    options.Configuration = redisConfiguration;
    options.InstanceName = "checkout:";
});

builder.Services.AddSession(options =>
{
    options.Cookie.Name = "__Host-checkout-session";
    options.Cookie.HttpOnly = true;
    options.Cookie.SecurePolicy = CookieSecurePolicy.Always;
    options.Cookie.SameSite = SameSiteMode.Lax;
    options.IdleTimeout = TimeSpan.FromMinutes(20);
});

var app = builder.Build();

app.UseSession();

Production authentication should prefer Microsoft Entra ID and managed identity where supported rather than embedding long-lived access keys. The exact client integration depends on the Redis client and Azure identity package versions.

The session cookie should contain only an opaque identifier. Configure secure cookie attributes and regenerate authentication-related identifiers after privilege changes.

Session State Versus Authentication

Authentication should not rely solely on a Redis session record unless the complete failure and recovery model is understood.

Common patterns are:

  • A self-contained signed authentication cookie or token proves identity.
  • Redis stores optional application session state.
  • Server-side authorization still checks current permissions for sensitive actions.

If Redis fails:

  • A cache-only feature might degrade gracefully.
  • A lost application session might force reauthentication.
  • A security-critical server-side session might require fail-closed behavior.

Define the expected behavior explicitly. "Redis unavailable" should not accidentally bypass security checks.

Shopping Carts and Workflow State

Redis can provide fast temporary shopping-cart or wizard state, but ask whether losing it is acceptable.

If cart loss creates significant business impact:

  • Persist the cart in a durable store.
  • Use Redis as a fast projection or write-through layer.
  • Reconcile Redis with the durable record.

For short anonymous carts, Redis with TTL may be acceptable. For submitted orders, payments, reservations, or legal records, use an authoritative transactional store.

Redis Is Usually Not the Source of Truth

For cache-aside, the source of truth is a database or durable object store. Redis contains a disposable copy.

This design permits:

  • Eviction.
  • Cache flush.
  • Instance replacement.
  • Rebuilding after outage.
  • Short periods of staleness.

Some Redis workloads can use persistence and Redis as a primary data platform, but that choice requires explicit analysis of:

  • Durability guarantees.
  • Replication lag.
  • Restore behavior.
  • Transaction semantics.
  • Query and indexing requirements.
  • Backup retention.
  • Regional recovery.
  • Team operational maturity.

Do not gradually turn a cache into the only copy of business-critical data without revisiting the architecture.

High Availability

With high availability enabled, Azure Managed Redis distributes primary and replica shards across at least two nodes. Supported regions distribute nodes across availability zones by default.

High availability improves endpoint availability during:

  • Node failure.
  • Maintenance.
  • Some scaling operations.
  • Service-managed failover.

It does not guarantee zero data loss. Replication can be asynchronous, and recent writes can be absent after failover.

Non-HA mode lowers cost but lacks the availability SLA and can cause downtime and data loss. It is suitable only for development and test workloads.

Data Persistence

Persistence stores a disk copy that can help recover data after an unexpected outage.

Persistence is relevant when rebuilding Redis data is difficult or time-sensitive. It does not remove the need to understand:

  • Snapshot or append behavior.
  • Recovery point.
  • Recovery time.
  • Storage and performance cost.
  • Replication lag.
  • Application behavior while recovery occurs.

For a disposable cache, persistence can add unnecessary cost. For session state, queues, or expensive derived state, it may reduce impact but still does not guarantee that every acknowledged write survives every failure.

Flash Optimized Tier

Flash Optimized stores keys in RAM while values can reside in RAM or NVMe flash. It targets large datasets with a hot subset.

It is a good fit for:

  • Read-heavy workloads.
  • Values much larger than keys.
  • Access concentrated on a subset of the dataset.
  • Cost-sensitive large caches.

It is a poor fit for:

  • Write-heavy workloads.
  • Uniform random access across the whole dataset.
  • Long keys with small values.
  • Workloads requiring the lowest consistent latency.

Test with a nearly full production-sized dataset. A lightly loaded Flash instance can appear faster because most values still fit in RAM.

Clustering and Sharding

Azure Managed Redis uses an internally clustered architecture. The clustering policy determines client behavior.

The OSS clustering policy generally provides high throughput and low latency by allowing cluster-aware clients to connect to shards. Keys are assigned to hash slots.

Multi-key commands can fail with CROSSSLOT when keys belong to different slots. Hash tags force related keys into the same slot:

Code
cart:{customer-42}:items
cart:{customer-42}:totals

The value inside braces determines the slot. Use hash tags only for keys that genuinely need atomic multi-key operations because concentrating too much traffic in one slot creates a hot shard.

Enterprise and nonclustered policies can improve compatibility for some workloads but have performance, module, command, or size trade-offs. Test the actual client and command set before migration.

Scaling

Scaling can increase:

  • Memory.
  • vCPUs.
  • Network bandwidth.
  • Client connections.
  • Number of shards.

Choose a scale action based on the bottleneck:

  • High memory and evictions: increase memory or improve TTL and value size.
  • High server load: use more compute or shards.
  • Network saturation: select a larger or more performance-focused tier.
  • Hot shard: redesign key distribution.
  • Too many connections: reuse clients and inspect connection pools before scaling.

Scaling down has current limitations in Azure Managed Redis. Avoid treating rapid scale-down as a guaranteed cost-control mechanism.

Client Connection Management

Redis clients should reuse long-lived connections. In .NET, ConnectionMultiplexer is designed to be shared and reused rather than created per request.

Creating a connection per operation causes:

  • TLS and authentication overhead.
  • Connection storms.
  • Port exhaustion.
  • Increased latency.
  • Pressure on client limits.

Client configuration should include:

  • Appropriate connection and operation timeouts.
  • Reconnect behavior.
  • TLS.
  • Cluster support.
  • Bounded retries.
  • Logging that does not expose secrets.

Do not use unbounded retries. When Redis is overloaded, aggressive retries increase load and extend the outage.

Timeouts and Large Values

Redis is fast, but it is not immune to timeouts. Causes include:

  • Network saturation.
  • CPU-heavy commands.
  • Large keys or values.
  • Too many client connections.
  • Thread-pool starvation in the application.
  • Hot shards.
  • Failover or scaling.
  • Blocking commands.

Keep values reasonably small. Large payloads increase:

  • Serialization cost.
  • Memory use.
  • Network time.
  • Failover and replication work.
  • Tail latency.

Measure serialized size, not only in-memory object size.

Graceful Degradation

A cache should not automatically become a single point of failure for the authoritative application.

For a read cache:

  1. Attempt Redis with a short timeout.
  2. On failure, read from the authoritative store.
  3. Avoid writing back if Redis is unhealthy.
  4. Apply circuit breaking or temporary cache bypass.
  5. Protect the source with rate limits, request coalescing, and load shedding.

Fallback creates a new risk: a Redis outage can send the full traffic load to the database. Capacity plans and chaos tests must cover that scenario.

For session state, fallback may not be possible. Decide whether to:

  • Fail the request.
  • Force reauthentication.
  • Serve only stateless pages.
  • Use a durable session fallback.

Security

A secure design should normally use:

  • Microsoft Entra ID authentication and managed identity where supported.
  • TLS.
  • Private endpoints.
  • Disabled or restricted public network access.
  • Least-privilege role assignments.
  • Secret rotation for any temporary access keys.
  • Separate instances for different trust boundaries.
  • Connection auditing and Azure Monitor diagnostics.

Do not store:

  • Passwords.
  • Long-lived tokens.
  • Raw payment data.
  • Sensitive personal information without a clear encryption, retention, and access design.

Redis keys, values, logs, and diagnostic tooling all require data-classification review.

Multi-Region Design

Active geo-replication can replicate Redis data across regions for supported tiers and configurations. It is useful for globally distributed reads and regional continuity.

Trade-offs include:

  • Eventual convergence.
  • Conflict behavior.
  • Command restrictions.
  • Cross-region data transfer.
  • More complex session semantics.
  • Need for region-aware application routing.

A global shopping cart or session can receive concurrent updates in two regions. The business must define acceptable merge and conflict behavior. For strict order or payment consistency, use a durable transactional system of record.

Monitoring

Monitor:

  • Cache hits and misses.
  • Hit ratio.
  • Memory usage.
  • Evictions and expirations.
  • Server load.
  • Operations per second.
  • Network bandwidth.
  • Connected clients.
  • Timeouts and errors.
  • Per-shard metrics.
  • Replication and failover.
  • Persistence health.
  • Connection audit logs.

Alerts should focus on customer impact and early saturation:

  • Rising P99 latency.
  • Sustained memory pressure.
  • Unexpected evictions.
  • High server load.
  • Network bandwidth near limits.
  • Connection count growth.
  • Hit ratio decline.
  • Database load increase during cache failures.

Common Mistakes

Common mistakes include:

  • Starting a new deployment on a retiring Azure Cache for Redis tier.
  • Treating Redis as a guaranteed durable system of record.
  • Creating one Redis connection per request.
  • Storing very large values.
  • Using no TTL for disposable cache entries.
  • Giving sessions and disposable cache entries the same eviction behavior.
  • Using sticky sessions instead of distributed state.
  • Invalidating before committing the database update.
  • Allowing a cache outage to overwhelm the source database.
  • Running expensive key scans in production.
  • Ignoring CROSSSLOT behavior during migration.
  • Using broad public access and long-lived keys instead of private networking and identity.

Best-Practice Design Checklist

A production design should normally:

  • Use Azure Managed Redis for new Azure deployments.
  • Classify each key as disposable cache, session, coordination, or durable state.
  • Keep an authoritative data source for business-critical records.
  • Namespace and version keys.
  • Set TTLs with jitter.
  • Select an eviction policy that matches the workload.
  • Separate workloads with conflicting eviction or security needs.
  • Reuse cluster-aware client connections.
  • Protect against cache stampedes.
  • Design bounded fallback and source-store protection.
  • Enable HA for production.
  • Evaluate persistence only when recovery needs justify it.
  • Use managed identity, private endpoints, and TLS.
  • Load-test memory, throughput, bandwidth, failover, and shard distribution.
  • Monitor both cache health and downstream database impact.

Interview Practice

PreviousService tags, private networking, and reducing public attack surfaceNext UpAzure SQL Database tiers, scaling, serverless options, and failover groups