DEV_NET_CORE
GET_STARTED
.NETPerformance, scalability, and caching

In-memory, distributed, hybrid, and output caching

Overview

Caching is the practice of storing data or generated output temporarily so future requests can be served faster and with less work. In .NET applications, caching is commonly used to reduce repeated database queries, expensive calculations, API calls, file reads, template rendering, and repeated HTTP response generation.

For interviews, caching is important because it tests whether a developer understands performance, scalability, correctness, invalidation, consistency, and production trade-offs. A strong answer is not simply "use Redis" or "use memory cache." A strong answer explains what is being cached, who can see it, how long it is valid, how it is invalidated, whether the application runs on one server or many servers, and what happens when data changes.

In .NET and ASP.NET Core, the main caching choices include:

  • In-memory caching using IMemoryCache, where cached values live inside the application process.
  • Distributed caching using IDistributedCache, where cached values live in an external shared store such as Redis, SQL Server, Postgres, Cosmos DB, or another provider.
  • Hybrid caching using HybridCache, which combines fast local memory caching with optional distributed caching and built-in stampede protection.
  • Output caching in ASP.NET Core, where complete HTTP responses are cached based on server-side policies.

These choices solve different problems. In-memory caching is simple and fast for single-instance apps. Distributed caching is better for scaled-out applications. Hybrid caching is useful when you want both local speed and shared distributed behavior. Output caching is useful when the same HTTP response can be reused safely for many requests.

Core Concepts

What caching is solving

Caching is usually introduced to reduce one or more of the following costs:

  • Latency, such as waiting for a database, external API, or expensive computation.
  • CPU usage, such as repeatedly rendering the same data or calculating the same result.
  • Database load, such as repeatedly querying mostly unchanged reference data.
  • Network traffic, such as repeatedly calling downstream services.
  • Application throughput limits, where repeated work prevents the system from handling more requests.

A cache is useful when the cost of recomputing or refetching data is higher than the cost and risk of storing a temporary copy.

A cache is risky when the data changes frequently, is security-sensitive, is user-specific, or must always be strongly consistent.

Important caching terminology

TermMeaning
Cache keyThe unique identifier used to store and retrieve a cached value.
Cache valueThe object, bytes, DTO, or response body stored in the cache.
Cache hitThe requested key exists in the cache and can be returned.
Cache missThe requested key does not exist or has expired, so the source must be queried.
ExpirationA rule that determines when cached data becomes invalid.
Absolute expirationThe item expires at a fixed time or after a fixed duration.
Sliding expirationThe item expires if it has not been accessed for a duration.
EvictionRemoving an item from the cache due to expiration, memory pressure, manual removal, or policy.
InvalidationIntentionally removing or marking cached data as stale after underlying data changes.
Cache stampedeMany requests miss the same cache key at the same time and all try to rebuild it.
Stale dataCached data that no longer matches the source of truth.
VaryCreating different cached responses based on route, query string, header, user, culture, tenant, or another value.
TTLTime to live; how long an item is allowed to remain cached.

In-memory caching with IMemoryCache

IMemoryCache stores cache entries inside the memory of the current application process. It is usually the simplest caching option in a .NET application.

It is a good choice when:

  • The application runs as a single instance.
  • The cached data is small or moderately sized.
  • It is acceptable for cached data to disappear when the process restarts.
  • Each server can safely have its own copy of the cache.
  • The data does not need to be shared across multiple application instances.

Typical examples include:

  • Lookup lists such as countries, currencies, product categories, and feature metadata.
  • Expensive calculations that are reused frequently.
  • Configuration-like data that changes rarely.
  • Small reference data loaded from a database.

Basic setup:

Code
var builder = WebApplication.CreateBuilder(args);

builder.Services.AddMemoryCache();
builder.Services.AddScoped<ProductLookupService>();

Example service:

Code
using Microsoft.Extensions.Caching.Memory;

public sealed class ProductLookupService
{
    private readonly IMemoryCache _cache;
    private readonly AppDbContext _dbContext;

    public ProductLookupService(IMemoryCache cache, AppDbContext dbContext)
    {
        _cache = cache;
        _dbContext = dbContext;
    }

    public async Task<IReadOnlyList<ProductCategoryDto>> GetCategoriesAsync(
        CancellationToken cancellationToken)
    {
        return await _cache.GetOrCreateAsync(
            "product-categories:v1",
            async entry =>
            {
                entry.AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(30);
                entry.SlidingExpiration = TimeSpan.FromMinutes(10);

                return await _dbContext.ProductCategories
                    .OrderBy(x => x.Name)
                    .Select(x => new ProductCategoryDto(x.Id, x.Name))
                    .ToListAsync(cancellationToken);
            }) ?? [];
    }
}

public sealed record ProductCategoryDto(int Id, string Name);

Important habits:

  • Use stable, descriptive cache keys.
  • Cache DTOs or immutable values instead of EF Core tracked entities.
  • Set expiration rules instead of caching forever.
  • Avoid caching large objects without considering memory pressure.
  • Avoid storing per-user sensitive data unless the key includes the user identity and the security model is clear.
  • Remember that each application instance has its own separate memory cache.

Expiration, eviction, and invalidation

Caching is not complete without an expiration and invalidation strategy.

Expiration is time-based. For example, cache the product list for 10 minutes.

Invalidation is event-based. For example, remove the product list cache when an admin updates a product.

Example manual invalidation:

Code
public sealed class ProductCommandService
{
    private readonly AppDbContext _dbContext;
    private readonly IMemoryCache _cache;

    public ProductCommandService(AppDbContext dbContext, IMemoryCache cache)
    {
        _dbContext = dbContext;
        _cache = cache;
    }

    public async Task RenameCategoryAsync(
        int categoryId,
        string newName,
        CancellationToken cancellationToken)
    {
        var category = await _dbContext.ProductCategories
            .SingleAsync(x => x.Id == categoryId, cancellationToken);

        category.Name = newName;
        await _dbContext.SaveChangesAsync(cancellationToken);

        _cache.Remove("product-categories:v1");
    }
}

Common mistakes:

  • Using only sliding expiration for important data. A frequently accessed item may live much longer than intended.
  • Forgetting to invalidate cached lookup data after writes.
  • Using a cache key that is too broad and returns the wrong data to different users or tenants.
  • Using a cache key that is too specific and creates too many entries.
  • Assuming expiration is exact. Cache cleanup is usually opportunistic and implementation-dependent.

A practical pattern is to combine absolute expiration with event-based invalidation.

Distributed caching with IDistributedCache

IDistributedCache stores cache entries in an external shared cache provider. The application talks to the cache through a common interface, but the storage can be Redis, SQL Server, Postgres, Cosmos DB, NCache, or another implementation.

It is a good choice when:

  • The application runs on multiple servers or containers.
  • Cached data should be shared between app instances.
  • Cache data should survive application restarts or deployments.
  • Local memory usage must be reduced.
  • Session-like or shared cached data must be available across the server farm.

Common setup with Redis:

Code
var builder = WebApplication.CreateBuilder(args);

builder.Services.AddStackExchangeRedisCache(options =>
{
    options.Configuration = builder.Configuration.GetConnectionString("Redis");
    options.InstanceName = "myapp:";
});

IDistributedCache stores and retrieves byte arrays, so applications usually serialize data before storing it.

Example helper:

Code
using System.Text.Json;
using Microsoft.Extensions.Caching.Distributed;

public static class DistributedCacheJsonExtensions
{
    private static readonly JsonSerializerOptions JsonOptions = new(JsonSerializerDefaults.Web);

    public static async Task<T?> GetJsonAsync<T>(
        this IDistributedCache cache,
        string key,
        CancellationToken cancellationToken = default)
    {
        var bytes = await cache.GetAsync(key, cancellationToken);
        return bytes is null
            ? default
            : JsonSerializer.Deserialize<T>(bytes, JsonOptions);
    }

    public static async Task SetJsonAsync<T>(
        this IDistributedCache cache,
        string key,
        T value,
        DistributedCacheEntryOptions options,
        CancellationToken cancellationToken = default)
    {
        var bytes = JsonSerializer.SerializeToUtf8Bytes(value, JsonOptions);
        await cache.SetAsync(key, bytes, options, cancellationToken);
    }
}

Example service:

Code
public sealed class CustomerSummaryService
{
    private readonly IDistributedCache _cache;
    private readonly AppDbContext _dbContext;

    public CustomerSummaryService(IDistributedCache cache, AppDbContext dbContext)
    {
        _cache = cache;
        _dbContext = dbContext;
    }

    public async Task<CustomerSummaryDto?> GetCustomerSummaryAsync(
        int customerId,
        CancellationToken cancellationToken)
    {
        var key = $"customer-summary:{customerId}";

        var cached = await _cache.GetJsonAsync<CustomerSummaryDto>(key, cancellationToken);
        if (cached is not null)
        {
            return cached;
        }

        var summary = await _dbContext.Customers
            .Where(x => x.Id == customerId)
            .Select(x => new CustomerSummaryDto(
                x.Id,
                x.Name,
                x.Orders.Count))
            .SingleOrDefaultAsync(cancellationToken);

        if (summary is null)
        {
            return null;
        }

        await _cache.SetJsonAsync(
            key,
            summary,
            new DistributedCacheEntryOptions
            {
                AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(15),
                SlidingExpiration = TimeSpan.FromMinutes(5)
            },
            cancellationToken);

        return summary;
    }
}

public sealed record CustomerSummaryDto(int Id, string Name, int OrderCount);

Trade-offs of distributed caching:

AdvantageTrade-off
Shared across app instancesAdds network latency compared with memory cache
Can survive app restartsRequires external infrastructure
Reduces local memory usageSerialization and deserialization are required
Useful for scaled-out systemsCache provider outages must be handled
Centralized invalidation is easier than local memoryData consistency still requires careful design

Hybrid caching with HybridCache

HybridCache is a newer .NET caching abstraction that combines local in-memory caching with optional distributed caching. It is designed to simplify common cache-aside code and reduce problems such as cache stampedes.

Conceptually, it provides:

  • L1 cache: fast in-process memory cache.
  • L2 cache: optional distributed cache through an IDistributedCache provider.
  • Cache stampede protection: when many requests ask for the same missing key, only one request should run the expensive factory while others wait for the result.
  • Simplified API: GetOrCreateAsync handles lookup, factory execution, and storage.
  • Serialization support: useful when a distributed cache is configured.
  • Tag-based invalidation: useful for invalidating related entries together.

Basic setup:

Code
var builder = WebApplication.CreateBuilder(args);

builder.Services.AddStackExchangeRedisCache(options =>
{
    options.Configuration = builder.Configuration.GetConnectionString("Redis");
});

builder.Services.AddHybridCache();

Example usage:

Code
using Microsoft.Extensions.Caching.Hybrid;

public sealed class ProductReadService
{
    private readonly HybridCache _cache;
    private readonly AppDbContext _dbContext;

    public ProductReadService(HybridCache cache, AppDbContext dbContext)
    {
        _cache = cache;
        _dbContext = dbContext;
    }

    public async Task<ProductDetailsDto?> GetProductAsync(
        int productId,
        CancellationToken cancellationToken)
    {
        return await _cache.GetOrCreateAsync(
            $"product:{productId}",
            async token =>
            {
                return await _dbContext.Products
                    .Where(x => x.Id == productId)
                    .Select(x => new ProductDetailsDto(
                        x.Id,
                        x.Name,
                        x.Price))
                    .SingleOrDefaultAsync(token);
            },
            cancellationToken: cancellationToken);
    }
}

public sealed record ProductDetailsDto(int Id, string Name, decimal Price);

Example with options and tags:

Code
public async Task<ProductDetailsDto?> GetProductWithTagsAsync(
    int productId,
    CancellationToken cancellationToken)
{
    var options = new HybridCacheEntryOptions
    {
        Expiration = TimeSpan.FromMinutes(30),
        LocalCacheExpiration = TimeSpan.FromMinutes(5)
    };

    return await _cache.GetOrCreateAsync(
        $"product:{productId}",
        async token => await LoadProductAsync(productId, token),
        options,
        tags: [$"product:{productId}", "products"],
        cancellationToken: cancellationToken);
}

HybridCache is often a strong default for new applications that need both local speed and distributed consistency features. However, it does not remove the need to design good keys, expiration, invalidation, and serialization behavior.

Output caching in ASP.NET Core

Output caching stores complete HTTP responses and serves them again without re-executing the endpoint logic. It is different from caching a data object inside a service.

Output caching is useful when:

  • The same endpoint returns the same response for many callers.
  • The response is expensive to generate.
  • The response is safe to reuse.
  • The endpoint is usually public or anonymous.
  • The response can vary by route, query string, header, culture, tenant, or another known value.

Basic setup:

Code
var builder = WebApplication.CreateBuilder(args);

builder.Services.AddOutputCache(options =>
{
    options.AddBasePolicy(policy => policy.Expire(TimeSpan.FromSeconds(30)));
    options.AddPolicy("Products", policy =>
        policy.Expire(TimeSpan.FromMinutes(2))
              .SetVaryByQuery("category", "page")
              .Tag("products"));
});

var app = builder.Build();

app.UseRouting();
app.UseCors();
app.UseOutputCache();
app.UseAuthorization();

app.MapGet("/products", GetProducts)
   .CacheOutput("Products");

app.Run();

Controller example:

Code
using Microsoft.AspNetCore.OutputCaching;

[ApiController]
[Route("api/products")]
public sealed class ProductsController : ControllerBase
{
    [HttpGet]
    [OutputCache(PolicyName = "Products")]
    public async Task<IReadOnlyList<ProductDto>> GetProducts(
        [FromServices] ProductQueryService service,
        [FromQuery] string? category,
        CancellationToken cancellationToken)
    {
        return await service.GetProductsAsync(category, cancellationToken);
    }
}

Evicting output cache entries by tag:

Code
app.MapPost("/admin/cache/purge/products", async (
    IOutputCacheStore outputCacheStore,
    CancellationToken cancellationToken) =>
{
    await outputCacheStore.EvictByTagAsync("products", cancellationToken);
    return Results.NoContent();
});

Important output caching rules:

  • Register services with AddOutputCache.
  • Add middleware with UseOutputCache.
  • Adding the service and middleware does not automatically cache everything; endpoints must be configured with policies or attributes.
  • Default output caching is conservative: typically successful GET or HEAD responses are good candidates.
  • Responses that set cookies or authenticated responses should not be cached by default.
  • Use VaryByQuery, VaryByHeader, or VaryByValue when the response depends on request inputs.
  • Use tags for group invalidation.
  • Be careful with middleware order. In apps using CORS, output caching should be placed after CORS. In apps using routing/controllers, it should be placed after routing.

Output caching vs response caching

Output caching and response caching are often confused.

FeatureOutput cachingResponse caching
Main purposeServer-controlled reuse of generated HTTP responsesHTTP caching behavior based on HTTP cache headers
Controlled byServer-side policies and endpoint metadataHTTP cache headers such as Cache-Control, Vary, and Expires
Client request headers can bypass itNot in the same way; server policy controls behaviorYes, client cache headers can force revalidation or bypass behavior
InvalidationSupports policy-based and tag-based evictionMostly header-driven; fewer programmatic invalidation options
Useful forReducing server work for expensive endpointsStandards-based HTTP caching for clients/proxies
ASP.NET Core availabilityModern ASP.NET Core feature for server-side response cachingOlder HTTP-standard caching middleware

In interviews, a good answer explains that response caching follows HTTP caching semantics, while output caching is designed to let the server control when responses are cached to reduce backend work.

Choosing the right cache type

ScenarioRecommended choiceReason
Single-server app with small lookup dataIMemoryCacheFastest and simplest option
Multi-server app needing shared cacheIDistributedCacheShared external cache across app instances
Multi-server app needing both local speed and shared cacheHybridCacheCombines local memory and distributed cache
Public endpoint returning same response for many callersOutput cachingAvoids rerunning endpoint logic
User-specific dashboardUsually data caching with user-specific keys, or no cachingOutput caching can leak data if not varied correctly
Frequently changing transactional dataUsually avoid caching or use very short TTLStale data risk is high
Expensive external API responseHybridCache, IDistributedCache, or IMemoryCacheReduces downstream dependency calls
Static assetsCDN/browser cachingBetter handled outside application code

A useful decision flow:

  1. Is the cached thing a full HTTP response?
    • Use output caching when it is safe and reusable.
  2. Is the cached thing application data?
    • Use IMemoryCache for simple single-instance scenarios.
    • Use IDistributedCache for shared multi-instance scenarios.
    • Use HybridCache when you want local speed, distributed support, and stampede protection.
  3. Is the data user-specific or security-sensitive?
    • Avoid caching unless keys and authorization boundaries are carefully designed.
  4. Does the data change often?
    • Use short TTLs, active invalidation, or avoid caching.

Cache-aside pattern

The cache-aside pattern is one of the most common application caching patterns.

The flow is:

  1. Try to get data from cache.
  2. If found, return it.
  3. If not found, query the source of truth.
  4. Store the result in cache.
  5. Return the result.

Example:

Code
public async Task<ProductDto?> GetProductAsync(
    int id,
    CancellationToken cancellationToken)
{
    var key = $"product:{id}";

    if (_memoryCache.TryGetValue(key, out ProductDto? cached))
    {
        return cached;
    }

    var product = await _dbContext.Products
        .Where(x => x.Id == id)
        .Select(x => new ProductDto(x.Id, x.Name, x.Price))
        .SingleOrDefaultAsync(cancellationToken);

    if (product is not null)
    {
        _memoryCache.Set(
            key,
            product,
            new MemoryCacheEntryOptions
            {
                AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(10)
            });
    }

    return product;
}

The cache-aside pattern is easy to understand but can create cache stampedes if many requests miss the same key at once. HybridCache and output caching resource locking can help with this problem.

Cache stampede and thundering herd

A cache stampede happens when a popular cache item expires and many requests rebuild it at the same time. This can overload the database or downstream service.

Common mitigations:

  • Use HybridCache, which includes stampede protection for GetOrCreateAsync.
  • Use output caching resource locking for HTTP responses.
  • Use randomized TTL jitter so many entries do not expire at the exact same moment.
  • Refresh cache entries in the background before they expire.
  • Use a lock or single-flight mechanism for expensive keys.
  • Use stale-while-revalidate behavior where appropriate.

Example TTL jitter:

Code
private static TimeSpan AddJitter(TimeSpan baseTtl)
{
    var jitterSeconds = Random.Shared.Next(0, 30);
    return baseTtl + TimeSpan.FromSeconds(jitterSeconds);
}

Security and correctness concerns

Caching can create serious production bugs if security and correctness are ignored.

Common risks:

  • Data leakage: caching a response for one user and returning it to another user.
  • Tenant leakage: forgetting to include tenant ID in the cache key.
  • Authorization bypass: serving cached data without checking whether the current user can access it.
  • Stale permissions: caching authorization or role data for too long.
  • Sensitive data persistence: storing tokens, personal data, or secrets in a distributed cache without proper security controls.
  • Cache poisoning: allowing untrusted input to control cache keys or cached output incorrectly.

Safer practices:

  • Include tenant, user, culture, query, and authorization-relevant values in the cache key when needed.
  • Avoid output caching authenticated responses unless you fully understand and configure variation rules.
  • Do not cache secrets in normal application caches.
  • Use short TTLs for permission-related data.
  • Treat distributed caches as production infrastructure that needs authentication, encryption, monitoring, and access control.
  • Prefer caching DTOs designed for safe reuse.

Example tenant-aware key:

Code
var key = $"tenant:{tenantId}:product:{productId}";

Serialization and versioning

Distributed and hybrid caches often require serialization because data leaves the process. This creates versioning concerns.

Common issues:

  • The application deploys a new DTO shape while old cached data still uses the old shape.
  • Different services serialize the same logical object differently.
  • Type names or private implementation details leak into serialized payloads.
  • Large serialized values increase network latency and memory usage.

Best practices:

  • Cache stable DTOs, not EF Core entities or domain aggregates with complex behavior.
  • Include a version in the key when the shape changes.
  • Keep cached payloads small.
  • Use explicit serialization options.
  • Consider compression only for large payloads after measuring cost.

Example versioned key:

Code
var key = $"product-details:v2:{productId}";

Caching with EF Core

Caching can work well with EF Core read paths, but it should be used carefully.

Good candidates:

  • Lookup/reference data.
  • Read models and DTO projections.
  • Aggregated summaries.
  • Expensive reports.
  • Data from read-only tables.

Risky candidates:

  • Tracked EF Core entities.
  • Data that changes often.
  • Data that must reflect writes immediately.
  • Per-user data without user-specific cache keys.

Recommended pattern:

Code
var product = await _dbContext.Products
    .AsNoTracking()
    .Where(x => x.Id == productId)
    .Select(x => new ProductDetailsDto(x.Id, x.Name, x.Price))
    .SingleOrDefaultAsync(cancellationToken);

Use AsNoTracking and project to DTOs when caching query results. This avoids storing objects that are tied to a specific DbContext or change tracker.

Common production best practices

Use caching intentionally:

  • Cache data that is expensive to compute or fetch.
  • Do not cache everything by default.
  • Measure before and after caching.
  • Track cache hit rate, miss rate, eviction rate, and latency.
  • Use consistent naming for keys.
  • Set expiration and invalidation policies.
  • Avoid caching failed responses unless that is explicitly desired.
  • Use short TTLs for data that changes often.
  • Avoid storing very large payloads.
  • Protect distributed cache infrastructure.
  • Design behavior for cache provider outages.
  • Do not make the cache the only source of truth unless the system is explicitly designed that way.

A practical rule: the database or original service should remain the source of truth, and the cache should be an optimization unless the architecture explicitly says otherwise.

Common mistakes

Common interview-worthy mistakes include:

  • Using IMemoryCache in a load-balanced application and expecting all instances to share data.
  • Caching authenticated HTTP responses without varying by user or authorization state.
  • Forgetting to invalidate data after writes.
  • Caching EF Core tracked entities.
  • Not setting expiration.
  • Using broad keys such as "products" when the result depends on query parameters.
  • Using overly granular keys that create unbounded cache growth.
  • Ignoring cache stampede problems.
  • Not handling distributed cache outages.
  • Caching data before confirming it is a performance bottleneck.
  • Treating response caching and output caching as the same thing.

Interview Practice

PreviousIHttpClientFactory and Resilient Outbound HTTPNext UpRate limiting, memory/GC awareness, and runtime diagnostics