DEV_NET_CORE
GET_STARTED
Design & ArchitectureAPI design and integration contracts

API gateway and BFF decisions for web clients and microservices

Overview

An API gateway is an edge component that provides one entry point to backend APIs and applies shared routing, security, traffic, and protocol policies. A Backend for Frontend (BFF) is a backend tailored to the needs of one frontend or client experience, such as a browser application, mobile app, or partner portal.

They solve related but different problems:

  • A gateway centralizes common edge concerns across APIs.
  • A BFF adapts backend capabilities to a particular user experience.

A gateway can:

  • Route requests to services.
  • Terminate TLS.
  • Validate tokens.
  • Apply rate limits and quotas.
  • Transform protocols or headers.
  • Aggregate selected backend calls.
  • Cache safe responses.
  • Record edge telemetry.

A BFF can:

  • Shape responses for one client.
  • Orchestrate client-specific workflows.
  • Reduce frontend round trips.
  • Hide internal service topology.
  • Handle browser-specific authentication sessions.
  • Optimize payloads for device or interface constraints.

Both add a network hop, deployment, security surface, operational ownership, and potential failure point. They should be introduced because client or service architecture requires them, not because microservice diagrams usually contain them.

This topic matters in interviews because candidates must reason about placement of responsibility, failure behavior, identity propagation, latency, ownership, and frontend needs. A gateway should not become a central monolith, and a BFF should not duplicate core business logic owned by domain services.

Core Concepts

Gateway, Reverse Proxy, Load Balancer, and Ingress

These terms overlap but emphasize different responsibilities.

Reverse proxy

  • Receives a request and forwards it to an upstream server.
  • Can terminate TLS and rewrite headers or paths.

Load balancer

  • Distributes traffic across instances.
  • Primarily addresses availability and capacity.

Ingress controller

  • Implements external routing into a container platform such as Kubernetes.
  • Translates ingress configuration into proxy behavior.

API gateway

  • Adds API-aware policy such as authentication, quotas, transformation, version routing, developer subscriptions, and analytics.

A product can perform several roles. Evaluate required capabilities rather than selecting by label.

Gateway Versus Service Mesh

An API gateway manages primarily north-south traffic between external clients and the system.

A service mesh manages primarily east-west traffic between services and may provide:

  • Mutual TLS.
  • Service identity.
  • Retries and timeouts.
  • Traffic shaping.
  • Service-to-service telemetry.

They can coexist:

Code
Client
  -> API gateway
      -> service-mesh ingress
          -> internal services

Do not route every internal call back through the external gateway. That creates unnecessary latency and central coupling.

Gateway Routing

The gateway maps external routes to internal services:

Code
/api/orders/*   -> Ordering service
/api/catalog/*  -> Catalog service
/api/users/*    -> Identity profile service

Benefits:

  • Clients do not know service locations.
  • Internal topology can change.
  • Version or canary routing can be centralized.
  • Services can remain private.

Keep route ownership explicit. Conflicting rewrites and hidden routing rules make incidents difficult to diagnose.

Gateway Offloading

Common shared edge policies include:

  • TLS termination.
  • Authentication token validation.
  • Rate limiting.
  • Request-size limits.
  • IP restrictions.
  • Compression.
  • CORS policy.
  • Request correlation.
  • Basic caching.
  • Web application firewall rules.

Offload only concerns that are truly common and safe at the edge.

Backend services must still enforce:

  • Resource-level authorization.
  • Business invariants.
  • Tenant isolation.
  • Input semantics.
  • Data access rules.

The gateway is not the sole security boundary.

Gateway Aggregation

A gateway can call several services and combine their responses:

Code
GET /dashboard
  -> Customer service
  -> Orders service
  -> Recommendations service
  -> one dashboard response

Benefits:

  • Fewer client round trips.
  • Less client knowledge of service topology.
  • Centralized timeout and fallback policy.

Costs:

  • Increased gateway CPU and memory.
  • Fan-out latency.
  • Partial failures.
  • Coupling to response schemas.
  • Harder scaling and ownership.

Use limited aggregation for stable, shared edge scenarios. Client-specific aggregation often belongs in a BFF.

Tail Latency and Fan-Out

An aggregate response is constrained by slow dependencies:

Code
Total latency ~= gateway overhead + slowest required dependency

With many dependencies, the probability that at least one is slow or unavailable rises.

Design aggregation with:

  • Parallel calls where independent.
  • Per-dependency deadlines.
  • Overall request deadline.
  • Bounded concurrency.
  • Partial-response policy.
  • Fallbacks only when semantically safe.
  • Cancellation propagation.
  • Distributed tracing.

Do not retry every failed downstream call automatically. Retries can amplify load and duplicate unsafe operations.

Partial Failure

Define whether each dependency is:

  • Required.
  • Optional.
  • Replaceable by stale data.
  • Omittable with a warning.

Example response:

Code
{
  "orders": [],
  "recommendations": null,
  "warnings": [
    {
      "code": "recommendations-unavailable"
    }
  ]
}

Returning partial data is appropriate only when consumers can understand it and business correctness is preserved.

Single Point of Failure

A gateway is on many critical paths. Protect it with:

  • Multiple instances across failure domains.
  • Health probes.
  • Autoscaling.
  • Load testing.
  • Configuration validation.
  • Safe rollout and rollback.
  • Minimal local state.
  • Capacity headroom.
  • Dependency isolation.

Avoid synchronous external control-plane dependencies in the request path.

Gateway as a Choke Point

A gateway can become a bottleneck when it accumulates:

  • Core business workflows.
  • Large transformations.
  • Long-running requests.
  • Many custom plugins.
  • Shared release coordination.
  • Per-client conditional logic.

Keep the gateway focused on edge policy and simple routing or aggregation. Move domain behavior to services and client-specific behavior to BFFs.

What Is a BFF?

A BFF is a backend owned and shaped around one frontend experience.

Code
Web app    -> Web BFF    -> services
Mobile app -> Mobile BFF -> services
Partner UI -> Partner BFF -> services

The BFF exposes operations and representations optimized for that client rather than forcing every client through one general-purpose aggregation API.

BFF Responsibilities

Appropriate responsibilities:

  • Client-specific aggregation.
  • Response shaping.
  • UI-oriented workflow orchestration.
  • Server-side session handling.
  • Token exchange and secure token storage.
  • Client-specific caching.
  • Reducing mobile payloads.
  • Hiding unstable service topology.

Inappropriate responsibilities:

  • Core pricing rules.
  • Order invariants.
  • Authoritative data ownership.
  • Reusable domain policy.
  • Direct access to every service database.

Business rules should remain in the service or module that owns them.

When a BFF Is Valuable

Use a BFF when:

  • Web and mobile clients need materially different APIs.
  • A general backend contains many client-specific conditions.
  • Clients make many calls to render one screen.
  • Mobile bandwidth or latency needs special optimization.
  • Frontend teams require independent backend evolution.
  • Browser token security benefits from a server-side component.
  • Client release cadence differs from service release cadence.

A BFF can reduce frontend coupling and network chattiness.

When Not to Use a BFF

A BFF may be unnecessary when:

  • Only one simple client exists.
  • Clients use nearly identical operations.
  • One gateway aggregation layer is sufficient.
  • The backend is already a cohesive monolith.
  • GraphQL or another composition layer already provides client-specific selection.
  • The team cannot own another production service.

Duplicating one BFF per screen or minor client variation creates operational cost without useful autonomy.

One BFF Per Experience, Not Per Device Automatically

Do not create separate BFFs merely because clients are named web, iOS, and Android.

Ask:

  • Do they need different workflows?
  • Do they evolve independently?
  • Do they have different payload and latency constraints?
  • Are they owned by different teams?
  • Do they have different authentication models?

iOS and Android often share one mobile BFF. Administrative and customer web applications may need separate BFFs despite both running in browsers.

Browser BFF and Token Handling

A browser BFF can use an OAuth/OpenID Connect flow on the server and keep access tokens out of browser JavaScript.

Typical flow:

Code
Browser
  -> secure session cookie
      -> BFF
          -> access token
              -> downstream API

Benefits:

  • Access and refresh tokens remain server-side.
  • Smaller exposure to token theft through browser script.
  • Centralized token refresh and logout.

The session cookie should use:

  • Secure.
  • HttpOnly.
  • Appropriate SameSite.
  • Short or controlled lifetime.

Cookie authentication introduces CSRF risk for state-changing operations. Use anti-forgery defenses and strict origin policy.

Authentication Versus Authorization

The gateway or BFF can validate the caller's identity, but downstream services must enforce authorization for owned resources.

Identity propagation options include:

  • Forward the original access token.
  • Exchange for a downstream audience token.
  • Use workload identity plus signed user context.
  • Use delegated authorization.

Do not trust arbitrary client-supplied identity headers. Strip or overwrite trusted headers at the edge.

Token Audience and Scope

A token intended for a gateway is not automatically valid for every service.

Validate:

  • Issuer.
  • Audience.
  • Signature.
  • Expiration.
  • Scopes or roles.
  • Tenant.

Token exchange or on-behalf-of flows can produce service-appropriate tokens. Avoid issuing one broad token that grants every internal capability.

BFF API Design

A BFF API can be task- or screen-oriented:

Code
GET /home
GET /checkout-summary
POST /checkout
GET /account-settings

This is acceptable because the BFF is a client adapter, not a reusable enterprise domain API.

Keep commands explicit and avoid exposing one generic proxy route:

Code
/proxy/{service}/{path}

An unrestricted proxy defeats surface reduction and can create security vulnerabilities.

Orchestration and Transactions

A BFF can coordinate calls, but it should not pretend several service operations form one database transaction.

For multi-service workflows:

  • Prefer one domain service to own the business process.
  • Use asynchronous orchestration for long-running work.
  • Define idempotency.
  • Handle compensation explicitly.
  • Return operation status where completion is deferred.

The BFF should not become the durable source of truth for a business saga unless that workflow is genuinely client-specific.

Gateway Versus BFF

ConcernAPI gatewayBFF
Primary scopeShared edgeOne client experience
RoutingCore responsibilityUsually limited
Rate limitingCommon policyClient-specific refinement
AuthenticationToken validation and edge policySession and client flow
AggregationShared, limitedClient-specific
Response shapingLight transformationTailored composition
Business rulesAvoidAvoid core domain rules
OwnershipPlatform or API teamFrontend-aligned team

A common architecture uses both:

Code
Client
  -> shared API gateway
      -> client BFF
          -> domain services

The extra hop must be justified. Some systems expose BFFs directly through an ingress or gateway product that combines these roles.

Gateway Versus GraphQL

GraphQL can let clients select fields and combine a graph of data. It can reduce the need for multiple response-shaped BFF endpoints.

GraphQL does not automatically solve:

  • Authentication and authorization.
  • Rate limiting and abuse control.
  • N+1 backend calls.
  • Workflow commands.
  • Token handling in browsers.
  • Service ownership.
  • Partial failure policy.

A GraphQL server can itself be a BFF or composition gateway. Avoid stacking layers that duplicate aggregation.

Gateway Versus Direct Client-to-Service Calls

Direct calls may be acceptable when:

  • Few services are public.
  • Clients can safely discover and authenticate to each API.
  • Cross-origin and certificate management are controlled.
  • No common edge policy is needed.

Problems include:

  • Exposed internal topology.
  • Many public endpoints and certificates.
  • Inconsistent security and quotas.
  • Client coupling to service decomposition.
  • More frontend round trips.

The gateway provides stability but adds central infrastructure.

Rate Limiting and Quotas

Apply limits by appropriate identity:

  • IP address.
  • User.
  • Tenant.
  • API key.
  • Client application.
  • Operation cost.

Return clear feedback:

Code
HTTP/1.1 429 Too Many Requests
Retry-After: 30

One global limit can allow a noisy tenant to affect everyone. Use fair partitions and protect expensive aggregation endpoints separately.

Caching

Gateway or BFF caching can reduce latency and backend load for safe responses.

Cache keys must consider:

  • Path and query.
  • User or tenant identity.
  • Authorization scope.
  • Language.
  • Accepted representation.
  • API version.

Never place private user data in a shared cache without correct partitioning. Prefer HTTP validators and explicit cache-control policy over ad hoc caching.

Retries, Timeouts, and Circuit Breakers

Every outbound call needs a deadline.

Retry only when:

  • The failure is transient.
  • The remaining deadline permits it.
  • The operation is idempotent or protected by an idempotency key.
  • Retry volume is bounded.

Use jittered backoff and circuit breakers carefully. Layered retries at client, gateway, BFF, and service can multiply traffic.

Define one retry budget across the call chain.

Request and Response Transformation

Transformations can:

  • Rename an external path.
  • Add trusted correlation metadata.
  • Translate a legacy media type.
  • Remove internal fields.

Heavy transformations create a second implementation of the contract and complicate debugging. Prefer explicit BFF code for substantial client adaptation.

Observability

Capture:

  • Request ID and trace context.
  • Authenticated client and tenant.
  • Route and operation.
  • Status and latency.
  • Downstream dependency timing.
  • Retry and circuit-breaker events.
  • Rate-limit decisions.
  • Cache hits.
  • Partial failures.

Use distributed tracing across gateway, BFF, and services. Do not log tokens, cookies, personal data, or sensitive request bodies.

Correlation and Causation

Preserve standard trace headers and add business correlation IDs when needed. The gateway can create a trace when absent, but should not replace a valid trusted trace context without policy.

Return a safe correlation identifier in errors so support can locate the request.

Deployment and Ownership

Define:

  • Who owns gateway configuration.
  • Who can add routes and policies.
  • Who deploys each BFF.
  • Who responds to incidents.
  • How changes are reviewed.
  • How configurations are tested.

Gateway configuration is production code. Store it in version control, validate it, and deploy through controlled pipelines.

BFF ownership should align with the frontend team when that team can operate backend services. Otherwise the pattern may create organizational handoffs rather than autonomy.

Scaling

Scale gateways and BFFs independently when their workloads differ.

Gateway load depends on:

  • Total requests.
  • TLS and authentication cost.
  • Transformations.
  • Payload sizes.

BFF load depends on:

  • Client traffic.
  • Fan-out count.
  • Aggregation CPU and memory.
  • Session storage.

Keep services stateless where possible. If sessions are required, use resilient shared storage or encrypted self-contained cookies with careful size and revocation decisions.

Resilience and Isolation

Avoid allowing one failing dependency to exhaust all gateway or BFF resources.

Use:

  • Connection-pool limits.
  • Per-upstream concurrency limits.
  • Bulkheads.
  • Short queues.
  • Timeouts.
  • Load shedding.
  • Separate deployment pools for critical clients where justified.

Return a controlled 503 or partial response instead of allowing unbounded work to collapse the edge.

API Discovery and Documentation

The gateway can expose one developer portal, but ownership of API contracts should remain with backend teams.

Gateway documentation should not drift from service behavior. Aggregate or publish versioned OpenAPI documents through a controlled process, and ensure external routes, security, and server URLs match what consumers actually call.

.NET Reverse Proxy Example

YARP can implement a programmable reverse proxy in ASP.NET Core:

Code
var builder = WebApplication.CreateBuilder(args);

builder.Services
    .AddReverseProxy()
    .LoadFromConfig(
        builder.Configuration.GetSection("ReverseProxy"));

var app = builder.Build();

app.MapReverseProxy();

app.Run();

Configuration:

Code
{
  "ReverseProxy": {
    "Routes": {
      "orders": {
        "ClusterId": "ordering",
        "Match": {
          "Path": "/api/orders/{**catch-all}"
        }
      }
    },
    "Clusters": {
      "ordering": {
        "Destinations": {
          "primary": {
            "Address": "https://ordering.internal/"
          }
        }
      }
    }
  }
}

A managed gateway product may provide stronger policy, portal, quota, analytics, and lifecycle capabilities. A code-based proxy provides flexibility but transfers more operational responsibility to the team.

Minimal BFF Endpoint Example

Code
app.MapGet(
    "/home",
    async (
        IOrdersClient orders,
        IRecommendationsClient recommendations,
        ClaimsPrincipal user,
        CancellationToken cancellationToken) =>
    {
        var customerId = user.GetRequiredCustomerId();

        var ordersTask = orders.GetRecentAsync(
            customerId,
            cancellationToken);

        var recommendationsTask = recommendations.GetAsync(
            customerId,
            cancellationToken);

        await Task.WhenAll(ordersTask, recommendationsTask);

        return TypedResults.Ok(new HomeResponse(
            await ordersTask,
            await recommendationsTask));
    });

Production code needs explicit timeouts, failure policy, tracing, and authorization.

Decision Framework

Ask:

  1. Which shared edge concerns need one enforcement point?
  2. Do clients need materially different backend behavior?
  3. How many extra network hops are acceptable?
  4. Who owns and operates the new component?
  5. Which logic belongs to domain services rather than the edge?
  6. How are identity and authorization propagated?
  7. What happens when one downstream service is slow or unavailable?
  8. Can existing gateway aggregation, GraphQL, or a modular backend solve the need?
  9. How will contracts, telemetry, and incidents be managed?

Choose the fewest layers that satisfy those requirements.

Common Mistakes

  • Adding a gateway because every microservice diagram has one.
  • Putting core business logic in gateway policies.
  • Treating gateway authentication as complete authorization.
  • Forwarding untrusted identity headers.
  • Routing internal service calls through the external gateway.
  • Aggregating too many dependencies into one request.
  • Retrying unsafe operations without idempotency.
  • Stacking retries at every layer.
  • Returning partial data without a documented contract.
  • Building one general BFF full of conditions for every client.
  • Creating a BFF when all clients have identical needs.
  • Exposing an unrestricted proxy endpoint.
  • Storing browser access tokens in JavaScript despite a server-side BFF design.
  • Using cookies without CSRF protection.
  • Treating gateway configuration as manual operations work.
  • Publishing internal topology and endpoints.
  • Ignoring the additional latency and ownership cost.

Best Practices

  • Keep gateway responsibilities focused on shared edge policy.
  • Keep BFF responsibilities focused on one client experience.
  • Leave domain rules and authoritative data in owning services.
  • Validate identity at the edge and authorization in services.
  • Use audience-appropriate tokens and trusted identity propagation.
  • Define deadlines, retry budgets, and partial-failure policy.
  • Protect unsafe retries with idempotency.
  • Make gateway and BFF highly available and horizontally scalable.
  • Use distributed tracing and dependency-level metrics.
  • Version-control and test routing and policy configuration.
  • Align BFF ownership with frontend ownership where practical.
  • Avoid duplicate aggregation layers.
  • Reassess whether the extra hop still provides value.

Interview Practice

PreviousUbiquitous language and bounded contextsNext UpOpenAPI contracts and consumer-facing documentation