Overview
An API gateway is an edge component that provides one entry point to backend APIs and applies shared routing, security, traffic, and protocol policies. A Backend for Frontend (BFF) is a backend tailored to the needs of one frontend or client experience, such as a browser application, mobile app, or partner portal.
They solve related but different problems:
- A gateway centralizes common edge concerns across APIs.
- A BFF adapts backend capabilities to a particular user experience.
A gateway can:
- Route requests to services.
- Terminate TLS.
- Validate tokens.
- Apply rate limits and quotas.
- Transform protocols or headers.
- Aggregate selected backend calls.
- Cache safe responses.
- Record edge telemetry.
A BFF can:
- Shape responses for one client.
- Orchestrate client-specific workflows.
- Reduce frontend round trips.
- Hide internal service topology.
- Handle browser-specific authentication sessions.
- Optimize payloads for device or interface constraints.
Both add a network hop, deployment, security surface, operational ownership, and potential failure point. They should be introduced because client or service architecture requires them, not because microservice diagrams usually contain them.
This topic matters in interviews because candidates must reason about placement of responsibility, failure behavior, identity propagation, latency, ownership, and frontend needs. A gateway should not become a central monolith, and a BFF should not duplicate core business logic owned by domain services.
Core Concepts
Gateway, Reverse Proxy, Load Balancer, and Ingress
These terms overlap but emphasize different responsibilities.
Reverse proxy
- Receives a request and forwards it to an upstream server.
- Can terminate TLS and rewrite headers or paths.
Load balancer
- Distributes traffic across instances.
- Primarily addresses availability and capacity.
Ingress controller
- Implements external routing into a container platform such as Kubernetes.
- Translates ingress configuration into proxy behavior.
API gateway
- Adds API-aware policy such as authentication, quotas, transformation, version routing, developer subscriptions, and analytics.
A product can perform several roles. Evaluate required capabilities rather than selecting by label.
Gateway Versus Service Mesh
An API gateway manages primarily north-south traffic between external clients and the system.
A service mesh manages primarily east-west traffic between services and may provide:
- Mutual TLS.
- Service identity.
- Retries and timeouts.
- Traffic shaping.
- Service-to-service telemetry.
They can coexist:
Client
-> API gateway
-> service-mesh ingress
-> internal services
Do not route every internal call back through the external gateway. That creates unnecessary latency and central coupling.
Gateway Routing
The gateway maps external routes to internal services:
/api/orders/* -> Ordering service
/api/catalog/* -> Catalog service
/api/users/* -> Identity profile service
Benefits:
- Clients do not know service locations.
- Internal topology can change.
- Version or canary routing can be centralized.
- Services can remain private.
Keep route ownership explicit. Conflicting rewrites and hidden routing rules make incidents difficult to diagnose.
Gateway Offloading
Common shared edge policies include:
- TLS termination.
- Authentication token validation.
- Rate limiting.
- Request-size limits.
- IP restrictions.
- Compression.
- CORS policy.
- Request correlation.
- Basic caching.
- Web application firewall rules.
Offload only concerns that are truly common and safe at the edge.
Backend services must still enforce:
- Resource-level authorization.
- Business invariants.
- Tenant isolation.
- Input semantics.
- Data access rules.
The gateway is not the sole security boundary.
Gateway Aggregation
A gateway can call several services and combine their responses:
GET /dashboard
-> Customer service
-> Orders service
-> Recommendations service
-> one dashboard response
Benefits:
- Fewer client round trips.
- Less client knowledge of service topology.
- Centralized timeout and fallback policy.
Costs:
- Increased gateway CPU and memory.
- Fan-out latency.
- Partial failures.
- Coupling to response schemas.
- Harder scaling and ownership.
Use limited aggregation for stable, shared edge scenarios. Client-specific aggregation often belongs in a BFF.
Tail Latency and Fan-Out
An aggregate response is constrained by slow dependencies:
Total latency ~= gateway overhead + slowest required dependency
With many dependencies, the probability that at least one is slow or unavailable rises.
Design aggregation with:
- Parallel calls where independent.
- Per-dependency deadlines.
- Overall request deadline.
- Bounded concurrency.
- Partial-response policy.
- Fallbacks only when semantically safe.
- Cancellation propagation.
- Distributed tracing.
Do not retry every failed downstream call automatically. Retries can amplify load and duplicate unsafe operations.
Partial Failure
Define whether each dependency is:
- Required.
- Optional.
- Replaceable by stale data.
- Omittable with a warning.
Example response:
{
"orders": [],
"recommendations": null,
"warnings": [
{
"code": "recommendations-unavailable"
}
]
}
Returning partial data is appropriate only when consumers can understand it and business correctness is preserved.
Single Point of Failure
A gateway is on many critical paths. Protect it with:
- Multiple instances across failure domains.
- Health probes.
- Autoscaling.
- Load testing.
- Configuration validation.
- Safe rollout and rollback.
- Minimal local state.
- Capacity headroom.
- Dependency isolation.
Avoid synchronous external control-plane dependencies in the request path.
Gateway as a Choke Point
A gateway can become a bottleneck when it accumulates:
- Core business workflows.
- Large transformations.
- Long-running requests.
- Many custom plugins.
- Shared release coordination.
- Per-client conditional logic.
Keep the gateway focused on edge policy and simple routing or aggregation. Move domain behavior to services and client-specific behavior to BFFs.
What Is a BFF?
A BFF is a backend owned and shaped around one frontend experience.
Web app -> Web BFF -> services
Mobile app -> Mobile BFF -> services
Partner UI -> Partner BFF -> services
The BFF exposes operations and representations optimized for that client rather than forcing every client through one general-purpose aggregation API.
BFF Responsibilities
Appropriate responsibilities:
- Client-specific aggregation.
- Response shaping.
- UI-oriented workflow orchestration.
- Server-side session handling.
- Token exchange and secure token storage.
- Client-specific caching.
- Reducing mobile payloads.
- Hiding unstable service topology.
Inappropriate responsibilities:
- Core pricing rules.
- Order invariants.
- Authoritative data ownership.
- Reusable domain policy.
- Direct access to every service database.
Business rules should remain in the service or module that owns them.
When a BFF Is Valuable
Use a BFF when:
- Web and mobile clients need materially different APIs.
- A general backend contains many client-specific conditions.
- Clients make many calls to render one screen.
- Mobile bandwidth or latency needs special optimization.
- Frontend teams require independent backend evolution.
- Browser token security benefits from a server-side component.
- Client release cadence differs from service release cadence.
A BFF can reduce frontend coupling and network chattiness.
When Not to Use a BFF
A BFF may be unnecessary when:
- Only one simple client exists.
- Clients use nearly identical operations.
- One gateway aggregation layer is sufficient.
- The backend is already a cohesive monolith.
- GraphQL or another composition layer already provides client-specific selection.
- The team cannot own another production service.
Duplicating one BFF per screen or minor client variation creates operational cost without useful autonomy.
One BFF Per Experience, Not Per Device Automatically
Do not create separate BFFs merely because clients are named web, iOS, and Android.
Ask:
- Do they need different workflows?
- Do they evolve independently?
- Do they have different payload and latency constraints?
- Are they owned by different teams?
- Do they have different authentication models?
iOS and Android often share one mobile BFF. Administrative and customer web applications may need separate BFFs despite both running in browsers.
Browser BFF and Token Handling
A browser BFF can use an OAuth/OpenID Connect flow on the server and keep access tokens out of browser JavaScript.
Typical flow:
Browser
-> secure session cookie
-> BFF
-> access token
-> downstream API
Benefits:
- Access and refresh tokens remain server-side.
- Smaller exposure to token theft through browser script.
- Centralized token refresh and logout.
The session cookie should use:
Secure.HttpOnly.- Appropriate
SameSite. - Short or controlled lifetime.
Cookie authentication introduces CSRF risk for state-changing operations. Use anti-forgery defenses and strict origin policy.
Authentication Versus Authorization
The gateway or BFF can validate the caller's identity, but downstream services must enforce authorization for owned resources.
Identity propagation options include:
- Forward the original access token.
- Exchange for a downstream audience token.
- Use workload identity plus signed user context.
- Use delegated authorization.
Do not trust arbitrary client-supplied identity headers. Strip or overwrite trusted headers at the edge.
Token Audience and Scope
A token intended for a gateway is not automatically valid for every service.
Validate:
- Issuer.
- Audience.
- Signature.
- Expiration.
- Scopes or roles.
- Tenant.
Token exchange or on-behalf-of flows can produce service-appropriate tokens. Avoid issuing one broad token that grants every internal capability.
BFF API Design
A BFF API can be task- or screen-oriented:
GET /home
GET /checkout-summary
POST /checkout
GET /account-settings
This is acceptable because the BFF is a client adapter, not a reusable enterprise domain API.
Keep commands explicit and avoid exposing one generic proxy route:
/proxy/{service}/{path}
An unrestricted proxy defeats surface reduction and can create security vulnerabilities.
Orchestration and Transactions
A BFF can coordinate calls, but it should not pretend several service operations form one database transaction.
For multi-service workflows:
- Prefer one domain service to own the business process.
- Use asynchronous orchestration for long-running work.
- Define idempotency.
- Handle compensation explicitly.
- Return operation status where completion is deferred.
The BFF should not become the durable source of truth for a business saga unless that workflow is genuinely client-specific.
Gateway Versus BFF
A common architecture uses both:
Client
-> shared API gateway
-> client BFF
-> domain services
The extra hop must be justified. Some systems expose BFFs directly through an ingress or gateway product that combines these roles.
Gateway Versus GraphQL
GraphQL can let clients select fields and combine a graph of data. It can reduce the need for multiple response-shaped BFF endpoints.
GraphQL does not automatically solve:
- Authentication and authorization.
- Rate limiting and abuse control.
- N+1 backend calls.
- Workflow commands.
- Token handling in browsers.
- Service ownership.
- Partial failure policy.
A GraphQL server can itself be a BFF or composition gateway. Avoid stacking layers that duplicate aggregation.
Gateway Versus Direct Client-to-Service Calls
Direct calls may be acceptable when:
- Few services are public.
- Clients can safely discover and authenticate to each API.
- Cross-origin and certificate management are controlled.
- No common edge policy is needed.
Problems include:
- Exposed internal topology.
- Many public endpoints and certificates.
- Inconsistent security and quotas.
- Client coupling to service decomposition.
- More frontend round trips.
The gateway provides stability but adds central infrastructure.
Rate Limiting and Quotas
Apply limits by appropriate identity:
- IP address.
- User.
- Tenant.
- API key.
- Client application.
- Operation cost.
Return clear feedback:
HTTP/1.1 429 Too Many Requests
Retry-After: 30
One global limit can allow a noisy tenant to affect everyone. Use fair partitions and protect expensive aggregation endpoints separately.
Caching
Gateway or BFF caching can reduce latency and backend load for safe responses.
Cache keys must consider:
- Path and query.
- User or tenant identity.
- Authorization scope.
- Language.
- Accepted representation.
- API version.
Never place private user data in a shared cache without correct partitioning. Prefer HTTP validators and explicit cache-control policy over ad hoc caching.
Retries, Timeouts, and Circuit Breakers
Every outbound call needs a deadline.
Retry only when:
- The failure is transient.
- The remaining deadline permits it.
- The operation is idempotent or protected by an idempotency key.
- Retry volume is bounded.
Use jittered backoff and circuit breakers carefully. Layered retries at client, gateway, BFF, and service can multiply traffic.
Define one retry budget across the call chain.
Request and Response Transformation
Transformations can:
- Rename an external path.
- Add trusted correlation metadata.
- Translate a legacy media type.
- Remove internal fields.
Heavy transformations create a second implementation of the contract and complicate debugging. Prefer explicit BFF code for substantial client adaptation.
Observability
Capture:
- Request ID and trace context.
- Authenticated client and tenant.
- Route and operation.
- Status and latency.
- Downstream dependency timing.
- Retry and circuit-breaker events.
- Rate-limit decisions.
- Cache hits.
- Partial failures.
Use distributed tracing across gateway, BFF, and services. Do not log tokens, cookies, personal data, or sensitive request bodies.
Correlation and Causation
Preserve standard trace headers and add business correlation IDs when needed. The gateway can create a trace when absent, but should not replace a valid trusted trace context without policy.
Return a safe correlation identifier in errors so support can locate the request.
Deployment and Ownership
Define:
- Who owns gateway configuration.
- Who can add routes and policies.
- Who deploys each BFF.
- Who responds to incidents.
- How changes are reviewed.
- How configurations are tested.
Gateway configuration is production code. Store it in version control, validate it, and deploy through controlled pipelines.
BFF ownership should align with the frontend team when that team can operate backend services. Otherwise the pattern may create organizational handoffs rather than autonomy.
Scaling
Scale gateways and BFFs independently when their workloads differ.
Gateway load depends on:
- Total requests.
- TLS and authentication cost.
- Transformations.
- Payload sizes.
BFF load depends on:
- Client traffic.
- Fan-out count.
- Aggregation CPU and memory.
- Session storage.
Keep services stateless where possible. If sessions are required, use resilient shared storage or encrypted self-contained cookies with careful size and revocation decisions.
Resilience and Isolation
Avoid allowing one failing dependency to exhaust all gateway or BFF resources.
Use:
- Connection-pool limits.
- Per-upstream concurrency limits.
- Bulkheads.
- Short queues.
- Timeouts.
- Load shedding.
- Separate deployment pools for critical clients where justified.
Return a controlled 503 or partial response instead of allowing unbounded work to collapse the edge.
API Discovery and Documentation
The gateway can expose one developer portal, but ownership of API contracts should remain with backend teams.
Gateway documentation should not drift from service behavior. Aggregate or publish versioned OpenAPI documents through a controlled process, and ensure external routes, security, and server URLs match what consumers actually call.
.NET Reverse Proxy Example
YARP can implement a programmable reverse proxy in ASP.NET Core:
var builder = WebApplication.CreateBuilder(args);
builder.Services
.AddReverseProxy()
.LoadFromConfig(
builder.Configuration.GetSection("ReverseProxy"));
var app = builder.Build();
app.MapReverseProxy();
app.Run();
Configuration:
{
"ReverseProxy": {
"Routes": {
"orders": {
"ClusterId": "ordering",
"Match": {
"Path": "/api/orders/{**catch-all}"
}
}
},
"Clusters": {
"ordering": {
"Destinations": {
"primary": {
"Address": "https://ordering.internal/"
}
}
}
}
}
}
A managed gateway product may provide stronger policy, portal, quota, analytics, and lifecycle capabilities. A code-based proxy provides flexibility but transfers more operational responsibility to the team.
Minimal BFF Endpoint Example
app.MapGet(
"/home",
async (
IOrdersClient orders,
IRecommendationsClient recommendations,
ClaimsPrincipal user,
CancellationToken cancellationToken) =>
{
var customerId = user.GetRequiredCustomerId();
var ordersTask = orders.GetRecentAsync(
customerId,
cancellationToken);
var recommendationsTask = recommendations.GetAsync(
customerId,
cancellationToken);
await Task.WhenAll(ordersTask, recommendationsTask);
return TypedResults.Ok(new HomeResponse(
await ordersTask,
await recommendationsTask));
});
Production code needs explicit timeouts, failure policy, tracing, and authorization.
Decision Framework
Ask:
- Which shared edge concerns need one enforcement point?
- Do clients need materially different backend behavior?
- How many extra network hops are acceptable?
- Who owns and operates the new component?
- Which logic belongs to domain services rather than the edge?
- How are identity and authorization propagated?
- What happens when one downstream service is slow or unavailable?
- Can existing gateway aggregation, GraphQL, or a modular backend solve the need?
- How will contracts, telemetry, and incidents be managed?
Choose the fewest layers that satisfy those requirements.
Common Mistakes
- Adding a gateway because every microservice diagram has one.
- Putting core business logic in gateway policies.
- Treating gateway authentication as complete authorization.
- Forwarding untrusted identity headers.
- Routing internal service calls through the external gateway.
- Aggregating too many dependencies into one request.
- Retrying unsafe operations without idempotency.
- Stacking retries at every layer.
- Returning partial data without a documented contract.
- Building one general BFF full of conditions for every client.
- Creating a BFF when all clients have identical needs.
- Exposing an unrestricted proxy endpoint.
- Storing browser access tokens in JavaScript despite a server-side BFF design.
- Using cookies without CSRF protection.
- Treating gateway configuration as manual operations work.
- Publishing internal topology and endpoints.
- Ignoring the additional latency and ownership cost.
Best Practices
- Keep gateway responsibilities focused on shared edge policy.
- Keep BFF responsibilities focused on one client experience.
- Leave domain rules and authoritative data in owning services.
- Validate identity at the edge and authorization in services.
- Use audience-appropriate tokens and trusted identity propagation.
- Define deadlines, retry budgets, and partial-failure policy.
- Protect unsafe retries with idempotency.
- Make gateway and BFF highly available and horizontally scalable.
- Use distributed tracing and dependency-level metrics.
- Version-control and test routing and policy configuration.
- Align BFF ownership with frontend ownership where practical.
- Avoid duplicate aggregation layers.
- Reassess whether the extra hop still provides value.