Picture ordering breakfast by phoning the kitchen, then the bakery, then the coffee shop separately — each call ringing for ages before someone picks up. You'd spend more time on hold than eating. Far better to tell one person your whole order and let them coordinate the rest behind the counter.
Gateway Aggregation does this for your app. Instead of a client firing off a separate request to every service a screen needs, it sends one request to the gateway, which gathers everything from the backends and hands back a single combined reply.
The problem
A single screen often needs data from several services at once — a product page might want details from the catalog service, the price from pricing, the stock count from inventory, and reviews from a fourth. If the client calls each one directly, that's four separate round trips over the network before the page can render.
On a fast wired connection that's annoying; on a phone with 150 ms of latency per round trip, it's painful. Each call also re-pays the cost of connection setup, authentication, and TLS. The client becomes a chatty coordinator, juggling partial results and failure handling for every backend — work it shouldn't have to do.
- Chatty clientCalls every service a screen needs directly — one slow, high-latency round trip each, re-paying connection setup, TLS, and auth every time.
- Backend serviceOne of several the page depends on. The client must juggle each one's partial results and failures itself.
How it works
The gateway absorbs the coordination. The client makes one request — "give me everything for this product page." The gateway then fans out to each backend service, ideally firing the calls in parallel rather than one after another. Because these calls happen inside the data center, the network hops between gateway and services are fast and cheap compared to the client's distant connection.
As the responses come back, the gateway merges them into a single payload shaped the way the client wants, and returns it in one reply. The client made one slow round trip instead of four; the four fast round trips all happened on the gateway's side. The diagram below shows a single request fanning out to three services and the results being combined into one response.
- Aggregating gatewayReceives one client request, fans it out to several backends in parallel, then merges the replies into a single response.
- Backend serviceOne of several services a screen needs; called over fast in-data-center hops, not slow client round trips.
Don't let the slowest backend hold everyone hostage. Aggregate with per-call timeouts and sensible fallbacks so a single sluggish or failing service degrades gracefully — return the rest of the page with a placeholder — instead of stalling the whole merged response. Pairing aggregation with a circuit breaker keeps one sick dependency from dragging the gateway down.
When to use it
Aggregation pays off whenever clients are making many round trips to assemble one view, especially over high-latency mobile links. It's a core trick of an API gateway and a natural fit for a backends-for-frontends setup, where each client type gets responses pre-shaped for its needs.
It's not always the answer. If the calls genuinely depend on one another and must run in sequence, you lose the parallelism that makes aggregation fast. And don't let the gateway swell into a place where business logic lives — its job is to combine, not to compute. Keep that boundary clean and pair it with gateway routing for sending requests onward and gateway offloading for shared concerns.