A cookbook is organized by chapter — soups, mains, desserts. That's great if you know you want dessert. But if you've got a bag of mushrooms and want every recipe that uses them, you'd have to read the whole book page by page. That's why good cookbooks add an ingredient index in the back: look up "mushrooms" and jump straight to the right pages.
Index Table is exactly that back-of-the-book index, but for a data store that can only find things by its primary key. You build and maintain a second table keyed by the field you actually search on.
The problem
Many data stores — especially key-value and partitioned cloud stores — are blazing fast when you look up a record by its primary key, and miserably slow at everything else. Ask one for "all orders placed by customer 42" when the key is the order ID, and it has no choice but to scan every partition and inspect every row.
In a relational database you'd just add a secondary index and move on. But the simpler stores at cloud scale often don't offer secondary indexes, or limit them severely. Without one, your common-but-non-key queries get slower and pricier as the data grows, until the full scan becomes unworkable.
- Orders storeKeyed by order ID. It can find a record instantly by key, but has no index for the customer field.
- Row checkWith no secondary index, the store must inspect every row in turn to test whether it belongs to the customer.
- Non-key queryFiltering on a field that isn't the key forces a full scan that gets slower and pricier as the data grows.
How it works
You create a second table whose key is the field you want to query by. To find orders by customer, you build an index table keyed on customer ID; each entry points to (or contains) the matching orders. Now "orders for customer 42" is a single direct lookup instead of a scan.
There are two flavors. A lean index table stores just the key and a pointer — the primary keys of the matching records — so you then fetch the full rows from the main table. A fatter one duplicates whole records (or the columns a query needs) into the index so the lookup returns everything in one hop, trading storage and write cost for read speed. Either way, your application is responsible for writing to the index whenever the source data changes. The diagram below shows a query bypassing a full scan by going through a customer-keyed index table to reach the right records.
- Index tableA secondary table keyed by the queried field; turns a scan into a direct lookup that points to the matching records.
- Orders tableThe primary store, keyed by order ID — fast by key, but slow to search by any other field.
Every index table is another write you have to keep honest. Inserting or updating a record now means updating its index entries too, ideally in the same transaction or via a reliable async pipeline. If they ever drift apart, the index lies. This is the classic trade-off against strict normalization: you're duplicating data on purpose to make reads fast, and you owe the discipline to keep the copies in sync.
When to use it
Use an index table when you frequently query a store on a non-key field and the store itself can't index it for you — the typical situation with large key-value or sharded NoSQL stores. It's a great fit alongside CQRS, where the read side is free to maintain whatever purpose-built lookup structures make queries fast.
Skip it when your database already supports the secondary indexes you need — let the engine do the work and keep the consistency guarantees. And weigh the write penalty: if the field is rarely queried but constantly updated, the extra writes and sync risk may cost more than the occasional scan you're trying to avoid.