Priority Queue

Walk into a busy hospital emergency room and you'll notice it isn't first-come-first-served. Someone arriving with chest pains is seen before someone who's been waiting an hour with a sprained ankle. A nurse triages everyone and the most urgent cases move to the front, because order of arrival matters less than how critical the case is.

A priority queue brings that triage logic to your system. Instead of processing requests in the strict order they arrived, it lets more important work be handled first.

The problem

A plain queue is fair in the simplest way: first in, first out. Everything waits its turn. That's fine until not all work is equally important. A flood of low-stakes background jobs — say, batch report generation — can pile up in the queue, and now a time-sensitive request, like a paying customer's checkout confirmation, is stuck behind thousands of items it doesn't care about.

Strict FIFO has no notion that some messages deserve to be handled sooner. During a busy spell, the requests that matter most are exactly the ones most likely to be buried, and your most valuable users feel the slowest service.

Before priorities — urgent work waits behind the backlog

strict first-in, first-out — no shortcuts

Mixed requests

One FIFO queue (urgent stuck at back)

Worker

A plain FIFO queue is fair in the simplest way: everything waits its turn. A flood of low-stakes batch jobs buries a time-sensitive checkout confirmation behind thousands of items it doesn't care about.

How it works

You attach a priority to each message and let consumers honor it. There are two common ways to build this. One is a single queue that natively understands priority, dequeuing the highest-priority message available rather than the oldest. The other — often simpler and more portable — is to use separate queues per priority level: a high-priority queue and a low-priority one, with consumers always draining the high queue first and only reaching for the low queue when the high one is empty.

You can also tune capacity per level: put more competing consumers on the urgent queue so it clears fast, and fewer on the routine one. Either way, urgent work no longer waits behind the routine backlog. The diagram below shows a high-priority path being served ahead of a low-priority one.

Priority Queue — let urgent work jump the line

urgent jumps the line

Mixed requests

High-priority queue

Low-priority queue

Workers (served first)

Worker (when idle)

Requests split into a high- and low-priority queue; workers drain the high queue first, so time-sensitive work isn't buried behind a routine backlog.

Watch out

Beware starvation. If high-priority work keeps arriving, a naive 'always serve high first' rule means low-priority messages may never run. Reserve some consumer capacity for the low queue, or age messages up in priority the longer they wait, so routine work still drains instead of rotting at the bottom forever.

When to use it

A priority queue makes sense whenever your workload has a genuine mix of urgencies — premium versus free tiers, interactive requests versus background batch jobs, or alerts that must be acted on immediately alongside routine processing. It pairs naturally with queue load leveling, which smooths bursts, by adding a sense of which buffered work to tackle first.

Don't bother if all your messages are equally important — a plain FIFO queue is simpler and has no starvation risk to manage. And keep the number of priority levels small; two or three tiers capture most of the value, while a dozen finely graded levels just add complexity without making the system meaningfully smarter about what to do next.

Priority Queue

The problem

How it works

When to use it

Key takeaways

Keep going