Performance and scaling

When lag grows, first identify the bottleneck: claim SQL, handler work, table size, or cross-service transport.

Most Postgres work queues should tune the claim path before changing platforms. Brokers help with transport and fan-out; they do not remove the need for leases, idempotency, and handler capacity.

Find the bottleneck before changing tools

Slow claims are usually a fixable SQL problem: poll frequency, claim scan width, missing bucket filters, or table bloat. If claims are fast but lag grows, add handler capacity. If the work is really cross-service fan-out or replay, then add transport.

Realistic target after tuning: hundreds to low thousands of jobs/sec across a worker pool on a modest Postgres instance, if handlers are fast (<100ms) and you adopt the basics below. Slow handlers (HTTP calls, PDF generation) scale by adding workers until handler CPU/I/O is the limit, not claim latency.

What you see	Likely bottleneck	First move
Claim query p99 is high	SQL scan width, locks, or table size	Filter by bucket in SQL, right-size batches, archive completed rows.
Claims are fast but pending age grows	Handlers are slower than enqueue rate	Add handler concurrency or workers until CPU, I/O, or partner limits appear.
Database is mostly idle but workers poll constantly	Idle-loop load	Use `LISTEN/NOTIFY` plus jittered backoff.
One customer blocks its own stream	Hot partition key	Shard that key if strict global order is not required.
Many services need replay and fan-out	Transport, not local coordination	Use an outbox relay into a broker or event log.

First-pass tuning

Filter ownership in SQL

Precompute partition_bucket on enqueue. Claim with WHERE partition_bucket = ANY($owned). Claiming 100 rows and discarding 95 in application code wastes locks and I/O.

Cache the ring

Rebuild the ring and owned-bucket list every 5–10s, not on every poll loop. Heartbeat TTL is 30s; the worker pool does not change that fast.

Wake, don’t hammer

LISTEN inbox_new on enqueue + jittered backoff when idle (1–2s). Poll fast (50–100ms) only when backlog exists. Cuts idle DB load by 10–50×.

Right-size claim batches

Start at 25. Increase until claim p99 stays <50ms or handlers cannot keep up. Huge batches mean long transactions. Tiny batches mean claim overhead dominates.

Keep the inbox table hot

Archive or delete completed rows on a schedule. A 50M-row inbox makes every index scan slower: even with partial indexes.

Small payloads

Store a pointer in payload (S3 key, row id), not the blob. Target <2 KB JSONB per row. Large payloads bloat pages and WAL.

Workload tuning calculator

Describe your workload, get recommended timings, batch sizes, and pool settings based on the rules of thumb below.

Your cluster

Worker processes 4

Horizontal scale: each runs the claim loop

CPU cores per worker 4

Expected enqueue rate 100/s

Sustained jobs/sec into the inbox

Completed rows in inbox 0.1M

Millions of completed rows still in table (bloat factor)

Your workload

Handler p99 duration 200ms

Wall-clock time per job at p99: drives lease & batch sizing

Workload type

Avg payload size 2 KB

Per-key ordering required LISTEN/NOTIFY on enqueue

Recommended settings

Adjust inputs to see recommendations.

—

Est. capacity/s

—

Load %

—

Per worker/s

Setting	Recommended

generated config

Start with these settings

Tune from here, then measure. Adjust one knob at a time so you can tell whether the improvement came from fewer scans, fewer claims, faster handlers, or lower idle polling.

Setting	Start here	Rule of thumb	Watch for
Heartbeat interval	10s	≈ TTL ÷ 3 (TTL = 30s → heartbeat every 10s)	Too slow → false dead. Too fast → write noise
Worker dead TTL	30s	3× missed heartbeats before marking dead	Long TTL = slow failover. Short = flappy ring
Lease duration	90s	≥ 3× p99 handler time (short jobs). Renew for long jobs	Too short → cleanup churn + duplicates
Lease renewal	Every 30s	Every lease ÷ 3 while handler runs	Miss one renewal window → row reclaimed
Claim batch size	25	Drain in ~1–5s of handler wall time. Cap 50–100	Big batches lock more rows. Small = overhead
Idle poll interval	1–2s + jitter	Backoff when queue empty. Reset on NOTIFY	Fixed 500ms poll × N workers = needless load
Active poll interval	50–100ms	When pending depth > 0 or after NOTIFY	Only run hot when work exists
Ring cache TTL	5–10s	Rebuild on NOTIFY or TTL expiry	Every-loop rebuild wastes CPU
Housekeeping	30s	One winner via advisory lock	More frequent rarely helps
Partition buckets	360–1024	≥ 10× worker count; ~128 vnodes/worker	Too few → skew. Too many → fine
Handler concurrency	4–16 per worker	Match CPU cores × 2 for I/O-bound work	Serial loop underuses the machine
Payload size	`<2 KB`	Store heavy data elsewhere. Keep JSONB lean	Big JSONB → slow claims + vacuum pain
Completed row retention	24h–7d in inbox	Archive then DELETE. Keep pending+processing hot	Millions of completed rows → index bloat
Connection pool	PgBouncer transaction mode	Pool size ≈ workers × (1 claim + concurrency)	One conn per row = pool exhaustion

When the first pass is not enough

Parallel handlers per worker

Claim a batch, process rows with a concurrency limit (p-limit, semaphores). Claim stays serial. Handlers run in parallel. Often 2–4× throughput per machine.

Index for ordering guard

If using per-key FIFO, add partial index on (partition_key) WHERE status = 'processing' so the NOT EXISTS check stays cheap.

Horizontal workers

Add processes until claim latency or Postgres CPU rises. Consistent hash spreads keys. Scale is roughly linear until the DB is hot.

Postgres hygiene

Aggressive autovacuum on inbox, monitor bloat, consider FILLFACTOR 90 on high-churn tables. Tune after archive strategy is in place.

Put the tuned loop together

const BATCH_SIZE = 25;
const HANDLER_CONCURRENCY = 8;
const RING_TTL_MS = 8_000;
const IDLE_POLL_MS = 1_500;
const ACTIVE_POLL_MS = 75;

let ringCache = { ring: null, ownedBuckets: [], expiresAt: 0 };

async function getRing(db) {
  if (Date.now() < ringCache.expiresAt) return ringCache;
  const live = await db.query(`SELECT id FROM workers
    WHERE status = 'alive'
      AND last_seen_at > now() - interval '30 seconds'
    ORDER BY id`);
  const ring = buildRing(live, 128);
  ringCache = {
    ring,
    ownedBuckets: ownedBucketsFor(WORKER_ID, ring, 360),
    expiresAt: Date.now() + RING_TTL_MS,
  };
  return ringCache;
}

// LISTEN inbox_new → wake workers and refresh ownership soon
db.on("notification", () => {
  wakeFlag = true;
  ringCache.expiresAt = 0;
});

while (true) {
  await maybeRunHousekeeping(db);
  const { ownedBuckets } = await getRing(db);
  const batch = await db.claimInboxBatch({
    workerId: WORKER_ID,
    buckets: ownedBuckets,
    limit: BATCH_SIZE,
  });

  if (batch.length === 0) {
    await sleep(wakeFlag ? ACTIVE_POLL_MS : IDLE_POLL_MS + Math.random() * 500);
    wakeFlag = false;
    continue;
  }

  await mapWithConcurrency(batch, HANDLER_CONCURRENCY, async (row) => {
    await processWithLease(row, WORKER_ID);
  });
}

Keep completed rows out of the hot table

The inbox should mostly hold pending and processing. Move history out on a schedule (housekeeping can do this too).

-- Run daily or when completed count > 1M
INSERT INTO inbox_archive SELECT * FROM inbox
WHERE status = 'completed' AND completed_at < now() - interval '7 days';

DELETE FROM inbox
WHERE status = 'completed' AND completed_at < now() - interval '7 days';

-- Batch deletes (10k at a time) if table is huge: avoid one massive lock

Add indexes only for measured pressure

-- Speeds per-key ordering guard (NOT EXISTS on processing siblings)
CREATE INDEX idx_inbox_partition_processing
  ON inbox (partition_key)
  WHERE status = 'processing';

-- Already in schema.sql: claim path depends on these partial indexes:
-- idx_inbox_bucket_claim (partition_bucket, available_at, created_at, id) WHERE status = 'pending'
-- idx_inbox_claim (available_at, created_at, id) WHERE status = 'pending'

Move transport only when transport is the bottleneck

Separate where you coordinate claims from what carries bytes between services. The table uses product names only as familiar anchors.

Signal	Keep tuning Postgres coordination	Add or lean on message transport
Sustained enqueue rate	`< ~1–5k/s` after SQL and worker tuning	High-volume handoff between services. Many subscribers on one stream
Claim p99 after tuning	`< 50ms`, coordination is healthy	Still pegged after tuning. Or consumers cannot use your DB at all
Replay / fan-out	Outbox relay + per-service inbox	Shared event log as platform infrastructure
Cross-service boundary	Outbox in producer DB → transport → inbox in consumer DB	Natural fit: transport is the pipe, not the coordinator

Multi-DC, replicas, and cost signals

Multi-DC and replicas

Enqueue and claim must hit the primary. Read replicas work for dashboards and lag metrics, not for claiming work. Cross-DC inbox requires a single write leader, or you accept split-brain risk.

Cost signals

When claim queries peg CPU and archiving cannot keep pace, tune coordination first. Move high-volume transport to a broker only if that does not solve the bottleneck.

Source

Use the article for explanation, then use these files when you want the complete SQL and TypeScript in one place.