Re: Efficient pagination using multi-column cursors - Mailing list pgsql-performance

From Peter Geoghegan
Subject Re: Efficient pagination using multi-column cursors
Date
Msg-id CAH2-Wzk0XHmt_1yEeDzRB3jWfceK0ZxcnPKFsqt-XEBiZ3d7fA@mail.gmail.com
Whole thread Raw
In response to Efficient pagination using multi-column cursors  (large.goose2829@salomvary.com)
Responses Re: Efficient pagination using multi-column cursors
List pgsql-performance
On Wed, Feb 26, 2025 at 9:29 AM <large.goose2829@salomvary.com> wrote:
> Without being familiar the internals of the query planner, I *think* there *should* be a way to come up with WHERE
conditionsthat results in the "perfect" plan. 

There is a fundamental trade-off involved here. The simple, fast
"WHERE (col_1, col_2, col_3) > (10, 20, 29)" query returns whatever
tuples are stored immediately after "(10, 20, 29)" in the index.
Naturally, they're returned in index order, which is usually the most
useful order (simple ASC order or simple DESC order for all columns).

The B-Tree code can physically traverse your mixed-ASC-and-DESC order
index in almost the same way. But it is much less useful, since the
matching index tuples won't be physically located together as exactly
one contiguous group of tuples. And so (with your "handwritten" row
comparison) you get a filter qual that filters out non-matching tuples
using lower-order index columns. The index scan actually just returns
"Index Cond: (col_1 >= 10)" (which still returns a contiguous group of
index tuples), while a filter condition excludes those tuples returned
by the index scan node that don't satisfy the later/lower-order column
condition.

The book "Relational Database Index Design and the Optimizers"
proposes a vocabulary for the trade-offs in this area -- the 3 star
model. When creating the best possible index for certain queries it is
sometimes inherently necessary to choose between what it calls the
first star (which means avoiding a sort) and the second star (which
means having the thinnest possible "row slice"). Sometimes those
things are in tension, which makes sense when you imagine how the
index must be physically traversed.

--
Peter Geoghegan



pgsql-performance by date:

Previous
From: large.goose2829@salomvary.com
Date:
Subject: Efficient pagination using multi-column cursors
Next
From: large.goose2829@salomvary.com
Date:
Subject: Re: Efficient pagination using multi-column cursors