Thread: Efficient pagination using multi-column cursors

Efficient pagination using multi-column cursors

From
large.goose2829@salomvary.com
Date:
Hi folks,

I am working on optimizing a query that attempts to efficiently paginate through a large table using multi-column "cursors" aka. the "seek method" (as described in detail here: https://use-the-index-luke.com/sql/partial-results/fetch-next-page).

The table (drastically simplified) looks like this:

CREATE TABLE data
(
    col_1   int         NOT NULL,
    col_2   int         NOT NULL,
    col_3   int         NOT NULL,
    content varchar(10) NOT NULL
);

And has an appropriate index:

CREATE INDEX data_index ON data (col_1, col_2, col_3);

The recommended query to paginate through this table is using the "row values" syntax:

SELECT content
FROM data
WHERE (col_1, col_2, col_3) > (10, 20, 29)
ORDER BY col_1, col_2, col_3
LIMIT 100;

Which results in a perfectly optimized query plan (slightly edited for readability, using test data of 1 million rows, on PostgreSQL 17.2):

Limit  (cost=0.42..5.33 rows=100 width=20)
       (actual time=0.084..0.197 rows=100 loops=1)
  ->  Index Scan using data_index on data  
        (cost=0.42..43617.30 rows=889600 width=20)
        (actual time=0.081..0.176 rows=100 loops=1)
        Index Cond: (ROW(col_1, col_2, col_3) > ROW(10, 20, 29))
Planning Time: 0.344 ms
Execution Time: 0.264 ms

However, in reality, my query uses a mix of ascending and descending ordering (with an index matching the order columns), in which case the WHERE (col_1, col_2, col_3) > (10, 20, 29) syntax is not an option (unless I somehow derive "reversed" data from the column, which I would like to avoid).

Therefore I am using an equivalent query using multiple WHERE conditions, something like this (for simplicity, no mixed ordering is involved in the examples):

SELECT content
FROM data
WHERE
  col_1 >= 10
  AND (
    col_1 > 10
    OR (
      col_2 >= 20
      AND (
        col_2 > 20
        OR col_3 > 29
      )
    )
  )
ORDER BY col_1, col_2, col_3
LIMIT 100;

Which returns the same rows, but the query plan is slightly less efficient:

Limit  (cost=0.42..6.48 rows=100 width=20) 
       (actual time=0.848..0.893 rows=100 loops=1)
  ->  Index Scan using data_index on data  
        (cost=0.42..52984.52 rows=874874 width=20)
        (actual time=0.847..0.884 rows=100 loops=1)
        Index Cond: (col_1 >= 10)
        Filter: ((col_1 > 10) OR ((col_2 >= 20) AND ((col_2 > 20) OR (col_3 > 29))))
        Rows Removed by Filter: 2030
Planning Time: 0.133 ms
Execution Time: 0.916 ms

Instead of exclusively relying on an index access predicate, this plan involves an index filter predicate.

Without being familiar the internals of the query planner, I *think* there *should* be a way to come up with WHERE conditions that results in the "perfect" plan. There are several ways to phrase the conditions, of which I've tried a few, only to get the same or worse performance. Does anyone have a suggestion for a better query?

I am also aware that I might be chasing an optimization with low returns (yet to see how it performs in Real Life data) but I'm already too deep down in a rabbit hole without being able to turn back without knowing The Truth.

I've dumped my experiments into a DB Fiddle: https://www.db-fiddle.com/f/kd8zaibZGKGH1HSStyxNkx/0

Cheers,
Márton


Re: Efficient pagination using multi-column cursors

From
Peter Geoghegan
Date:
On Wed, Feb 26, 2025 at 9:29 AM <large.goose2829@salomvary.com> wrote:
> Without being familiar the internals of the query planner, I *think* there *should* be a way to come up with WHERE
conditionsthat results in the "perfect" plan. 

There is a fundamental trade-off involved here. The simple, fast
"WHERE (col_1, col_2, col_3) > (10, 20, 29)" query returns whatever
tuples are stored immediately after "(10, 20, 29)" in the index.
Naturally, they're returned in index order, which is usually the most
useful order (simple ASC order or simple DESC order for all columns).

The B-Tree code can physically traverse your mixed-ASC-and-DESC order
index in almost the same way. But it is much less useful, since the
matching index tuples won't be physically located together as exactly
one contiguous group of tuples. And so (with your "handwritten" row
comparison) you get a filter qual that filters out non-matching tuples
using lower-order index columns. The index scan actually just returns
"Index Cond: (col_1 >= 10)" (which still returns a contiguous group of
index tuples), while a filter condition excludes those tuples returned
by the index scan node that don't satisfy the later/lower-order column
condition.

The book "Relational Database Index Design and the Optimizers"
proposes a vocabulary for the trade-offs in this area -- the 3 star
model. When creating the best possible index for certain queries it is
sometimes inherently necessary to choose between what it calls the
first star (which means avoiding a sort) and the second star (which
means having the thinnest possible "row slice"). Sometimes those
things are in tension, which makes sense when you imagine how the
index must be physically traversed.

--
Peter Geoghegan



Re: Efficient pagination using multi-column cursors

From
large.goose2829@salomvary.com
Date:
Thanks for the insights!

On Wed, Feb 26, 2025, at 16:05, Peter Geoghegan wrote:
On Wed, Feb 26, 2025 at 9:29 AM <large.goose2829@salomvary.com> wrote:
> Without being familiar the internals of the query planner, I *think* there *should* be a way to come up with WHERE conditions that results in the "perfect" plan.

There is a fundamental trade-off involved here. The simple, fast
"WHERE (col_1, col_2, col_3) > (10, 20, 29)" query returns whatever
tuples are stored immediately after "(10, 20, 29)" in the index.
Naturally, they're returned in index order, which is usually the most
useful order (simple ASC order or simple DESC order for all columns).


The B-Tree code can physically traverse your mixed-ASC-and-DESC order
index in almost the same way. But it is much less useful, since the
matching index tuples won't be physically located together as exactly
one contiguous group of tuples. 

I am not sure I understand this.

My understanding is that given this "mixed order" index:
CREATE INDEX data_index_desc ON data (col_1, col_2 DESC, col_3);

The index tuples are physically organized exactly in this way:
ORDER BY col_1, col_2 DESC, col_3

So that I should be able to write a query that reads a continuous range from this index without filtering.


And so (with your "handwritten" row
comparison) you get a filter qual that filters out non-matching tuples
using lower-order index columns. The index scan actually just returns
"Index Cond: (col_1 >= 10)" (which still returns a contiguous group of
index tuples), while a filter condition excludes those tuples returned
by the index scan node that don't satisfy the later/lower-order column
condition.

Does this mean that it is not possible to come up with a plan that has the same performance as "WHERE (col_1, col_2, col_3) > (10, 20, 29)" using "handwritten" filters, or only for "mixed order"? Or not a theoretical limitation but a limitation of the current implementation of the query planner?

I don't know whether it's polite to bring up competitors on this mailing list, but MySQL 8 (which quite ironically has poor performance for the "row values" syntax, doing a full index scan) seems to avoid index filtering using the "OR AND" variant (at least when no mixed ordering is involved):

SELECT content
FROM data
WHERE
  col_1 > 10
  OR (
    col_1 = 10
    AND (
      col_2 > 20
      OR (
        col_2 = 20
        AND col_3 > 29
      )
    )
  )
ORDER BY col_1, col_2, col_3
LIMIT 100;

-> Limit: 100 row(s)  (cost=100710 rows=100) (actual time=0.0322..1.15 rows=100 loops=1)
    -> Index range scan on data using data_index
         over (col_1 = 10 AND col_2 = 20 AND 29 < col_3) OR (col_1 = 10 AND 20 < col_2) OR (10 < col_1),
         with index condition: ((`data`.col_1 > 10) or ((`data`.col_1 = 10) and ((`data`.col_2 > 20) or ((`data`.col_2 = 20) and (`data`.col_3 > 29)))))
         (cost=100710 rows=511937) (actual time=0.0316..1.14 rows=100 loops=1)


Still slightly slower actual total time on the same machine as PostgreSQL though (based on a single EXPLAIN ANALYZE sample only).


The book "Relational Database Index Design and the Optimizers"
proposes a vocabulary for the trade-offs in this area -- the 3 star
model. When creating the best possible index for certain queries it is
sometimes inherently necessary to choose between what it calls the
first star (which means avoiding a sort) and the second star (which
means having the thinnest possible "row slice"). Sometimes those
things are in tension, which makes sense when you imagine how the
index must be physically traversed.

Aka. "Good, Fast, Cheap — Pick Any Two" ;)

Cheers,
Márton

Re: Efficient pagination using multi-column cursors

From
Peter Geoghegan
Date:
On Wed, Feb 26, 2025 at 10:40 AM <large.goose2829@salomvary.com> wrote:
> My understanding is that given this "mixed order" index:
> CREATE INDEX data_index_desc ON data (col_1, col_2 DESC, col_3);
>
> The index tuples are physically organized exactly in this way:
> ORDER BY col_1, col_2 DESC, col_3
>
> So that I should be able to write a query that reads a continuous range from this index without filtering.

Yes, you can. For example, if you use the predicate "col_1 >= 10", it
can work in this way, even with a mixed-asc-and-desc order multicolumn
index -- without any filtering. Same thing if you don't have any
predicate at all -- there's no lower-order columns to filter on
because there's no columns to filter on at all.

The actual predicate that you're interested in isn't like that. It
cannot use an index that both returns rows in the mixed-ASC-and-DESC
order that you want (and so terminate early with a LIMIT and whatnot),
while at the same time accessing all the tuples as exactly one
contiguous group. You have to pick which is more important. It sounds
very much like having the most useful sort order is important.

> Does this mean that it is not possible to come up with a plan that has the same performance as "WHERE (col_1, col_2,
col_3)> (10, 20, 29)" using "handwritten" filters, or only for "mixed order"? Or not a theoretical limitation but a
limitationof the current implementation of the query planner? 

Perhaps the query planner should be taught to rewrite the query in
such a way as to make it unnecessary for you to do so -- I think that
that's what MySQL is doing for you. That is beside the point.

Again, think about how things are physically laid out in an index
which mixes ASC and DESC order. It is inevitable that the scan has to
traverse over non-matching tuples in order to read all of the matching
tuples (or to read a given number of matching tuples). This has
nothing to do with the query planner.

> Aka. "Good, Fast, Cheap — Pick Any Two" ;)

It's not like that. Often it just isn't necessary to pick any 2 -- you
can have all 3, because the requirements of the query allow it. (Plus
it would never make sense to pick the first and second stars over the
third.)

--
Peter Geoghegan



Re: Efficient pagination using multi-column cursors

From
Laurenz Albe
Date:
On Wed, 2025-02-26 at 15:27 +0100, large.goose2829@salomvary.com wrote:
> I am working on optimizing a query that attempts to efficiently paginate
> through a large table using multi-column "cursors" aka. the "seek method"
> (as described in detail here:
> https://use-the-index-luke.com/sql/partial-results/fetch-next-page).
>
> The table (drastically simplified) looks like this:
>
> CREATE TABLE data
> (
>     col_1   int         NOT NULL,
>     col_2   int         NOT NULL,
>     col_3   int         NOT NULL,
>     content varchar(10) NOT NULL
> );
>
> And has an appropriate index:
>
> CREATE INDEX data_index ON data (col_1, col_2, col_3);
>
> The recommended query to paginate through this table is using the "row values" syntax:
>
> SELECT content
> FROM data
> WHERE (col_1, col_2, col_3) > (10, 20, 29)
> ORDER BY col_1, col_2, col_3
> LIMIT 100;
>
> Which results in a perfectly optimized query plan
>
> However, in reality, my query uses a mix of ascending and descending ordering (with an
> index matching the order columns), in which case the WHERE (col_1, col_2, col_3) > (10, 20, 29)
> syntax is not an option (unless I somehow derive "reversed" data from the column,
> which I would like to avoid).

Here are my ideas for this situation:

https://www.cybertec-postgresql.com/en/keyset-pagination-with-descending-order/

Yours,
Laurenz Albe