Home > mailing lists

Re: New access method for b-tree. - Mailing list pgsql-hackers

From	Alexandre Felipe
Subject	Re: New access method for b-tree.
Date	February 5 09:59:25
Msg-id	CAE8JnxNM9Bh=LCGOzayewDgX3-kUNXdTDwNSDwsf+t=wKhPiCQ@mail.gmail.com Whole thread Raw
In response to	Re: New access method for b-tree. (Michał Kłeczek <michal@kleczek.org>)
List	pgsql-hackers

Tree view

Thank you for looking into this.

Now we can execute a, still narrow, family queries!

Maybe it helps to see this as a social network feeds. Imagine a social network, you have a few friends, or follow a few people, and you want to see their updates ordered by date. For each user we have a different combination of users that we have to display. But maybe, even having hundreds of users we will only show the first 10.

There is a low hanging fruit on the skip scan, if we need N rows, and one group already has M rows we could stop there.

If Nx is the number of friends, and M is the number of posts to show.

This runs with complexity (Nx * M) rows, followed by an (Nx * M) sort, instead of (Nx * N) followed by an (Nx * N) sort.

Where M = 10 and N is 1000 this is a significant improvement.

But if M ~ N, the merge scan that runs with M + Nx row accesses, (M + Nx) heap operations.

If everything is on the same page the skip scan would win.

The cost estimation is probably far off.

I am also not considering the filters applied after this operator, and I don't know if the planner infrastructure is able to adjust it by itself.

This is where I would like reviewer's feedback. I think that the planner costs are something to be determined experimentally.

Next I will make it slightly more general handling
* More index columns: Index (a, b, s...) could support WHERE a IN (...) ORDER BY b LIMIT N (ignoring s...)
* Multi-column prefix: WHERE (a, b) IN (...) ORDER BY c
* Non-leading prefix: WHERE b IN (...) AND a = const ORDER BY c on index (a, b, c)

---

Kind Regards,

Alexandre

On Wed, Feb 4, 2026 at 7:13 AM Michał Kłeczek <michal@kleczek.org> wrote:

On 3 Feb 2026, at 22:42, Ants Aasma <ants.aasma@cybertec.at> wrote:

On Mon, 2 Feb 2026 at 01:54, Tomas Vondra <tomas@vondra.me> wrote:
I'm also wondering how common is the targeted query pattern? How common
it is to have an IN condition on the leading column in an index, and
ORDER BY on the second one?

I have seen this pattern multiple times. My nickname for it is the
timeline view. Think of the social media timeline, showing posts from
all followed accounts in timestamp order, returned in reasonably sized
batches. The naive SQL query will have to scan all posts from all
followed accounts and pass them through a top-N sort. When the total
number of posts is much larger than the batch size this is much slower
than what is proposed here (assuming I understand it correctly) -
effectively equivalent to running N index scans through Merge Append.

My workarounds I have proposed users have been either to rewrite the
query as a UNION ALL of a set of single value prefix queries wrapped
in an order by limit. This gives the exact needed merge append plan
shape. But repeating the query N times can get unwieldy when the
number of values grows, so the fallback is:

SELECT * FROM unnest(:friends) id, LATERAL (
   SELECT * FROM posts
   WHERE user_id = id
   ORDER BY tstamp DESC LIMIT 100)
ORDER BY tstamp DESC LIMIT 100;

The downside of this formulation is that we still have to fetch a
batch worth of items from scans where we otherwise would have only had
to look at one index tuple.

GIST can be used to handle this kind of queries as it supports multiple sort orders.
The only problem is that GIST does not support ORDER BY column.
One possible workaround is [1] but as described there it does not play well with partitioning.
I’ve started drafting support for ORDER BY column in GIST - see [2].
I think it would be easier to implement and maintain than a new IAM (but I don’t have enough knowledge and experience to implement it myself)

[1] https://www.postgresql.org/message-id/3FA1E0A9-8393-41F6-88BD-62EEEA1EC21F%40kleczek.org
[2] https://www.postgresql.org/message-id/B2AC13F9-6655-4E27-BFD3-068844E5DC91%40kleczek.org

—
Kind regards,
Michal

Attachment

pgsql-hackers by date:

From: Richard Guo
Date: 05 February, 09:51:20
Subject: Re: Convert NOT IN sublinks to anti-joins when safe

From: shveta malik
Date: 05 February, 10:09:16
Subject: Re: [PATCH] Support automatic sequence replication

Re: New access method for b-tree. - Mailing list pgsql-hackers

Attachment

Previous

Next