Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys) - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys) |
Date | |
Msg-id | 4d910260-f22c-4d95-a40d-c177fcc9419c@enterprisedb.com Whole thread Raw |
In response to | Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys) (vignesh C <vignesh21@gmail.com>) |
Responses |
Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)
|
List | pgsql-hackers |
On 1/14/24 12:18, vignesh C wrote: > On Fri, 14 Jul 2023 at 20:17, Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> >> On 7/9/23 23:44, Tomas Vondra wrote: >>> ... >>>>> Yes, my previous message was mostly about backwards compatibility, and >>>>> this may seem a bit like an argument against it. But that message was >>>>> more a question "If we do this, is it actually backwards compatible the >>>>> way we want/need?") >>>>> >>>>> Anyway, I think the BrinDesc scratch space is a neat idea, I'll try >>>>> doing it that way and report back in a couple days. >>>> >>>> Cool. In 0005-Support-SK_SEARCHARRAY-in-BRIN-bloom-20230702.patch, you >>>> used the preprocess function to pre-calculate the scankey's hash, even >>>> for scalars. You could use the scratch space in BrinDesc for that, >>>> before doing anything with SEARCHARRAYs. >>>> >>> >>> Yeah, that's a good idea. >>> >> >> I started looking at this (the scratch space in BrinDesc), and it's not >> as straightforward. The trouble is BrinDesc is "per attribute" but the >> scratch space is "per scankey" (because we'd like to sort values from >> the scankey array). >> >> With the "new" consistent functions (that get all scan keys at once) >> this probably is not an issue, because we know which scan key we're >> processing and so we can map it to the scratch space. But with the old >> consistent function that's not the case. Maybe we should support this >> only with the "new" consistent function variant? >> >> This would however conflict with the idea to have a separate consistent >> function for arrays, which "splits" the scankeys into multiple groups >> again. There could be multiple SAOP scan keys, and then what? >> >> I wonder if the scratch space should be in the ScanKey instead? > > Are we planning to post an updated patch for this? If the interest has > gone down and if there are no plans to handle this I'm thinking of > returning this commitfest entry in this commitfest and can be opened > when there is more interest. > I still think the patch is a good idea and plan to get back to it, but probably not in this CF. Given that the last update if from July, it's fair to bump it - either RWF or just move to the next CF. Up to you. As for the patch, I wonder if Heikki has some idea what to do about the scratch space? I got stuck on thinking about how to do this with the two types of consistent functions we support/allow. To articulate the issue more clearly - the scratch space is "per index" but we need scratch space "per index key". That's fine - we can simply have pointers to multiple scratch spaces, I think. But we have two ways to do consistent functions - the "old" gets scan keys one attribute at a time, "new" gets all at once. For the "new" it's not a problem, it's simple to identify the right scratch space. But for the "old" one, how would that happen? The consistent function has no idea which index key it's operating on, and how to identify the correct scratch space. I can think of two ways to deal with this: 1) Only allow SK_SEARCHARRAY for indexes supporting new-style consistent functions (but I'm not sure how, considering amsearcharray is set way before we know what the opclass does, or whether it implements the old or new consistent function). 2) Allow SK_SEARCHARRAY even with old consistent function, but do some dance in bringetbitmap() so to set the scratch space accordingly before the call. Now that I read Heikki's messages again, I see he suggested assigning a new procnum to a consistent function supporting SK_SEARCHARRAY, which seems to be very close to (1). Except that we'd now have 3 ways to define a consistent function, and that sounds a bit ... too much? Anyway, thinking about (1), I'm still not sure what to do about existing opclasses - it'd be nice to have some backwards compatible solution, without breaking everything and forcing everyone to implement all the new stuff. Which is kinda why we already have two ways to do consistent functions. Presumably we'd have to implement some "default" handling by translating the SK_SEARCHARRAY key into simple equality keys ... regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: