Re: Hypothetical indexes using BRIN broken since pg10 - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Hypothetical indexes using BRIN broken since pg10 |
Date | |
Msg-id | 23475.1567470067@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Hypothetical indexes using BRIN broken since pg10 (Julien Rouhaud <rjuju123@gmail.com>) |
Responses |
Re: Hypothetical indexes using BRIN broken since pg10
|
List | pgsql-hackers |
Julien Rouhaud <rjuju123@gmail.com> writes: > On Fri, Jul 26, 2019 at 1:34 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> The patch assumes the default pages_per_range setting, but looking at >> the code at https://github.com/HypoPG/hypopg, the extension actually >> takes pages_per_range into account when it estimates the index size. I >> guess there's no easy way to pass the pages_per_range setting down to >> brincostestimate(). I'm not sure what we should do about that, but seems >> that just using BRIN_DEFAULT_PAGES_PER_RANGE here is not very accurate. > Yes, hypopg can use a custom pages_per_range as it intercepts it when > the hypothetical index is created. I didn't find any way to get that > information in brincostestimate(), especially for something that could > backpatched. I don't like it, but I don't see how to do better than > just using BRIN_DEFAULT_PAGES_PER_RANGE :( I can tell you what I think ought to happen, but making it happen might be more work than this patch should take on. The right answer IMO is basically for the brinGetStats call to go away from brincostestimate and instead happen during plancat.c's building of the IndexOptInfo. In the case of a hypothetical index, it'd fall to the get_relation_info_hook to fill in suitable fake data. Sounds simple, but: 1. We really don't want even more AM-specific knowledge in plancat.c. So I think the right way to do this would be something along the line of adding a "void *amdata" field to IndexOptInfo, and adding an AM callback to be called during get_relation_info that's allowed to fill that in with some AM-specific data (which the AM's costestimate routine would know about). The existing btree-specific hacks in get_relation_info should migrate into btree's version of this callback, and IndexOptInfo.tree_height should probably go away in favor of keeping that in btree's version of the amdata struct. 2. This approach puts a premium on the get_relation_info callback being cheap, because there's no certainty that the data it fills into IndexOptInfo.amdata will ever get used. For btree, the _bt_getrootheight call is cheap enough to not be a problem, because it just looks at the metapage data that btree keeps cached in the index's relcache entry. The same cannot be said for brinGetStats as it stands: it goes off to read the index metapage. There are at least two ways to fix that: 2a. Teach brin to keep the metapage cached like btree does. This seems like it could be a performance win across the board, but you'd need to work out invalidation behavior, and it'd be a bit invasive. 2b. Define IndexOptInfo.amdata as being filled lazily, that is brincostestimate will invoke brinGetStats and fill in the data if the pointer is currently NULL. Then a hypothetical-index plugin could override that by pre-filling the field with the desired fake data. I don't have a problem with allowing brincostestimate to fill in defaults based on BRIN_DEFAULT_PAGES_PER_RANGE if it sees that amdata is null and the index is hypothetical. But there should be a way for the get_relation_info_hook to do better. BTW, the current patch doesn't apply according to the cfbot, but I think it just needs a trivial rebase over 9c703c169 (ie, assume the index is already locked). I didn't bother to do that since what I recommend above would require a lot more change in that area. regards, tom lane
pgsql-hackers by date: