Re: WIP: BRIN multi-range indexes - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: WIP: BRIN multi-range indexes |
Date | |
Msg-id | 20200909202600.fup27xfr2m34jqg6@development Whole thread Raw |
In response to | Re: WIP: BRIN multi-range indexes (Alvaro Herrera <alvherre@2ndquadrant.com>) |
Responses |
Re: WIP: BRIN multi-range indexes
|
List | pgsql-hackers |
On Wed, Sep 09, 2020 at 04:53:30PM -0300, Alvaro Herrera wrote: >On 2020-Sep-09, Tomas Vondra wrote: > >> There are some minor optimizations possible - for example I noticed we >> call minmax_multi_get_strategy_procinfo often because it happens in a >> loop, and we could easily do it just once. But that saves only about 10% >> or so, it's not a ground-breaking optimization. > >Well, I guess this kind of thing should be fixed regardless while we >still know it's there, just to avoid an obvious inefficiency. > Sure. I was just suggesting it's not something that'd make this very close to plain minmax opclass. >> The main reason for the slowness is that we pass the values one by one >> to brin_minmax_multi_add_value - and on each call we need to deserialize >> (and then sometimes also serialize) the summary, which may be quite >> expensive. The regular minmax does not have this issue, it just swaps >> the Datum value and that's it. > >Ah, right, that's more interesting. The original dumb BRIN code >separates BrinMemTuple from BrinTuple so that things can be operated >efficiently in memory. Maybe something similar can be done in this >case, which also sounds like your second suggestion: > >> Another option would be to teach add_value to keep the deserialized >> summary somewhere, and then force serialization at the end of the BRIN >> page range. The end result would be roughly the same, I think. > Well, the patch already has Ranges (memory) and SerializedRanges (disk) but it's not very clear to me where to stash the in-memory data and where to make the conversion. > >Also, I think you could get a few initial patches pushed soon, since >they look like general improvements rather than specific to multi-range. > Yeah, I agree. I plan to review those once again in a couple days and then push them. > >On a differen train of thought, I wonder if we shouldn't drop the idea >of there being two minmax opclasses; just have one (still called >"minmax") and have the multi-range version be the v2 of it. We would >still need to keep code to operate on the old one, but if you ever >REINDEX then your index is upgraded to the new one. I see no reason to >keep the dumb minmax version around, assuming the performance is roughly >similar. > I'm not a huge fan of that. I think it's unlikely we'll ever make this new set of oplasses just as fast as the plain minmax, and moreover it does have some additional requirements (e.g. the distance proc, which may not make sense for some data types). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: