On Wed, Sep 09, 2020 at 03:40:41PM -0300, Alvaro Herrera wrote:
>On 2020-Sep-09, John Naylor wrote:
>
>> create index on t using brin (a);
>> CREATE INDEX
>> Time: 1631.452 ms (00:01.631)
>
>> create index on t using brin (a int8_minmax_multi_ops);
>> CREATE INDEX
>> Time: 6521.026 ms (00:06.521)
>
>It seems strange that the multi-minmax index takes so much longer to
>build. I wonder if there's some obvious part of the algorithm that can
>be improved?
>
There are some minor optimizations possible - for example I noticed we
call minmax_multi_get_strategy_procinfo often because it happens in a
loop, and we could easily do it just once. But that saves only about 10%
or so, it's not a ground-breaking optimization.
The main reason for the slowness is that we pass the values one by one
to brin_minmax_multi_add_value - and on each call we need to deserialize
(and then sometimes also serialize) the summary, which may be quite
expensive. The regular minmax does not have this issue, it just swaps
the Datum value and that's it.
I see two possible optimizations - firstly, adding some sort of batch
variant of the add_value function, which would get a bunch of values
instead of just a single one, amortizing the serialization costs.
Another option would be to teach add_value to keep the deserialized
summary somewhere, and then force serialization at the end of the BRIN
page range. The end result would be roughly the same, I think.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services