Re: WIP: BRIN multi-range indexes - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: WIP: BRIN multi-range indexes
Date
Msg-id 20200909180620.k7dso2cxhyxokxql@development
Whole thread Raw
In response to Re: WIP: BRIN multi-range indexes  (John Naylor <john.naylor@2ndquadrant.com>)
List pgsql-hackers
On Wed, Sep 09, 2020 at 12:04:28PM -0400, John Naylor wrote:
>On Sat, Sep 5, 2020 at 7:21 PM Tomas Vondra
><tomas.vondra@2ndquadrant.com> wrote:
>>
>> OK, here is a rebased version. Most of the breakage was due to changes
>> to the BRIN sgml docs.
>
>Hi Tomas,
>
>I plan on trying some different queries on different data
>distributions to get a sense of when the planner chooses a
>multi-minmax index, and whether the choice is good.
>
>Just to start, I used the artificial example in [1], but scaled down a
>bit to save time. Config is at the default except for:
>shared_buffers = 1GB
>random_page_cost = 1.1;
>effective_cache_size = 4GB;
>
>create table t (a bigint, b int) with (fillfactor=95);
>
>insert into t select i + 1000*random(), i+1000*random()
>  from generate_series(1,10000000) s(i);
>
>update t set a = 1, b = 1 where random() < 0.001;
>update t set a = 10000000, b = 10000000 where random() < 0.001;
>
>analyze t;
>
>create index on t using brin (a);
>CREATE INDEX
>Time: 1631.452 ms (00:01.631)
>
>explain analyze select * from t
>  where a between 1923300::int and 1923600::int;
>
>                                                        QUERY PLAN

>--------------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on t  (cost=16.10..43180.43 rows=291 width=12)
>(actual time=217.770..1131.366 rows=288 loops=1)
>   Recheck Cond: ((a >= 1923300) AND (a <= 1923600))
>   Rows Removed by Index Recheck: 9999712
>   Heap Blocks: lossy=56819
>   ->  Bitmap Index Scan on t_a_idx  (cost=0.00..16.03 rows=22595
>width=0) (actual time=3.054..3.055 rows=568320 loops=1)
>         Index Cond: ((a >= 1923300) AND (a <= 1923600))
> Planning Time: 0.328 ms
> Execution Time: 1131.411 ms
>(8 rows)
>
>Now add the multi-minmax:
>
>create index on t using brin (a int8_minmax_multi_ops);
>CREATE INDEX
>Time: 6521.026 ms (00:06.521)
>
>The first interesting thing is, with both BRIN indexes available, the
>planner still chooses the conventional BRIN index. Only when I disable
>it, does it choose the multi-minmax index:
>
>explain analyze select * from t
>  where a between 1923300::int and 1923600::int;
>
>                                                       QUERY PLAN

>-------------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on t  (cost=68.10..43160.86 rows=291 width=12)
>(actual time=1.835..4.196 rows=288 loops=1)
>   Recheck Cond: ((a >= 1923300) AND (a <= 1923600))
>   Rows Removed by Index Recheck: 22240
>   Heap Blocks: lossy=128
>   ->  Bitmap Index Scan on t_a_idx1  (cost=0.00..68.03 rows=22523
>width=0) (actual time=0.691..0.691 rows=1280 loops=1)
>         Index Cond: ((a >= 1923300) AND (a <= 1923600))
> Planning Time: 0.250 ms
> Execution Time: 4.244 ms
>(8 rows)
>
>I wonder if this is a clue that something in the costing unfairly
>penalizes a multi-minmax index. Maybe not enough to matter in
>practice, since I wouldn't expect a user to put different kinds of
>index on the same column.
>

I think this is much more an estimation issue than a costing one. Notice
that in the "regular" BRIN minmax index we have this:

    ->  Bitmap Index Scan on t_a_idx  (cost=0.00..16.03 rows=22595
        width=0) (actual time=3.054..3.055 rows=568320 loops=1)

while for the multi-minmax we have this:

    ->  Bitmap Index Scan on t_a_idx1  (cost=0.00..68.03 rows=22523
        width=0) (actual time=0.691..0.691 rows=1280 loops=1)

So yes, the multi-minmax index is costed a bit higher, mostly because
the index is a bit larger. (There's also a tweak to the correlation, but
that does not make much difference because it's just 0.99 vs. 1.0.)

The main difference is that for minmax the bitmap index scan actually
matches ~586k rows (a bit confusing, considering the heap scan has to
process almost 10M rows during recheck). But the multi-minmax only
matches ~1300 rows, with a recheck of 22k.

I'm not sure how to consider this during costing, as we only see these
numbers at execution time. One way would be to also consider "size" of
the ranges (i.e. max-min) vs. range of the whole column. But that's not
something we already have.

I'm not sure how troublesome this issue really is - I don't think people
are very likely to have both minmax and multi-minmax indexes on the same
column.

>The second thing is, with parallel seq scan, the query is faster than
>a BRIN bitmap scan, with this pathological data distribution, but the
>planner won't choose it unless forced to:
>
>set enable_bitmapscan = 'off';
>explain analyze select * from t
>  where a between 1923300::int and 1923600::int;
>                                                      QUERY PLAN

>-----------------------------------------------------------------------------------------------------------------------
> Gather  (cost=1000.00..120348.10 rows=291 width=12) (actual
>time=372.766..380.364 rows=288 loops=1)
>   Workers Planned: 2
>   Workers Launched: 2
>   ->  Parallel Seq Scan on t  (cost=0.00..119319.00 rows=121
>width=12) (actual time=268.326..366.228 rows=96 loops=3)
>         Filter: ((a >= 1923300) AND (a <= 1923600))
>         Rows Removed by Filter: 3333237
> Planning Time: 0.089 ms
> Execution Time: 380.434 ms
>(8 rows)
>

I think this is the same root cause - the planner does not realize how
bad the minmax index actually is in this case, so it uses a bit too
optimistic estimate for costing. And then it has to do essentially
seqscan with extra work for bitmap index/heap scan.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: More aggressive vacuuming of temporary tables
Next
From: Fujii Masao
Date:
Subject: Re: Transactions involving multiple postgres foreign servers, take 2