Re: WIP: BRIN multi-range indexes - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: WIP: BRIN multi-range indexes
Date
Msg-id c970fcb3-b483-1578-b33f-d61ebc56a948@enterprisedb.com
Whole thread Raw
In response to Re: WIP: BRIN multi-range indexes  (John Naylor <john.naylor@enterprisedb.com>)
Responses Re: WIP: BRIN multi-range indexes  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On 1/19/21 9:44 PM, John Naylor wrote:
> On Tue, Jan 12, 2021 at 1:42 PM Tomas Vondra 
> <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>> 
> wrote:
>  > I suspect it'd due to minmax having to decide which "ranges" to merge,
>  > which requires repeated sorting, etc. I certainly don't dare to claim
>  > the current algorithm is perfect. I wouldn't have expected such big
>  > difference, though - so definitely worth investigating.
> 
> It seems that monotonically increasing (or decreasing) values in a table 
> are a worst case scenario for multi-minmax indexes, or basically, unique 
> values within a range. I'm guessing it's because it requires many passes 
> to fit all the values into a limited number of ranges. I tried using 
> smaller pages_per_range numbers, 32 and 8, and that didn't help.
> 
> Now, with a different data distribution, using only 10 values that 
> repeat over and over, the results are muchs more sympathetic to multi-minmax:
> 
> insert into iot (num, create_dt)
> select random(), '2020-01-01 0:00'::timestamptz + (x % 10 || ' 
> seconds')::interval
> from generate_series(1,5*365*24*60*60) x;
> 
> create index cd_single on iot using brin(create_dt);
> 27.2s
> 
> create index cd_multi on iot using brin(create_dt 
> timestamptz_minmax_multi_ops);
> 30.4s
> 
> create index cd_bt on iot using btree(create_dt);
> 61.8s
> 
> Circling back to the monotonic case, I tried running a simple perf 
> record on a backend creating a multi-minmax index on a timestamptz 
> column and these were the highest non-kernel calls:
> +   21.98%    21.91%  postgres         postgres            [.] 
> FunctionCall2Coll
> +    9.31%     9.29%  postgres         postgres            [.] 
> compare_combine_ranges
> +    8.60%     8.58%  postgres         postgres            [.] qsort_arg
> +    5.68%     5.66%  postgres         postgres            [.] 
> brin_minmax_multi_add_value
> +    5.63%     5.60%  postgres         postgres            [.] timestamp_lt
> +    4.73%     4.71%  postgres         postgres            [.] 
> reduce_combine_ranges
> +    3.80%     0.00%  postgres         [unknown]           [.] 
> 0x0320016800040000
> +    3.51%     3.50%  postgres         postgres            [.] timestamp_eq
> 
> There's no one place that's pathological enough to explain the 4x 
> slowness over traditional BRIN and nearly 3x slowness over btree when 
> using a large number of unique values per range, so making progress here 
> would have to involve a more holistic approach.
> 

Yeah. This very much seems like the primary problem is in how we build 
the ranges incrementally - with monotonic sequences, we end up having to 
merge the ranges over and over again. I don't know what was the 
structure of the table, but I guess it was kinda narrow (very few 
columns), which exacerbates the problem further, because the number of 
rows per range will be way higher than in real-world.

I do think the solution to this might be to allow more values during 
batch index creation, and only "compress" to the requested number at the 
very end (when serializing to on-disk format).

There are a couple additional comments about possibly replacing 
sequential scan with a binary search, that could help a bit too.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tatsuro Yamada
Date:
Subject: Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Next
From: David Zhang
Date:
Subject: Re: Add table access method as an option to pgbench