Re: WIP: BRIN multi-range indexes - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: WIP: BRIN multi-range indexes
Date
Msg-id 8a40b7e4-357e-b0c2-e8b7-72001993fc32@enterprisedb.com
Whole thread Raw
In response to Re: WIP: BRIN multi-range indexes  (John Naylor <john.naylor@enterprisedb.com>)
Responses Re: WIP: BRIN multi-range indexes  (John Naylor <john.naylor@enterprisedb.com>)
List pgsql-hackers

On 1/12/21 6:28 PM, John Naylor wrote:
> On Sat, Dec 19, 2020 at 8:15 PM Tomas Vondra 
> <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>> 
> wrote:
>  > [12-20 version]
> 
> Hi Tomas,
> 
> The measurements look good. In case it fell through the cracks, my 
> earlier review comments for Bloom BRIN indexes regarding minor details 
> don't seem to have been addressed in this version. I'll point to earlier 
> discussion for convenience:
> 
> https://www.postgresql.org/message-id/CACPNZCt%3Dx-fOL0CUJbjR3BFXKgcd9HMPaRUVY9cwRe58hmd8Xg%40mail.gmail.com 
> <https://www.postgresql.org/message-id/CACPNZCt%3Dx-fOL0CUJbjR3BFXKgcd9HMPaRUVY9cwRe58hmd8Xg%40mail.gmail.com>
> 
> https://www.postgresql.org/message-id/CACPNZCuqpkCGt8%3DcywAk1kPu0OoV_TjPXeV-J639ABQWyViyug%40mail.gmail.com 
> <https://www.postgresql.org/message-id/CACPNZCuqpkCGt8%3DcywAk1kPu0OoV_TjPXeV-J639ABQWyViyug%40mail.gmail.com>
> 

Whooops :-( I'll go through those again, thanks for reminding me.

>  > The improvements are fairly minor:
>  >
>  > 1) Rejecting bloom filters that are clearly too large (larger than page)
>  > early. This is imperfect, as it works for individual index keys, not the
>  > whole row. But per discussion it seems useful.
> 
> I think this is good enough.
> 
>  > So based on this I'm tempted to just use the version with two hashes, as
>  > implemented in 0005. It's much simpler than the partitioning scheme,
>  > does not need any of the logic to generate primes etc.
> 
> Sounds like the best engineering decision.
> 
> Circling back to multi-minmax build times, I ran a couple quick tests on 
> bigger hardware, and found that not only is multi-minmax slower than 
> minmax, which is to be expected, but also slower than btree. (unlogged 
> table ~12GB in size, maintenance_work_mem = 1GB, median of three runs)
> 
> btree          38.3s
> minmax         26.2s
> multi-minmax  110s
> 
> Since btree indexes are much larger, I imagine something algorithmic is 
> involved. Is it worth digging further to see if some code path is taking 
> more time than we would expect?
> 

I suspect it'd due to minmax having to decide which "ranges" to merge, 
which requires repeated sorting, etc. I certainly don't dare to claim 
the current algorithm is perfect. I wouldn't have expected such big 
difference, though - so definitely worth investigating.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: pg_dump PublicationRelInfo objects need to be marked with owner
Next
From: Bruce Momjian
Date:
Subject: Re: pg_upgrade test for binary compatibility of core data types