Home > mailing lists

Re: Bloom index cost model seems to be wrong - Mailing list pgsql-performance

From	Jeff Janes
Subject	Re: Bloom index cost model seems to be wrong
Date	February 12, 2019 16:58:08
Msg-id	CAMkU=1x9ML8PPmPKM0x7OVv6oUFn4b8s50suj2jcCOT78TLkzQ@mail.gmail.com Whole thread
In response to	Re: Bloom index cost model seems to be wrong (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Bloom index cost model seems to be wrong
List	pgsql-performance

Tree view

On Tue, Feb 12, 2019 at 10:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Kellerer <spam_eater@gmx.net> writes:
> The bloom index is only used if either Seq Scan is disabled or if the random_page_cost is set to 1 (anything about 1 triggers a Seq Scan on my Windows laptop).

Hm. blcostestimate is using the default cost calculation, except for

/* We have to visit all index tuples anyway */
costs.numIndexTuples = index->tuples;

which essentially tells genericcostestimate to assume that every index
tuple will be visited. This obviously is going to increase the cost
estimate; maybe there's something wrong with that?

I assumed (without investigating yet) that genericcostestimate is applying a cpu_operator_cost (or a few of them) on each index tuple, while the premise of a bloom index is that you do very fast bit-fiddling, not more expense SQL operators, for each tuple and then do the recheck only on what survives to the table tuple part.

Cheers,

Jeff

pgsql-performance by date:

From: Tom Lane
Date: 12 February 2019, 15:41:58
Subject: Re: Bloom index cost model seems to be wrong

From: Jeff Janes
Date: 12 February 2019, 18:03:48
Subject: Re: Performance regressions found using sqlfuzz

Re: Bloom index cost model seems to be wrong - Mailing list pgsql-performance

Previous

Next