Re: Bloom index cost model seems to be wrong - Mailing list pgsql-performance

From Jeff Janes
Subject Re: Bloom index cost model seems to be wrong
Date
Msg-id CAMkU=1x9ML8PPmPKM0x7OVv6oUFn4b8s50suj2jcCOT78TLkzQ@mail.gmail.com
Whole thread Raw
In response to Re: Bloom index cost model seems to be wrong  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Bloom index cost model seems to be wrong
List pgsql-performance

On Tue, Feb 12, 2019 at 10:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Thomas Kellerer <spam_eater@gmx.net> writes:
> The bloom index is only used if either Seq Scan is disabled or if the random_page_cost is set to 1 (anything about 1 triggers a Seq Scan on my Windows laptop).

Hm.  blcostestimate is using the default cost calculation, except for

        /* We have to visit all index tuples anyway */
        costs.numIndexTuples = index->tuples;

which essentially tells genericcostestimate to assume that every index
tuple will be visited.  This obviously is going to increase the cost
estimate; maybe there's something wrong with that?

I assumed (without investigating yet) that genericcostestimate is applying a cpu_operator_cost (or a few of them) on each index tuple, while the premise of a bloom index is that you do very fast bit-fiddling, not more expense SQL operators, for each tuple and then do the recheck only on what survives to the table tuple part.

Cheers,

Jeff

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bloom index cost model seems to be wrong
Next
From: Jeff Janes
Date:
Subject: Re: Performance regressions found using sqlfuzz