Re: Bloom index cost model seems to be wrong - Mailing list pgsql-performance

From Jeff Janes
Subject Re: Bloom index cost model seems to be wrong
Date
Msg-id CAMkU=1xyzYcjHdCdt8pcY4rvoh0xeP1kb8kvhWiF5fyTYFgBPA@mail.gmail.com
Whole thread Raw
In response to Re: Bloom index cost model seems to be wrong  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: Bloom index cost model seems to be wrong  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
On Tue, Feb 12, 2019 at 11:58 AM Jeff Janes <jeff.janes@gmail.com> wrote:

On Tue, Feb 12, 2019 at 10:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hm.  blcostestimate is using the default cost calculation, except for

        /* We have to visit all index tuples anyway */
        costs.numIndexTuples = index->tuples;

which essentially tells genericcostestimate to assume that every index
tuple will be visited.  This obviously is going to increase the cost
estimate; maybe there's something wrong with that?

I assumed (without investigating yet) that genericcostestimate is applying a cpu_operator_cost (or a few of them) on each index tuple, while the premise of a bloom index is that you do very fast bit-fiddling, not more expense SQL operators, for each tuple and then do the recheck only on what survives to the table tuple part.

In order for bloom (or any other users of CREATE ACCESS METHOD, if there are any) to have a fighting chance to do better, I think many of selfuncs.c currently private functions would have to be declared in some header file, perhaps utils/selfuncs.h.  But that then requires a cascade of other inclusions.  Perhaps that is why it was not done.

Cheers,

Jeff

pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Performance regressions found using sqlfuzz
Next
From: Tom Lane
Date:
Subject: Re: Bloom index cost model seems to be wrong