Home > mailing lists

Re: Bloom index cost model seems to be wrong - Mailing list pgsql-performance

From	Jeff Janes
Subject	Re: Bloom index cost model seems to be wrong
Date	February 12, 2019 19:56:40
Msg-id	CAMkU=1xyzYcjHdCdt8pcY4rvoh0xeP1kb8kvhWiF5fyTYFgBPA@mail.gmail.com Whole thread
In response to	Re: Bloom index cost model seems to be wrong (Jeff Janes <jeff.janes@gmail.com>)
Responses	Re: Bloom index cost model seems to be wrong
List	pgsql-performance

Tree view

On Tue, Feb 12, 2019 at 11:58 AM Jeff Janes <jeff.janes@gmail.com> wrote:

On Tue, Feb 12, 2019 at 10:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hm. blcostestimate is using the default cost calculation, except for

/* We have to visit all index tuples anyway */
costs.numIndexTuples = index->tuples;

which essentially tells genericcostestimate to assume that every index
tuple will be visited. This obviously is going to increase the cost
estimate; maybe there's something wrong with that?

I assumed (without investigating yet) that genericcostestimate is applying a cpu_operator_cost (or a few of them) on each index tuple, while the premise of a bloom index is that you do very fast bit-fiddling, not more expense SQL operators, for each tuple and then do the recheck only on what survives to the table tuple part.

In order for bloom (or any other users of CREATE ACCESS METHOD, if there are any) to have a fighting chance to do better, I think many of selfuncs.c currently private functions would have to be declared in some header file, perhaps utils/selfuncs.h. But that then requires a cascade of other inclusions. Perhaps that is why it was not done.

Cheers,

Jeff

pgsql-performance by date:

From: Jeff Janes
Date: 12 February 2019, 18:03:48
Subject: Re: Performance regressions found using sqlfuzz

From: Tom Lane
Date: 12 February 2019, 21:17:23
Subject: Re: Bloom index cost model seems to be wrong

Re: Bloom index cost model seems to be wrong - Mailing list pgsql-performance

Previous

Next