Re: BRIN cost estimate - Mailing list pgsql-hackers

From David Rowley
Subject Re: BRIN cost estimate
Date
Msg-id CAKJS1f9n-Wapop5Xz1dtGdpdqmzeGqQK4sV2MK-zZugfC14Xtw@mail.gmail.com
Whole thread Raw
In response to Re: BRIN cost estimate  (Emre Hasegeli <emre@hasegeli.com>)
Responses Re: BRIN cost estimate  (Emre Hasegeli <emre@hasegeli.com>)
List pgsql-hackers
On 3 April 2017 at 03:05, Emre Hasegeli <emre@hasegeli.com> wrote:
> Unfortunately, I am on vacation for two weeks without my computer.  I can
> post another version after 18th.  I know we are under time pressure for
> release.  I wouldn't mind if you or Alvaro would change it anyway you like.

I've made some changes. Actually, I completely changed how the
estimates work. I find this method more self-explanatory.

Basically, we work out the total index ranges, then work out how many
of those we'd touch in a perfectly correlated scenario. We then work
out how many ranges we'll probably visit based on the correlation
estimates from the stats, and assume the selectivity is probableRanges
/ totalIndexRanges.

I've attached a spreadsheet that compares Emre's method to mine. Mine
seems to favour the BRIN index less when the table is small. I think
this is pretty natural since if there is only 1 range, and we narrow
the result to one of them, then we might as well have performed a
seqscan.

My method seems favour BRIN a bit longer when the correlation is
between about 1% and 100%. But penalises BRIN much more when
correlation is less than around 1%. This may be better my way is
certainly smarter than the unpatched version, but still holds on a bit
longer, which may be more favourable if a BRIN index actually exists.
It might be more annoying for a user to have added a BRIN index and
never have the planner choose it.

My method also never suffers from estimating > 100% of the table.

I was a bit worried that Emre's method would penalise BRIN too much
when the correlation is not so high.

Interested to hear comments on this.

Please feel free to play with the spreadsheet by changing rows 1-3 in column B.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: "Tsunakawa, Takayuki"
Date:
Subject: Re: Supporting huge pages on Windows
Next
From: Tom Lane
Date:
Subject: Re: Statement timeout behavior in extended queries