Re: Index Skip Scan - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Index Skip Scan
Date
Msg-id CAEepm=0S1JUUwKCK4jhmtTWm9VUTsCe3+b5tSvddpMpkzas7Hw@mail.gmail.com
Whole thread Raw
In response to Re: Index Skip Scan  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Index Skip Scan
List pgsql-hackers
On Fri, Aug 17, 2018 at 7:48 AM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Wed, Aug 15, 2018 at 11:22 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> * groups and certain aggregates (MIN() and MAX() of suffix index
>> columns within each group)
>> * index scans where the scan key doesn't include the leading columns
>> (but you expect there to be sufficiently few values)
>> * merge joins (possibly the trickiest and maybe out of range)
>
> FWIW, I suspect that we're going to have the biggest problems in the
> optimizer. It's not as if ndistinct is in any way reliable. That may
> matter more on average than it has with other path types.

Can you give an example of problematic ndistinct underestimation?

I suppose you might be able to defend against that in the executor: if
you find that you've done an unexpectedly high number of skips, you
could fall back to regular next-tuple mode.  Unfortunately that's
require the parent plan node to tolerate non-unique results.

I noticed that the current patch doesn't care about restrictions on
the range (SELECT DISTINCT a FROM t WHERE a BETWEEN 500 and 600), but
that causes it to overestimate the number of btree searches, which is
a less serious problem (it might not chose a skip scan when it would
have been better).

-- 
Thomas Munro
http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Vik Fearing
Date:
Subject: Re: Pre-v11 appearances of the word "procedure" in v11 docs
Next
From: Paul Bonaud
Date:
Subject: Doc patch: pg_upgrade page and checkpoint location consistency withreplicas