On Fri, Aug 17, 2018 at 7:48 AM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Wed, Aug 15, 2018 at 11:22 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> * groups and certain aggregates (MIN() and MAX() of suffix index
>> columns within each group)
>> * index scans where the scan key doesn't include the leading columns
>> (but you expect there to be sufficiently few values)
>> * merge joins (possibly the trickiest and maybe out of range)
>
> FWIW, I suspect that we're going to have the biggest problems in the
> optimizer. It's not as if ndistinct is in any way reliable. That may
> matter more on average than it has with other path types.
Can you give an example of problematic ndistinct underestimation?
I suppose you might be able to defend against that in the executor: if
you find that you've done an unexpectedly high number of skips, you
could fall back to regular next-tuple mode. Unfortunately that's
require the parent plan node to tolerate non-unique results.
I noticed that the current patch doesn't care about restrictions on
the range (SELECT DISTINCT a FROM t WHERE a BETWEEN 500 and 600), but
that causes it to overestimate the number of btree searches, which is
a less serious problem (it might not chose a skip scan when it would
have been better).
--
Thomas Munro
http://www.enterprisedb.com