Re: [HACKERS] distinct estimate of a hard-coded VALUES list - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: [HACKERS] distinct estimate of a hard-coded VALUES list
Date
Msg-id CAMkU=1w02NNDfXEhzaj28Z8Qf3=YPms2fDxFQX3+TT7Th+YoWA@mail.gmail.com
Whole thread Raw
In response to Re: distinct estimate of a hard-coded VALUES list  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] distinct estimate of a hard-coded VALUES list  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, Aug 23, 2016 at 5:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 08/22/2016 07:42 PM, Alvaro Herrera wrote:
>> Also, if we patch it this way and somebody has a slow query because of a
>> lot of duplicate values, it's easy to solve the problem by
>> de-duplicating.  But with the current code, people that have the
>> opposite problem has no way to work around it.

> I certainly agree it's better when a smart user can fix his query plan
> by deduplicating the values than when we end up generating a poor plan
> due to assuming some users are somewhat dumb.

Well, that seems to be the majority opinion, so attached is a patch
to do it.  Do we want to sneak this into 9.6, or wait?

> I wonder how expensive would it be to actually count the number of
> distinct values - there certainly are complex data types where the
> comparisons are fairly expensive, but I would not expect those to be
> used in explicit VALUES lists.

You'd have to sort the values before de-duping, and deal with VALUES
expressions that aren't simple literals.  And you'd have to do it a
lot --- by my count, get_variable_numdistinct is invoked six times
on the Var representing the VALUES output in Jeff's example query.
Maybe you could cache the results somewhere, but that adds even
more complication.  Given the lack of prior complaints about this,
it's hard to believe it's worth the trouble.


This patch still applies, and I think the argument for it is still valid.  So I'm going to make a commit-fest entry for it.  Is there additional evidence we should gather?

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [HACKERS] pg_basebackup fails on Windows when using tablespace mapping
Next
From: Michael Paquier
Date:
Subject: Re: [HACKERS] pg_basebackup fails on Windows when using tablespace mapping