Home > mailing lists

Re: distinct estimate of a hard-coded VALUES list - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: distinct estimate of a hard-coded VALUES list
Date	August 18, 2016 21:25:16
Msg-id	25962.1471555512@sss.pgh.pa.us Whole thread Raw
In response to	distinct estimate of a hard-coded VALUES list (Jeff Janes <jeff.janes@gmail.com>)
Responses	Re: distinct estimate of a hard-coded VALUES list
List	pgsql-hackers

Tree view

Jeff Janes <jeff.janes@gmail.com> writes:
> So even though it knows that 6952 values have been shoved in the bottom, it
> thinks only 200 are going to come out of the aggregation.  This seems like
> a really lousy estimate.  In more complex queries than the example one
> given it leads to poor planning choices.

> Is the size of the input list not available to the planner at the point
> where it estimates the distinct size of the input list?  I'm assuming that
> if it is available to EXPLAIN than it is available to the planner.  Does it
> know how large the input list is, but just throw up its hands and use 200
> as the distinct size anyway?

It does know it, what it doesn't know is how many duplicates there are.
If we do what I think you're suggesting, which is assume the entries are
all distinct, I'm afraid we'll just move the estimation problems somewhere
else.

I recall some talk of actually running an ANALYZE-like process on the
elements of a VALUES list, but it seemed like overkill at the time and
still does.
        regards, tom lane

pgsql-hackers by date:

From: Alvaro Herrera
Date: 18 August 2016, 21:24:03
Subject: Re: [WIP] [B-Tree] Keep indexes sorted by heap physical location

From: Claudio Freire
Date: 18 August 2016, 21:26:36
Subject: Re: [WIP] [B-Tree] Keep indexes sorted by heap physical location

Re: distinct estimate of a hard-coded VALUES list - Mailing list pgsql-hackers

Previous

Next