Re: incorrect row estimates for primary key join - Mailing list pgsql-performance

From Tom Lane
Subject Re: incorrect row estimates for primary key join
Date
Msg-id 19479.1372203377@sss.pgh.pa.us
Whole thread Raw
In response to Re: incorrect row estimates for primary key join  (Kevin Grittner <kgrittn@ymail.com>)
Responses Re: incorrect row estimates for primary key join  (Ben <midfield@gmail.com>)
List pgsql-performance
Kevin Grittner <kgrittn@ymail.com> writes:
> Ben <midfield@gmail.com> wrote:
>> it seems to me that an equality join between two relations (call them A and B)
>> using columns in relation B with a unique constraint should yield row estimates
>> which are at most equal to the row estimates for relation A.� my questions are
>>
>> 1 - is this correct?
>>
>> 2 - does the postgresql planner implement this when generating row estimates?

> That seems intuitive, but some of the estimates need to be made
> before all such information is available.� Maybe we can do
> something about that some day....
> Maybe someone else will jump in here with more details than I can
> provide (at least without hours digging in the source code).

It does not attempt to match up query WHERE clauses with indexes during
selectivity estimation, so the existence of a multi-column unique
constraint wouldn't help it improve the estimate.

In the case at hand, I doubt that a better result rowcount estimate
would have changed the planner's opinion of how to do the join.  The OP
seems to be imagining that 2 million index probes into a large table
would be cheap, but that's hardly true.  It's quite likely that the
mergejoin actually is the best way to do the query.  If it isn't really
best on his hardware, I would think that indicates a need for some
tuning of the cost parameters.  Another thing that might be helpful for
working with such large tables is increasing work_mem, to make hashes
and sorts run faster.

            regards, tom lane


pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Weird, bad 0.5% selectivity estimate for a column equal to itself
Next
From: Ben
Date:
Subject: Re: incorrect row estimates for primary key join