Re: - Mailing list pgsql-performance

From Tom Lane
Subject Re:
Date
Msg-id 8930.1182820034@sss.pgh.pa.us
Whole thread Raw
In response to Re:  (Ed Tyrrill <tyrrill_ed@emc.com>)
Responses Re:  (Ed Tyrrill <tyrrill_ed@emc.com>)
List pgsql-performance
Ed Tyrrill <tyrrill_ed@emc.com> writes:
> It seems to me that the first plan is the optimal one for this case, but
> when the planner has more information about the table it chooses not to
> use it.  Do you think that if work_mem were higher it might choose the
> first plan again?

It's worth fooling around with work_mem just to see what happens.  The
other thing that would be interesting is to force the other plan (set
enable_mergejoin = off) just to see what the planner is costing it at.
My suspicion is that the estimated costs are pretty close.

The ANALYZE stats affect this choice only in second-order ways AFAIR.
The planner penalizes hashes if it thinks there will be a lot of
duplicate values in the inner relation, but IIRC there is also a penalty
for inner duplicates in the mergejoin cost estimate.  So I'm a bit
surprised that there'd be a change.

Can you show us the pg_stats rows for the join columns after analyzing
at target 10 and target 100?

            regards, tom lane

pgsql-performance by date:

Previous
From: Stephen Frost
Date:
Subject: Re:
Next
From: Jim Nasby
Date:
Subject: Re: Database-wide VACUUM ANALYZE