Re: Encouraging multi-table join order - Mailing list pgsql-performance

From Tom Lane
Subject Re: Encouraging multi-table join order
Date
Msg-id 10770.1144789733@sss.pgh.pa.us
Whole thread Raw
In response to Re: Encouraging multi-table join order  (Dan Harris <fbsd@drivefaster.net>)
Responses Re: Encouraging multi-table join order
List pgsql-performance
Dan Harris <fbsd@drivefaster.net> writes:
> Tom Lane wrote:
>> What does the pg_stats entry for eventactivity.incidentid
>> contain?

> {P043190299,P051560740,P052581036,P052830218,P053100679,P053190889,P060370845,P042070391,P042690319,P043290117}
> |
> {0.00166667,0.00166667,0.00166667,0.00166667,0.00166667,0.00166667,0.00166667,0.00133333,0.00133333,0.00133333}

> How high should I set this?  I read the default is 10, but I'm not sure
> if doubling this would make a difference or if I should be doing a much
> larger number. There's approx 45 million rows in the table, if that matters.

What the stats entry is saying is that the most common entries occur
about 75000 times apiece (0.00166667 * 45e6), which is what's scaring
the planner here ;-).  I think those frequencies are artificially high
though.  The default statistics sample size is 3000 rows (300 *
statistics target, actually), so those numbers correspond to 5 or 4
rows in the sample, which is probably just random chance.

Try increasing the stats targets for this table to 100, then re-ANALYZE
and see what you get.  The most_common_freqs entries might drop as much
as a factor of 10.

            regards, tom lane

pgsql-performance by date:

Previous
From: "Merlin Moncure"
Date:
Subject: Re: Stored Procedure Performance
Next
From: Dan Harris
Date:
Subject: Re: Encouraging multi-table join order