Re: Slow query with planner row strange estimation - Mailing list pgsql-performance

From Robert Haas
Subject Re: Slow query with planner row strange estimation
Date
Msg-id AANLkTilpsAwBy2x6Sy8EBYaprOoGp8H0REt3ZiH35cHn@mail.gmail.com
Whole thread Raw
In response to Re: Slow query with planner row strange estimation  (damien hostin <damien.hostin@axege.com>)
Responses Re: Slow query with planner row strange estimation  (Dimitri <dimitrik.fr@gmail.com>)
List pgsql-performance
On Fri, Jul 9, 2010 at 6:13 AM, damien hostin <damien.hostin@axege.com> wrote:
>> Have you tried running ANALYZE on the production server?
>>
>> You might also want to try ALTER TABLE ... SET STATISTICS to a large
>> value on some of the join columns involved in the query.
>
> Hello,
>
> Before comparing the test case on the two machines, I run analyse on the
> whole and look at pg_stats table to see if change occurs for the columns.
> but on the production server the stats never became as good as on the
> desktop computer. I set statistic at 10000 on column used by the join, run
> analyse which take a 3000000 row sample then look at the stats. The stats
> are not as good as on the desktop. Row number is nearly the same but only 1
> or 2 values are found.
>
> The data are not balanced the same way on the two computer :
> - Desktop is 12000 rows with 6000 implicated in the query (50%),
> - "Production" (actually a dev/test server) is 6 million rows with 6000
> implicated in the query (0,1%).
> Columns used in the query are nullable, and in the 5994000 other rows that
> are not implicated in the query these columns are null.
>
> I don't know if the statistic target is a % or a number of value to obtain,

It's a number of values to obtain.

> but event set at max (10000), it didn't managed to collect good stats (for
> this particular query).

I think there's a cutoff where it won't collect values unless they
occur significantly more often than the average frequency.  I wonder
if that might be biting you here: without the actual values in the MCV
table, the join selectivity estimates probably aren't too good.

> As I don't know what more to do, my conclusion is that the data need to be
> better balanced to allow the analyse gather better stats. But if there is a
> way to improve the stats/query with this ugly balanced data, I'm open to it
> !
>
> I hope that in real production, data will never be loaded this way. If this
> appened we will maybe set enable_nestloop to off, but I don't think it's a
> good solution, other query have a chance to get slower.

Yeah, that usually works out poorly.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

pgsql-performance by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: Need help in performance tuning.
Next
From: Robert Haas
Date:
Subject: Re: Need help in performance tuning.