[Fwd: Re: Slow query with planner row strange estimation] - Mailing list pgsql-performance

From damien hostin
Subject [Fwd: Re: Slow query with planner row strange estimation]
Date
Msg-id 4C3ADA48.4030902@axege.com
Whole thread Raw
Responses Re: [Fwd: Re: Slow query with planner row strange estimation]  (damien hostin <damien.hostin@axege.com>)
List pgsql-performance
--
HOSTIN Damien - Equipe R&D
Tel:+33(0)4 63 05 95 40
Société Axège
23 rue Saint Simon
63000 Clermont Ferrand
www.axege.com

Robert Haas a écrit :
> On Wed, Jul 7, 2010 at 10:39 AM, damien hostin <damien.hostin@axege.com> wrote:
>
>> Hello again,
>>
>> At last, I check the same query with the same data on my desktop computer.
>> Just after loading the data, the queries were slow, I launch a vaccum
>> analyse which collect good stats on the main table, the query became quick
>> (~200ms). Now 1classic sata disk computer is faster than our little monster
>> server !!
>>
>
> Have you tried running ANALYZE on the production server?
>
> You might also want to try ALTER TABLE ... SET STATISTICS to a large
> value on some of the join columns involved in the query.
>
>
Hello,

Before comparing the test case on the two machines, I run analyse on the
whole and look at pg_stats table to see if change occurs for the
columns. but on the production server the stats never became as good as
on the desktop computer. I set statistic at 10000 on column used by the
join, run analyse which take a 3000000 row sample then look at the
stats. The stats are not as good as on the desktop. Row number is nearly
the same but only 1 or 2 values are found.

The data are not balanced the same way on the two computer :
- Desktop is 12000 rows with 6000 implicated in the query (50%),
- "Production" (actually a dev/test server) is 6 million rows with 6000
implicated in the query (0,1%).
Columns used in the query are nullable, and in the 5994000 other rows
that are not implicated in the query these columns are null.

I don't know if the statistic target is a % or a number of value to
obtain, but event set at max (10000), it didn't managed to collect good
stats (for this particular query).
As I don't know what more to do, my conclusion is that the data need to
be better balanced to allow the analyse gather better stats. But if
there is a way to improve the stats/query with this ugly balanced data,
I'm open to it !

I hope that in real production, data will never be loaded this way. If
this appened we will maybe set enable_nestloop to off, but I don't think
it's a good solution, other query have a chance to get slower.


Thanks for helping

--
HOSTIN Damien - Equipe R&D
Tel:+33(0)4 63 05 95 40
Société Axège
23 rue Saint Simon
63000 Clermont Ferrand
www.axege.com



pgsql-performance by date:

Previous
From: Dimitri
Date:
Subject: Re: Slow query with planner row strange estimation
Next
From: damien hostin
Date:
Subject: Re: [Fwd: Re: Slow query with planner row strange estimation]