Re: Performance problem on 8.2.4, but not 8.2.3 - Mailing list pgsql-performance

From Kristo Kaiv
Subject Re: Performance problem on 8.2.4, but not 8.2.3
Date
Msg-id 6C9F509A-3F0C-4DCB-B6B4-A0AAE55CDEAC@skype.net
Whole thread Raw
In response to Re: Performance problem on 8.2.4, but not 8.2.3  (Dave Pirotte <dpirotte@mediamatters.org>)
List pgsql-performance
these bogus rowcount estimates are a bit strange. if you have 800K rows and select 100K of them the rowcount estimate should most likely come from the histogram for the column. can you check what the histograms are for 
referrer path tables refferrer_domain where id <= referrer_domain 'mediamatters.org' both in 8.2.3 and 8.2.4 
I still think this happens because of skewed statistics. More memory should encourage the planner to choose hash join over nested loop afaik.

Kristo

On 26.05.2007, at 0:37, Dave Pirotte wrote:

Thanks for the quick responses.  :-)  The data is almost identical, between the two servers: 8.2.3 has 882198 records, 8.2.4 has 893121.  For background, I pg_dump'ed the data into the 8.2.4 server yesterday, and analyzed with the stats target of 250, then reanalyzed with target 10.   So, the statistics should theoretically be ok.  Running a vacuum full analyze on referrer_paths, per Kristo's suggestion, didn't affect the query plan.   

We downgraded to 8.2.3 just to rule that out, upped stats target to 100, analyzed, and are still experiencing the same behavior -- it's still coming up with the same bogus rowcount estimates.  Over the weekend I'll lower the memory and see if that does anything, just to rule that out...  Any other thoughts?  Thanks so much for your time and suggestions thus far.

Cheers,
Dave

On May 25, 2007, at 4:33 PM, Tom Lane wrote:

"Steinar H. Gunderson" <sgunderson@bigfoot.com> writes:
It looks like the estimated cost is lower for 8.2.4 -- could it be that the
fact that he's giving it more memory lead to the planner picking a plan that
happens to be worse?

Offhand I don't think so.  More work_mem might make a hash join look
cheaper (or a sort for a mergejoin), but the problem here seems to be
that it's switching away from a hash and to a nestloop.  Which is a
loser because there are many more outer-relation rows than it's
expecting.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at



Dave Pirotte
Director of Technology
Media Matters for America
phone: 202-756-4122



pgsql-performance by date:

Previous
From: "Peter T. Breuer"
Date:
Subject: Re: general PG network slowness (possible cure) (repost)
Next
From: Michael Stone
Date:
Subject: Re: ECC RAM really needed?