Re: Very slow left outer join - Mailing list pgsql-performance

From Tyrrill, Ed
Subject Re: Very slow left outer join
Date
Msg-id A23190A408F7094FAF446C1538222F7603F98161@avaexch01.avamar.com
Whole thread Raw
In response to Re: Very slow left outer join  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Very slow left outer join
List pgsql-performance
Tom Lane <tgl@sss.pgh.pa.us> writes:
> Klint Gore <kg@kgb.une.edu.au> writes:
>> On Tue, 29 May 2007 17:16:57 -0700, "Tyrrill, Ed"
<tyrrill_ed@emc.com> wrote:
>>> mdsdb=# explain analyze select backupobjects.record_id from
>>> backupobjects left outer join backup_location using(record_id) where
>>> backup_id = 1071;
>
>> Why are you using left join?
>> The where condition is going to force the row to exist.

This select is a simplified version of what I am really doing that still
exhibits the problem I am having.  I know this small query doesn't
really make sense, but I thought it would be easier to evaluate
something small rather then the entire query.

>
> Which indeed the planner figured out (note the lack of any mention of
> left join in the EXPLAIN result).  Michael put his finger on the
problem
> though: there's something way off about the rowcount estimate here:
>
>>    ->  Bitmap Heap Scan on backup_location  (cost=3831.20..360207.21
>> rows=436915 width=8) (actual time=94.375..97.688 rows=2789 loops=1)
>>          Recheck Cond: (backup_id = 1071)
>>          ->  Bitmap Index Scan on backup_location_bid
>> (cost=0.00..3831.20 rows=436915 width=0) (actual time=84.239..84.239
>> rows=2789 loops=1)
>>                Index Cond: (backup_id = 1071)
>
> With such a simple index condition the planner really ought to be able
> to come close to the right rowcount estimate.  Check for vacuuming
> problems, check for lack of ANALYZE, consider whether you need to bump
> up the statistics target ...
>
>            regards, tom lane

I did a vacuum analyze after inserting all the data.  Is there possibly
a bug in analyze in 8.1.5-6?  I know it says rows=436915, but the last
time the backup_location table has had that little data in it was a
couple months ago, and analyze has been run many times since then.
Currently it has over 160 million rows.

Thanks,
Ed

pgsql-performance by date:

Previous
From: "Tyrrill, Ed"
Date:
Subject: Re: Very slow left outer join
Next
From: Tom Lane
Date:
Subject: Re: Very slow left outer join