Re: Poor performance when joining against inherited tables - Mailing list pgsql-performance

From Lucas Madar
Subject Re: Poor performance when joining against inherited tables
Date
Msg-id sig.81124526f3.4DCAF5F9.10904@samsix.com
Whole thread Raw
In response to Re: Poor performance when joining against inherited tables  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Poor performance when joining against inherited tables  (Maciek Sakrejda <msakrejda@truviso.com>)
Re: Poor performance when joining against inherited tables  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-performance
On 05/11/2011 09:38 AM, Robert Haas wrote:
>> However, if I disable seqscan (set enable_seqscan=false), I get the
>> following plan:
>>
>>   QUERY PLAN
>> ------------
>>   Hash Join  (cost=10001298843.53..290002337961.71 rows=8643757 width=506)
>>    Hash Cond: (f.id = objects.id)
>>    ->    Append  (cost=10000000000.00..290000536334.43 rows=8643757 width=20)
>>          ->    Seq Scan on item f  (cost=10000000000.00..10000000026.30
>> rows=1630 width=20)
>>          ->    Index Scan using xxx_pkey on itemXX f  (cost=0.00..10.60 rows=90
>> width=20)
>>          ->    Index Scan using yyy_pkey on itemYY f  (cost=0.00..25.24
>> rows=266 width=20)
>>          ->    Index Scan using zzz_pkey on itemZZ f  (cost=0.00..9.28 rows=2
>> width=20)
>>          ...
>>    ->    Hash  (cost=999347.17..999347.17 rows=3941949 width=490)
>>          ->    Index Scan using objects_pkey on objects (cost=0.00..999347.17
>> rows=3941949 width=490)
>>
>> This seems like a much more sensible query plan.
> I don't think so.  Scanning the index to extract all the rows in a
> table is typically going to be a lot slower than a sequential scan.
>
> A more interesting question is why you're not getting a plan like this:
>
> Nested Loop
> ->  Seq Scan on objects
> ->  Append
>     ->   Index Scan using xxx_pkey on itemXX
>     ->   Index Scan using yyy_pkey on itemYY
>     ->   Index Scan using zzz_pkey on itemZZ

Compared to the previous query plan (omitted in this e-mail, in which
the planner was scanning all the item tables sequentially), the second
query is much more desirable. It takes about 12 seconds to complete,
versus the other query which I canceled after six hours. However, what
you propose seems to make even more sense.

>> But it seems to think doing
>> a sequential scan on the *empty* item table is excessively expensive in this
>> case.
>>
>> Aside from enable_seqscan=false, is there any way I can make the query
>> planner not balk over doing a seqscan on an empty table?
> Why would you care?  A sequential scan of an empty table is very fast.
>
My issue is that it looks like it's avoiding the sequential scan:

Seq Scan on item f  (cost=10000000000.00..10000000026.30 rows=1630 width=20)

It says the sequential scan has a cost that's way too high, and I'm
presuming that's why it's choosing the extremely slow plan over the much
faster plan. I don't know very much about plans, but I'm assuming the
planner chooses the plan with the lowest cost.

I'd much prefer it *does* the sequential scan of the empty table and
goes with the other parts of the plan.

Thanks,
Lucas Madar

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: since when has pg_stat_user_indexes.idx_scan been counting?
Next
From: raghu ram
Date:
Subject: Re: [ADMIN] since when has pg_stat_user_indexes.idx_scan been counting?