Thread: Strange left outer join performance issue

From:
"Noah M. Daniels"
Date:

Hi,

I have two queries that are very similar, that run on the same table
with slightly different conditions. However, despite a similar number
of rows returned, the query planner is insisting on a different
ordering and different join algorithm, causing a huge performance
hit. I'm not sure why the planner is doing the merge join the way it
is in the slow case, rather than following a similar plan to the fast
case.

Notice that the difference in the query is near the very end, where
it's supplier_alias_id vs. buyer_alias_id and company_type =
'Supplier' vs 'Buyer'.

What I don't get is why, in the slow (supplier) case, the index scan
on customs_records is done first without the index condition of
cr.supplier_alias_id = "outer".id, which means selecting 1.7 million
rows; why wouldn't it do a nested loop left join and have the index
condition use that alias id the way the fast ('buyer') query is done?

I'd appreciate any help -- thanks!

SLOW:

select a.id as alias_id, a.company_type as alias_company_type, a.name
as alias_name, cr.shipper as customs_record_shipper, cr.saddr1 as
customs_record_saddr1, cr.saddr2 as customs_record_saddr2, cr.saddr3
as customs_record_saddr3, cr.consignee as customs_record_consignee,
cr.caddr1 as customs_record_caddr1, cr.caddr2 as
customs_record_caddr2, cr.caddr3 as customs_record_caddr3,
cr.notify_party as customs_record_notify_party, cr.naddr1 as
customs_record_naddr1, cr.naddr2 as customs_record_naddr2, cr.naddr3
as customs_record_naddr3, cr.also_notify_party as
customs_record_also_notify_party, cr.anaddr1 as
customs_record_anaddr1, cr.anaddr2 as customs_record_anaddr2,
cr.anaddr3 as customs_record_addr3, cr.id as customs_record_id,
cr.buyer_field as customs_record_buyer_field from aliases a left
outer join customs_records cr on cr.supplier_alias_id = a.id where
a.company_type = 'Supplier' and a.company_id is NULL


Merge Right Join  (cost=1138.78..460482.84 rows=2993 width=405)
(actual time=1244745.427..1245714.571 rows=39 loops=1)
   Merge Cond: ("outer".supplier_alias_id = "inner".id)
   ->  Index Scan using index_customs_records_on_supplier_alias_id on
customs_records cr  (cost=0.00..6717806.37 rows=1704859 width=363)
(actual time=54.567..1245210.707 rows=117424 loops=1)
   ->  Sort  (cost=1138.78..1139.53 rows=300 width=46) (actual
time=24.093..24.161 rows=39 loops=1)
         Sort Key: a.id
         ->  Index Scan using index_aliases_company_type_company_id
on aliases a  (cost=0.00..1126.44 rows=300 width=46) (actual
time=22.400..23.959 rows=10 loops=1)
               Index Cond: ((company_type)::text = 'Supplier'::text)
               Filter: (company_id IS NULL)
Total runtime: 1245714.752 ms

FAST:

Nested Loop Left Join  (cost=0.00..603052.46 rows=3244 width=405)
(actual time=68.526..3115.407 rows=1355 loops=1)
    ->  Index Scan using index_aliases_company_type_company_id on
aliases a  (cost=0.00..639.56 rows=165 width=46) (actual
time=32.419..132.286 rows=388 loops=1)
          Index Cond: ((company_type)::text = 'Buyer'::text)
          Filter: (company_id IS NULL)
    ->  Index Scan using index_customs_records_on_buyer_alias_id on
customs_records cr  (cost=0.00..3639.55 rows=915 width=363) (actual
time=2.133..7.649 rows=3 loops=388)
          Index Cond: (cr.buyer_alias_id = "outer".id)
Total runtime: 3117.713 ms
(7 rows)

select a.id as alias_id, a.company_type as alias_company_type, a.name
as alias_name, cr.shipper as customs_record_shipper, cr.saddr1 as
customs_record_saddr1, cr.saddr2 as customs_record_saddr2, cr.saddr3
as customs_record_saddr3, cr.consignee as customs_record_consignee,
cr.caddr1 as customs_record_caddr1, cr.caddr2 as
customs_record_caddr2, cr.caddr3 as customs_record_caddr3,
cr.notify_party as customs_record_notify_party, cr.naddr1 as
customs_record_naddr1, cr.naddr2 as customs_record_naddr2, cr.naddr3
as customs_record_naddr3, cr.also_notify_party as
customs_record_also_notify_party, cr.anaddr1 as
customs_record_anaddr1, cr.anaddr2 as customs_record_anaddr2,
cr.anaddr3 as customs_record_addr3, cr.id as customs_record_id,
cr.buyer_field as customs_record_buyer_field from aliases a left
outer join customs_records cr on cr.buyer_alias_id = a.id where
a.company_type = 'Buyer' and a.company_id is NULL


From:
"Daniel Cristian Cruz"
Date:

Run VACUUM ANALYZE and see if the cost estimates became close to the effective rows. This could make it faster.

2007/3/23, Noah M. Daniels <>:
> SLOW:
> Merge Right Join  (cost=1138.78..460482.84 rows=2993 width=405)
> (actual time=1244745.427..1245714.571 rows=39 loops=1)
>    Merge Cond: ("outer".supplier_alias_id = "inner".id)
>    ->  Index Scan using index_customs_records_on_supplier_alias_id on
> customs_records cr  (cost=0.00..6717806.37 rows=1704859 width=363)
> (actual time=54.567..1245210.707 rows=117424 loops=1)
>    ->  Sort  (cost=1138.78..1139.53 rows=300 width=46) (actual
> time=24.093..24.161 rows=39 loops=1)
>          Sort Key: a.id
>          ->  Index Scan using index_aliases_company_type_company_id
> on aliases a  (cost=0.00..1126.44 rows=300 width=46) (actual
> time=22.400..23.959 rows=10 loops=1)
>                Index Cond: ((company_type)::text = 'Supplier'::text)
>                Filter: (company_id IS NULL)
> Total runtime: 1245714.752 ms
>
> FAST:
>
> Nested Loop Left Join  (cost=0.00..603052.46 rows=3244 width=405)
> (actual time=68.526..3115.407 rows=1355 loops=1)
>     ->  Index Scan using index_aliases_company_type_company_id on
> aliases a  (cost= 0.00..639.56 rows=165 width=46) (actual
> time=32.419..132.286 rows=388 loops=1)
>           Index Cond: ((company_type)::text = 'Buyer'::text)
>           Filter: (company_id IS NULL)
>     ->  Index Scan using index_customs_records_on_buyer_alias_id on
> customs_records cr  (cost=0.00..3639.55 rows=915 width=363) (actual
> time=2.133..7.649 rows=3 loops=388)
>           Index Cond: (cr.buyer_alias_id = "outer".id)
> Total runtime: 3117.713 ms
> (7 rows)


--
Daniel Cristian Cruz
Analista de Sistemas
From:
"Noah M. Daniels"
Date:

Not much of a difference, unfortunately... I still wonder why it's
doing the 'supplier' (slow) query using the merge right join.

the 'fast' query:

Nested Loop Left Join  (cost=0.00..423342.71 rows=2481 width=410)
(actual time=100.076..6380.865 rows=1355 loops=1)
    ->  Index Scan using index_aliases_company_type_company_id on
aliases a  (cost=0.00..462.33 rows=118 width=46) (actual
time=24.811..143.690 rows=388 loops=1)
          Index Cond: ((company_type)::text = 'Buyer'::text)
          Filter: (company_id IS NULL)
    ->  Index Scan using index_customs_records_on_buyer_alias_id on
customs_records cr  (cost=0.00..3572.61 rows=890 width=368) (actual
time=5.526..16.042 rows=3 loops=388)
          Index Cond: (cr.buyer_alias_id = "outer".id)
Total runtime: 6382.940 ms
(7 rows)

the 'slow' one:

Merge Right Join  (cost=842.53..479378.17 rows=2281 width=410)
(actual time=554713.506..555584.825 rows=39 loops=1)
    Merge Cond: ("outer".supplier_alias_id = "inner".id)
    ->  Index Scan using index_customs_records_on_supplier_alias_id
on customs_records cr  (cost=0.00..6673133.76 rows=1704859 width=368)
(actual time=42.327..555225.588 rows=117424 loops=1)
    ->  Sort  (cost=842.53..843.07 rows=218 width=46) (actual
time=0.109..0.164 rows=39 loops=1)
          Sort Key: a.id
          ->  Index Scan using index_aliases_company_type_company_id
on aliases a  (cost=0.00..834.06 rows=218 width=46) (actual
time=0.033..0.074 rows=10 loops=1)
                Index Cond: ((company_type)::text = 'Supplier'::text)
                Filter: (company_id IS NULL)
Total runtime: 555584.978 ms
(9 rows)


On Mar 23, 2007, at 4:04 PM, Daniel Cristian Cruz wrote:

> Run VACUUM ANALYZE and see if the cost estimates became close to
> the effective rows. This could make it faster.
>
> 2007/3/23, Noah M. Daniels <>:
> > SLOW:
> > Merge Right Join  (cost=1138.78..460482.84 rows=2993 width=405)
> > (actual time=1244745.427..1245714.571 rows=39 loops=1)
> >    Merge Cond: ("outer".supplier_alias_id = "inner".id)
> >    ->  Index Scan using
> index_customs_records_on_supplier_alias_id on
> > customs_records cr  (cost=0.00..6717806.37 rows=1704859 width=363)
> > (actual time=54.567..1245210.707 rows=117424 loops=1)
> >    ->  Sort  (cost=1138.78..1139.53 rows=300 width=46) (actual
> > time=24.093..24.161 rows=39 loops=1)
> >          Sort Key: a.id
> >          ->  Index Scan using index_aliases_company_type_company_id
> > on aliases a  (cost=0.00..1126.44 rows=300 width=46) (actual
> > time=22.400..23.959 rows=10 loops=1)
> >                Index Cond: ((company_type)::text = 'Supplier'::text)
> >                Filter: (company_id IS NULL)
> > Total runtime: 1245714.752 ms
> >
> > FAST:
> >
> > Nested Loop Left Join  (cost=0.00..603052.46 rows=3244 width=405)
> > (actual time=68.526..3115.407 rows=1355 loops=1)
> >     ->  Index Scan using index_aliases_company_type_company_id on
> > aliases a  (cost= 0.00..639.56 rows=165 width=46) (actual
> > time=32.419..132.286 rows=388 loops=1)
> >           Index Cond: ((company_type)::text = 'Buyer'::text)
> >           Filter: (company_id IS NULL)
> >     ->  Index Scan using index_customs_records_on_buyer_alias_id on
> > customs_records cr  (cost=0.00..3639.55 rows=915 width=363) (actual
> > time=2.133..7.649 rows=3 loops=388)
> >           Index Cond: (cr.buyer_alias_id = "outer".id)
> > Total runtime: 3117.713 ms
> > (7 rows)
>
>
> --
> Daniel Cristian Cruz
> Analista de Sistemas


From:
Tom Lane
Date:

"Noah M. Daniels" <> writes:
> I have two queries that are very similar, that run on the same table
> with slightly different conditions. However, despite a similar number
> of rows returned, the query planner is insisting on a different
> ordering and different join algorithm, causing a huge performance
> hit. I'm not sure why the planner is doing the merge join the way it
> is in the slow case, rather than following a similar plan to the fast
> case.

It likes the merge join because it predicts (apparently correctly) that
only about 1/14th of the table will need to be scanned.  This'd be an
artifact of the relative ranges of supplier ids in the two tables.

What PG version is this?  8.2 understands about repeated indexscans
being cheaper than standalone ones, but I get the impression from the
explain estimates that you may be using something older that's
overestimating the cost of the nestloop way.

            regards, tom lane

From:
"Noah M. Daniels"
Date:

Tom,

You're right; this is postgres 8.0.8. Perhaps upgrading will solve
this issue. Is there any way to get this query to perform better in
postgres 8.0.8?

thanks!

On Mar 23, 2007, at 6:13 PM, Tom Lane wrote:

> "Noah M. Daniels" <> writes:
>> I have two queries that are very similar, that run on the same table
>> with slightly different conditions. However, despite a similar number
>> of rows returned, the query planner is insisting on a different
>> ordering and different join algorithm, causing a huge performance
>> hit. I'm not sure why the planner is doing the merge join the way it
>> is in the slow case, rather than following a similar plan to the fast
>> case.
>
> It likes the merge join because it predicts (apparently correctly)
> that
> only about 1/14th of the table will need to be scanned.  This'd be an
> artifact of the relative ranges of supplier ids in the two tables.
>
> What PG version is this?  8.2 understands about repeated indexscans
> being cheaper than standalone ones, but I get the impression from the
> explain estimates that you may be using something older that's
> overestimating the cost of the nestloop way.
>
>             regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings


From:
Tom Lane
Date:

"Noah M. Daniels" <> writes:
> You're right; this is postgres 8.0.8. Perhaps upgrading will solve
> this issue. Is there any way to get this query to perform better in
> postgres 8.0.8?

You could try reducing random_page_cost, but I'm not sure that will
help much.

            regards, tom lane