Thread: Strange plan in pg 8.1.0

Strange plan in pg 8.1.0

From
Mattias Kregert
Date:
Look at this insane plan:

lucas=# explain analyse select huvudklass,sum(summa) from kor_tjanster left outer join prislist on prislista=listid and
tjanst=tjanstidwhere kor_id in (select id from kor where lista=10484) group by 1; 
                                                                 QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=212892.07..212892.10 rows=2 width=23) (actual time=4056.165..4056.167 rows=2 loops=1)
   ->  Hash IN Join  (cost=102.84..212889.04 rows=607 width=23) (actual time=4032.931..4056.017 rows=31 loops=1)
         Hash Cond: ("outer".kor_id = "inner".id)
         ->  Hash Left Join  (cost=59.66..206763.11 rows=1215336 width=27) (actual time=4.959..3228.550 rows=1216434
loops=1)
               Hash Cond: (("outer".prislista = ("inner".listid)::text) AND ("outer".tjanst =
("inner".tjanstid)::text))
               ->  Seq Scan on kor_tjanster  (cost=0.00..23802.36 rows=1215336 width=26) (actual time=0.032..1257.241
rows=1216434loops=1) 
               ->  Hash  (cost=51.77..51.77 rows=1577 width=29) (actual time=4.898..4.898 rows=1577 loops=1)
                     ->  Seq Scan on prislist  (cost=0.00..51.77 rows=1577 width=29) (actual time=0.034..2.445
rows=1577loops=1) 
         ->  Hash  (cost=41.79..41.79 rows=557 width=4) (actual time=0.185..0.185 rows=29 loops=1)
               ->  Index Scan using kor_lista on kor  (cost=0.00..41.79 rows=557 width=4) (actual time=0.070..0.150
rows=29loops=1) 
                     Index Cond: (lista = 10484)
 Total runtime: 4056.333 ms

I have an index on kor_tjanster(kor_id), an index on prislist(prislist_id), did ANALYZE and all that stuff... but those
indexesare not used. 

Why does it come up with this strange plan? It does a seqscan of 1.2 million rows and then a join!? Using the index
wouldbe much faster... 

I expected something like this:
  1. Index Scan using kor_lista on kor (use lista_id 10484 to get a list of kor_id's - 29 rows (expected 557 rows))
  2. Index Scan using kor_id on kor_tjanster (use the kor_id's to get a list of kor_tjanster - 31 rows)
  3. Index Scan using prislist_listid on prislist (use the 31 kor_tjanster rows to find the corresponding 'huvudklass'
foreach row) 
29+31+31=91 index lookups... which is MUCH faster than seq-scanning millions of rows...

I need to speed up this query. How can i make it use the correct index? Any hints?

I have pg 8.1.0, default settings.

/* m */

Re: Strange plan in pg 8.1.0

From
"Steinar H. Gunderson"
Date:
On Mon, Oct 30, 2006 at 01:05:07PM +0200, Mattias Kregert wrote:
>          ->  Hash Left Join  (cost=59.66..206763.11 rows=1215336 width=27) (actual time=4.959..3228.550 rows=1216434
loops=1)
>                Hash Cond: (("outer".prislista = ("inner".listid)::text) AND ("outer".tjanst =
("inner".tjanstid)::text))

Note the conversion to text here. Are you sure the types are matching on both
sides of the join?

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: Strange plan in pg 8.1.0

From
"Mattias Kregert"
Date:
>>> From: "Steinar H. Gunderson" <sgunderson@bigfoot.com>
> On Mon, Oct 30, 2006 at 01:05:07PM +0200, Mattias Kregert wrote:
>>          ->  Hash Left Join  (cost=59.66..206763.11 rows=1215336
>> width=27) (actual time=4.959..3228.550 rows=1216434 loops=1)
>>                Hash Cond: (("outer".prislista = ("inner".listid)::text)
>> AND ("outer".tjanst = ("inner".tjanstid)::text))
>
> Note the conversion to text here. Are you sure the types are matching on
> both
> sides of the join?
>
> /* Steinar */

On the left side it is text, and on the right side it is varchar(10).
Casting left side to varchar(10) does not help, in fact it makes things even
worse: The cast to ::text vanishes in a puff of logic, but the plan gets
bigger and even slower (20-25 seconds).

A RIGHT join takes only 20 milliseconds, but i want the left join because
there could be missing rows in the "prislist" table...

/* m */

Re: Strange plan in pg 8.1.0

From
Tom Lane
Date:
Mattias Kregert <mattias@kregert.se> writes:
> Why does it come up with this strange plan?

Because 8.1 can't reorder outer joins.  To devise the plan you want,
the planner has to be able to prove that it's OK to perform the IN join
before the LEFT join, something that isn't always the case.  8.2 can
prove this, but no existing release can.

The only workaround I can think of is to do the IN in a sub-select.

            regards, tom lane

Re: Strange plan in pg 8.1.0

From
"Mattias Kregert"
Date:
From: "Tom Lane" <tgl@sss.pgh.pa.us>
> Mattias Kregert <mattias@kregert.se> writes:
>> Why does it come up with this strange plan?
>
> Because 8.1 can't reorder outer joins.  To devise the plan you want,
> the planner has to be able to prove that it's OK to perform the IN join
> before the LEFT join, something that isn't always the case.  8.2 can
> prove this, but no existing release can.
>
> The only workaround I can think of is to do the IN in a sub-select.
>
> regards, tom lane
>

Thanks!
I'll try some subselect solution for now, and make a note to change it when
8.2 is out.

/* m */

Re: Strange plan in pg 8.1.0

From
"Steinar H. Gunderson"
Date:
On Mon, Oct 30, 2006 at 03:26:09PM +0100, Mattias Kregert wrote:
> On the left side it is text, and on the right side it is varchar(10).
> Casting left side to varchar(10) does not help, in fact it makes things
> even worse: The cast to ::text vanishes in a puff of logic, but the plan
> gets bigger and even slower (20-25 seconds).

Casting definitely won't help it any; it was more a question of having the
types in the _tables_ be the same.

Anyhow, this might be a red herring; others might have something more
intelligent to say in this matter.

By the way, does it use an index scan if you turn off sequential scans
(set enable_seqscan = false)?

/* Steinar */
--
Homepage: http://www.sesse.net/