Big differences in plans between 8.0 and 8.1 - Mailing list pgsql-performance

From Gabriele Turchi
Subject Big differences in plans between 8.0 and 8.1
Date
Msg-id 1152972852.3541.0.camel@apollo5.casa.intranet
Whole thread Raw
Responses Re: Big differences in plans between 8.0 and 8.1  (Michael Fuhr <mike@fuhr.org>)
List pgsql-performance
Hi all. I have a strange (and serious) problem with an application
ported from postgres 8.0 to 8.1.

The old installation is postgresql-8.0.4-2.FC4.1 running on a Fedora 4,
the new one is postgresql-8.1.4-1.FC5.1 running on a fedora 5.

Some query is now _very_ slow. I've found some deep differences between
query plans.

As example. The query is:

select count(*) from orario_ap join registrazioni using(treno, data)
join personale using(personale_id) join ruoli using(ruolo_id) where
data=today_or) where data=today_orario();

orario_ap is a view.

On 8.0 the query runs in            138.146 ms
On 8.1 the query runs in           6761.112 ms
On 8.1 with nested loops disabled:   63.184 ms

This is not the only query affected.

Two notes: please cc answer directly to me, and I'm sorry, my english is
alpha version.


On a 8.0 the plan is:

railcomm04=# explain analyze select count(*) from orario_ap join
registrazioni using(treno, data) join personale using(personale_id) join
ruoli using(ruolo_id) where data=today_or) where data=today_orario();

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1106.77..1106.77 rows=1 width=0) (actual
time=137.786..137.787 rows=1 loops=1)
   ->  Merge Join  (cost=1088.96..1105.66 rows=444 width=0) (actual
time=124.173..137.190 rows=349 loops=1)
         Merge Cond: (("outer".tipo_treno = "inner".tipo_treno) AND
("outer".num_treno = "inner".num_treno) AND ("outer".orario =
"inner".orario))
         ->  Sort  (cost=574.10..575.09 rows=395 width=26) (actual
time=97.647..98.010 rows=349 loops=1)
               Sort Key: o1.tipo_treno, o1.num_treno, o1.orario
               ->  Hash Join  (cost=28.45..557.06 rows=395 width=26)
(actual time=35.326..93.415 rows=349 loops=1)
                     Hash Cond: ("outer".ruolo_id = "inner".ruolo_id)
                     ->  Hash Join  (cost=27.41..550.10 rows=395
width=30) (actual time=12.827..69.411 rows=349 loops=1)
                           Hash Cond: ("outer".personale_id =
"inner".personale_id)
                           ->  Hash Join  (cost=12.85..529.61 rows=395
width=34) (actual time=10.453..65.365 rows=349 loops=1)
                                 Hash Cond: ("outer".treno =
"inner".treno)
                                 ->  Seq Scan on orario o1
(cost=0.00..504.38 rows=843 width=33) (actual time=3.691..57.487
rows=797 loops=1)
                                       Filter: ((seq_fermata = 1) AND
(data = date((now() - '02:00:00'::interval))))
                                 ->  Hash  (cost=11.98..11.98 rows=349
width=19) (actual time=2.665..2.665 rows=0 loops=1)
                                       ->  Seq Scan on registrazioni
(cost=0.00..11.98 rows=349 width=19) (actual time=0.029..2.042 rows=349
loops=1)
                                             Filter: (date((now() -
'02:00:00'::interval)) = data)
                           ->  Hash  (cost=12.85..12.85 rows=685
width=4) (actual time=2.350..2.350 rows=0 loops=1)
                                 ->  Seq Scan on personale
(cost=0.00..12.85 rows=685 width=4) (actual time=0.005..1.350 rows=685
loops=1)
                     ->  Hash  (cost=1.03..1.03 rows=3 width=4) (actual
time=22.479..22.479 rows=0 loops=1)
                           ->  Seq Scan on ruoli  (cost=0.00..1.03
rows=3 width=4) (actual time=22.461..22.468 rows=3 loops=1)
         ->  Sort  (cost=514.86..516.94 rows=831 width=26) (actual
time=26.493..27.490 rows=949 loops=1)
               Sort Key: o2.tipo_treno, o2.num_treno, o2.orario
               ->  Seq Scan on orario o2  (cost=0.00..474.56 rows=831
width=26) (actual time=0.056..17.398 rows=797 loops=1)
                     Filter: ((orario_partenza IS NULL) AND (date((now()
- '02:00:00'::interval)) = data))
 Total runtime: 138.146 ms



On a standard 8.1 is:

railcomm04=# explain analyze select count(*) from orario_ap join
registrazioni using(treno, data) join personale using(personale_id) join
ruoli using(ruolo_id) where data=today_orario();

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=500.45..500.46 rows=1 width=0) (actual
time=6760.876..6760.877 rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..500.44 rows=1 width=0) (actual
time=5.915..6759.550 rows=349 loops=1)
         Join Filter: (("outer".orario = "inner".orario) AND
("outer".num_treno = "inner".num_treno) AND ("outer".tipo_treno =
"inner".tipo_treno))
         ->  Nested Loop  (cost=0.00..25.87 rows=1 width=72) (actual
time=0.124..42.617 rows=349 loops=1)
               ->  Nested Loop  (cost=0.00..20.15 rows=1 width=76)
(actual time=0.106..34.330 rows=349 loops=1)
                     ->  Nested Loop  (cost=0.00..14.12 rows=1 width=40)
(actual time=0.045..12.037 rows=349 loops=1)
                           Join Filter: ("outer".ruolo_id =
"inner".ruolo_id)
                           ->  Seq Scan on registrazioni
(cost=0.00..11.98 rows=2 width=44) (actual time=0.025..2.315 rows=349
loops=1)
                                 Filter: (date((now() -
'02:00:00'::interval)) = data)
                           ->  Seq Scan on ruoli  (cost=0.00..1.03
rows=3 width=4) (actual time=0.003..0.009 rows=3 loops=349)
                     ->  Index Scan using orario_pkey on orario o1
(cost=0.00..6.02 rows=1 width=104) (actual time=0.053..0.056 rows=1
loops=349)
                           Index Cond: ((o1.treno = "outer".treno) AND
(o1.seq_fermata = 1))
                           Filter: (data = date((now() -
'02:00:00'::interval)))
               ->  Index Scan using personale_pkey on personale
(cost=0.00..5.71 rows=1 width=4) (actual time=0.013..0.017 rows=1
loops=349)
                     Index Cond: ("outer".personale_id =
personale.personale_id)
         ->  Seq Scan on orario o2  (cost=0.00..474.56 rows=1 width=72)
(actual time=0.030..17.784 rows=797 loops=349)
               Filter: ((orario_partenza IS NULL) AND (date((now() -
'02:00:00'::interval)) = data))
 Total runtime: 6761.112 ms


On a 8.1 with nested loops disabled:

railcomm04=# explain analyze select count(*) from orario_ap join
registrazioni using(treno, data) join personale using(personale_id) join
ruoli using(ruolo_id) where data=today_orario();

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=802.82..802.83 rows=1 width=0) (actual
time=62.309..62.310 rows=1 loops=1)
   ->  Hash Join  (cost=328.23..802.82 rows=1 width=0) (actual
time=44.443..61.867 rows=349 loops=1)
         Hash Cond: (("outer".orario = "inner".orario) AND
("outer".num_treno = "inner".num_treno) AND ("outer".tipo_treno =
"inner".tipo_treno))
         ->  Seq Scan on orario o2  (cost=0.00..474.56 rows=1 width=72)
(actual time=0.068..16.558 rows=797 loops=1)
               Filter: ((orario_partenza IS NULL) AND (date((now() -
'02:00:00'::interval)) = data))
         ->  Hash  (cost=328.22..328.22 rows=1 width=72) (actual
time=38.479..38.479 rows=349 loops=1)
               ->  Hash Join  (cost=29.33..328.22 rows=1 width=72)
(actual time=6.700..37.530 rows=349 loops=1)
                     Hash Cond: ("outer".treno = "inner".treno)
                     ->  Index Scan using orario_pkey on orario o1
(cost=0.00..298.88 rows=1 width=104) (actual time=0.069..29.033 rows=797
loops=1)
                           Index Cond: (seq_fermata = 1)
                           Filter: (data = date((now() -
'02:00:00'::interval)))
                     ->  Hash  (cost=29.32..29.32 rows=1 width=36)
(actual time=6.595..6.595 rows=349 loops=1)
                           ->  Hash Join  (cost=13.04..29.32 rows=1
width=36) (actual time=3.361..5.887 rows=349 loops=1)
                                 Hash Cond: ("outer".personale_id =
"inner".personale_id)
                                 ->  Seq Scan on personale
(cost=0.00..12.85 rows=685 width=4) (actual time=0.013..1.098 rows=685
loops=1)
                                 ->  Hash  (cost=13.04..13.04 rows=1
width=40) (actual time=3.301..3.301 rows=349 loops=1)
                                       ->  Hash Join  (cost=1.04..13.04
rows=1 width=40) (actual time=0.090..2.602 rows=349 loops=1)
                                             Hash Cond:
("outer".ruolo_id = "inner".ruolo_id)
                                             ->  Seq Scan on
registrazioni  (cost=0.00..11.98 rows=2 width=44) (actual
time=0.025..1.465 rows=349 loops=1)
                                                   Filter: (date((now()
- '02:00:00'::interval)) = data)
                                             ->  Hash  (cost=1.03..1.03
rows=3 width=4) (actual time=0.040..0.040 rows=3 loops=1)
                                                   ->  Seq Scan on ruoli
(cost=0.00..1.03 rows=3 width=4) (actual time=0.014..0.025 rows=3
loops=1)
 Total runtime: 63.184 ms


Regards,
Gabriele


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Self-join query and index usage
Next
From: Hassan Adekoya
Date:
Subject: increment Rows in an SQL Result Set postgresql