Re: Worse perfomance on 8.2.0 than on 7.4.14 - Mailing list pgsql-performance

From Rolf Østvik
Subject Re: Worse perfomance on 8.2.0 than on 7.4.14
Date
Msg-id 20070101150410.51809.qmail@web26309.mail.ukl.yahoo.com
Whole thread Raw
In response to Re: Worse perfomance on 8.2.0 than on 7.4.14  (Dennis Bjorklund <db@zigo.dhs.org>)
Responses Re: Worse perfomance on 8.2.0 than on 7.4.14
List pgsql-performance
(I am sorry if my explain analyze outputs gets
garbled)

--- Dennis Bjorklund <db@zigo.dhs.org> skrev:

> Rolf Østvik skrev:
>
> > I have a simple query which uses 32ms on 7.4.14
> and 1015ms on 8.2.0.
> > I guess 7.4.14 creates a better execution plan
> than 8.2.0 for this query but
>
>
> Try to turn off planner options in 8.2 to make it
> generate the same plan
> as 7.4. Then run EXPLAIN ANALYZE on that query that
> generate the same
> plan as in 7.4 and we can compare the costs and
> maybe understand what go
> wrong.
>
> /Dennis
>

I have adjusted some settings in the postgresql.conf
file:
7.4.14
  shared_buffers=64000 #512MB
  sort_mem=32000       #32KB
  effective_cache_size=128000 #1GB
8.2.0
  shared_buffers=512MB
  temp_buffers=32MB
  work_mem=8MB
  effective_cache_size=1GB
  random_page_cost=2.0

And also disabled some planner options in 8.2
  enable_bitscanmap = off
  enable_hashjoin = off

NB: enable_seqscan = on (default value)

First i have some queries to give you a feel of size
of
datasets and plans and times.

Q-A: (Simple query A)
 select sr.id from step_result_subset as sr
  where sr.step_parent = 0;
Q-B: (Simple query B)
 select ur.id from uut_result_subset as ur
  where ur.start_date_time > '2006-12-11';
Q-C: (Simple query C)
 select ur.id from uut_result_subset as ur
  where ur.start_date_time > '2006-12-11';

7.4.14
Explain analyze of Q-A on 7.1.14:
 Index Scan using step_result_subset_parent_key on
step_result_subset sr  (cost=0.00..42793.67
rows=166069 width=4) (actual time=0.091..1201.073
rows=176449 loops=1)
   Index Cond: (step_parent = 0)
 Total runtime: 1263.592 ms
(3 rows)

Explain analyze of Q-B on 7.1.14:
 Index Scan using
uut_result_subset_start_date_time_idx on
uut_result_subset ur  (cost=0.00..63.28 rows=18
width=4) (actual time=0.081..0.190 rows=68 loops=1)
   Index Cond: (start_date_time > '2006-12-11
00:00:00'::timestamp without time zone)
 Total runtime: 0.242 ms
(3 rows)

Explain analyze of Q-C on 7.1.14:
 Seq Scan on uut_result_subset ur  (cost=0.00..3161.94
rows=28640 width=4) (actual time=0.059..108.159
rows=29144 loops=1)
   Filter: (start_date_time > '2006-09-11
00:00:00'::timestamp without time zone)
 Total runtime: 117.560 ms
(3 rows)

8.2.0
Explain analyze of Q-A on 8.2.0:
 Index Scan using step_result_subset_parent_key on
step_result_subset sr  (cost=0.00..26759.41
rows=143163 width=4) (actual time=0.099..924.039
rows=176449 loops=1)
   Index Cond: (step_parent = 0)
 Total runtime: 998.757 ms
(3 rows)

Explain analyze of Q-A on 8.2.0:
 Index Scan using
uut_result_subset_start_date_time_idx on
uut_result_subset ur  (cost=0.00..196.15 rows=118
width=4) (actual time=0.025..0.081 rows=68 loops=1)
   Index Cond: (start_date_time > '2006-12-11
00:00:00'::timestamp without time zone)
 Total runtime: 0.159 ms
(3 rows)

Explain analyze of Q-C on 8.2.0:
 Index Scan using
uut_result_subset_start_date_time_idx on
uut_result_subset ur  (cost=0.00..2382.39 rows=31340
width=4) (actual time=0.035..35.367 rows=29144
loops=1)
   Index Cond: (start_date_time > '2006-09-11
00:00:00'::timestamp without time zone)
 Total runtime: 47.168 ms
(3 rows)

Here is the complex query/view.
create view view_subset as
select
  ur.id as ur_id,
  sr.id as sr_id
from
  uut_result_subset as ur
   inner join step_result_subset as sr
    on ur.id=sr.uut_result
where
  ur.start_date_time > '2006-12-11'
  and sr.step_parent=0
;

Query with start_date_time > '2006-12-11' on 7.4.14
Query PLAN
---------------
 Nested Loop  (cost=0.00..1400.86 rows=17 width=8)
(actual time=0.066..12.754 rows=68 loops=1)
   ->  Index Scan using
uut_result_subset_start_date_time_idx on
uut_result_subset ur  (cost=0.00..63.28 rows=18
width=4) (actual time=0.019..0.136 rows=68 loops=1)
         Index Cond: (start_date_time > '2006-12-11
00:00:00'::timestamp without time zone)
   ->  Index Scan using step_result_uut_result_idx on
step_result_subset sr  (cost=0.00..74.28 rows=2
width=8) (actual time=0.085..0.182 rows=1 loops=68)
         Index Cond: ("outer".id = sr.uut_result)
         Filter: (step_parent = 0)
 Total runtime: 12.849 ms

Query with start_date_time > '2006-12-11' on 8.2.0
Query PLAN
---------------
 Nested Loop  (cost=0.00..35860.83 rows=96 width=8)
(actual time=11.891..2339.878 rows=68 loops=1)
   ->  Index Scan using step_result_subset_parent_key
on step_result_subset sr  (cost=0.00..26759.41
rows=143163 width=8) (actual time=0.083..1017.500
rows=176449 loops=1)
         Index Cond: (step_parent = 0)
   ->  Index Scan using uut_result_subset_pkey on
uut_result_subset ur  (cost=0.00..0.05 rows=1 width=4)
(actual time=0.006..0.006 rows=0 loops=176449)
         Index Cond: (ur.id = sr.uut_result)
         Filter: (start_date_time > '2006-12-11
00:00:00'::timestamp without time zone)
 Total runtime: 2339.974 ms

I also wanted to try it with a bigger dataset so i
set the restriction of start_date_time to
"start_date_time> '2006-09-11'"
I also then set "enable_hashjoin = on" to get same
plans on 7.4.14 and 8.2.0.

Query with start_date_time > '2006-09-11' on 7.4.14
Query PLAN
---------------
 Hash Join  (cost=3233.54..47126.96 rows=26940
width=8) (actual time=126.437..1489.584 rows=29139
loops=1)
   Hash Cond: ("outer".uut_result = "inner".id)
   ->  Index Scan using step_result_subset_parent_key
on step_result_subset sr  (cost=0.00..42793.67
rows=166069 width=8) (actual time=0.078..1137.123
rows=176449 loops=1)
         Index Cond: (step_parent = 0)
   ->  Hash  (cost=3161.94..3161.94 rows=28640
width=4) (actual time=126.068..126.068 rows=0 loops=1)
         ->  Seq Scan on uut_result_subset ur
(cost=0.00..3161.94 rows=28640 width=4) (actual
time=0.063..107.041 rows=29144 loops=1)
               Filter: (start_date_time > '2006-09-11
00:00:00'::timestamp without time zone)
 Total runtime: 1504.600 ms
(8 rows)

Query with start_date_time > '2006-09-11' on 8.2.0
Query PLAN
---------------
 Hash Join  (cost=2460.74..32695.45 rows=25413
width=8) (actual time=61.453..1198.048 rows=29139
loops=1)
   Hash Cond: (sr.uut_result = ur.id)
   ->  Index Scan using step_result_subset_parent_key
on step_result_subset sr  (cost=0.00..26759.41
rows=143163 width=8) (actual time=0.089..937.124
rows=176449 loops=1)
         Index Cond: (step_parent = 0)
   ->  Hash  (cost=2382.39..2382.39 rows=31340
width=4) (actual time=55.975..55.975 rows=29144
loops=1)
         ->  Index Scan using
uut_result_subset_start_date_time_idx on
uut_result_subset ur  (cost=0.00..2382.39 rows=31340
width=4) (actual time=0.051..35.635 rows=29144
loops=1)
               Index Cond: (start_date_time >
'2006-09-11 00:00:00'::timestamp without time zone)
 Total runtime: 1212.910 ms
(8 rows)

Other comments.
I am _beginning_ to get a feeling of adjusting
parameters and how my dataset behaves. 8.2.0 does (as
expected) work much better on bigger datasets than
7.4.14.
I was still hoping that i could get better response
times since i think the Q-C query
(ur.start_date_time > '2006-09-11') should be the
biggest restrictor of the datasets i want to look at.
From what i can understand that is what happens with
the query plan for 7.4.14 when restriction is
"ur.start_date_time > '2006-12-11'".

Best regards
Rolf Østvik

__________________________________________________
Bruker du Yahoo!?
Lei av spam?  Yahoo! Mail har den beste spambeskyttelsen
http://no.mail.yahoo.com

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Worse perfomance on 8.2.0 than on 7.4.14
Next
From: Rolf Østvik
Date:
Subject: Re: Worse perfomance on 8.2.0 than on 7.4.14