Thread: performance penalty between Postgresql 8.3.8 and 8.4.1

performance penalty between Postgresql 8.3.8 and 8.4.1

From
"Schmitz, David"
Date:
Hello everybody,

we have severe performance penalty between Postgresql 8.3.8 and 8.4.1

Consider the following tables:

CREATE TABLE xdf.xdf_admin_hierarchy
(
  admin_place_id integer NOT NULL,
  admin_order smallint NOT NULL,
  iso_country_code character(3) NOT NULL,
  country_id integer NOT NULL,
  order1_id integer,
  order2_id integer,
  order8_id integer,
  builtup_id integer,
  num_links integer,
  CONSTRAINT pk_xdf_admin_hierarchy PRIMARY KEY (admin_place_id)
)
WITH (
  OIDS=FALSE
);
ALTER TABLE xdf.xdf_admin_hierarchy OWNER TO frog;

CREATE TABLE xdf.xdf_link_admin
(
  admin_place_id integer NOT NULL,
  link_id integer NOT NULL,
  side character(1) NOT NULL,
  CONSTRAINT pk_xdf_link_admin PRIMARY KEY (link_id, side)
)
WITH (
  OIDS=FALSE
);
ALTER TABLE xdf.xdf_link_admin OWNER TO frog;

CREATE INDEX nx_xdflinkadmin_adminplaceid
  ON xdf.xdf_link_admin
  USING btree
  (admin_place_id);

CREATE INDEX nx_xdflinkadmin_linkid
  ON xdf.xdf_link_admin
  USING btree
  (link_id);

CREATE TABLE xdf.xdf_road_link
(
  road_link_id integer NOT NULL,
  road_name_id integer,
  left_address_range_id integer NOT NULL,
  right_address_range_id integer NOT NULL,
  address_type smallint NOT NULL,
  is_exit_name character(1) NOT NULL,
  explicatable character(1) NOT NULL,
  is_junction_name character(1) NOT NULL,
  is_name_on_roadsign character(1) NOT NULL,
  is_postal_name character(1) NOT NULL,
  is_stale_name character(1) NOT NULL,
  is_vanity_name character(1) NOT NULL,
  is_scenic_name character(1) NOT NULL,
  link_id integer NOT NULL,
  CONSTRAINT pk_xdf_road_link PRIMARY KEY (road_link_id)
)
WITH (
  OIDS=FALSE
);
ALTER TABLE xdf.xdf_road_link OWNER TO frog;

CREATE INDEX nx_xdfroadlink_leftaddressrangeid
  ON xdf.xdf_road_link
  USING btree
  (left_address_range_id);

CREATE INDEX nx_xdfroadlink_linkid
  ON xdf.xdf_road_link
  USING btree
  (link_id);

CREATE INDEX nx_xdfroadlink_rightaddressrangeid
  ON xdf.xdf_road_link
  USING btree
  (right_address_range_id);

CREATE INDEX nx_xdfroadlink_roadnameid
  ON xdf.xdf_road_link
  USING btree
  (road_name_id);

CREATE TABLE xdf.xdf_road_name
(
  road_name_id integer NOT NULL,
  route_type smallint NOT NULL,
  attached_to_base character(1) NOT NULL,
  precedes_base character(1) NOT NULL,
  prefix character varying(10),
  street_type character varying(30),
  suffix character varying(2),
  base_name character varying(60) NOT NULL,
  language_code character(3) NOT NULL,
  is_exonym character(1) NOT NULL,
  name_type character(1) NOT NULL,
  direction_on_sign character(1) NOT NULL,
  street_name character varying(60) NOT NULL,
  CONSTRAINT pk_xdf_road_name PRIMARY KEY (road_name_id)
)
WITH (
  OIDS=FALSE
);
ALTER TABLE xdf.xdf_road_name OWNER TO frog;

CREATE INDEX nx_xdfroadname_languagecode
  ON xdf.xdf_road_name
  USING btree
  (language_code);

If one executes a query of the following structure:

SELECT AH.ORDER8_ID, AH.BUILTUP_ID, RL.LINK_ID, LA.SIDE,
RL.ROAD_NAME_ID, RL.LEFT_ADDRESS_RANGE_ID, RL.RIGHT_ADDRESS_RANGE_ID,
RL.IS_EXIT_NAME, RL.EXPLICATABLE, RL.IS_JUNCTION_NAME,
RL.IS_NAME_ON_ROADSIGN, RL.IS_POSTAL_NAME, RL.IS_STALE_NAME,
RL.IS_VANITY_NAME, RL.ROAD_LINK_ID, RN.STREET_NAME,
RN.ROUTE_TYPE
FROM xdf.xdf_ADMIN_HIERARCHY AH, xdf.xdf_LINK_ADMIN LA,
xdf.xdf_ROAD_LINK RL, xdf.xdf_ROAD_NAME RN
WHERE AH.ADMIN_PLACE_ID = LA.ADMIN_PLACE_ID
AND LA.LINK_ID = RL.LINK_ID
AND RL.ROAD_NAME_ID = RN.ROAD_NAME_ID
AND RL.IS_EXIT_NAME = 'N'
AND RL.IS_JUNCTION_NAME = 'N'
AND RN.ROAD_NAME_ID BETWEEN 158348561 AND 158348660
ORDER BY RL.ROAD_NAME_ID, AH.ORDER8_ID, AH.BUILTUP_ID, RL.LINK_ID;

It is carried out with poor performance on postgresql 8.4.1 However postgresql 8.3.8 performs just fine.
If you take a closer look at the query with EXPLAIN, it becomes obvious, that postgresql 8.4 does not
consider the primary key at level 3 and instead generates a hash join:

Postgresql 8.4.1:

Sort  (cost=129346.71..129498.64 rows=60772 width=61)
  Sort Key: rl.road_name_id, ah.order8_id, ah.builtup_id, rl.link_id
  ->  Hash Join  (cost=2603.57..124518.03 rows=60772 width=61)
        Hash Cond: (la.admin_place_id = ah.admin_place_id)
        ->  Nested Loop  (cost=6.82..120781.81 rows=60772 width=57)
              ->  Nested Loop  (cost=6.82..72383.98 rows=21451 width=51)
                    ->  Index Scan using pk_rdf_road_name on rdf_road_name rn  (cost=0.00..11.24 rows=97 width=21)
                          Index Cond: ((road_name_id >= 158348561) AND (road_name_id <= 158348660))
                    ->  Bitmap Heap Scan on rdf_road_link rl  (cost=6.82..743.34 rows=222 width=34)
                          Recheck Cond: (rl.road_name_id = rn.road_name_id)
                          Filter: ((rl.is_exit_name = 'N'::bpchar) AND (rl.is_junction_name = 'N'::bpchar))
                          ->  Bitmap Index Scan on nx_rdfroadlink_roadnameid  (cost=0.00..6.76 rows=222 width=0)
                                Index Cond: (rl.road_name_id = rn.road_name_id)
              ->  Index Scan using nx_rdflinkadmin_linkid on rdf_link_admin la  (cost=0.00..2.22 rows=3 width=10)
                    Index Cond: (la.link_id = rl.link_id)
        ->  Hash  (cost=1544.11..1544.11 rows=84211 width=12)
              ->  Seq Scan on rdf_admin_hierarchy ah  (cost=0.00..1544.11 rows=84211 width=12)

Postgresql 8.3.8:

Sort  (cost=3792.75..3792.95 rows=81 width=61)
  Sort Key: rl.road_name_id, ah.order8_id, ah.builtup_id, rl.link_id
  ->  Nested Loop  (cost=21.00..3790.18 rows=81 width=61)
        ->  Nested Loop  (cost=21.00..3766.73 rows=81 width=57)
              ->  Nested Loop  (cost=21.00..3733.04 rows=14 width=51)
                    ->  Index Scan using pk_rdf_road_name on rdf_road_name rn  (cost=0.00..8.32 rows=1 width=21)
                          Index Cond: ((road_name_id >= 158348561) AND (road_name_id <= 158348660))
                    ->  Bitmap Heap Scan on rdf_road_link rl  (cost=21.00..3711.97 rows=1020 width=34)
                          Recheck Cond: (rl.road_name_id = rn.road_name_id)
                          Filter: ((rl.is_exit_name = 'N'::bpchar) AND (rl.is_junction_name = 'N'::bpchar))
                          ->  Bitmap Index Scan on nx_rdfroadlink_roadnameid  (cost=0.00..20.75 rows=1020 width=0)
                                Index Cond: (rl.road_name_id = rn.road_name_id)
              ->  Index Scan using nx_rdflinkadmin_linkid on rdf_link_admin la  (cost=0.00..2.31 rows=8 width=10)
                    Index Cond: (la.link_id = rl.link_id)
        ->  Index Scan using pk_rdf_admin_hierarchy on rdf_admin_hierarchy ah  (cost=0.00..0.28 rows=1 width=12)
              Index Cond: (ah.admin_place_id = la.admin_place_id)

With our data it is a performance difference from 1h16min (8.3.8) to 2h43min (8.4.1)

I hope someone can help me out with my problem. If you need further information please let me know.

Mit freundlichem Gruß / Best regards

David Schmitz
Dipl.-Ing.(FH)
Software Developer New Map Compiler

HARMAN/BECKER AUTOMOTIVE SYSTEMS
innovative systems GmbH
Hugh-Greene-Weg 2-4 - 22529 Hamburg - Germany
Phone: +49 (0)40-30067-990
Fax:     +49 (0)40-30067-969
Mailto:DaSchmitz@harmanbecker.com

*******************************************
innovative systems GmbH Navigation-Multimedia
Geschaeftsfuehrung: Edwin Summers - Michael Juergen Mauser
Sitz der Gesellschaft: Hamburg - Registergericht: Hamburg HRB 59980

*******************************************
Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte Informationen. Wenn Sie nicht der richtige Adressat
sindoder diese E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender und loeschen Sie diese
Mail.Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. 
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have
receivedthis e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying,
disclosureor distribution of the contents in this e-mail is strictly forbidden. 
*******************************************

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
"Kevin Grittner"
Date:
"Schmitz, David" <david.schmitz@harman.com> wrote:

> It is carried out with poor performance on postgresql 8.4.1
> However postgresql 8.3.8 performs just fine.
> If you take a closer look at the query with EXPLAIN, it becomes
> obvious, that postgresql 8.4 does not consider the primary key at
> level 3 and instead generates a hash join:

> Postgresql 8.4.1:
>
> Sort  (cost=129346.71..129498.64 rows=60772 width=61)

> Postgresql 8.3.8:
>
> Sort  (cost=3792.75..3792.95 rows=81 width=61)

It determines the plan based on available statistics, which in this
case seem to indicate rather different data.  Do the two databases
have identical data?  Have they both been recently analyzed?  What
is the default_statistics_target on each?  Do any columns in these
tables have overrides?

-Kevin

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
"Schmitz, David"
Date:
-----Ursprüngliche Nachricht-----
Von:    Schmitz, David
Gesendet:    Di 08.12.2009 00:14
An:    Kevin Grittner
Cc:
Betreff:    AW: [PERFORM] performance penalty between Postgresql 8.3.8 and 8.4.1




-----Ursprüngliche Nachricht-----
Von:    Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov]
Gesendet:    Mo 07.12.2009 23:19
An:    Schmitz, David; pgsql-performance@postgresql.org
Cc:
Betreff:    Re: [PERFORM] performance penalty between Postgresql 8.3.8 and 8.4.1

"Schmitz, David" <david.schmitz@harman.com> wrote:

> It is carried out with poor performance on postgresql 8.4.1
> However postgresql 8.3.8 performs just fine.
> If you take a closer look at the query with EXPLAIN, it becomes
> obvious, that postgresql 8.4 does not consider the primary key at
> level 3 and instead generates a hash join:

> Postgresql 8.4.1:
>
> Sort  (cost=129346.71..129498.64 rows=60772 width=61)

> Postgresql 8.3.8:
>
> Sort  (cost=3792.75..3792.95 rows=81 width=61)

It determines the plan based on available statistics, which in this
case seem to indicate rather different data.  Do the two databases
have identical data?  Have they both been recently analyzed?  What
is the default_statistics_target on each?  Do any columns in these
tables have overrides?

-Kevin


Hello Kevin,

both databases have identical / same data and hardware. On postgresql 8.3.8 default statistics target is 10 and at
postgresql8.4.1 it is 100. But i have been experimenting in both directions with postgres 8.4.1 10, 100, 1000 or 10000
doesnot matter perfomance remains bad. Analyze has been run recently on both databases (even an explicit analayze
beforequery makes no difference). Autovaccuum and analyze are set quite aggressive at 0.01 (v) and 0.02 (a) and
postgres8.3.8 still outperforms 8.4.1. 

Regards

dave

*******************************************
innovative systems GmbH Navigation-Multimedia
Geschaeftsfuehrung: Edwin Summers - Michael Juergen Mauser
Sitz der Gesellschaft: Hamburg - Registergericht: Hamburg HRB 59980

*******************************************
Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte Informationen. Wenn Sie nicht der richtige Adressat
sindoder diese E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender und loeschen Sie diese
Mail.Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. 
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have
receivedthis e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying,
disclosureor distribution of the contents in this e-mail is strictly forbidden. 
*******************************************

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
Andres Freund
Date:
Hi David,

On Monday 07 December 2009 23:05:14 Schmitz, David wrote:
> With our data it is a performance difference from 1h16min (8.3.8) to
>  2h43min (8.4.1)
Can you afford a explain analyze run overnight or so for both?

Andres

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
Robert Haas
Date:
On Mon, Dec 7, 2009 at 5:19 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> "Schmitz, David" <david.schmitz@harman.com> wrote:
>
>> It is carried out with poor performance on postgresql 8.4.1
>> However postgresql 8.3.8 performs just fine.
>> If you take a closer look at the query with EXPLAIN, it becomes
>> obvious, that postgresql 8.4 does not consider the primary key at
>> level 3 and instead generates a hash join:
>
>> Postgresql 8.4.1:
>>
>> Sort  (cost=129346.71..129498.64 rows=60772 width=61)
>
>> Postgresql 8.3.8:
>>
>> Sort  (cost=3792.75..3792.95 rows=81 width=61)
>
> It determines the plan based on available statistics, which in this
> case seem to indicate rather different data.  Do the two databases
> have identical data?  Have they both been recently analyzed?  What
> is the default_statistics_target on each?  Do any columns in these
> tables have overrides?

I think Tom made some changes to the join selectivity code which might
be relevant here, though I'm not sure exactly what's going on.  Can we
see, on the 8.4.1 database:

SELECT SUM(1) FROM rdf_admin_hierarchy;
SELECT s.stadistinct, s.stanullfrac, s.stawidth,
array_upper(s.stanumbers1, 1) FROM pg_statistic s WHERE s.starelid =
'rdf_admin_hierarchy'::regclass AND s.staattnum = (SELECT a.attnum
FROM pg_attribute a WHERE a.attname = 'admin_place_id' AND a.attrelid
= 'rdf_admin_hierarchy'::regclass);

...Robert

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
"Schmitz, David"
Date:
Hi Andres,

This query returns for 8.4.1 and for 8.3.8 the same result:

stadistinct = -1
stanullfrac = 0
stawidth = 4
array_upper nothing

Regards

David

>-----Ursprüngliche Nachricht-----
>Von: Robert Haas [mailto:robertmhaas@gmail.com]
>Gesendet: Dienstag, 8. Dezember 2009 05:05
>An: Kevin Grittner
>Cc: Schmitz, David; pgsql-performance@postgresql.org
>Betreff: Re: [PERFORM] performance penalty between Postgresql
>8.3.8 and 8.4.1
>
>On Mon, Dec 7, 2009 at 5:19 PM, Kevin Grittner
><Kevin.Grittner@wicourts.gov> wrote:
>> "Schmitz, David" <david.schmitz@harman.com> wrote:
>>
>>> It is carried out with poor performance on postgresql 8.4.1 However
>>> postgresql 8.3.8 performs just fine.
>>> If you take a closer look at the query with EXPLAIN, it becomes
>>> obvious, that postgresql 8.4 does not consider the primary key at
>>> level 3 and instead generates a hash join:
>>
>>> Postgresql 8.4.1:
>>>
>>> Sort  (cost=129346.71..129498.64 rows=60772 width=61)
>>
>>> Postgresql 8.3.8:
>>>
>>> Sort  (cost=3792.75..3792.95 rows=81 width=61)
>>
>> It determines the plan based on available statistics, which in this
>> case seem to indicate rather different data.  Do the two databases
>> have identical data?  Have they both been recently analyzed?
> What is
>> the default_statistics_target on each?  Do any columns in
>these tables
>> have overrides?
>
>I think Tom made some changes to the join selectivity code
>which might be relevant here, though I'm not sure exactly
>what's going on.  Can we see, on the 8.4.1 database:
>
>SELECT SUM(1) FROM rdf_admin_hierarchy;
>SELECT s.stadistinct, s.stanullfrac, s.stawidth,
>array_upper(s.stanumbers1, 1) FROM pg_statistic s WHERE
>s.starelid = 'rdf_admin_hierarchy'::regclass AND s.staattnum =
>(SELECT a.attnum FROM pg_attribute a WHERE a.attname =
>'admin_place_id' AND a.attrelid = 'rdf_admin_hierarchy'::regclass);
>
>...Robert
>

*******************************************
innovative systems GmbH Navigation-Multimedia
Geschaeftsfuehrung: Edwin Summers - Michael Juergen Mauser
Sitz der Gesellschaft: Hamburg - Registergericht: Hamburg HRB 59980

*******************************************
Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte Informationen. Wenn Sie nicht der richtige Adressat
sindoder diese E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender und loeschen Sie diese
Mail.Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. 
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have
receivedthis e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying,
disclosureor distribution of the contents in this e-mail is strictly forbidden. 
*******************************************

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
"Schmitz, David"
Date:
Hi Andres,

EXPLAIN ANALYZE
select ah.ORDER8_ID, ah.BUILTUP_ID, rl.LINK_ID, la.SIDE,
                    rl.ROAD_NAME_ID, rl.LEFT_ADDRESS_RANGE_ID, rl.RIGHT_ADDRESS_RANGE_ID,
                    rl.IS_EXIT_NAME, rl.EXPLICATABLE, rl.IS_JUNCTION_NAME,
                    rl.IS_NAME_ON_ROADSIGN, rl.IS_POSTAL_NAME, rl.IS_STALE_NAME,
                    rl.IS_VANITY_NAME, rl.ROAD_LINK_ID, rn.STREET_NAME,
                    rn.ROUTE_TYPE
                from rdf.xdf_ADMIN_HIERARCHY ah
                join xdf.xdf_LINK_ADMIN la
                on ah.ADMIN_PLACE_ID = la.ADMIN_PLACE_ID
                join xdf.xdf_ROAD_LINK rl
                on la.LINK_ID = rl.LINK_ID
                join xdf.xdf_ROAD_NAME rn
                on rl.ROAD_NAME_ID = rn.ROAD_NAME_ID
                where rl.IS_EXIT_NAME = 'N'
                    and rl.IS_JUNCTION_NAME = 'N'
                    and rn.ROAD_NAME_ID between 158348561  and 158348660
                order by rl.ROAD_NAME_ID, ah.ORDER8_ID, ah.BUILTUP_ID, rl.LINK_ID;

On Postgresql 8.4.1

Sort  (cost=129346.71..129498.64 rows=60772 width=61) (actual time=100.358..100.496 rows=1444 loops=1)
  Sort Key: rl.road_name_id, ah.order8_id, ah.builtup_id, rl.link_id
  Sort Method:  quicksort  Memory: 252kB
  ->  Hash Join  (cost=2603.57..124518.03 rows=60772 width=61) (actual time=62.359..97.268 rows=1444 loops=1)
        Hash Cond: (la.admin_place_id = ah.admin_place_id)
        ->  Nested Loop  (cost=6.82..120781.81 rows=60772 width=57) (actual time=0.318..33.600 rows=1444 loops=1)
              ->  Nested Loop  (cost=6.82..72383.98 rows=21451 width=51) (actual time=0.232..12.359 rows=722 loops=1)
                    ->  Index Scan using pk_xdf_road_name on xdf_road_name rn  (cost=0.00..11.24 rows=97 width=21)
(actualtime=0.117..0.185 rows=100 loops=1) 
                          Index Cond: ((road_name_id >= 158348561) AND (road_name_id <= 158348660))
                    ->  Bitmap Heap Scan on xdf_road_link rl  (cost=6.82..743.34 rows=222 width=34) (actual
time=0.025..0.115rows=7 loops=100) 
                          Recheck Cond: (rl.road_name_id = rn.road_name_id)
                          Filter: ((rl.is_exit_name = 'N'::bpchar) AND (rl.is_junction_name = 'N'::bpchar))
                          ->  Bitmap Index Scan on nx_xdfroadlink_roadnameid  (cost=0.00..6.76 rows=222 width=0)
(actualtime=0.008..0.008 rows=7 loops=100) 
                                Index Cond: (rl.road_name_id = rn.road_name_id)
              ->  Index Scan using nx_xdflinkadmin_linkid on xdf_link_admin la  (cost=0.00..2.22 rows=3 width=10)
(actualtime=0.023..0.028 rows=2 loops=722) 
                    Index Cond: (la.link_id = rl.link_id)
        ->  Hash  (cost=1544.11..1544.11 rows=84211 width=12) (actual time=61.924..61.924 rows=84211 loops=1)
              ->  Seq Scan on xdf_admin_hierarchy ah  (cost=0.00..1544.11 rows=84211 width=12) (actual
time=0.017..33.442rows=84211 loops=1) 
Total runtime: 101.446 ms


and on Postgresql  8.3.8:

Sort  (cost=3792.75..3792.95 rows=81 width=61) (actual time=28.928..29.074 rows=1444 loops=1)
  Sort Key: rl.road_name_id, ah.order8_id, ah.builtup_id, rl.link_id
  Sort Method:  quicksort  Memory: 252kB
  ->  Nested Loop  (cost=21.00..3790.18 rows=81 width=61) (actual time=0.210..26.098 rows=1444 loops=1)
        ->  Nested Loop  (cost=21.00..3766.73 rows=81 width=57) (actual time=0.172..19.148 rows=1444 loops=1)
              ->  Nested Loop  (cost=21.00..3733.04 rows=14 width=51) (actual time=0.129..6.126 rows=722 loops=1)
                    ->  Index Scan using pk_xdf_road_name on xdf_road_name rn  (cost=0.00..8.32 rows=1 width=21)
(actualtime=0.059..0.117 rows=100 loops=1) 
                          Index Cond: ((road_name_id >= 158348561) AND (road_name_id <= 158348660))
                    ->  Bitmap Heap Scan on xdf_road_link rl  (cost=21.00..3711.97 rows=1020 width=34) (actual
time=0.015..0.055rows=7 loops=100) 
                          Recheck Cond: (rl.road_name_id = rn.road_name_id)
                          Filter: ((rl.is_exit_name = 'N'::bpchar) AND (rl.is_junction_name = 'N'::bpchar))
                          ->  Bitmap Index Scan on nx_xdfroadlink_roadnameid  (cost=0.00..20.75 rows=1020 width=0)
(actualtime=0.007..0.007 rows=7 loops=100) 
                                Index Cond: (rl.road_name_id = rn.road_name_id)
              ->  Index Scan using nx_xdflinkadmin_linkid on xdf_link_admin la  (cost=0.00..2.31 rows=8 width=10)
(actualtime=0.014..0.017 rows=2 loops=722) 
                    Index Cond: (la.link_id = rl.link_id)
        ->  Index Scan using pk_xdf_admin_hierarchy on xdf_admin_hierarchy ah  (cost=0.00..0.28 rows=1 width=12)
(actualtime=0.003..0.004 rows=1 loops=1444) 
              Index Cond: (ah.admin_place_id = la.admin_place_id)
Total runtime: 29.366 ms

Hope this gives any clue. Or did I missunderstand you?

Regards

David


>-----Ursprüngliche Nachricht-----
>Von: Andres Freund [mailto:andres@anarazel.de]
>Gesendet: Dienstag, 8. Dezember 2009 00:25
>An: pgsql-performance@postgresql.org
>Cc: Schmitz, David
>Betreff: Re: [PERFORM] performance penalty between Postgresql
>8.3.8 and 8.4.1
>
>Hi David,
>
>On Monday 07 December 2009 23:05:14 Schmitz, David wrote:
>> With our data it is a performance difference from 1h16min
>(8.3.8) to
>> 2h43min (8.4.1)
>Can you afford a explain analyze run overnight or so for both?
>
>Andres
>

*******************************************
innovative systems GmbH Navigation-Multimedia
Geschaeftsfuehrung: Edwin Summers - Michael Juergen Mauser
Sitz der Gesellschaft: Hamburg - Registergericht: Hamburg HRB 59980

*******************************************
Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte Informationen. Wenn Sie nicht der richtige Adressat
sindoder diese E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender und loeschen Sie diese
Mail.Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. 
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have
receivedthis e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying,
disclosureor distribution of the contents in this e-mail is strictly forbidden. 
*******************************************

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
Thom Brown
Date:
2009/12/8 Schmitz, David <david.schmitz@harman.com>
Hi Andres,

EXPLAIN ANALYZE
select ah.ORDER8_ID, ah.BUILTUP_ID, rl.LINK_ID, la.SIDE,
                   rl.ROAD_NAME_ID, rl.LEFT_ADDRESS_RANGE_ID, rl.RIGHT_ADDRESS_RANGE_ID,
                   rl.IS_EXIT_NAME, rl.EXPLICATABLE, rl.IS_JUNCTION_NAME,
                   rl.IS_NAME_ON_ROADSIGN, rl.IS_POSTAL_NAME, rl.IS_STALE_NAME,
                   rl.IS_VANITY_NAME, rl.ROAD_LINK_ID, rn.STREET_NAME,
                   rn.ROUTE_TYPE
               from rdf.xdf_ADMIN_HIERARCHY ah
               join xdf.xdf_LINK_ADMIN la
               on ah.ADMIN_PLACE_ID = la.ADMIN_PLACE_ID
               join xdf.xdf_ROAD_LINK rl
               on la.LINK_ID = rl.LINK_ID
               join xdf.xdf_ROAD_NAME rn
               on rl.ROAD_NAME_ID = rn.ROAD_NAME_ID
               where rl.IS_EXIT_NAME = 'N'
                   and rl.IS_JUNCTION_NAME = 'N'
                   and rn.ROAD_NAME_ID between 158348561  and 158348660
               order by rl.ROAD_NAME_ID, ah.ORDER8_ID, ah.BUILTUP_ID, rl.LINK_ID;

On Postgresql 8.4.1

Sort  (cost=129346.71..129498.64 rows=60772 width=61) (actual time=100.358..100.496 rows=1444 loops=1)
 Sort Key: rl.road_name_id, ah.order8_id, ah.builtup_id, rl.link_id
 Sort Method:  quicksort  Memory: 252kB
 ->  Hash Join  (cost=2603.57..124518.03 rows=60772 width=61) (actual time=62.359..97.268 rows=1444 loops=1)
       Hash Cond: (la.admin_place_id = ah.admin_place_id)
       ->  Nested Loop  (cost=6.82..120781.81 rows=60772 width=57) (actual time=0.318..33.600 rows=1444 loops=1)
             ->  Nested Loop  (cost=6.82..72383.98 rows=21451 width=51) (actual time=0.232..12.359 rows=722 loops=1)
                   ->  Index Scan using pk_xdf_road_name on xdf_road_name rn  (cost=0.00..11.24 rows=97 width=21) (actual time=0.117..0.185 rows=100 loops=1)
                         Index Cond: ((road_name_id >= 158348561) AND (road_name_id <= 158348660))
                   ->  Bitmap Heap Scan on xdf_road_link rl  (cost=6.82..743.34 rows=222 width=34) (actual time=0.025..0.115 rows=7 loops=100)
                         Recheck Cond: (rl.road_name_id = rn.road_name_id)
                         Filter: ((rl.is_exit_name = 'N'::bpchar) AND (rl.is_junction_name = 'N'::bpchar))
                         ->  Bitmap Index Scan on nx_xdfroadlink_roadnameid  (cost=0.00..6.76 rows=222 width=0) (actual time=0.008..0.008 rows=7 loops=100)
                               Index Cond: (rl.road_name_id = rn.road_name_id)
             ->  Index Scan using nx_xdflinkadmin_linkid on xdf_link_admin la  (cost=0.00..2.22 rows=3 width=10) (actual time=0.023..0.028 rows=2 loops=722)
                   Index Cond: (la.link_id = rl.link_id)
       ->  Hash  (cost=1544.11..1544.11 rows=84211 width=12) (actual time=61.924..61.924 rows=84211 loops=1)
             ->  Seq Scan on xdf_admin_hierarchy ah  (cost=0.00..1544.11 rows=84211 width=12) (actual time=0.017..33.442 rows=84211 loops=1)
Total runtime: 101.446 ms


and on Postgresql  8.3.8:

Sort  (cost=3792.75..3792.95 rows=81 width=61) (actual time=28.928..29.074 rows=1444 loops=1)
 Sort Key: rl.road_name_id, ah.order8_id, ah.builtup_id, rl.link_id
 Sort Method:  quicksort  Memory: 252kB
 ->  Nested Loop  (cost=21.00..3790.18 rows=81 width=61) (actual time=0.210..26.098 rows=1444 loops=1)
       ->  Nested Loop  (cost=21.00..3766.73 rows=81 width=57) (actual time=0.172..19.148 rows=1444 loops=1)
             ->  Nested Loop  (cost=21.00..3733.04 rows=14 width=51) (actual time=0.129..6.126 rows=722 loops=1)
                   ->  Index Scan using pk_xdf_road_name on xdf_road_name rn  (cost=0.00..8.32 rows=1 width=21) (actual time=0.059..0.117 rows=100 loops=1)
                         Index Cond: ((road_name_id >= 158348561) AND (road_name_id <= 158348660))
                   ->  Bitmap Heap Scan on xdf_road_link rl  (cost=21.00..3711.97 rows=1020 width=34) (actual time=0.015..0.055 rows=7 loops=100)
                         Recheck Cond: (rl.road_name_id = rn.road_name_id)
                         Filter: ((rl.is_exit_name = 'N'::bpchar) AND (rl.is_junction_name = 'N'::bpchar))
                         ->  Bitmap Index Scan on nx_xdfroadlink_roadnameid  (cost=0.00..20.75 rows=1020 width=0) (actual time=0.007..0.007 rows=7 loops=100)
                               Index Cond: (rl.road_name_id = rn.road_name_id)
             ->  Index Scan using nx_xdflinkadmin_linkid on xdf_link_admin la  (cost=0.00..2.31 rows=8 width=10) (actual time=0.014..0.017 rows=2 loops=722)
                   Index Cond: (la.link_id = rl.link_id)
       ->  Index Scan using pk_xdf_admin_hierarchy on xdf_admin_hierarchy ah  (cost=0.00..0.28 rows=1 width=12) (actual time=0.003..0.004 rows=1 loops=1444)
             Index Cond: (ah.admin_place_id = la.admin_place_id)
Total runtime: 29.366 ms

Hope this gives any clue. Or did I missunderstand you?

Regards

David


>-----Ursprüngliche Nachricht-----
>Von: Andres Freund [mailto:andres@anarazel.de]
>Gesendet: Dienstag, 8. Dezember 2009 00:25
>An: pgsql-performance@postgresql.org
>Cc: Schmitz, David
>Betreff: Re: [PERFORM] performance penalty between Postgresql
>8.3.8 and 8.4.1
>
>Hi David,
>
>On Monday 07 December 2009 23:05:14 Schmitz, David wrote:
>> With our data it is a performance difference from 1h16min
>(8.3.8) to
>> 2h43min (8.4.1)
>Can you afford a explain analyze run overnight or so for both?
>
>Andres
>




Your output shows that the xdf_admin_hierarchy tables between versions are drastically different.  8.3.8 only contains 1 row, whereas 8.4.1 contains 84211 rows.

Thom

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
"Schmitz, David"
Date:
Hi Thom,
 
I did a select count(*) from xdf.xdf_admin_hierarchy and it returns 84211 on both databases postgres 8.3.8 and 8.4.1.
The amount of data is exactly the same in both databases as they are restored from the same dump.
 
Regards
 
David


Von: Thom Brown [mailto:thombrown@gmail.com]
Gesendet: Dienstag, 8. Dezember 2009 11:12
An: Schmitz, David
Cc: Andres Freund; pgsql-performance@postgresql.org
Betreff: Re: [PERFORM] performance penalty between Postgresql 8.3.8 and 8.4.1

2009/12/8 Schmitz, David <david.schmitz@harman.com>
Hi Andres,

EXPLAIN ANALYZE
select ah.ORDER8_ID, ah.BUILTUP_ID, rl.LINK_ID, la.SIDE,
                   rl.ROAD_NAME_ID, rl.LEFT_ADDRESS_RANGE_ID, rl.RIGHT_ADDRESS_RANGE_ID,
                   rl.IS_EXIT_NAME, rl.EXPLICATABLE, rl.IS_JUNCTION_NAME,
                   rl.IS_NAME_ON_ROADSIGN, rl.IS_POSTAL_NAME, rl.IS_STALE_NAME,
                   rl.IS_VANITY_NAME, rl.ROAD_LINK_ID, rn.STREET_NAME,
                   rn.ROUTE_TYPE
               from rdf.xdf_ADMIN_HIERARCHY ah
               join xdf.xdf_LINK_ADMIN la
               on ah.ADMIN_PLACE_ID = la.ADMIN_PLACE_ID
               join xdf.xdf_ROAD_LINK rl
               on la.LINK_ID = rl.LINK_ID
               join xdf.xdf_ROAD_NAME rn
               on rl.ROAD_NAME_ID = rn.ROAD_NAME_ID
               where rl.IS_EXIT_NAME = 'N'
                   and rl.IS_JUNCTION_NAME = 'N'
                   and rn.ROAD_NAME_ID between 158348561  and 158348660
               order by rl.ROAD_NAME_ID, ah.ORDER8_ID, ah.BUILTUP_ID, rl.LINK_ID;

On Postgresql 8.4.1

Sort  (cost=129346.71..129498.64 rows=60772 width=61) (actual time=100.358..100.496 rows=1444 loops=1)
 Sort Key: rl.road_name_id, ah.order8_id, ah.builtup_id, rl.link_id
 Sort Method:  quicksort  Memory: 252kB
 ->  Hash Join  (cost=2603.57..124518.03 rows=60772 width=61) (actual time=62.359..97.268 rows=1444 loops=1)
       Hash Cond: (la.admin_place_id = ah.admin_place_id)
       ->  Nested Loop  (cost=6.82..120781.81 rows=60772 width=57) (actual time=0.318..33.600 rows=1444 loops=1)
             ->  Nested Loop  (cost=6.82..72383.98 rows=21451 width=51) (actual time=0.232..12.359 rows=722 loops=1)
                   ->  Index Scan using pk_xdf_road_name on xdf_road_name rn  (cost=0.00..11.24 rows=97 width=21) (actual time=0.117..0.185 rows=100 loops=1)
                         Index Cond: ((road_name_id >= 158348561) AND (road_name_id <= 158348660))
                   ->  Bitmap Heap Scan on xdf_road_link rl  (cost=6.82..743.34 rows=222 width=34) (actual time=0.025..0.115 rows=7 loops=100)
                         Recheck Cond: (rl.road_name_id = rn.road_name_id)
                         Filter: ((rl.is_exit_name = 'N'::bpchar) AND (rl.is_junction_name = 'N'::bpchar))
                         ->  Bitmap Index Scan on nx_xdfroadlink_roadnameid  (cost=0.00..6.76 rows=222 width=0) (actual time=0.008..0.008 rows=7 loops=100)
                               Index Cond: (rl.road_name_id = rn.road_name_id)
             ->  Index Scan using nx_xdflinkadmin_linkid on xdf_link_admin la  (cost=0.00..2.22 rows=3 width=10) (actual time=0.023..0.028 rows=2 loops=722)
                   Index Cond: (la.link_id = rl.link_id)
       ->  Hash  (cost=1544.11..1544.11 rows=84211 width=12) (actual time=61.924..61.924 rows=84211 loops=1)
             ->  Seq Scan on xdf_admin_hierarchy ah  (cost=0.00..1544.11 rows=84211 width=12) (actual time=0.017..33.442 rows=84211 loops=1)
Total runtime: 101.446 ms


and on Postgresql  8.3.8:

Sort  (cost=3792.75..3792.95 rows=81 width=61) (actual time=28.928..29.074 rows=1444 loops=1)
 Sort Key: rl.road_name_id, ah.order8_id, ah.builtup_id, rl.link_id
 Sort Method:  quicksort  Memory: 252kB
 ->  Nested Loop  (cost=21.00..3790.18 rows=81 width=61) (actual time=0.210..26.098 rows=1444 loops=1)
       ->  Nested Loop  (cost=21.00..3766.73 rows=81 width=57) (actual time=0.172..19.148 rows=1444 loops=1)
             ->  Nested Loop  (cost=21.00..3733.04 rows=14 width=51) (actual time=0.129..6.126 rows=722 loops=1)
                   ->  Index Scan using pk_xdf_road_name on xdf_road_name rn  (cost=0.00..8.32 rows=1 width=21) (actual time=0.059..0.117 rows=100 loops=1)
                         Index Cond: ((road_name_id >= 158348561) AND (road_name_id <= 158348660))
                   ->  Bitmap Heap Scan on xdf_road_link rl  (cost=21.00..3711.97 rows=1020 width=34) (actual time=0.015..0.055 rows=7 loops=100)
                         Recheck Cond: (rl.road_name_id = rn.road_name_id)
                         Filter: ((rl.is_exit_name = 'N'::bpchar) AND (rl.is_junction_name = 'N'::bpchar))
                         ->  Bitmap Index Scan on nx_xdfroadlink_roadnameid  (cost=0.00..20.75 rows=1020 width=0) (actual time=0.007..0.007 rows=7 loops=100)
                               Index Cond: (rl.road_name_id = rn.road_name_id)
             ->  Index Scan using nx_xdflinkadmin_linkid on xdf_link_admin la  (cost=0.00..2.31 rows=8 width=10) (actual time=0.014..0.017 rows=2 loops=722)
                   Index Cond: (la.link_id = rl.link_id)
       ->  Index Scan using pk_xdf_admin_hierarchy on xdf_admin_hierarchy ah  (cost=0.00..0.28 rows=1 width=12) (actual time=0.003..0.004 rows=1 loops=1444)
             Index Cond: (ah.admin_place_id = la.admin_place_id)
Total runtime: 29.366 ms

Hope this gives any clue. Or did I missunderstand you?

Regards

David


>-----Ursprüngliche Nachricht-----
>Von: Andres Freund [mailto:andres@anarazel.de]
>Gesendet: Dienstag, 8. Dezember 2009 00:25
>An: pgsql-performance@postgresql.org
>Cc: Schmitz, David
>Betreff: Re: [PERFORM] performance penalty between Postgresql
>8.3.8 and 8.4.1
>
>Hi David,
>
>On Monday 07 December 2009 23:05:14 Schmitz, David wrote:
>> With our data it is a performance difference from 1h16min
>(8.3.8) to
>> 2h43min (8.4.1)
>Can you afford a explain analyze run overnight or so for both?
>
>Andres
>




Your output shows that the xdf_admin_hierarchy tables between versions are drastically different.  8.3.8 only contains 1 row, whereas 8.4.1 contains 84211 rows.

Thom

 
*******************************************
innovative systems GmbH Navigation-Multimedia
Geschaeftsfuehrung: Edwin Summers - Michael Juergen Mauser
Sitz der Gesellschaft: Hamburg - Registergericht: Hamburg HRB 59980
 
*******************************************
Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender und loeschen Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet.
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the contents in this e-mail is strictly forbidden.
*******************************************
 

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
Andres Freund
Date:
Hi David,

On Tuesday 08 December 2009 10:59:51 Schmitz, David wrote:
> >> With our data it is a performance difference from 1h16min
> >> (8.3.8) to 2h43min (8.4.1)
> On Postgresql 8.4.1
> Total runtime: 101.446 ms
> and on Postgresql  8.3.8:
> Total runtime: 29.366 ms
Hm. There obviously is more going on than these queries?

> Hash Join  (cost=2603.57..124518.03 rows=60772 width=61) (actual
time=62.359..97.268 rows=1444 loops=1)
> Nested Loop  (cost=21.00..3790.18 rows=81 width=61) (actual
time=0.210..26.098 rows=1444 loops=1)
Both misestimate the resultset quite a bit. It looks like happenstance that
the one on 8.3 turns out to be better...

Andres

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
"Schmitz, David"
Date:
Hi Andres,

this is just one of many of these queries. There are a lot of jobs calculating
stuff for different ranges which are defined via between in the where clause.


When I leave out the between in the where clause it returns:

On Postgresql 8.4.1:

Sort  (cost=5390066.42..5435347.78 rows=18112546 width=61) (actual time=84382.275..91367.983 rows=12742796 loops=1)
Sort Key: rl.road_name_id, ah.order8_id, ah.builtup_id, rl.link_id
Sort Method:  external merge  Disk: 924536kB
->  Hash Join  (cost=1082249.40..2525563.48 rows=18112546 width=61) (actual time=23367.205..52256.209 rows=12742796
loops=1)
Hash Cond: (la.admin_place_id = ah.admin_place_id)
->  Merge Join  (cost=1079652.65..2183356.50 rows=18112546 width=57) (actual time=23306.643..45541.157 rows=12742796
loops=1)
      Merge Cond: (la.link_id = rl.link_id)
      ->  Index Scan using nx_xdflinkadmin_linkid on xdf_link_admin la  (cost=0.00..798398.53 rows=16822372 width=10)
(actualtime=0.098..12622.576 rows=16822399 loops=1) 
      ->  Sort  (cost=1071304.95..1087287.81 rows=6393147 width=51) (actual time=23302.596..25640.559 rows=12742795
loops=1)
        Sort Key: rl.link_id
        Sort Method:  external sort  Disk: 405896kB
        ->  Hash Join  (cost=15735.91..348620.58 rows=6393147 width=51) (actual time=327.064..9189.938 rows=6371398
loops=1)
          Hash Cond: (rl.road_name_id = rn.road_name_id)
          ->  Seq Scan on xdf_road_link rl  (cost=0.00..182236.41 rows=7708159 width=34) (actual time=0.028..2689.085
rows=7709085loops=1) 
            Filter: ((is_exit_name = 'N'::bpchar) AND (is_junction_name = 'N'::bpchar))
          ->  Hash  (cost=9885.96..9885.96 rows=467996 width=21) (actual time=326.740..326.740 rows=467996 loops=1)
            ->  Seq Scan on xdf_road_name rn  (cost=0.00..9885.96 rows=467996 width=21) (actual time=0.019..191.473
rows=467996loops=1) 
->  Hash  (cost=1544.11..1544.11 rows=84211 width=12) (actual time=60.453..60.453 rows=84211 loops=1)
      ->  Seq Scan on xdf_admin_hierarchy ah  (cost=0.00..1544.11 rows=84211 width=12) (actual time=0.019..31.723
rows=84211loops=1) 
Total runtime: 92199.676 ms

On Postgresql 8.3.8:

Sort  (cost=9419546.57..9514635.57 rows=38035597 width=61) (actual time=82790.473..88847.963 rows=12742796 loops=1)
  Sort Key: rl.road_name_id, ah.order8_id, ah.builtup_id, rl.link_id
  Sort Method:  external merge  Disk: 999272kB
  ->  Hash Join  (cost=1079404.97..3200652.85 rows=38035597 width=61) (actual time=22583.059..51197.249 rows=12742796
loops=1)
        Hash Cond: (la.admin_place_id = ah.admin_place_id)
        ->  Merge Join  (cost=1076808.22..2484888.66 rows=38035597 width=57) (actual time=22524.015..44539.246
rows=12742796loops=1) 
              Merge Cond: (la.link_id = rl.link_id)
              ->  Index Scan using nx_xdflinkadmin_linkid on xdf_link_admin la  (cost=0.00..795583.17 rows=16822420
width=10)(actual time=0.086..11725.990 rows=16822399 loops=1) 
              ->  Sort  (cost=1076734.49..1092821.79 rows=6434920 width=51) (actual time=22514.553..25083.253
rows=12742795loops=1) 
                    Sort Key: rl.link_id
                    Sort Method:  external sort  Disk: 443264kB
                    ->  Hash Join  (cost=15743.47..349025.77 rows=6434920 width=51) (actual time=330.211..9014.353
rows=6371398loops=1) 
                          Hash Cond: (rl.road_name_id = rn.road_name_id)
                          ->  Seq Scan on xdf_road_link rl  (cost=0.00..182235.08 rows=7706491 width=34) (actual
time=0.018..2565.983rows=7709085 loops=1) 
                                Filter: ((is_exit_name = 'N'::bpchar) AND (is_junction_name = 'N'::bpchar))
                          ->  Hash  (cost=9890.43..9890.43 rows=468243 width=21) (actual time=329.906..329.906
rows=467996loops=1) 
                                ->  Seq Scan on xdf_road_name rn  (cost=0.00..9890.43 rows=468243 width=21) (actual
time=0.018..190.764rows=467996 loops=1) 
        ->  Hash  (cost=1544.11..1544.11 rows=84211 width=12) (actual time=58.910..58.910 rows=84211 loops=1)
              ->  Seq Scan on xdf_admin_hierarchy ah  (cost=0.00..1544.11 rows=84211 width=12) (actual
time=0.009..28.725rows=84211 loops=1) 
Total runtime: 89612.801 ms

Regards

David

>-----Ursprüngliche Nachricht-----
>Von: Andres Freund [mailto:andres@anarazel.de]
>Gesendet: Dienstag, 8. Dezember 2009 11:29
>An: pgsql-performance@postgresql.org
>Cc: Schmitz, David
>Betreff: Re: [PERFORM] performance penalty between Postgresql
>8.3.8 and 8.4.1
>
>Hi David,
>
>On Tuesday 08 December 2009 10:59:51 Schmitz, David wrote:
>> >> With our data it is a performance difference from 1h16min
>> >> (8.3.8) to 2h43min (8.4.1)
>> On Postgresql 8.4.1
>> Total runtime: 101.446 ms
>> and on Postgresql  8.3.8:
>> Total runtime: 29.366 ms
>Hm. There obviously is more going on than these queries?
>
>> Hash Join  (cost=2603.57..124518.03 rows=60772 width=61) (actual
>time=62.359..97.268 rows=1444 loops=1)
>> Nested Loop  (cost=21.00..3790.18 rows=81 width=61) (actual
>time=0.210..26.098 rows=1444 loops=1)
>Both misestimate the resultset quite a bit. It looks like
>happenstance that the one on 8.3 turns out to be better...
>
>Andres
>

*******************************************
innovative systems GmbH Navigation-Multimedia
Geschaeftsfuehrung: Edwin Summers - Michael Juergen Mauser
Sitz der Gesellschaft: Hamburg - Registergericht: Hamburg HRB 59980

*******************************************
Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte Informationen. Wenn Sie nicht der richtige Adressat
sindoder diese E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender und loeschen Sie diese
Mail.Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. 
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have
receivedthis e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying,
disclosureor distribution of the contents in this e-mail is strictly forbidden. 
*******************************************

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
Craig Ringer
Date:
On 8/12/2009 6:11 PM, Thom Brown wrote:

> Your output shows that the xdf_admin_hierarchy tables between versions
> are drastically different.  8.3.8 only contains 1 row, whereas 8.4.1
> contains 84211 rows.

That's just because one of them is doing a nested loop where it looks up
a single row from xdf_admin_hierarchy via its primary key on each
iteration. The other plan is doing a hash join on a sequential scan over
xdf_admin_hierarchy so it reports all the rows at once.

--
Craig Ringer

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
"Schmitz, David"
Date:
Hi Craig,

that is exactly the problem postgresql 8.4.1 does not consider the primary key but instead calculates
a hash join. This can only result in poorer performance. I think this is a bug.

Regards

David

>-----Ursprüngliche Nachricht-----
>Von: Craig Ringer [mailto:craig@postnewspapers.com.au]
>Gesendet: Dienstag, 8. Dezember 2009 13:12
>An: Thom Brown
>Cc: Schmitz, David; Andres Freund; pgsql-performance@postgresql.org
>Betreff: Re: [PERFORM] performance penalty between Postgresql
>8.3.8 and 8.4.1
>
>On 8/12/2009 6:11 PM, Thom Brown wrote:
>
>> Your output shows that the xdf_admin_hierarchy tables
>between versions
>> are drastically different.  8.3.8 only contains 1 row, whereas 8.4.1
>> contains 84211 rows.
>
>That's just because one of them is doing a nested loop where
>it looks up a single row from xdf_admin_hierarchy via its
>primary key on each iteration. The other plan is doing a hash
>join on a sequential scan over xdf_admin_hierarchy so it
>reports all the rows at once.
>
>--
>Craig Ringer
>

*******************************************
innovative systems GmbH Navigation-Multimedia
Geschaeftsfuehrung: Edwin Summers - Michael Juergen Mauser
Sitz der Gesellschaft: Hamburg - Registergericht: Hamburg HRB 59980

*******************************************
Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte Informationen. Wenn Sie nicht der richtige Adressat
sindoder diese E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender und loeschen Sie diese
Mail.Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. 
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have
receivedthis e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying,
disclosureor distribution of the contents in this e-mail is strictly forbidden. 
*******************************************

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
Robert Haas
Date:
On Tue, Dec 8, 2009 at 7:12 AM, Craig Ringer
<craig@postnewspapers.com.au> wrote:
> On 8/12/2009 6:11 PM, Thom Brown wrote:
>
>> Your output shows that the xdf_admin_hierarchy tables between versions
>> are drastically different.  8.3.8 only contains 1 row, whereas 8.4.1
>> contains 84211 rows.
>
> That's just because one of them is doing a nested loop where it looks up a
> single row from xdf_admin_hierarchy via its primary key on each iteration.
> The other plan is doing a hash join on a sequential scan over
> xdf_admin_hierarchy so it reports all the rows at once.

I've been meaning to write a patch to show the places after the
decimal point in that case.  Rounding off to an integer is horribly
misleading and obscures what is really going on.  Although in this
case maybe it would come out 1.000 anyway.

...Robert

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
Robert Haas
Date:
On Tue, Dec 8, 2009 at 8:27 AM, Schmitz, David <david.schmitz@harman.com> wrot
> that is exactly the problem postgresql 8.4.1 does not consider the primary key but instead calculates
> a hash join. This can only result in poorer performance. I think this is a bug.

Your statement that "this can only result in poorer performance" is
flat wrong.  Just because there's a primary key doesn't mean that an
inner-indexscan plan is fastest.  Frequently a hash join is faster.  I
can think of a couple of possible explanations for the behavior you're
seeing:

- Something could be blocking PostgreSQL from using that index at all.
 If you do EXPLAIN SELECT * FROM xdf_admin_hierarchy WHERE
admin_place_id = <some particular value>, does it use the index or
seq-scan the table?

- The index on your 8.4.1 system might be bloated.  You could perhaps
SELECT reltuples FROM pg_class WHERE oid =
'pk_xdf_admin_hierarchy'::regclass on both systems to see if one index
is larger than the other.

- You might have changed the value of the work_mem parameter on one
system vs. the other.  Try "show work_mem;" on each system and see
what you get.

If it's none of those things, it's could be the result of a code
change, but I'm at a loss to think of which one would apply in this
case.  I suppose we could do a bisection search but that's a lot of
work for you.  If you could extract a reproducible test case (complete
with data) that would allow someone else to try to track it down.

...Robert

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I can think of a couple of possible explanations for the behavior you're
> seeing:

The reason it's switching from a nestloop to something else is pretty
obvious: the estimate of the number of rows coming out of the lower
join has gone from 81 to 60772.  Neither of which is real accurate :-(,
but the larger value pretty strongly discourages using a nestloop.

The estimates for the individual scans mostly seem to be better than
before, in the case of xdf_road_name far better: 97 vs 1, against a true
value of 100.  So that's good; I suspect though that it just comes from
the increase in default stats target and doesn't reflect any logic
change.  The bottom line though is that it's gone from a considerable
underestimate of the join size to a considerable overestimate, and that
pushes it to use a different plan that turns out to be inferior.

I don't see any fixable bug here.  This is just a corner case where
the inherent inaccuracies in join size estimation went wrong for us;
but for every one of those there's another one where we'd get the
right answer for the wrong reason.

One thing that might be worth considering is to try to improve the
accuracy of this rowcount estimate:

                  ->  Bitmap Heap Scan on xdf_road_link rl  (cost=6.82..743.34 rows=222 width=34) (actual
time=0.025..0.115rows=7 loops=100) 
                          Recheck Cond: (rl.road_name_id = rn.road_name_id)
                          Filter: ((rl.is_exit_name = 'N'::bpchar) AND (rl.is_junction_name = 'N'::bpchar))
                          ->  Bitmap Index Scan on nx_xdfroadlink_roadnameid  (cost=0.00..6.76 rows=222 width=0)
(actualtime=0.008..0.008 rows=7 loops=100) 
                                Index Cond: (rl.road_name_id = rn.road_name_id)

I think a large part of the inaccuracy here has to do with not having
good stats for the joint effect of the is_exit_name and is_junction_name
conditions.  But to be frank that looks like bad schema design.
Consider merging those and any related flags into one "entry type"
column.

            regards, tom lane

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
"Schmitz, David"
Date:
Hi Robert,

unfortunatley its non of the things :-( see below:

- EXPLAIN SELECT * FROM xdf.xdf_admin_hierarchy
  WHERE admin_place_id = 150738434

  On Postgresql 8.4.1 and 8.3.8
  Index Scan using pk_rdf_admin_hierarchy on rdf_admin_hierarchy  (cost=0.00..8.28 rows=1 width=34)
  Index Cond: (admin_place_id = 150738434)

- SELECT reltuples FROM pg_class WHERE oid = 'pk_xdf_admin_hierarchy'::regclass
  returns 84211 on postgresql 8.4.1 and 8.3.8

- work_mem is 512MB on both systems

- unfortunately I can not hand out any data because of legal issues so we will have to
  do further debugging if necessary

So how should we proceed with this issue?

Regards

David


>-----Ursprüngliche Nachricht-----
>Von: Robert Haas [mailto:robertmhaas@gmail.com]
>Gesendet: Dienstag, 8. Dezember 2009 16:14
>An: Schmitz, David
>Cc: Craig Ringer; Thom Brown; Andres Freund;
>pgsql-performance@postgresql.org
>Betreff: Re: [PERFORM] performance penalty between Postgresql
>8.3.8 and 8.4.1
>
>On Tue, Dec 8, 2009 at 8:27 AM, Schmitz, David
><david.schmitz@harman.com> wrot
>> that is exactly the problem postgresql 8.4.1 does not consider the
>> primary key but instead calculates a hash join. This can
>only result in poorer performance. I think this is a bug.
>
>Your statement that "this can only result in poorer
>performance" is flat wrong.  Just because there's a primary
>key doesn't mean that an inner-indexscan plan is fastest.
>Frequently a hash join is faster.  I can think of a couple of
>possible explanations for the behavior you're
>seeing:
>
>- Something could be blocking PostgreSQL from using that index at all.
> If you do EXPLAIN SELECT * FROM xdf_admin_hierarchy WHERE
>admin_place_id = <some particular value>, does it use the
>index or seq-scan the table?
>
>- The index on your 8.4.1 system might be bloated.  You could
>perhaps SELECT reltuples FROM pg_class WHERE oid =
>'pk_xdf_admin_hierarchy'::regclass on both systems to see if
>one index is larger than the other.
>
>- You might have changed the value of the work_mem parameter
>on one system vs. the other.  Try "show work_mem;" on each
>system and see what you get.
>
>If it's none of those things, it's could be the result of a
>code change, but I'm at a loss to think of which one would
>apply in this case.  I suppose we could do a bisection search
>but that's a lot of work for you.  If you could extract a
>reproducible test case (complete with data) that would allow
>someone else to try to track it down.
>
>...Robert
>

*******************************************
innovative systems GmbH Navigation-Multimedia
Geschaeftsfuehrung: Edwin Summers - Michael Juergen Mauser
Sitz der Gesellschaft: Hamburg - Registergericht: Hamburg HRB 59980

*******************************************
Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte Informationen. Wenn Sie nicht der richtige Adressat
sindoder diese E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender und loeschen Sie diese
Mail.Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. 
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have
receivedthis e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying,
disclosureor distribution of the contents in this e-mail is strictly forbidden. 
*******************************************

Re: performance penalty between Postgresql 8.3.8 and 8.4.1

From
Robert Haas
Date:
On Tue, Dec 8, 2009 at 11:07 AM, Schmitz, David
<david.schmitz@harman.com> wrote:
> So how should we proceed with this issue?

I think Tom nailed it.

...Robert