good experience with performance in 8.2 for multi column indexes - Mailing list pgsql-general

From Michael Enke
Subject good experience with performance in 8.2 for multi column indexes
Date
Msg-id 48076830.3040301@wincor-nixdorf.com
Whole thread Raw
Responses Re: good experience with performance in 8.2 for multi column indexes  (Thomas Markus <t.markus@proventis.net>)
Re: good experience with performance in 8.2 for multi column indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hi lists,
I want to let you take part in my experience of performance boost for delete operations
where more than one column is part of a primary key.

For my setup, in 8.1 a delete query which deletes 200000 entries depending on rows in another table
runs about 7h, in 8.2 (and later) it runs 9s!

I have two tables looking exactly the same, with two columns in the pk, one varchar(20) and one char(1).
Both tables contain the same contents.

Explain produces the following difference:
8.1:

tplinux=> explain delete from pluext1 where (plunmbr,pluexttype) in (select plunmbr,pluexttype from pluext2);
                                        QUERY PLAN
----------------------------------------------------------------------------------------
  Hash Join  (cost=24267.10..155886.35 rows=48236 width=6)
    Hash Cond: ("outer".pluexttype = "inner".pluexttype)
    Join Filter: ("outer".plunmbr = "inner".plunmbr)
    ->  Seq Scan on pluext1  (cost=0.00..6945.00 rows=138900 width=46)
    ->  Hash  (cost=24116.37..24116.37 rows=13891 width=40)
          ->  Unique  (cost=23074.62..24116.37 rows=13891 width=40)
                ->  Sort  (cost=23074.62..23421.87 rows=138900 width=40)
                      Sort Key: pluext2.plunmbr, pluext2.pluexttype
                      ->  Seq Scan on pluext2  (cost=0.00..6945.00 rows=138900 width=40)
(9 rows)

(this delete took 7h)

8.2:

tplinux=> explain delete from pluext1 where (plunmbr,pluexttype) in (select plunmbr,pluexttype from pluext2);
                                                QUERY PLAN
---------------------------------------------------------------------------------------------------------
  Nested Loop IN Join  (cost=0.00..13362.14 rows=41106 width=6)
    ->  Seq Scan on pluext1  (cost=0.00..6411.25 rows=128225 width=46)
    ->  Index Scan using pluext2_pk on pluext2  (cost=0.00..0.50 rows=3 width=40)
          Index Cond: ((pluext1.plunmbr = pluext2.plunmbr) AND (pluext2.pluexttype = pluext1.pluexttype))
(4 rows)

(this delete took 9s)

I could not find an explanation for this in the release notes for 8.2,
I thought it was much earlier that multi column indexes could be used.
Anyway, it saved my life that new version is fast. Many thanks to the developer!

BTW I do not understand the output of the 8.2 explain:
 From my understanding it should do a seq scan on pluext2 (and not pluext1)
and an index scan on pluext1 (and not pluext2).

Regards,
Michael

--
Wincor Nixdorf International GmbH
Sitz der Gesellschaft: Paderborn
Registergericht Paderborn HRB 3507
Gesch�ftsf�hrer: Eckard Heidloff (Vorsitzender), Stefan Auerbach, Dr. J�rgen Wunram
Vorsitzender des Aufsichtsrats: Karl-Heinz Stiller
Steuernummer: 339/5884/0020 - Ust-ID Nr.: DE812927716 - WEEE-Reg.-Nr. DE44477193

Diese E-Mail enth�lt vertrauliche Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrt�mlich
erhaltenhaben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie
dieunbefugte Weitergabe dieser E-Mail ist nicht gestattet. 

This e-mail may contain confidential information. If you are not the intended recipient (or have received this e-mail
inerror) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or
distributionof the material in this e-mail is strictly forbidden. 


pgsql-general by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: How to recovery data from folder data installation?
Next
From: Paul Boddie
Date:
Subject: Re: Which Python library - psycopg2 or pygresql?