Re: Query slower on 8.0.3 (Windows) vs 7.3 (cygwin) - Mailing list pgsql-performance

From Gurpreet Aulakh
Subject Re: Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)
Date
Msg-id IGEGINLEEDLANNHAAKCMIECGCCAA.gaulakh@ecmarket.com
Whole thread Raw
In response to Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)  ("Gurpreet Aulakh" <gaulakh@ecmarket.com>)
Responses Re: Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)
List pgsql-performance
I have started to break my query down and analyze each piece.
What I have discovered is very interesting.

First here is a small piece of my query.

EXPLAIN ANALYZE SELECT doc.doc_documentid FROM document AS doc
    LEFT JOIN document as root ON doc.doc_internalRootXref =
root.doc_documentId
    LEFT JOIN folder_document ON doc.doc_documentid =
folder_document.doc_documentId LIMIT 500 OFFSET 0

When I run this on Postgres 8.0.3 running under windows this is the result

QUERY PLAN
Limit  (cost=183.49..753.41 rows=500 width=40) (actual time=47.000..79.000
rows=500 loops=1)
  ->  Hash Left Join  (cost=183.49..6702.23 rows=5719 width=40) (actual
time=47.000..79.000 rows=500 loops=1)
        Hash Cond: (("outer".doc_documentid)::text =
("inner".doc_documentid)::text)
        ->  Merge Left Join  (cost=0.00..6432.96 rows=5719 width=40) (actual
time=0.000..16.000 rows=500 loops=1)
              Merge Cond: (("outer".doc_internalrootxref)::text =
("inner".doc_documentid)::text)
              ->  Index Scan using doc_internalrootxref_index on document
doc  (cost=0.00..3172.64 rows=5719 width=80) (actual time=0.000..0.000
rows=500 loops=1)
              ->  Index Scan using pk_document on document root
(cost=0.00..3174.53 rows=5719 width=40) (actual time=0.000..0.000 rows=863
loops=1)
        ->  Hash  (cost=169.19..169.19 rows=5719 width=40) (actual
time=47.000..47.000 rows=0 loops=1)
              ->  Seq Scan on folder_document  (cost=0.00..169.19 rows=5719
width=40) (actual time=0.000..16.000 rows=5719 loops=1)
Total runtime: 79.000 ms

Here is the result of running the same query on the Postgres 7.3 running
under Cygwin

QUERY PLAN
Limit  (cost=183.49..775.31 rows=500 width=160) (actual time=13.00..44.00
rows=500 loops=1)
  ->  Hash Join  (cost=183.49..6952.79 rows=5719 width=160) (actual
time=13.00..44.00 rows=501 loops=1)
        Hash Cond: ("outer".doc_documentid = "inner".doc_documentid)
        ->  Merge Join  (cost=0.00..6612.03 rows=5719 width=120) (actual
time=0.00..29.00 rows=775 loops=1)
              Merge Cond: ("outer".doc_internalrootxref =
"inner".doc_documentid)
              ->  Index Scan using doc_internalrootxref_index on document
doc  (cost=0.00..3254.39 rows=5719 width=80) (actual time=0.00..7.00
rows=775 loops=1)
              ->  Index Scan using pk_document on document root
(cost=0.00..3257.88 rows=5719 width=40) (actual time=0.00..15.00 rows=1265
loops=1)
        ->  Hash  (cost=169.19..169.19 rows=5719 width=40) (actual
time=12.00..12.00 rows=0 loops=1)
              ->  Seq Scan on folder_document  (cost=0.00..169.19 rows=5719
width=40) (actual time=0.00..9.00 rows=5719 loops=1)
Total runtime: 45.00 msec

What is really interesting is the time it takes for the Hash to occur. For
the first hash, on the 7.3 it takes only 12ms while on the 8.0 it takes
47ms.
Now the databases are created from the same data and I have run
vacuumdb -f -z on the databases.

Now I have read something on the archives that stated that perhaps the data
is in the filesystem (not database) cache. Would this be the case?. If so
how would I improve the  performance under WIN2K?

Anyone have any ideas?


-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org]On Behalf Of Gurpreet
Aulakh
Sent: September 21, 2005 12:38 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)


I currently have a Postgres 7.3 database running under WIN2K using cygwin
and want to move to Postgres 8.0.3 (native windows version).
I am finding most simple queries are significantly faster on the native
windows version compared to 7.3 (under cygwin).
However, for a complex query, that involve multiple JOINs, the 7.3 version
is actually faster (about 2X faster).

The query that I am running was optimized to run under 7.3. It was
specifically modified to control the planner with explicit JOINs.
When I run the same query on the 8.0.3 version with the join_collapse_limit
set to 1 the query is slower.

Can someone tell me why setting the join_collapse_limit to 1 in the 8.0
version does not produce similar results to the 7.3 version?
Does anyone have any suggestions on what I can do? Do I have to rewrite the
query?


Here are the results of an explain analyze on the query.

Explain analyze Postgres 7.3 running on WIN2K using cygwin.

Hash Join  (cost=21808.27..1946264.80 rows=2982 width=1598) (actual
time=2186.00..2320.00 rows=50 loops=1)
   Hash Cond: ("outer".doc_internalparentomxref = "inner".doc_documentid)
   ->  Hash Join  (cost=20948.78..1945323.29 rows=2982 width=1534) (actual
time=2110.00..2227.00 rows=50 loops=1)
         Hash Cond: ("outer".doc_internalrootxref = "inner".doc_documentid)
         ->  Hash Join  (cost=20089.29..1944381.79 rows=2982 width=1484)
(actual time=2067.00..2179.00 rows=50 loops=1)
               Hash Cond: ("outer".doc_documentid = "inner".doc_documentid)
               Join Filter: ("inner".dc_doccontacttype = 'FROM'::character
varying)
               ->  Hash Join  (cost=7455.14..1928613.59 rows=2982
width=1138) (actual time=1216.00..1539.00 rows=50 loops=1)
                     Hash Cond: ("outer".doc_documentid =
"inner".doc_documentid)
                     Join Filter: ("inner".dc_doccontacttype =
'TO'::character varying)
                     ->  Hash Join  (cost=183.49..1918519.06 rows=2860
width=792) (actual time=64.00..301.00 rows=50 loops=1)
                           Hash Cond: ("outer".doc_documentid =
"inner".doc_documentid)
                           ->  Seq Scan on document finaldoc
(cost=0.00..1918256.94 rows=2860 width=717) (actual time=13.00..254.00
rows=50 loops=1)
                                 Filter: (subplan)
                                 SubPlan
                                   ->  Materialize  (cost=335.27..335.27
rows=50 width=160) (actual time=0.00..0.01 rows=50 loops=5719)
                                         ->  Limit  (cost=0.00..335.27
rows=50 width=160) (actual time=3.00..8.00 rows=50 loops=1)
                                               ->  Nested Loop
(cost=0.00..38347.95 rows=5719 width=160) (actual time=3.00..8.00 rows=51
loops=1)
                                                     ->  Merge Join
(cost=0.00..3910.14 rows=5719 width=120) (actual time=3.00..3.00 rows=51
loops=1)
                                                           Merge Cond:
("outer".doc_documentid = "inner".doc_documentid)
                                                           ->  Index Scan
using pk_document on document doc  (cost=0.00..3256.48 rows=5719 width=80)
(actual time=1.00..1.00 rows=51 loops=1)
                                                           ->  Index Scan
using pk_folder_document on folder_document  (cost=0.00..553.91 rows=5719
width=40) (actual time=2.00..2.00 rows=51 loops=1)
                                                     ->  Index Scan using
pk_document on document root  (cost=0.00..6.01 rows=1 width=40) (actual
time=0.10..0.10 rows=1 loops=51)
                                                           Index Cond:
("outer".doc_internalrootxref = root.doc_documentid)
                           ->  Hash  (cost=169.19..169.19 rows=5719
width=75) (actual time=31.00..31.00 rows=0 loops=1)
                                 ->  Seq Scan on folder_document
(cost=0.00..169.19 rows=5719 width=75) (actual time=0.00..11.00 rows=5719
loops=1)
                     ->  Hash  (cost=1328.80..1328.80 rows=34280 width=346)
(actual time=846.00..846.00 rows=0 loops=1)
                           ->  Seq Scan on doccontact dcto
(cost=0.00..1328.80 rows=34280 width=346) (actual time=0.00..175.00
rows=34280 loops=1)
               ->  Hash  (cost=1328.80..1328.80 rows=34280 width=346)
(actual time=445.00..445.00 rows=0 loops=1)
                     ->  Seq Scan on doccontact dcfrom  (cost=0.00..1328.80
rows=34280 width=346) (actual time=0.00..223.00 rows=34280 loops=1)
         ->  Hash  (cost=845.19..845.19 rows=5719 width=50) (actual
time=42.00..42.00 rows=0 loops=1)
               ->  Seq Scan on document root  (cost=0.00..845.19 rows=5719
width=50) (actual time=0.00..2.00 rows=5719 loops=1)
   ->  Hash  (cost=845.19..845.19 rows=5719 width=64) (actual
time=73.00..73.00 rows=0 loops=1)
         ->  Seq Scan on document parentom  (cost=0.00..845.19 rows=5719
width=64) (actual time=0.00..30.00 rows=5719 loops=1)
   SubPlan
     ->  Limit  (cost=0.00..5.56 rows=1 width=40) (actual time=0.06..0.06
rows=0 loops=50)
           ->  Result  (cost=0.00..7.20 rows=1 width=40) (actual
time=0.06..0.06 rows=0 loops=50)
                 One-Time Filter: ($0 = true)
                 ->  Index Scan using documentevent_index on documentevent
de  (cost=0.00..7.20 rows=1 width=40) (actual time=0.07..0.07 rows=0
loops=44)
                       Index Cond: (($1 = doc_documentid) AND
(de_processedflag = false) AND (de_documenteventstatus = 'ERROR'::character
varying))
     ->  Limit  (cost=0.00..3.86 rows=1 width=40) (actual time=0.10..0.10
rows=0 loops=50)

Explain analyze Postgres 8.0.3 running natively under WIN2K.

Hash IN Join  (cost=5293.09..7121.89 rows=50 width=1369) (actual
time=1062.000..5558.000 rows=50 loops=1)
   Hash Cond: (("outer".doc_documentid)::text =
("inner".doc_documentid)::text)
   ->  Hash Left Join  (cost=4798.24..6199.29 rows=5741 width=1369) (actual
time=751.000..4236.000 rows=5719 loops=1)
         Hash Cond: (("outer".doc_internalparentomxref)::text =
("inner".doc_documentid)::text)
         ->  Hash Left Join  (cost=3956.48..5271.41 rows=5741 width=1345)
(actual time=541.000..3105.000 rows=5719 loops=1)
               Hash Cond: (("outer".doc_internalrootxref)::text =
("inner".doc_documentid)::text)
               ->  Hash Left Join  (cost=3114.72..4343.53 rows=5741
width=1335) (actual time=501.000..2313.000 rows=5719 loops=1)
                     Hash Cond: (("outer".doc_documentid)::text =
("inner".doc_documentid)::text)
                     ->  Hash Left Join  (cost=1649.92..2721.09 rows=5741
width=1039) (actual time=180.000..1342.000 rows=5719 loops=1)
                           Hash Cond: (("outer".doc_documentid)::text =
("inner".doc_documentid)::text)
                           ->  Hash Left Join  (cost=185.13..1098.65
rows=5741 width=743) (actual time=40.000..592.000 rows=5719 loops=1)
                                 Hash Cond: (("outer".doc_documentid)::text
= ("inner".doc_documentid)::text)
                                 ->  Seq Scan on document finaldoc
(cost=0.00..827.41 rows=5741 width=708) (actual time=0.000..41.000 rows=5719
loops=1)
                                 ->  Hash  (cost=170.70..170.70 rows=5770
width=75) (actual time=40.000..40.000 rows=0 loops=1)
                                       ->  Seq Scan on folder_document
(cost=0.00..170.70 rows=5770 width=75) (actual time=0.000..10.000 rows=5719
loops=1)
                           ->  Hash  (cost=1450.50..1450.50 rows=5718
width=336) (actual time=140.000..140.000 rows=0 loops=1)
                                 ->  Seq Scan on doccontact dcto
(cost=0.00..1450.50 rows=5718 width=336) (actual time=0.000..130.000
rows=5718 loops=1)
                                       Filter: ((dc_doccontacttype)::text =
'TO'::text)
                     ->  Hash  (cost=1450.50..1450.50 rows=5718 width=336)
(actual time=321.000..321.000 rows=0 loops=1)
                           ->  Seq Scan on doccontact dcfrom
(cost=0.00..1450.50 rows=5718 width=336) (actual time=10.000..291.000
rows=5718 loops=1)
                                 Filter: ((dc_doccontacttype)::text =
'FROM'::text)
               ->  Hash  (cost=827.41..827.41 rows=5741 width=50) (actual
time=40.000..40.000 rows=0 loops=1)
                     ->  Seq Scan on document root  (cost=0.00..827.41
rows=5741 width=50) (actual time=0.000..30.000 rows=5719 loops=1)
         ->  Hash  (cost=827.41..827.41 rows=5741 width=64) (actual
time=210.000..210.000 rows=0 loops=1)
               ->  Seq Scan on document parentom  (cost=0.00..827.41
rows=5741 width=64) (actual time=0.000..160.000 rows=5719 loops=1)
   ->  Hash  (cost=494.73..494.73 rows=50 width=42) (actual
time=261.000..261.000 rows=0 loops=1)
         ->  Subquery Scan "IN_subquery"  (cost=185.13..494.73 rows=50
width=42) (actual time=101.000..261.000 rows=50 loops=1)
               ->  Limit  (cost=185.13..494.23 rows=50 width=40) (actual
time=101.000..261.000 rows=50 loops=1)
                     ->  Nested Loop Left Join  (cost=185.13..35676.18
rows=5741 width=40) (actual time=101.000..261.000 rows=50 loops=1)
                           ->  Hash Left Join  (cost=185.13..1098.65
rows=5741 width=80) (actual time=91.000..91.000 rows=50 loops=1)
                                 Hash Cond: (("outer".doc_documentid)::text
= ("inner".doc_documentid)::text)
                                 ->  Seq Scan on document doc
(cost=0.00..827.41 rows=5741 width=80) (actual time=10.000..10.000 rows=50
loops=1)
                                 ->  Hash  (cost=170.70..170.70 rows=5770
width=40) (actual time=81.000..81.000 rows=0 loops=1)
                                       ->  Seq Scan on folder_document
(cost=0.00..170.70 rows=5770 width=40) (actual time=10.000..61.000 rows=5719
loops=1)
                           ->  Index Scan using pk_document on document root
(cost=0.00..6.01 rows=1 width=40) (actual time=3.400..3.400 rows=1 loops=50)
                                 Index Cond:
(("outer".doc_internalrootxref)::text = (root.doc_documentid)::text)
   SubPlan
     ->  Limit  (cost=0.00..1.96 rows=1 width=40) (actual time=0.400..0.400
rows=0 loops=50)
           ->  Seq Scan on followup_document fd  (cost=0.00..3.91 rows=2
width=40) (actual time=0.400..0.400 rows=0 loops=50)
                 Filter: (($1)::text = (doc_documentid)::text)
     ->  Limit  (cost=0.00..6.01 rows=1 width=40) (actual
time=17.620..17.620 rows=0 loops=50)
           ->  Result  (cost=0.00..6.01 rows=1 width=40) (actual
time=17.620..17.620 rows=0 loops=50)
                 One-Time Filter: ($0 = true)
                 ->  Index Scan using documentevent_index on documentevent
de  (cost=0.00..6.01 rows=1 width=40) (actual time=28.419..28.419 rows=0
loops=31)
                       Index Cond: ((($1)::text = (doc_documentid)::text)
AND (de_processedflag = false) AND ((de_documenteventstatus)::text =
'ERROR'::text))
 Total runtime: 5558.000 ms


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings




pgsql-performance by date:

Previous
From: "Gurpreet Aulakh"
Date:
Subject: Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)
Next
From: Tom Lane
Date:
Subject: Re: Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)