Re: 7.3.1 New install, large queries are slow - Mailing list pgsql-performance

From Roman Fail
Subject Re: 7.3.1 New install, large queries are slow
Date
Msg-id 9B1C77393DED0D4B9DAA1AA1742942DA0E4C06@pos_pdc.posportal.com
Whole thread Raw
In response to 7.3.1 New install, large queries are slow  ("Roman Fail" <rfail@posportal.com>)
Responses Re: 7.3.1 New install, large queries are slow
Re: 7.3.1 New install, large queries are slow
List pgsql-performance
> Josh Berkus wrote:
> And MSSQL is returning results in 3 seconds?    I find that a bit hard
> to believe, unless this query is called repeatedly and that's the
> figure for the last call, where the records are being cached.   I'll
> have to look at your hardware descriptions again.

Hardware-wise, the Postgres server is a hot rod and MSSQL is a basic vanilla server.  I changed all the WHERE clauses
toradically different values and couldn't get it to take more than 5 seconds on MSSQL.  Most of it's cost savings seems
tocome from some kind of "Table Spool/Lazy Spool" in it's execution plan, which looks to me like it only exists for the
lifeof the query.  You can read more about this at:
 
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/optimsql/odp_tun_1_1m7g.asp
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/optimsql/odp_tun_1_7rjg.asp
Maybe there are some good ideas here for Postgres.  Unfortunately, the MSSQL Execution Plan is displayed graphically,
andI can't figure out a way to get it to text without typing it all.  I could do some screen shots if you really want
tosee it.
 
 
> Stephan Szabo wrote:
> I'd assume that tranamount values are fairly randomly distributed
> throughout the table, right?  It takes about 5 minutes for the
> system to read the entire table and more for the index scan, so
> you're probably reading most of the table randomly and the index
> as well.
> What values on batchdetail do you use in query where clauses regularly? 
 
Yes, tranamount values are randomly distributed.  I don't understand why an index scan would be "random", isn't the
wholepoint of an index to have an ordered reference into the data?  batchdetail has 5 columns that can be in the WHERE
clause,all of which are indexed.  None is more likely than the other to be searched, so a clustered index doesn't make
muchsense to me.  The whole thing needs to be fast.
 
 
>> Nope.  This was a misimpression caused by batchdetail waiting for a
>> bunch of other processes to complete.  Sometimes the parallelizing
>> gives me a wrong impression of what's holding up the query.  Sorry if I
>> confused you.
>
>I'm still not sure that it isn't a big part given that the time went down
>by a factor of about 4 when index scans were disabled and a sequence scan
>was done and that a sequence scan over the table with no other tables
>joined looked to take about 5 minutes itself and the difference between
>that seqscan and the big query was only about 20 seconds when
>enable_indexscan was off unless I'm misreading those results.

You are not misreading the results.  There was a huge difference.  Nobody has ever made note of it, but this still
seemsvery odd to me:
 
*** 'sar -b' during the query
with index scan: 6,000 block reads/sec
with seq scan: 95,000 block reads/sec
 
 
Tom, here is the EXPLAIN for your suggested version of the query with enable_indexscan=on.  I performed the same query
withenable_indexscan=off and total runtime was *much* better: 296174.60 msec.  By the way, thank you for your detailed
descriptionof how the JOIN order affects the outcome - I understand much better now.
 
EXPLAIN ANALYZE
SELECT ss.batchdate, ss.batchdetailid, ss.bankno, ss.trandate, ss.tranamount,
ss.submitinterchange, ss.authamount, ss.authno, ss.cardtypeid, ss.mcccode,
ss.name AS merchantname, ss.cardtype, ss.merchid,
p1.localtaxamount, p1.productidentifier, dr.avsresponse,
cr.checkoutdate, cr.noshowindicator, ck.checkingacctno,
ck.abaroutingno, ck.checkno
FROM
  (SELECT b.batchdate, d.batchdetailid, t.bankno, d.trandate, d.tranamount,
       d.submitinterchange, d.authamount, d.authno, d.cardtypeid, d.mcccode,
       m.name, c.cardtype, m.merchid
   FROM tranheader t, batchheader b, merchants m, cardtype c, batchdetail d
   WHERE t.tranheaderid=b.tranheaderid
   AND m.merchantid=b.merchantid
   AND d.batchid=b.batchid
   AND c.cardtypeid=d.cardtypeid
   AND t.clientid = 6
   AND d.tranamount BETWEEN 500.0 AND 700.0
   AND b.batchdate > '2002-12-15'
   AND m.merchid = '701252267') ss
  LEFT JOIN purc1 p1 on p1.batchdetailid=ss.batchdetailid
  LEFT JOIN direct dr ON dr.batchdetailid = ss.batchdetailid
  LEFT JOIN carrental cr ON cr.batchdetailid = ss.batchdetailid
  LEFT JOIN checks ck ON ck.batchdetailid = ss.batchdetailid
ORDER BY ss.batchdate DESC
LIMIT 50
Limit  (cost=1601637.75..1601637.75 rows=1 width=285) (actual time=1221606.41..1221606.42 rows=5 loops=1)
  ->  Sort  (cost=1601637.75..1601637.75 rows=1 width=285) (actual time=1221606.40..1221606.41 rows=5 loops=1)
        Sort Key: b.batchdate
        ->  Nested Loop  (cost=1543595.18..1601637.74 rows=1 width=285) (actual time=1204815.02..1221606.27 rows=5
loops=1)
              Join Filter: ("inner".batchdetailid = "outer".batchdetailid)
              ->  Nested Loop  (cost=1543595.18..1601581.23 rows=1 width=247) (actual time=1204792.38..1221560.42
rows=5loops=1)
 
                    Join Filter: ("inner".batchdetailid = "outer".batchdetailid)
                    ->  Nested Loop  (cost=1543595.18..1601581.22 rows=1 width=230) (actual time=1204792.35..1221560.27
rows=5loops=1)
 
                          Join Filter: ("inner".batchdetailid = "outer".batchdetailid)
                          ->  Nested Loop  (cost=1543595.18..1601581.21 rows=1 width=221) (actual
time=1204792.31..1221560.09rows=5 loops=1)
 
                                Join Filter: ("inner".batchdetailid = "outer".batchdetailid)
                                ->  Nested Loop  (cost=1543595.18..1545529.17 rows=1 width=202) (actual
time=1195376.48..1195578.86rows=5 loops=1)
 
                                      Join Filter: ("inner".tranheaderid = "outer".tranheaderid)
                                      ->  Nested Loop  (cost=1543595.18..1545449.98 rows=1 width=186) (actual
time=1195370.72..1195536.53rows=5 loops=1)
 
                                            Join Filter: ("inner".cardtypeid = "outer".cardtypeid)
                                            ->  Merge Join  (cost=1543595.18..1545448.76 rows=1 width=172) (actual
time=1195311.88..1195477.32rows=5 loops=1)
 
                                                  Merge Cond: ("outer".batchid = "inner".batchid)
                                                  ->  Sort  (cost=476.17..476.18 rows=4 width=102) (actual
time=30.57..30.59rows=17 loops=1)
 
                                                        Sort Key: b.batchid
                                                        ->  Nested Loop  (cost=0.00..476.14 rows=4 width=102) (actual
time=25.21..30.47rows=17 loops=1)
 
                                                              ->  Index Scan using merchants_ix_merchid_idx on
merchantsm  (cost=0.00..5.65 rows=1 width=78) (actual time=23.81..23.82 rows=1 loops=1)
 
                                                                    Index Cond: (merchid = '701252267'::character
varying)
                                                              ->  Index Scan using batchheader_ix_merchantid_idx on
batchheaderb  (cost=0.00..470.30 rows=15 width=24) (actual time=1.38..6.55 rows=17 loops=1)
 
                                                                    Index Cond: ("outer".merchantid = b.merchantid)
                                                                    Filter: (batchdate > '2002-12-15
00:00:00'::timestampwithout time zone)
 
                                                  ->  Sort  (cost=1543119.01..1544045.79 rows=370710 width=70) (actual
time=1194260.51..1194892.79rows=368681 loops=1)
 
                                                        Sort Key: d.batchid
                                                        ->  Index Scan using batchdetail_ix_tranamount_idx on
batchdetaild  (cost=0.00..1489103.46 rows=370710 width=70) (actual time=5.26..1186051.44 rows=370307 loops=1)
 
                                                              Index Cond: ((tranamount >= 500.0) AND (tranamount <=
700.0))
                                            ->  Seq Scan on cardtype c  (cost=0.00..1.10 rows=10 width=14) (actual
time=11.77..11.79rows=10 loops=5)
 
                                      ->  Seq Scan on tranheader t  (cost=0.00..55.15 rows=1923 width=16) (actual
time=0.02..5.46rows=1923 loops=5)
 
                                            Filter: (clientid = 6)
                                ->  Seq Scan on purc1 p1  (cost=0.00..44285.35 rows=941335 width=19) (actual
time=10.79..3763.56rows=938770 loops=5)
 
                          ->  Seq Scan on direct dr  (cost=0.00..0.00 rows=1 width=9) (actual time=0.00..0.00 rows=0
loops=5)
                    ->  Seq Scan on carrental cr  (cost=0.00..0.00 rows=1 width=17) (actual time=0.00..0.00 rows=0
loops=5)
              ->  Seq Scan on checks ck  (cost=0.00..40.67 rows=1267 width=38) (actual time=0.77..7.15 rows=1267
loops=5)
Total runtime: 1221645.52 msec

 
> Tomasz Myrta wrote:
> Are there any where clauses which all of theses variation have?

Yes.....WHERE clientid = ? will appear in every query.  The others are present based on user input.

 
> Ron Johnson wrote:
> What are the indexes on batchdetail?
> There's one on batchid and a seperate one on tranamount?
> If so, what about dropping them and create a single multi-segment
> index on "batchid, tranamount".  (A constraint can then enforce
> uniqueness on batchid.
 
There is no index on batchid, I think it is a good idea to create one.  Stephan also suggested this.  After I try the
singlebatchid index, I might try to multi-segment index idea as well.   I'll post results later today.
 
 
> Stephan Szabo wrote:
> Then I realized that batchheader.batchid and
> batchdetail.batchid don't even have the same
> type, and that's probably something else you'd
> need to fix.

Yes, that's a mistake on my part....batchdetail(batchid) should be an int8.  It looks to me like converting this
datatypecan't be done with a single ALTER TABLE ALTER COLUMN statement.....so I guess I'll work around it with an ADD,
UPDATE,DROP, and RENAME.
 
 
> Josh Berkus wrote:
> Um, Roman, keep in mind this is a mailing list.   I'm sure that
> everyone here is happy to give you the tools to figure out how to fix
> things, but only in a DIY fashion, and not on your schedule. 

I hate being defensive, but I don't remember saying that I expect anyone to fix my problems for me on my schedule.  *I*
hopethat *I* can get this done by Friday, because otherwise my boss is going to tell me to dump Postgres and install
MSSQLon the server.  I only mention this fact because it's a blow against PostgreSQL's reputation if I have to give up.
There is no pressure on you, and I apologize if something I said sounded like whining.
 
 
I am VERY grateful for the time that all of you have given to this problem.
 
Roman Fail
Sr. Web Application Programmer
POS Portal, Inc.
 

pgsql-performance by date:

Previous
From: Stephan Szabo
Date:
Subject: Re: 7.3.1 New install, large queries are slow
Next
From: Stephan Szabo
Date:
Subject: Re: 7.3.1 New install, large queries are slow