Re: 7.3.1 New install, large queries are slow - Mailing list pgsql-performance

From Tom Lane
Subject Re: 7.3.1 New install, large queries are slow
Date
Msg-id 14736.1042730017@sss.pgh.pa.us
Whole thread Raw
In response to Re: 7.3.1 New install, large queries are slow  ("Roman Fail" <rfail@posportal.com>)
Responses Re: 7.3.1 New install, large queries are slow  ("Josh Berkus" <josh@agliodbs.com>)
List pgsql-performance
"Roman Fail" <rfail@posportal.com> writes:
> SELECT ...
> FROM tranheader t, batchheader b, merchants m, cardtype c, (batchdetail d
> LEFT JOIN purc1 p1 on p1.batchdetailid=d.batchdetailid
> LEFT JOIN direct dr ON dr.batchdetailid = d.batchdetailid
> LEFT JOIN carrental cr ON cr.batchdetailid = d.batchdetailid
> LEFT JOIN checks ck ON ck.batchdetailid = d.batchdetailid)
> WHERE t.tranheaderid=b.tranheaderid
> AND m.merchantid=b.merchantid
> AND d.batchid=b.batchid
> AND c.cardtypeid=d.cardtypeid
> AND t.clientid = 6
> AND d.tranamount BETWEEN 500.0 AND 700.0
> AND b.batchdate > '2002-12-15'
> AND m.merchid = '701252267'

No no no ... this is even worse than before.  Your big tables are
batchdetail (d) and purc1 (p1).  What you've got to do is arrange the
computation so that those are trimmed to just the interesting records as
soon as possible.  The constraint on d.tranamount helps, but after that
you proceed to join d to p1 *first*, before any of the other constraints
can be applied.  That's a huge join that you then proceed to throw away
most of, as shown by the row counts in the EXPLAIN output.

Note the parentheses I added above to show how the system interprets
your FROM clause.  Since dr,cr,ck are contributing nothing to
elimination of records, you really want them joined last, not first.

What would probably work better is

SELECT ...
FROM
  (SELECT ...
   FROM tranheader t, batchheader b, merchants m, cardtype c, batchdetail d
   WHERE t.tranheaderid=b.tranheaderid
   AND m.merchantid=b.merchantid
   AND d.batchid=b.batchid
   AND c.cardtypeid=d.cardtypeid
   AND t.clientid = 6
   AND d.tranamount BETWEEN 500.0 AND 700.0
   AND b.batchdate > '2002-12-15'
   AND m.merchid = '701252267') ss
  LEFT JOIN purc1 p1 on p1.batchdetailid=ss.batchdetailid
  LEFT JOIN direct dr ON dr.batchdetailid = ss.batchdetailid
  LEFT JOIN carrental cr ON cr.batchdetailid = ss.batchdetailid
  LEFT JOIN checks ck ON ck.batchdetailid = ss.batchdetailid

which lets the system get the useful restrictions applied before it has
to finish expanding out the star query.  Since cardtype isn't
contributing any restrictions, you might think about moving it into the
LEFT JOIN series too (although I think the planner will choose to join
it last in the subselect, anyway).

            regards, tom lane

pgsql-performance by date:

Previous
From: Ron Johnson
Date:
Subject: Re: schema/db design wrt performance
Next
From: Andrew Sullivan
Date:
Subject: Re: schema/db design wrt performance