Re: I feel the need for speed. What am I doing wrong? - Mailing list pgsql-hackers

From Dann Corbit
Subject Re: I feel the need for speed. What am I doing wrong?
Date
Msg-id D90A5A6C612A39408103E6ECDD77B829408A25@voyager.corporate.connx.com
Whole thread Raw
Responses Re: I feel the need for speed. What am I doing wrong?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Monday, January 06, 2003 5:26 PM
> To: Nigel J. Andrews
> Cc: Dann Corbit; pgsql-hackers@postgresql.org;
> pgsql-general@postgresql.org
> Subject: Re: [HACKERS] I feel the need for speed. What am I
> doing wrong?
>
>
> "Nigel J. Andrews" <nandrews@investsystems.co.uk> writes:
> >> select a."RT_REC_KEY", a."cnxarraycolumn", a."CRC" from
> >> "CNX_DS_53_SIS_STU_OPT_FEE_TB" a left outer join
> >> "CNX_DS2_53_SIS_STU_OPT_FEE_TB" b on ( a."RT_REC_KEY" =
> >> b."RT_REC_KEY" and a."cnxarraycolumn" = b."cnxarraycolumn") where
> >> b.oid is null ;
>
> > I suspect you get no results because it's unlikely b.oid
> will be null.
>
> Try "it's impossible for b.oid to be null --- unless a dummy
> b row is being provided by the LEFT JOIN".  I interpret the
> purpose of the query to be to look for "a" rows that have no
> matching "b" row.
>
> Using OID for this is kind of cute, I guess, since it is
> guaranteed not-null in a real row; he doesn't have to think
> about whether any of his regular columns are not-null.

I am very happy to report that PostgreSQL now easily beats MS Access for
speed!
:-)

Here are the MS Access results, where I create on index and then a
second:
Access Outer Join         2 column index on 1st table
02:29.9    2 column index on 2nd table + 2 column index on 1st
table        02:18.3

For PostgreSQL, I originally created a 3 column index (because I have
also 3 column joins in other places) and got this result:
Postgres Outer Join                           enable_seqscan  = 1    3 column Index    1    12:43.9

Showing the plan showed that the indexes were being ignored.

Yikes!  Five times slower!  But then I took Tom's incredibly helpful
suggestion to disable the sequential scan:

Postgres Outer Join                           enable_seqscan  = 0    3 column Index    0    05:17.5

Changed to a 2 column index:    2 column index    0    04:58.3

Added an index to the second table:    2 column index on 2nd table + 2 column index on 1st
table    0    01:53.6

PostgreSQL is now 22% faster than Access (HAPPY DAYS)!

For my application, I happen to know that the data will be approximately
physically clustered, and that the indexes will always be very near
matches for data sets big enough to matter.  Benchmarking also showed
that adding the 3rd column to the index was counter productive, even
when the join criteria was a.f1 = b.f1, a.f2=b.f2, a.f3 = b.f3.


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: OS/400 support?
Next
From: Tom Lane
Date:
Subject: Re: I feel the need for speed. What am I doing wrong?