Home > mailing lists

Re: PostgreSQL runs a query much slower than BDE and MySQL - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: PostgreSQL runs a query much slower than BDE and MySQL
Date	August 16, 2006 19:52:06
Msg-id	29246.1155768717@sss.pgh.pa.us Whole thread Raw
In response to	PostgreSQL runs a query much slower than BDE and MySQL ("Peter Hardman" <peter@ssbg.zetnet.co.uk>)
Responses	Re: PostgreSQL runs a query much slower than BDE and MySQL Re: PostgreSQL runs a query much slower than BDE and MySQL
List	pgsql-performance

Tree view

"Peter Hardman" <peter@ssbg.zetnet.co.uk> writes:
> I'm in the process of migrating a Paradox 7/BDE 5.01 database from single-user
> Paradox to a web based interface to either MySQL or PostgreSQL.
> The query I run is:

> /* Select all sheep who's most recent transfer was into the subject flock */
> SELECT DISTINCT f1.regn_no, f1.transfer_date as date_in
> FROM SHEEP_FLOCK f1 JOIN
>     /* The last transfer date for each sheep */
>     (SELECT f.regn_no, MAX(f.transfer_date) as last_xfer_date
>     FROM  SHEEP_FLOCK f
>     GROUP BY f.regn_no) f2
> ON f1.regn_no = f2.regn_no
> WHERE f1.flock_no = '1359'
> AND f1.transfer_date = f2.last_xfer_date

This seems pretty closely related to this recent thread:
http://archives.postgresql.org/pgsql-performance/2006-08/msg00220.php
in which the OP is doing a very similar kind of query in almost exactly
the same way.

I can't help thinking that there's probably a better way to phrase this
type of query in SQL, though it's not jumping out at me what that is.

What I find interesting though is that it sounds like both MSSQL and
Paradox know something we don't about how to optimize it.  PG doesn't
have any idea how to do the above query without forming the full output
of the sub-select, but I suspect that the commercial DBs know a
shortcut; perhaps they are able to automatically derive a restriction
in the subquery similar to what you did by hand.  Does Paradox have
anything comparable to EXPLAIN that would give a hint about the query
plan they are using?

Also, just as in the other thread, I'm thinking that a seqscan+hash
aggregate would be a better idea than this bit:

>                    ->  GroupAggregate  (cost=0.00..3924.91 rows=33676 width=13) (actual time=0.324..473.131
rows=38815loops=1) 
>                          ->  Index Scan using sheep_flock_pkey on sheep_flock f (cost=0.00..3094.95 rows=81802
width=13)(actual time=0.295..232.156) 

Possibly you need to raise work_mem to get it to consider the hash
aggregation method.

BTW, are you *sure* you are testing PG 8.1?  The "Subquery Scan f2" plan
node looks unnecessary to me, and I'd have expected 8.1 to drop it out.
8.0 and before would have left it in the plan though.  This doesn't make
all that much difference performance-wise in itself, but it does make me
wonder what you are testing.

            regards, tom lane

pgsql-performance by date:

From: "Peter Hardman"
Date: 16 August 2006, 16:44:01
Subject: Re: PostgreSQL runs a query much slower than BDE and MySQL

From: Steve Poe
Date: 16 August 2006, 23:10:46
Subject: Re: Postgresql Performance on an HP DL385 and

Re: PostgreSQL runs a query much slower than BDE and MySQL - Mailing list pgsql-performance

Previous

Next