Good results from a DB2 user - Mailing list pgsql-advocacy

From Christopher Kings-Lynne
Subject Good results from a DB2 user
Date
Msg-id 068301c3451e$355a6d80$2800a8c0@mars
Whole thread Raw
List pgsql-advocacy
I will post my next email to him in a minute...I just thought this was a
good bit of support for new IN list performance...

Chris

----- Original Message -----
From: "Eirik Oeverby" <ltning@anduin.net>
To: "Christopher Kings-Lynne" <chriskl@familyhealth.com.au>
Sent: Monday, July 07, 2003 8:35 PM
Subject: Re: IN list processing performance (yet again)

Hey,

Thanks a lot for your reply.
I have taken the hard route - I've gotten 7.4-CVS running here. It seems
to work nicely, apart from some small changes it required to some
functions I defined in 7.3.3.

It is now in fact considerably faster than DB2, by roughly 65% on
average. In particular the recursive functions (that I had to create by
hand as functions in pg because there is no recursive query support) are
very fast, and with 7.4 even the replication is faster than it used to
be with DB2.

Thanks!
/Eirik

On Mon, 7 Jul 2003 19:14:14 +0800 (WST)
Christopher Kings-Lynne <chriskl@familyhealth.com.au> wrote:

> Hi Elrick
>
> >  DELETE FROM articles WHERE id IN (SELECT id FROM articles_tmp)
> > to clean out any overlapping data (to avoid clashes when inserting).
>
> The problem with optimising this to an EXISTS is that you can't use
> table aliases in DELETE queries.  Give this a shot and tell me if it
> improves things:
>
> DELETE FROM articles WHERE EXISTS (SELECT id FROM articles_tmp AS at
> WHERE at.id=articles.id);
>
> Make sure there is an index on articles_tmp(id).  Don't run this on
> live data, just in case :)
>
> > This is incredibly slow. With about 80.000 rows in the articles
> > table and 2.500 rows in articles_tmp, this query takes upwards of 5
> > minutes (I have never actually allowed it to finish). The id field
> > is obviously also the primary key.
>
> The problem is that in 7.3, IN processing basically generates a list
> 2500 integers long and then for each row in the outer table, it does a
> linear search of all 2500 entries in the list. Very slow.  In 7.4CVS,
> this has been fixed and a very fast hash algorithm is used instead.
> If you're brave, you could try using 7.4CVS version :)  Otherwise,
> wait a month or two (cos we're in feature freeze) and 7.4 release will
> be out.
>
> You should be able to match DB2's speed in 7.4...
>
> Chris
>




pgsql-advocacy by date:

Previous
From: "scott.marlowe"
Date:
Subject: Re: About the default performance
Next
From: "Christopher Kings-Lynne"
Date:
Subject: Follow up on previous email