Re: need help with a query - Mailing list pgsql-performance

From Jonah H. Harris
Subject Re: need help with a query
Date
Msg-id 36e682920710210944m4ba6b073yde7ef284aa7e1d74@mail.gmail.com
Whole thread Raw
In response to Re: need help with a query  (Pavel Velikhov <pvelikhov@yahoo.com>)
List pgsql-performance
On 10/20/07, Pavel Velikhov <pvelikhov@yahoo.com> wrote:
> Left the query running for 10+ hours and had to kill it. I guess there
> really was no need to have lots of
> shared buffers (the hope was that postgresql will cache the whole table). I
> ended up doing this step inside
> the application as a pre-processing step. Can't have postgres running with
> different fsych options since this
> will be part of an "easy to install and run" app, that should just require a
> typical PosgreSQL installation.

Is the size always different?  If not, you could limit the updates:

UPDATE links
  SET target_size = size
 FROM articles
 WHERE articles.article_id = links.article_to
        AND links.target_size != articles.size;

Since this is a huge operation, what about trying:

CREATE TABLE links_new AS SELECT l.col1, l.col2, a.size as
target_size, l.col4, ... FROM links l, articles a WHERE a.article_id =
l.article_to;

Then truncate links, copy the data from links_new.  Alternatively, you
could drop links, rename links_new to links, and recreate the
constraints.

I guess the real question is application design.  Why doesn't this app
store size at runtime instead of having to batch this huge update?

--
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation                | fax: 732.331.1301
499 Thornall Street, 2nd Floor          | jonah.harris@enterprisedb.com
Edison, NJ 08837                        | http://www.enterprisedb.com/

pgsql-performance by date:

Previous
From: Erik Jones
Date:
Subject: Re: [SQL] two queryes in a single tablescan
Next
From: Lee Keel
Date:
Subject: Memory Settings....