Re: need help with a query - Mailing list pgsql-performance

From Pavel Velikhov
Subject Re: need help with a query
Date
Msg-id 226961.97083.qm@web56401.mail.re3.yahoo.com
Whole thread Raw
In response to need help with a query  (Pavel Velikhov <pvelikhov@yahoo.com>)
List pgsql-performance

On 10/20/07, Pavel Velikhov <pvelikhov@yahoo.com> wrote:
> Left the query running for 10+ hours and had to kill it. I guess there
> really was no need to have lots of
> shared buffers (the hope was that postgresql will cache the whole table). I
> ended up doing this step inside
> the application as a pre-processing step. Can't have postgres running with
> different fsych options since this
> will be part of an "easy to install and run" app, that should just require a
> typical PosgreSQL installation.

>Is the size always different?  If not, you could limit the updates:

>UPDATE links
>  SET target_size = size
>FROM articles
>WHERE articles.article_id = links.article_to
>        AND links.target_size != articles.size;

Ah, this sounds better for sure! But its probably as good as the scan with an index-scan subquery I was getting before...

>Since this is a huge operation, what about trying:

>CREATE TABLE links_new AS SELECT l.col1, l.col2, a.size as
>target_size, l.col4, ... FROM links l, articles a WHERE a.article_id =
>l.article_to;

>Then truncate links, copy the data from links_new.  Alternatively, you
>could drop links, rename links_new to links, and recreate the
>constraints.

>I guess the real question is application design.  Why doesn't this app
>store size at runtime instead of having to batch this huge update?

This is a link analysis application, I need to materialize all the sizes for target
articles in order to have the runtime part (vs. the loading part) run efficiently. I.e.
I really want to avoid a join with the articles table at runtime.

I have solved the size problem by other means (I compute it in my loader), but
I still have one query that needs to update a pretty large percentage of the links table...
I have previously used mysql, and for some reason I didn't have a problem with queries
like this (on the other hand mysql was crashing when building an index on article_to in the
links relation, so I had to work without a critical index)...

Thank!
Pavel


--
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation                | fax: 732.331.1301
499 Thornall Street, 2nd Floor          | jonah.harris@enterprisedb.com
Edison, NJ 08837                        | http://www.enterprisedb.com/

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

pgsql-performance by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Seqscan
Next
From: Nis Jørgensen
Date:
Subject: Re: How to improve speed of 3 table join &group (HUGE tables)