Re: [GENERAL] How to improve: performance of query on postgresql 8.3 takes days - Mailing list pgsql-performance

From John R Pierce
Subject Re: [GENERAL] How to improve: performance of query on postgresql 8.3 takes days
Date
Msg-id 4C51FE04.2050700@hogranch.com
Whole thread Raw
In response to How to improve: performance of query on postgresql 8.3 takes days  (Dino Vliet <dino_vliet@yahoo.com>)
List pgsql-performance
  On 07/29/10 2:58 PM, Dino Vliet wrote:
>
> Dear postgresql list,
>
>
> I have some troubles generating data for a analysis task at hand.
>
>
> I have a table (table A) containing 5 million records and 28 number of
> attributes. This table is 461MB big if I copy it to a csv file.
>
>
> I want to create another table (table B) based on the contents of
> table A plus some 15 extra attributes (in pl/pgsql written functions
> which produce those extra attributes)
>
>
> So my statement looks like this:
>
>
> create tableB as (
>
> select some attributes, function1(A.attribute1)as attributeX+1,
> function2(A.attribute1,A.Attribute2,A.attribute3,A.attribute4,A.attribute5)
> as attribute X+2......function15(A.attribute1,A.attribute9) as
> attributeX+15 from tableA as A)
>
>
> This takes almost 60 hours to finish on my database server running
> debian 5.0 with XFS as filesystem containing 4GB RAM. I'm using
> postgresql server version 8.3 (but am seeing the same phenomena on my
> FreeBSD 8.0 database server running postgresql 8.4 as well)
>
>
> I arrived at 15 functions because I had 7 or 8 joins in the past and
> saw that my disk was getting hid and I had heard someplace that RAM is
> faster so I rewrote those 7 or 8 joins as functions in pl/pgsql. They
> were just simple lookups, although some of the functions are looking
> stuff up in tables containing 78000 records. However, I thought this
> wouldn't be a problem because they are simple functions which look up
> the value of one variable based on a parameter. 3 of the more special
> functions are shown here:
>
> ...
>
>   1.
>
>       What can I do to let the creation of table B go faster?
>
>   2.
>
>       Do you think the use of indices (but where) would help me? I
>       didn't go that route because in fact I don't have a where clause
>       in the create table B statement. I could put indices on the
>       little tables I'm using in the functions.
>
>   3.
>
>       What about the functions? Should I code them differently?
>
>   4.
>
>       What about my server configuration. What could be done over there?
>
>
> Thanks in advanced
>
>

certainly your lookup tables should have a index on the key you're using
to look up values.   without said index, that 78000 row 'little' table
will have to be sequentially scanned for every one of your several
million rows.

with said indexes, you may find that just doing JOINs when you actually
use this data rather than creating a new table will work quite nicely.
you could use a VIEW to do the joins transparently on the fly.



pgsql-performance by date:

Previous
From: Dino Vliet
Date:
Subject: How to improve: performance of query on postgresql 8.3 takes days
Next
From: Mark Rostron
Date:
Subject: what does "initplan" operation in explain output mean?