Re: problem with large inserts - Mailing list pgsql-performance

From Lutz Fischer
Subject Re: problem with large inserts
Date
Msg-id 50CA0358.9040401@staffmail.ed.ac.uk
Whole thread Raw
In response to Re: problem with large inserts  (Filip Rembiałkowski <plk.zuber@gmail.com>)
Responses Re: problem with large inserts  (Filip Rembiałkowski <plk.zuber@gmail.com>)
List pgsql-performance
Thanks a lot you saved my day

create temp table foo AS SELECT DISTINCT ...
did take a mere 77464.744 ms
And an additional
Insert into LPP select * from foo;
Just 576.909 ms

I don't really understand why it's working via a temp table but not
directly (or in any reasonable amount of time) - but at least I have a
solution I can work with.


On 13/12/12 16:09, Filip Rembiałkowski wrote:
> Just an idea - how long does it take to run _only_
> CREATE TEMP TABLE foo AS <your SELECT here>
>
>
>
>
> On Thu, Dec 13, 2012 at 4:37 PM, Lutz Fischer
> <lfischer@staffmail.ed.ac.uk> wrote:
>> Hi
>>
>> I have currently some trouble with inserts into a table
>>
>> INSERT INTO LPP (PPID, LID)
>> SELECT DISTINCT PPid, LID FROM
>>         (SELECT * FROM PP WHERE s_id = sid) pp
>>             INNER JOIN
>>         has_protein hp1
>>             ON pp.p1id = hp1.pid
>>             INNER JOIN
>>         has_protein hp2
>>             ON pp.p2_id = hp2.pid
>>             INNER JOIN
>>         (SELECT * FROM L WHERE s_id = sid) l
>>             ON (hp1.pid = l.p1id AND hp2.pid = l.p2id AND hp1.ppos +
>> pp.s1 = l.s1 AND hp2.ppos + pp.s2 = l.s2)
>>             OR (hp1.pid = l.p2id AND hp2.pid = l.p1id AND hp1.ppos +
>> pp.s1 = l.s2 AND hp2.ppos + pp.s2 = l.s1)
>>     ;
>>
>> If I run only
>>
>> SELECT DISTINCT PPid, LID FROM
>>         (SELECT * FROM PP WHERE s_id = 708) pp
>>             INNER JOIN
>>         has_protein hp1
>>             ON pp.p1id = hp1.pid
>>             INNER JOIN
>>         has_protein hp2
>>             ON pp.p2_id = hp2.pid
>>             INNER JOIN
>>         (SELECT * FROM L WHERE s_id = 708) l
>>             ON (hp1.pid = l.p1id AND hp2.pid = l.p2id AND hp1.ppos +
>> pp.s1 = l.s1 AND hp2.ppos + pp.s2 = l.s2)
>>             OR (hp1.pid = l.p2id AND hp2.pid = l.p1id AND hp1.ppos +
>> pp.s1 = l.s2 AND hp2.ppos + pp.s2 = l.s1)
>>     ;
>>
>> it returns 200620 rows in 170649 ms  ( thats just under 3  minutes). I
>> stopped the actual insert after about 8h.
>>
>> The table that the insert happens to, is following:
>> CREATE TABLE LPP
>> (
>>   ppid bigint NOT NULL,
>>   lid bigint NOT NULL,
>>   CONSTRAINT pk_lpp PRIMARY KEY (ppid,lid)
>> )
>>
>> I also tried without the primary key but that one is still running for
>> more that a day.
>>
>> Currently the table LPP holds 471139 rows. Its linking the PP and the L
>> table.
>>
>> There are no foreign keys referring to that table nor are there any
>> other constraints on it.
>> Previously I had foreign keys on lid and ppid refering to the L and PP
>> table. But in a desperate try to get some speed up I deleted these. -
>> But still...
>>
>> I am running postgresql 9.2 on a windows 2008  R2 server with 256 GB and
>> the database is on something like a raid 1+0 (actually a raid1e)
>> consisting of 3x4TB disks (limit of what could easily be fitted into the
>> server).
>>
>> At the given time  there were no concurrent access to any of the
>> involved tables.
>>
>> Has anybody some idea why the insert takes so long and/or how to speed
>> things up a bit? I could live with something like half an hour - better
>> would be in minutes.
>>
>>
>> Thanks for any responds,
>>
>> Lutz Fischer
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance


--
Lutz Fischer
lfischer@staffmail.ed.ac.uk
+44 131 6517057


The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Limit & offset effect on query plans
Next
From: Filip Rembiałkowski
Date:
Subject: Re: problem with large inserts