Re: problem with large inserts - Mailing list pgsql-general

From Albe Laurenz
Subject Re: problem with large inserts
Date
Msg-id A737B7A37273E048B164557ADEF4A58B05789D15@ntex2010i.host.magwien.gv.at
Whole thread Raw
In response to problem with large inserts  (Lutz Fischer <lfischer@staffmail.ed.ac.uk>)
List pgsql-general
Lutz Fischer wrote:
> I have currently some trouble with inserts into a table
>
> INSERT INTO LPP (PPID, LID)
> SELECT DISTINCT PPid, LID FROM
>         (SELECT * FROM PP WHERE s_id = sid) pp
>             INNER JOIN
>         has_protein hp1
>             ON pp.p1id = hp1.pid
>             INNER JOIN
>         has_protein hp2
>             ON pp.p2_id = hp2.pid
>             INNER JOIN
>         (SELECT * FROM L WHERE s_id = sid) l
>             ON (hp1.pid = l.p1id AND hp2.pid = l.p2id AND hp1.ppos +
> pp.s1 = l.s1 AND hp2.ppos + pp.s2 = l.s2)
>             OR (hp1.pid = l.p2id AND hp2.pid = l.p1id AND hp1.ppos +
> pp.s1 = l.s2 AND hp2.ppos + pp.s2 = l.s1)
>     ;
>
> If I run only
>
> SELECT DISTINCT PPid, LID FROM
>         (SELECT * FROM PP WHERE s_id = 708) pp
>             INNER JOIN
>         has_protein hp1
>             ON pp.p1id = hp1.pid
>             INNER JOIN
>         has_protein hp2
>             ON pp.p2_id = hp2.pid
>             INNER JOIN
>         (SELECT * FROM L WHERE s_id = 708) l
>             ON (hp1.pid = l.p1id AND hp2.pid = l.p2id AND hp1.ppos +
> pp.s1 = l.s1 AND hp2.ppos + pp.s2 = l.s2)
>             OR (hp1.pid = l.p2id AND hp2.pid = l.p1id AND hp1.ppos +
> pp.s1 = l.s2 AND hp2.ppos + pp.s2 = l.s1)
>     ;
>
> it returns 200620 rows in 170649 ms  ( thats just under 3  minutes). I
> stopped the actual insert after about 8h.
>
> The table that the insert happens to, is following:
> CREATE TABLE LPP
> (
>   ppid bigint NOT NULL,
>   lid bigint NOT NULL,
>   CONSTRAINT pk_lpp PRIMARY KEY (ppid,lid)
> )
>
> I also tried without the primary key but that one is still running for
> more that a hour.
>
> Currently the table LPP holds 471139 rows. Its linking the PP and the L
> table.
>
> There are no foreign keys referring to that table nor are there any
> other constraints on it.
> Previously I had foreign keys on lid and ppid refering to the L and PP
> table. But in a desperate try to get some speed up I deleted these. -
> But still...
>
> I am running postgresql 9.2 on a windows 2008  R2 server with 256 GB and
> the database is on something like a raid 1+0 (actually a raid1e)
> consisting of 3x4TB disks (limit of what could easily be fitted into the
> server).
>
> At the given time  there were no concurrent access to any of the
> involved tables.
>
> Has anybody some idea why the insert takes so long and/or how to speed
> things up a bit? I could live with something like half an hour - better
> would be in minutes.

Can you spot the bottleneck? Is it disk I/O? CPU?

Do you see complaints in the log file?

These are the usual suspects to tune to get better insert performance
(if commit is not the problem):

wal_buffers = 16MB
checkpoint_segments = 100
checkpoint_completion_target = 0.9

Yours,
Laurenz Albe


pgsql-general by date:

Previous
From: "Fan, Yi"
Date:
Subject: Re: Looking for cooperators
Next
From: aasat
Date:
Subject: Re: How to remove n-first elements from array?