Re: SELECT INTO large FKyed table is slow - Mailing list pgsql-performance

From Mario Splivalo
Subject Re: SELECT INTO large FKyed table is slow
Date
Msg-id 4CF59017.1090705@megafon.hr
Whole thread Raw
In response to Re: SELECT INTO large FKyed table is slow  (Mladen Gogala <mladen.gogala@vmsinfo.com>)
Responses Re: SELECT INTO large FKyed table is slow  ("Pierre C" <lists@peufeu.com>)
List pgsql-performance
On 11/30/2010 05:26 PM, Mladen Gogala wrote:
> At the beginning of the load, you should defer all of the deferrable
> constraints, setting constraints deferred and issuing the copy statement
> within a transaction block, like this:
>
> scott=# begin; BEGIN
> Time: 0.203 ms
> scott=# set constraints all deferred;
> SET CONSTRAINTS
> Time: 0.201 ms
> scott=# copy test1 from '/tmp/test1.csv';
> COPY 100
> Time: 11.939 ms
> scott=# commit;
> ERROR: insert or update on table "test1" violates foreign key
> constraint "fk_tst1_deptno"
> DETAIL: Key (col1)=(1) is not present in table "dept".
>
>
> Of course, that will require complete rewrite of your load script,
> because the errors will be checked at the commit time and transaction
> can either fail as a whole or succeed as a whole. It's all or nothing

Well, it is like that now. First I load the data from the CSV into the
temporary table (just named temporary, exists on the server). That table
is usualy aroun 10k rows. Then I call the function which does the job.

> situation. How frequently do you see records with an incorrect drone_id?

Seldom.

> If that happens only once in a blue moon, you may need no stinkin'
> foreign keys in the first place, you may be able
> to have a batch job that will flag all the records with an invalid
> drone_id instead.

I did have that idea, yes, but still, I'd like to know what is slowing
postgres down. Because when I look at the disk I/O, it seems very random
- i get around 800k of disk reads and ocasionaly 1500k of writes (during
insert into history table).

> Furthermore, you can make sure that you have enough shared buffers to
> cache the entire "drones" table. Also, do "strace" on the postgres
> process handling your session and see whether the time is spent writing
> to WAL archives. If that is slowing you down, you should consider buying
> a SSD or a high end disk drive. I have never had such problem, but you
> should also check whether pg_loader can do anything for you.
>
> As far as speed is concerned, inserting with deferred foreign keys is
> almost as fast as inserting without foreign keys:
>
> scott=# alter table test1 drop constraint fk_tst1_deptno;
> ALTER TABLE
> Time: 16.219 ms
> scott=# copy test1 from '/tmp/test1.csv';
> COPY 100
> Time: 10.418 ms
>
> If you take a look at the example above, you will see that inserting
> with a deferred FK took 11.939 milliseconds while inserting into the
> same table without the FK took 10.418 milliseconds, the difference of
> 1.5 milliseconds per 100 rows. The timing of 2 seconds per 100
> rows looks suspiciously high. Me thinks that your problem is not just
> the foreign key, there must be something else devouring the time. You
> should have a test instance, compiled with "-g" option and do profiling.

I'll have to. So far I've been doing this only on that dedicated server.
I'll try to download the database to my desktop and try the tests there.

Concerning the shared_buffers, it's 256M, and the drones table is just 15M.

I have tried your recommendation and it yielded no difference.

Now I tried removing the constraints from the history table (including
the PK) and the inserts were fast. After few 'rounds' of inserts I added
constraints back, and several round after that were fast again. But then
all the same. Insert of some 11k rows took 4 seconds (with all
constraints) and now the last one of only 4k rows took one minute. I did
vacuum after each insert.


    Mario

pgsql-performance by date:

Previous
From: Mark Kirkwood
Date:
Subject: Re: SELECT INTO large FKyed table is slow
Next
From: "Pierre C"
Date:
Subject: Re: SELECT INTO large FKyed table is slow