Re: Using multi-row technique with COPY - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Using multi-row technique with COPY
Date
Msg-id 200511291917.jATJHrM08049@candle.pha.pa.us
Whole thread Raw
In response to Using multi-row technique with COPY  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: Using multi-row technique with COPY  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Using multi-row technique with COPY  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
Simon Riggs wrote:
> As a further enhancement, I would also return to the NOLOGGING option
> for COPY. Previously we had said that COPY LOCK was the way to go -
> taking a full table lock to prevent concurrent inserts to a block from a
> COPY that didn't write WAL and another backend which wanted to write WAL
> about that block. With the above suggested all-inserts-at-once
> optimization, it would no longer be a requirement to lock the table.
> That means we can continue to take advantage of the ability to run
> multiple COPY loads into the same table. Avoiding writing WAL will
> further reduce CPU by about 15% and I/O by about 50%. 
> 
> I would also suggest that pgdump be changed to use the NOLOGGING option
> by default, with an option to work as previously.

For those who have been around, they know I dislike having options that
95% of our users desire not be the default behavior.  I think the COPY
NOLOGGING idea falls in that category.  I would like to explore if there
is a way to have COPY automatically do no logging where possible by
default.

First, I think NOLOGGING is probably the wrong keyword.  I am thinking
SHARE/EXCLUSIVE is best because they are already keywords, and they
explain the effect of the flag on other applications, rather than the
LOGGING capability, which is invisible to applications.

I am thinking we would have COPY WITH [ [ EXCLUSIVE | SHARE ] [ LOCK ]] ...
EXCLUSIVE lock would be NOLOGGING, SHARE would do logging because other
applications could insert into the table at the same time (and do
UPDATES/DELETES of the inserted rows).

One idea for default behavior would be to use EXCLUSIVE when the table
is zero size.  I think that would do pg_dump and most of the user cases,
and of course users could override the default by using a keyword.  We
could emit a NOTICE if an an exclusive lock is used without an EXCLUSIVE
keyword.  One problem I see is that there is no way to insure zero size
without a lock that blocks other writers.  Is that reliable?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Open Source management resource
Next
From: Tom Lane
Date:
Subject: Re: Using multi-row technique with COPY