Re: COPY locking - Mailing list pgsql-general

From Tom Lane
Subject Re: COPY locking
Date
Msg-id 17581.989504960@sss.pgh.pa.us
Whole thread Raw
In response to Re: COPY locking  (John Coers <coers@intrinsity.com>)
List pgsql-general
John Coers <coers@intrinsity.com> writes:
> Tom Lane wrote:
>> Are you talking about concurrent copies into the same table?  That would
>> suffer a lot of short-term lock interference, no doubt, since all the
>> copies are going to be writing the same disk page (ie, the current last
>> page of the table).

> Yes, that is exactly the problem.  So there is not a lock per se?  DO
> they each right to their own piece of shared memory and then try and
> flush that to disk and that is when they interfere?

No, they all share the same disk buffer in shared memory for whatever is
currently the last page of the file.  Adding a tuple into that page
requires grabbing a short-term lock on that page (since obviously only
one process can do it at a time, else they'd be trying to insert into
the same physical spot).  That lock is released as soon as the tuple is
physically stored in the buffer, but with enough processes running COPY
into the same table, you'll get contention for the lock.

I suspect the performance issue you're seeing is not so much the need
for a short-term lock (it's awful hard to see how to do without one,
anyway) as it is that our current spinlock implementation is pretty bad
in the face of heavy contention.  On most platforms the backoff when
you can't get the lock on the first try is a ten-millisecond sleep,
which is huge overkill for locks that are only held for periods of
microseconds.  There was discussion in pghackers a few months ago about
reimplementing spinlocks in some more modern way (eg, using Posix
semaphores if available) but no progress has been made yet.

> Are there any suggested techniques or tweaks I can make to avoid this
> interference?

Do you really need to run multiple COPYs in parallel, or would
serializing them be just as good?  You could serialize them without
locking out readers by doing

    BEGIN;
    LOCK TABLE foo IN EXCLUSIVE MODE;
    COPY foo FROM ...
    END;

"EXCLUSIVE" mode isn't quite as exclusive as a plain LOCK TABLE; it
still allows other readers.  See "Table-level locks" in the User's
Guide.

            regards, tom lane

pgsql-general by date:

Previous
From: Tony Grant
Date:
Subject: Re: ER diagrams
Next
From: Justin Clift
Date:
Subject: Re: ER diagrams