Re: postgres 8.4, COPY, and high concurrency - Mailing list pgsql-performance

From Heikki Linnakangas
Subject Re: postgres 8.4, COPY, and high concurrency
Date
Msg-id 50A29F2B.7070604@vmware.com
Whole thread Raw
In response to postgres 8.4, COPY, and high concurrency  (Jon Nelson <jnelson+pgsql@jamponi.net>)
Responses Re: postgres 8.4, COPY, and high concurrency  (Jon Nelson <jnelson+pgsql@jamponi.net>)
List pgsql-performance
On 13.11.2012 21:13, Jon Nelson wrote:
> I was working on a data warehousing project where a fair number of files
> could be COPY'd more or less directly into tables. I have a somewhat nice
> machine to work with, and I ran on 75% of the cores I have (75% of 32 is
> 24).
>
> Performance was pretty bad. With 24 processes going, each backend (in COPY)
> spent 98% of it's time in semop (as identified by strace).  I tried larger
> and smaller shared buffers, all sorts of other tweaks, until I tried
> reducing the number of concurrent processes from 24 to 4.
>
> Disk I/O went up (on average) at least 10X and strace reports that the top
> system calls are write (61%), recvfrom (25%), and lseek (14%) - pretty
> reasonable IMO.
>
> Given that each COPY is into it's own, newly-made table with no indices or
> foreign keys, etc, I would have expected the interaction among the backends
> to be minimal, but that doesn't appear to be the case.  What is the likely
> cause of the semops?

I'd guess it's lock contention on WALInsertLock. That means, the system
is experiencing lock contention on generating WAL records for the
insertions. If that theory is correct, you ought to get a big gain if
you have wal_level=minimal, and you create or truncate the table in the
same transaction with the COPY. That allows the system to skip
WAL-logging the COPY.

Or you could upgrade to 9.2. The WAL-logging of bulk COPY was optimized
in 9.2, it should help precisely the scenario you're facing.

- Heikki


pgsql-performance by date:

Previous
From: Jon Nelson
Date:
Subject: postgres 8.4, COPY, and high concurrency
Next
From: Jeff Janes
Date:
Subject: Re: postgres 8.4, COPY, and high concurrency