Thread: Big copy slowdown

Big copy slowdown

From
Brian Hurt
Date:
I'm not sure if this is the right forum for this question, so I'll start
here and escalate as necessary.

One of the things I do a lot is copy large tables from one database to
another.  I'd started life using pgdump, but this requires I save the
whole table into an intermediate file (or play games with pipes).
Rather than doing either, I instead wrote a quick script which creates a
cursor at the source, and starts schlepping data from one to other.  For
"small" tables (say, a million rows or less), this works just fine, but
when I try to copy a bigger table (tens of millions of rows),  the
performance continually drops off, with the copy becoming slower and slower.

On 8.3-beta1, I see the memory utilization skyrocketing, specifically
the non-shared residential space of the postmaster daemon, driving the
whole system deep into swap (sooner or later), and bringing things to a
screeching halt.  But on 8.1.4, I don't see memory utilization problem-
the memory utilization of a daemon pegs out at just a little larger than
the shared buffers size (i.e. what I'd expect)- but I still see the
slowdown.

It is definately the copy that is the problem- I've tried a variation of
my script with the copy commented out (just throwing the data away) to
test if it's the cursor- and I get 20K+ rows/second.from the cursor.
I've also tried breaking the copy up- finishing the current copy command
and starting a new one every million rows or so, and that doesn't help.
I've also fiddled with transactions, and rate-limiting the copy, and
neither of those help.

Is this a bug in postgres?  If not, as I'm assume, what should I be
doing to make this work fast?

Brian


Re: Big copy slowdown

From
Tom Lane
Date:
Brian Hurt <bhurt@janestcapital.com> writes:
> Is this a bug in postgres?

Well, if you'd provide enough info for someone else to reproduce it, we
could have a look.

            regards, tom lane

Re: Big copy slowdown

From
Brian Hurt
Date:
Tom Lane wrote:
Brian Hurt <bhurt@janestcapital.com> writes: 
Is this a bug in postgres?   
Well, if you'd provide enough info for someone else to reproduce it, we
could have a look. 

Hello- I just wanted to close the book on this problem.  I first noticed the problem when long copies kept slowing down.  It turns out it wasn't a problem with Postgresql at all, as I recreated it with bonnie++ (which explains the long silence on this issue).  After doing a lot of sustained I/O, we see I/O wait times climb until we're getting virtually no I/O performance at all, and it doesn't matter if it's Postgresql or bonnie++ doing the I/O.  iostat would report that we'd be doing only 2MB/sec, and we'd be seeing 90%+ iowait percentages in atop or top.  So our solution is to fix out I/O subsystem.  We're replacing the expensive TLA SAN storage device with a low end Fibre Channel raid (which we were going to do anyways, as even at it's best, the TLA SAN wasn't impressive), upgrading from an old 2.4 linux kernel to a newer 2.6 kernel, changing the I/O Scheduler to deadline, and upgrading the server hardware.  Some combination of the above solves the problem.

Hopefully no one has been lying awake worrying about this problem :-), but I wanted to close it out, and to leave a permanent record in the archives for the next person who has this problem.

Brian