Re: limiting performance impact of wal archiving. - Mailing list pgsql-performance

From Scott Marlowe
Subject Re: limiting performance impact of wal archiving.
Date
Msg-id dcc563d10911101010j71947ba3mb79b6ccca9f5e224@mail.gmail.com
Whole thread Raw
In response to Re: limiting performance impact of wal archiving.  (Greg Smith <greg@2ndquadrant.com>)
List pgsql-performance
On Tue, Nov 10, 2009 at 10:48 AM, Greg Smith <greg@2ndquadrant.com> wrote:
> Scott Marlowe wrote:
>>
>> On some busy systems with lots of small transactions large
>> shared_buffer can cause it to run slower rather than faster due to
>> background writer overhead.
>>
>
> This is only really true in 8.2 and earlier, where background writer
> computations are done as a percentage of shared_buffers.  The rewrite I did
> in 8.3 changes that to where it's proportional to overall system activity
> (specifically, buffer allocations) and you shouldn't see this there.

Nice to know since we converted to 8.3 a few months ago.  I did notice
the huge overall performance improvement from 8.2 to 8.3 and I assume
part of that was the code you wrote for WAL.  Thanks!

>  However, large values for shared_buffers do increase the potential for
> longer checkpoints though, which is similar background overhead starting in
> 8.3.  That's why I mention it hand in hand with decreasing the checkpoint
> frequency, you really need to do that before large shared_buffers values are
> viable.

Yeah.  We run 64 checkpoint segments and a 30 minute timeout and a
lower completion target (0.25 to 0.5) on most of our servers with good
behaviour in 8.3

> This is actually a topic I meant to mention to Laurent:  if you're not
> running at least PG8.3, you really should be considering what it would take
> to upgrade to 8.4.  It's hard to justify the 8.3->8.4 upgrade just based on
> that version's new performance features (unless you delete things a lot),
> but the changes from 8.1 to 8.2 to 8.3 make the database faster at a lot of
> common tasks.

True++  8.3 is the minimum version of pg we run anywhere at work now.
8.4 isn't compelling yet for us, since we finally got fsm setup right.
 But for someone upgrading from 8.2 or before, I'd think the automatic
fsm stuff would be a big selling point.

>> Note that if you've got a slow IO subsystem, a large number of
>> checkpoint segments can result in REALLY long restart times after a
>> crash, as well as really long waits for shutdown and / or bgwriter
>> once you've filled them all up.
>>
>
> The setup here, with a decent number of disks and a 3ware controller,
> shouldn't be that bad here.

If he were running RAID-5 I'd agree. :) That's gonna slow down the
write speeds quite a bit during recovery.

> Ultimately you have to ask yourself whether
> it's OK to suffer from the rare recovery issue this introduces if it
> improves things a lot all of the rest of the time, which increasing
> checkpoint_segments does.

Note that 100% of the time I have to wait for recovery on start it's
because something went wrong with a -m fast shutdown that required
either hand killing all postgres backends and the postmaster, or a -m
immediate.  On the machines with 12 disk RAID-10 arrays this takes
seconds to do.  On the slaves with a pair of 7200RPM SATA drives, or
the one at the office on RAID-6, and 60 to 100+ WAL segments, it takes
a couple of minutes.

>> Note that XFS gets a LOT of testing, especially under linux.  That
>> said it's still probably only 1/10th as many dbs (or fewer) as those
>> running on ext3 on linux.  I've used it before and it's a little
>> faster than ext3 at some stuff, especially deleting large files (or in
>> pg's case lots of 1G files) which can make ext3 crawl.
>
> While true, you have to consider whether the things it's better at really
> happen during a regular day.  The whole "faster at deleting large files"
> thing doesn't matter to me on a production DB server at all, so that
> slam-dunk win for XFS doesn't even factor into my filesystem ranking
> computations in that context.

ahhhh.  I store backups on my pgdata directory, so it does start to
matter there.  Luckily, that's on a slave database so it's not as
horrible as it could be.  Still running ext3 on it because it just
works.

pgsql-performance by date:

Previous
From: Greg Smith
Date:
Subject: Re: limiting performance impact of wal archiving.
Next
From: Jeff
Date:
Subject: Re: limiting performance impact of wal archiving.