Re: Warm standby problems: Followup - Mailing list pgsql-admin

From David F. Skoll
Subject Re: Warm standby problems: Followup
Date
Msg-id 4AE75060.40504@roaringpenguin.com
Whole thread Raw
In response to Re: Warm standby problems: Followup  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-admin
Kevin Grittner wrote:
>> shared_buffers = 24MB

> You should probably set that higher.

Nah.  This machine is totally bored; tweaking PostgreSQL would be pointless
since it's so under-utilized.

>> archive_command = '/usr/bin/wal_archive_command.pl %p'

> It would probably be safer to pass in %f, too, and use it for the file
> name rather than plucking off the last portion of the %p -- at a
> minimum it might reduce the chances that you'd use that technique in
> restore_command, where it flat out won't work.

Nah.  Our Perl code is pretty robust; we use the File::Spec module to split
apart path names and that's safe.

The restore_command is robust (wrapped):

restore_command = '/usr/share/canit/failover/libexec/canit-failover-pg-standby.pl \
-s 2 -t /var/backups/postgres/initiate-failover /var/backups/postgres/wal/ %f %p %r'

and well-tested.

>> autovacuum = off

> Be very careful if you choose to do this -- you've just made yourself
> responsible for preventing bloat which can slowly strangle
> performance.

As I said, the server is completely under-utilized.  Nightly vacuum
takes about 7 minutes:

Oct 26 2009 21:56:53 DETAIL:  A total of 29184 page slots are in use (including overhead).
Oct 26 2009 21:56:53 29184 page slots are required to track all free space.

> There are a few setting which should almost always be overridden for
> production use.  You might want to read this for a start:

I doubt any of those settings is causing the problem.  The one setting
that I *do* suspect is the archive_timeout.  We don't have that set on
other systems, and they do not exhibit the problem.  I tried looking
at the PostgreSQL source to see if a log switch requested because of
archive_timeout could somehow result in the WAL file being changed
during execution of archive_command, but I'm completely unfamiliar
with the Pg source code and quickly got lost. :-(

I'm going to leave the system for a few days and see how many SHA1
mismatches I get.  Then I'm going to disable archive_timeout and see
if that changes anything.

Regards,

David.

pgsql-admin by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: Warm standby problems: Followup
Next
From: Tom Lane
Date:
Subject: Re: Warm standby problems: Followup