Re: standby questions - Mailing list pgsql-general

From Chander Ganesan
Subject Re: standby questions
Date
Msg-id 47ADA15C.1000105@otg-nc.com
Whole thread Raw
In response to standby questions  ("Roberto Scattini" <roberto.scattini@gmail.com>)
Responses Re: standby questions
List pgsql-general
Hi Roberto,

>
> -a "WAL segment file" is the same that a "log file segment"?
>
A WAL (Write Ahead Log) file is one of the numbered files in the pg_xlog
directory.  Keep in mind that you'll be archiving (in some cases) more
than just WAL files, for example you might see other files appear in
your archive directory when you do a PITR backup.
> -what are the "log file segments" referenced by checkpoint_segments?
> are the "binary logs" where postgres stores lasts transactions, the
> ones in $DATA/pg_xlog dir? if this is true, then:
> what means "Maximum distance between automatic WAL checkpoints"???
> this is how often, in "log file segments", postgres will perform a
> checkpoint, generating a special checkpoint record from which to start
> the redo operation, dont?
>
A single WAL file isn't indicative of a checkpoint.  Rather, PostgreSQL
will fill "checkpoint segments" WAL files and then checkpoint (in normal
operation - though there are exceptions to this).  So if
checkpoint_segments is set to 3, then you'll see 48 MB of WAL files be
generated prior to a checkpoint (3 * 16 MB WAL files).  In the event of
a crash, if PostgreSQL auto-recovers (not the warm-standby server taking
over, the same instance that crashed re-starting) then it processes from
the last checkpoint forward.  Simply put, the WAL files contain the
"differences" between what is in the PostgreSQL shared buffer pool and
what is in the files that represent the database on disk.  When a
checkpoint occurres, the buffers "sync" to disk, so there are no
differences between what is in memory and on disk (so the old WAL files
would then be obsolete for automatic crash recovery, and a new set can
be started).
> -what is the "restartpoint" named in the "Warm Standby Servers for
> High Availability" page?
> (http://www.postgresql.org/docs/8.2/static/warm-standby.html) i cant
> find a definition in any other doc...
>
The restart point would be the time when you issued a pg_start_backup()
to take a PITR backup of your main server.
> -how often a new WAL file is generated? this depends on the server load?
>
Yes.  Server load and checkpoint timeout values would influence this.
Also any manual file switches that you do...
> -in one WAL file i could have one, more than one and even an
> incomplete "transaction"?? (where is the doc i need to read?!)
>
When replay occurs only completed transactions will be replayed.  Of
course, transactions won't span a checkpoint in the WAL files.
> -if i have incomplete transactions on a WAL, how the standby server
> processes that? what if the rest of the transaction never reaches the
> standby server?
>
Any transactions that are not completed will be discarded.
> -how do i know exactly at which point in time (in transactions) my
> standby server is, if i have to switch to primary role?
You really don't.  You can read your log files to find the last WAL that
has been replayed, and if you know the last WAL generated on the primary
(pg_controldata) then you should be able to figure out how many WAL
files away you are.  There are techniques to avoid the loss of any WAL
files in the event of a crash (synchronous warm standby), that we teach
in our performance tuning course...but I'm sure you can find directions
on how to implement these online someplace.
> -how many archive files is safe to keep in the standby server? right
> now, i have "-k 100" in the pg_standby opts but in the pg_standby
> README says:
> "You should be wary against setting this number too low,
> since this may mean you cannot restart the standby. This
> is because the last restartpoint marked in the WAL files
> may be many files in the past and can vary considerably.
> This should be set to a value exceeding the number of WAL
> files that can be recovered in 2*checkpoint_timeout seconds,
> according to the value in the warm standby postgresql.conf.
> It is wholly unrelated to the setting of checkpoint_segments
> on either primary or standby."
>
This number would indicate the number of files you need to "keep" to
ensure that if recovery is re-started it can continue successfully.  I
believe this is the number of WAL files since the last checkpoint, but
if that is so it is a variable value.  That's because a long-running
transaction could end up exceeding checkpoint_segments, in which case
PostgreSQL will exceed that number.  The easiest thing is to use the
'%r' parameter so PostgreSQL can tell pg_standby what to do....  If not,
I would just leave it at 0 and periodically prune old files.  I'm not
100% certain on this (but it certainly makes logical sense).

I don't use pg_standby, I typically use a shell script to do
this...which affords me a greater deal of customization.  The main
advantage to pg_standby is that it's a more-or-less "hands off" approach
that is cross-platform compatible (Windows and Unix variants).
Signalling components could be added to pg_standby at some point...
> i cant use the %r because im in 8.2 and not thinking in upgrade by now...
> this is related to the restartpoint in my previous question... but i
> dont know exactly what it is so i cant make a more detailed analysis.
>
> finally: does anybody have a config like this one working with
> heartbeat? is that recommendable?
>
Yes.  However your configuration could lose transactions in the event of
a crash of the primary (assuming you can't access its pg_xlog directory
after the crash).  If you're using heartbeat then you probably have the
two servers relatively close together, and should consider a more robust
solution if your hardware can support it.

Hope that helps..

--
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC  27560
Phone: 877-258-8987/919-463-0999
http://www.otg-nc.com
Ask me about our Expert Comprehensive PostgreSQL, PostGIS & UMN Mapserver training.


pgsql-general by date:

Previous
From: "Roberto Scattini"
Date:
Subject: Re: standby questions
Next
From: Simon Riggs
Date:
Subject: Re: standby questions