Thread: standby questions

standby questions

From

"Roberto Scattini"

Date:

08 February 2008, 23:07:46

hi list:
im working in the setup of a warm standby server. im using
postgres-8.2.5 in the slave and 8.2.4 in master right now, in
production it will be only 8.2.5, everything in debian etch.

we decided this solution because we want an easy to
migrate/implement/adminstrate backup sever, and we have a few minutes
of data loss (if we can be precise with the "starting point in time"
of that loss).
by now, i have the warm standby server working fine (at least, it logs
that processes the archives...).
if i am undestanding, the "complete" (in broad strokes) process for
WAL archiving and warm standby, is something like this:
        1) the primary server processes some transactions and
generates some "WAL segment files" in the $DATA/pg_xlog/ dir of 16MB
each (by default).
        2) if archive_command is activated and working, the primary
server sends (preferably with rsync or some other "atomic tool") the
NEW WAL files to the standby server. Later, at some point, the primary
server will delete this files when considers that are not necessary
anymore.
        3) the standby server receives the archive files and processes
them somehow (almost everybody does this with pg_standby by now?) and
keep waiting for a new WAL file or the trigger file.

now, as you can see, im not a expert on database tech, maybe only a
"brave admin", so, reading the docs and seeing the warm standby
"working", i have a few questions (in any case, links to docs a
welcome!!):

-a "WAL segment file" is the same that a "log file segment"?

-what are the "log file segments" referenced by checkpoint_segments?
are the "binary logs" where postgres stores lasts transactions, the
ones in $DATA/pg_xlog dir? if this is true, then:
what means "Maximum distance between automatic WAL checkpoints"???
this is how often, in "log file segments", postgres will perform a
checkpoint, generating a special checkpoint record from which to start
the redo operation, dont?

-what is the "restartpoint" named in the "Warm Standby Servers for
High Availability" page?
(http://www.postgresql.org/docs/8.2/static/warm-standby.html) i cant
find a definition in any other doc...

-how often a new WAL file is generated? this depends on the server load?

-in one WAL file i could have one, more than one and even an
incomplete "transaction"?? (where is the doc i need to read?!)

-if i have incomplete transactions on a WAL, how the standby server
processes that? what if the rest of the transaction never reaches the
standby server?

-how do i know exactly at which point in time (in transactions) my
standby server is, if i have to switch to primary role?

-how many archive files is safe to keep in the standby server? right
now, i have "-k 100" in the pg_standby opts but in the pg_standby
README says:
"You should be wary against setting this number too low,
since this may mean you cannot restart the standby. This
is because the last restartpoint marked in the WAL files
may be many files in the past and can vary considerably.
This should be set to a value exceeding the number of WAL
files that can be recovered in 2*checkpoint_timeout seconds,
according to the value in the warm standby postgresql.conf.
It is wholly unrelated to the setting of checkpoint_segments
on either primary or standby."

i cant use the %r because im in 8.2 and not thinking in upgrade by now...
this is related to the restartpoint in my previous question... but i
dont know exactly what it is so i cant make a more detailed analysis.

finally: does anybody have a config like this one working with
heartbeat? is that recommendable?

wow! sorry for the huge post... i think i tied some loose ends just by
writing this email!! :D

thanks in advance!!

--
Roberto Scattini
 ___     _
 ))_) __ )L __
((__)(('(( ((_)

Re: standby questions

From

David Wall

Date:

08 February 2008, 23:23:56

>         2) if archive_command is activated and working, the primary
> server sends (preferably with rsync or some other "atomic tool") the
> NEW WAL files to the standby server. Later, at some point, the primary
> server will delete this files when considers that are not necessary
> anymore.
>
Is 'scp' by itself considered an "atomic tool" for copying files to the
standby server?  Does "atomic" mean that the program should copy a file
over using a temp file name and then renames at the end or does it mean
something else?

David

Re: standby questions

From

Greg Smith

Date:

09 February 2008, 03:12:20

On Fri, 8 Feb 2008, David Wall wrote:

> Is 'scp' by itself considered an "atomic tool" for copying files to the
> standby server?  Does "atomic" mean that the program should copy a file over
> using a temp file name and then renames at the end or does it mean something
> else?

That's correct.  You have to do it that way or the system in recovery mode
can start to consume the new segment file before it has been completely
copied over yet.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: standby questions

From

David Wall

Date:

09 February 2008, 03:17:21

> That's correct.  You have to do it that way or the system in recovery
> mode can start to consume the new segment file before it has been
> completely copied over yet.

Does pg_standby take care of this by checking file sizes or the like?
In my testing with scp, we never experienced any problems, but I wonder
if we were somehow "just lucky."

David

Re: standby questions

From

Greg Smith

Date:

09 February 2008, 03:45:23

On Sat, 9 Feb 2008, Roberto Scattini wrote:

> -a "WAL segment file" is the same that a "log file segment"?

Yes:  WAL="write-ahead log".

> -how often a new WAL file is generated? this depends on the server load?

These are the WAL segment files that are produced by the database, the
ones you're copying to the standby server.  A new one comes out whenever
the current one is filled.  While it's not really exact, if you think of
it as new one appearing after every 16MB of changes to the database that's
the right general idea.

> -what are the "log file segments" referenced by checkpoint_segments?
> are the "binary logs" where postgres stores lasts transactions, the
> ones in $DATA/pg_xlog dir? if this is true, then:
> what means "Maximum distance between automatic WAL checkpoints"???
> this is how often, in "log file segments", postgres will perform a
> checkpoint, generating a special checkpoint record from which to start
> the redo operation, dont?

A checkpoint happens after one of a few things have occurred:

-The checkpoint_timeout has passed
-checkpoint_segments worth of WAL files have been output since the last
one
-One is required to support a command (server shutdown for example)

Because the 2nd of those depends on activity as mentioned above, that's
why the "distance" between checkpoints can vary a bit.

The checkpoint will re-use up to 2*checkpoint_segments+1 files on the
primary server as part of its cleanup.

> -in one WAL file i could have one, more than one and even an
> incomplete "transaction"?? (where is the doc i need to read?!)
> -if i have incomplete transactions on a WAL, how the standby server
> processes that? what if the rest of the transaction never reaches the
> standby server?

Transactions that haven't been completed for whatever reason are rolled
back as part of bringing the standby server up.

> -how do i know exactly at which point in time (in transactions) my
> standby server is, if i have to switch to primary role?

You can run pg_controldata on the standby to see where it's at.

> -what is the "restartpoint" named in the "Warm Standby Servers for
> High Availability" page?

I started to answer this one but realized I don't have a clear enough
description here for you.  Hopefully someone will chime in (and by someone
I mean Simon) with more detail about how restartpoints are tracked and
what that implies for setting -k in pg_standby for 8.2.

Good questions, I've been collecting notes for a FAQ on this subject and
I'll add some of these to it.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: standby questions

From

Greg Smith

Date:

09 February 2008, 03:50:59

On Fri, 8 Feb 2008, David Wall wrote:

> Does pg_standby take care of this by checking file sizes or the like?  In my
> testing with scp, we never experienced any problems, but I wonder if we were
> somehow "just lucky."

pg_standby only processes files of exactly the length they're supposed to
be.  On Windows it even sleeps a bit after that to give time for things to
settle.

The main risky situation you could end up in is if you were using a copy
program that created the whole file at its full size first then wrote the
data to it.  I don't think there are many programs that operate like that
around and certainly scp doesn't do that.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: standby questions

From

"Roberto Scattini"

Date:

09 February 2008, 07:37:46

On Feb 9, 2008 5:50 AM, Greg Smith <gsmith@gregsmith.com> wrote:
> On Fri, 8 Feb 2008, David Wall wrote:
>
> > Does pg_standby take care of this by checking file sizes or the like?  In my
> > testing with scp, we never experienced any problems, but I wonder if we were
> > somehow "just lucky."
>
> pg_standby only processes files of exactly the length they're supposed to
> be.  On Windows it even sleeps a bit after that to give time for things to
> settle.
>
> The main risky situation you could end up in is if you were using a copy
> program that created the whole file at its full size first then wrote the
> data to it.  I don't think there are many programs that operate like that
> around and certainly scp doesn't do that.
>

"atomic tool":

The reason rsync is used in the archive_command is that rsync features
an 'atomic copy' - that is, the in-progress destination file is
created as a temp file, and then renamed when the copy is complete. In
the situation above, where segments are archived straight to the
directory that the slave reads from, 'cp' can cause an error whereby
the slave attempts to process a partially-copied WAL segment. If this
happens, postgres will emit an error like:

PANIC:  archive file "000000010000000000000031" has wrong size:
1810432 instead of 16777216
LOG:  startup process (PID 11356) was terminated by signal 6
LOG:  aborting startup due to startup process failure

taken from http://archives.postgresql.org/sydpug/2006-10/msg00001.php


thanks everybody!!

--
Roberto Scattini
 ___     _
 ))_) __ )L __
((__)(('(( ((_)

Re: standby questions

From

Chander Ganesan

Date:

09 February 2008, 08:49:48

Hi Roberto,

>
> -a "WAL segment file" is the same that a "log file segment"?
>
A WAL (Write Ahead Log) file is one of the numbered files in the pg_xlog
directory.  Keep in mind that you'll be archiving (in some cases) more
than just WAL files, for example you might see other files appear in
your archive directory when you do a PITR backup.
> -what are the "log file segments" referenced by checkpoint_segments?
> are the "binary logs" where postgres stores lasts transactions, the
> ones in $DATA/pg_xlog dir? if this is true, then:
> what means "Maximum distance between automatic WAL checkpoints"???
> this is how often, in "log file segments", postgres will perform a
> checkpoint, generating a special checkpoint record from which to start
> the redo operation, dont?
>
A single WAL file isn't indicative of a checkpoint.  Rather, PostgreSQL
will fill "checkpoint segments" WAL files and then checkpoint (in normal
operation - though there are exceptions to this).  So if
checkpoint_segments is set to 3, then you'll see 48 MB of WAL files be
generated prior to a checkpoint (3 * 16 MB WAL files).  In the event of
a crash, if PostgreSQL auto-recovers (not the warm-standby server taking
over, the same instance that crashed re-starting) then it processes from
the last checkpoint forward.  Simply put, the WAL files contain the
"differences" between what is in the PostgreSQL shared buffer pool and
what is in the files that represent the database on disk.  When a
checkpoint occurres, the buffers "sync" to disk, so there are no
differences between what is in memory and on disk (so the old WAL files
would then be obsolete for automatic crash recovery, and a new set can
be started).
> -what is the "restartpoint" named in the "Warm Standby Servers for
> High Availability" page?
> (http://www.postgresql.org/docs/8.2/static/warm-standby.html) i cant
> find a definition in any other doc...
>
The restart point would be the time when you issued a pg_start_backup()
to take a PITR backup of your main server.
> -how often a new WAL file is generated? this depends on the server load?
>
Yes.  Server load and checkpoint timeout values would influence this.
Also any manual file switches that you do...
> -in one WAL file i could have one, more than one and even an
> incomplete "transaction"?? (where is the doc i need to read?!)
>
When replay occurs only completed transactions will be replayed.  Of
course, transactions won't span a checkpoint in the WAL files.
> -if i have incomplete transactions on a WAL, how the standby server
> processes that? what if the rest of the transaction never reaches the
> standby server?
>
Any transactions that are not completed will be discarded.
> -how do i know exactly at which point in time (in transactions) my
> standby server is, if i have to switch to primary role?
You really don't.  You can read your log files to find the last WAL that
has been replayed, and if you know the last WAL generated on the primary
(pg_controldata) then you should be able to figure out how many WAL
files away you are.  There are techniques to avoid the loss of any WAL
files in the event of a crash (synchronous warm standby), that we teach
in our performance tuning course...but I'm sure you can find directions
on how to implement these online someplace.
> -how many archive files is safe to keep in the standby server? right
> now, i have "-k 100" in the pg_standby opts but in the pg_standby
> README says:
> "You should be wary against setting this number too low,
> since this may mean you cannot restart the standby. This
> is because the last restartpoint marked in the WAL files
> may be many files in the past and can vary considerably.
> This should be set to a value exceeding the number of WAL
> files that can be recovered in 2*checkpoint_timeout seconds,
> according to the value in the warm standby postgresql.conf.
> It is wholly unrelated to the setting of checkpoint_segments
> on either primary or standby."
>
This number would indicate the number of files you need to "keep" to
ensure that if recovery is re-started it can continue successfully.  I
believe this is the number of WAL files since the last checkpoint, but
if that is so it is a variable value.  That's because a long-running
transaction could end up exceeding checkpoint_segments, in which case
PostgreSQL will exceed that number.  The easiest thing is to use the
'%r' parameter so PostgreSQL can tell pg_standby what to do....  If not,
I would just leave it at 0 and periodically prune old files.  I'm not
100% certain on this (but it certainly makes logical sense).

I don't use pg_standby, I typically use a shell script to do
this...which affords me a greater deal of customization.  The main
advantage to pg_standby is that it's a more-or-less "hands off" approach
that is cross-platform compatible (Windows and Unix variants).
Signalling components could be added to pg_standby at some point...
> i cant use the %r because im in 8.2 and not thinking in upgrade by now...
> this is related to the restartpoint in my previous question... but i
> dont know exactly what it is so i cant make a more detailed analysis.
>
> finally: does anybody have a config like this one working with
> heartbeat? is that recommendable?
>
Yes.  However your configuration could lose transactions in the event of
a crash of the primary (assuming you can't access its pg_xlog directory
after the crash).  If you're using heartbeat then you probably have the
two servers relatively close together, and should consider a more robust
solution if your hardware can support it.

Hope that helps..

--
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC  27560
Phone: 877-258-8987/919-463-0999
http://www.otg-nc.com
Ask me about our Expert Comprehensive PostgreSQL, PostGIS & UMN Mapserver training.

Re: standby questions

From

Simon Riggs

Date:

09 February 2008, 09:05:39

On Sat, 2008-02-09 at 07:49 -0500, Chander Ganesan wrote:

> Signalling components could be added to pg_standby at some point...

What sort of thing are you looking for?

pg_standby accepts a trigger file as well as various types of signal.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

Re: standby questions

From

Chander Ganesan

Date:

09 February 2008, 09:20:59

Simon Riggs wrote:

On Sat, 2008-02-09 at 07:49 -0500, Chander Ganesan wrote:

Signalling components could be added to pg_standby at some point...

What sort of thing are you looking for?

pg_standby accepts a trigger file as well as various types of signal

I didn't see anything about signals in the documentation at (http://www.postgresql.org/docs/8.3/static/pgstandby.html).

We use signals in shell scripts to trigger (in some cases) a WAL copy, and in others an immediate wake-from-sleep to copy a final WAL file to trigger a failover.

-- 
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC  27560
Phone: 877-258-8987/919-463-0999
http://www.otg-nc.com