Thread: streamlined standby procedure

streamlined standby procedure

From
Csaba Nagy
Date:
Hi all,

I decided to start implementing a streamlined WAL shipping based standby
building procedure. My aim is fairly simple: to be able to build a
standby as automated as possible. 

The ultimate simplicity would be for me: - install postgres on the standby machine;- create a directory for the data
basefiles, containing
 
postgresql.conf and pg_hba.conf, and a standby.conf file;- start up the postmaster with a "--build-standby" option;

All the rest should be done automatically by postgres.

The procedure should be something similar to the one available today if
you do it manually. The main difference would be that the standby
postmaster should connect to the primary server, and get all table data
and WAL record stream through normal data base connections...

To facilitate this process, I thought about why not expose the WAL files
through a system view ? Something along the lines of:

pg_wal ( name text, walrecords blob, iscurrent boolean
)

Then anybody interested in the WAL record stream could easily find out
which is the current WAL record, and get any of the existing WAL records
by streaming the blob. Closed WAL files would be streamed completely,
and the current WAL file could be streamed in realtime as it is
created... this would facilitate an always as up to date as possible
standby, as it could get the WAL records in real time.

To make it possible to reliably get closed WAL records, a WAL
subscription system could be created, where a subscriber (the standby)
could signal which is the oldest WAL file it did not get yet. The
primary machine would keep all the WAL files extending back to the
oldest subscribed one. Then each time the subscriber finishes processing
a WAL file, it can signal it's interest in the next one. This could be
implemented by a table like:

pg_wal_subscription ( subscriber text, name text
)

The subscribers would insert a record in this table, and update it to
the next WAL file after they processed one. The subscriber names should
be unique across subscribers, this should be managed by the admin who
sets up the subscribers. When the subscriber is not interested anymore,
it can delete it's subscription record. That could be done by the DBA
too if things go haywire...

To build a stand by based on log shipping it is necessary to get over
all the data base files too. That could be also done by exposing them
through some view, which in turn might take advantage of knowledge of
the table structure to compress the data to be transferred. The main
idea is to do all transfers through normal DB connections, so the only
configuration to be done is to point the standby to the master
machine...

So, all this said, I'm not too familiar with either C programming or the
postgres sources, but I'm willing to learn. And the project as a whole
seems a bit too much to do it in one piece, so my first aim is to expose
the WAL records in a system view.

I would really appreciate any comments you have...

Thanks,
Csaba.




Re: streamlined standby procedure

From
Tom Lane
Date:
Csaba Nagy <nagy@ecircle-ag.com> writes:
> The procedure should be something similar to the one available today if
> you do it manually. The main difference would be that the standby
> postmaster should connect to the primary server, and get all table data
> and WAL record stream through normal data base connections...

This is pie-in-the-sky really.  A design like that would mean that the
master could *never* recycle WAL files, because it could never know when
some slave would pop up demanding a copy of ancient history.
        regards, tom lane


Re: streamlined standby procedure

From
Csaba Nagy
Date:
You obviously did not read further down :-)

I was proposing a subscription system, where the slave can specify the
oldest WAL file it is interested in, and keep that up to date as it
processes them.

That could cause of course trouble if a slave dies and it won't update
the subscription, but that's not any different than the current setup if
the archive_command starts failing constantly because the archive site
is down. In either case human intervention is needed. The DB connection
based approach has the advantage that you don't have to restart the
server if your slave location changes, and you can have multiple slaves
at the same time if you like, e.g. if you want to smoothly move over the
slave to another machine.

Cheers,
Csaba.

On Tue, 2006-02-07 at 16:18, Tom Lane wrote:
> Csaba Nagy <nagy@ecircle-ag.com> writes:
> > The procedure should be something similar to the one available today if
> > you do it manually. The main difference would be that the standby
> > postmaster should connect to the primary server, and get all table data
> > and WAL record stream through normal data base connections...
> 
> This is pie-in-the-sky really.  A design like that would mean that the
> master could *never* recycle WAL files, because it could never know when
> some slave would pop up demanding a copy of ancient history.
> 
>             regards, tom lane



Re: streamlined standby procedure

From
Tom Lane
Date:
Csaba Nagy <nagy@ecircle-ag.com> writes:
> You obviously did not read further down :-)
> I was proposing a subscription system, where the slave can specify the
> oldest WAL file it is interested in, and keep that up to date as it
> processes them.

And how is that "system view" going to handle subscriptions?
        regards, tom lane


Re: streamlined standby procedure

From
Csaba Nagy
Date:
On Tue, 2006-02-07 at 16:45, Tom Lane wrote:
> Csaba Nagy <nagy@ecircle-ag.com> writes:
> > You obviously did not read further down :-)
> > I was proposing a subscription system, where the slave can specify the
> > oldest WAL file it is interested in, and keep that up to date as it
> > processes them.
> 
> And how is that "system view" going to handle subscriptions?

Not THAT "system view" will handle the subscription... there would be
this view which exposes the WAL files, which would do exactly that,
expose the existing WAL files, and only those which exist. Of course it
must place some kind of lock on the WAL file it currently streams so it
is not recycled, but other than that this view should not be concerned
with subscription issues.

The subscription system would be a table in which you can insert
(subscriber_id, oldest_WAL_file_name_i'm_interested_in) tuples. When
recycling WAL files, this table will be consulted and only WAL files
older than the oldest entry in the subscription table are allowed to be
recycled.

Slaves will update their subscription line after processing each WAL
file, setting it to the next WAL file name they need. So the oldest WAL
to be kept will actually be in sync with what the slaves really need.

OK, now I start to see what you mean, i.e. if there's no subscription
then all WAL files are immediately recycled, and the view can only show
one entry, the current WAL. But actually that's OK, you still can see
what's the current WAL file, and can subscribe starting with it.

Cheers,
Csaba.




Re: streamlined standby procedure

From
Andrew Rawnsley
Date:
IMHO the #1 priority in the current PITR/WAL shipping system is to make the
standby able to tolerate being shut down and restarted, i.e. actually having
a true standby mode and not the current method of doing it only on startup.

While it is a trivial thing to fool postgres into staying in startup/restore
mode with a restore_command that blocks until more files are available, if
the machine needs to be shut down for whatever reason you have to go back to
the last image and replay to the present, which isn't always convenient. Nor
are you able to shut down the standby, copy it to a second instance to use
for testing/development/whatever, and restart the standby.

(Just to be clear - I _really_ like the flexibility of the customizable
archive and restore structure with PITR in PG, but the lack of a standby
mode always reminds me of whacking my forehead at 3am on the too-low doorway
into my son's bedroom...)


On 2/7/06 10:11 AM, "Csaba Nagy" <nagy@ecircle-ag.com> wrote:

> Hi all,
> 
> I decided to start implementing a streamlined WAL shipping based standby
> building procedure. My aim is fairly simple: to be able to build a
> standby as automated as possible.
> 
> The ultimate simplicity would be for me:
>  - install postgres on the standby machine;
>  - create a directory for the data base files, containing
> postgresql.conf and pg_hba.conf, and a standby.conf file;
>  - start up the postmaster with a "--build-standby" option;
> 
> All the rest should be done automatically by postgres.
> 
> The procedure should be something similar to the one available today if
> you do it manually. The main difference would be that the standby
> postmaster should connect to the primary server, and get all table data
> and WAL record stream through normal data base connections...
> 
> To facilitate this process, I thought about why not expose the WAL files
> through a system view ? Something along the lines of:
> 
> pg_wal (
>   name text,
>   walrecords blob,
>   iscurrent boolean
> )
> 
> Then anybody interested in the WAL record stream could easily find out
> which is the current WAL record, and get any of the existing WAL records
> by streaming the blob. Closed WAL files would be streamed completely,
> and the current WAL file could be streamed in realtime as it is
> created... this would facilitate an always as up to date as possible
> standby, as it could get the WAL records in real time.
> 
> To make it possible to reliably get closed WAL records, a WAL
> subscription system could be created, where a subscriber (the standby)
> could signal which is the oldest WAL file it did not get yet. The
> primary machine would keep all the WAL files extending back to the
> oldest subscribed one. Then each time the subscriber finishes processing
> a WAL file, it can signal it's interest in the next one. This could be
> implemented by a table like:
> 
> pg_wal_subscription (
>   subscriber text,
>   name text
> )
> 
> The subscribers would insert a record in this table, and update it to
> the next WAL file after they processed one. The subscriber names should
> be unique across subscribers, this should be managed by the admin who
> sets up the subscribers. When the subscriber is not interested anymore,
> it can delete it's subscription record. That could be done by the DBA
> too if things go haywire...
> 
> To build a stand by based on log shipping it is necessary to get over
> all the data base files too. That could be also done by exposing them
> through some view, which in turn might take advantage of knowledge of
> the table structure to compress the data to be transferred. The main
> idea is to do all transfers through normal DB connections, so the only
> configuration to be done is to point the standby to the master
> machine...
> 
> So, all this said, I'm not too familiar with either C programming or the
> postgres sources, but I'm willing to learn. And the project as a whole
> seems a bit too much to do it in one piece, so my first aim is to expose
> the WAL records in a system view.
> 
> I would really appreciate any comments you have...
> 
> Thanks,
> Csaba.
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly





Re: streamlined standby procedure

From
Csaba Nagy
Date:
On Tue, 2006-02-07 at 16:58, Andrew Rawnsley wrote:
> IMHO the #1 priority in the current PITR/WAL shipping system is to make the
> standby able to tolerate being shut down and restarted, i.e. actually having
> a true standby mode and not the current method of doing it only on startup.

This fits nicely in what I would like to achieve, and it might be
actually a better start.

> While it is a trivial thing to fool postgres into staying in startup/restore
> mode with a restore_command that blocks until more files are available, if
> the machine needs to be shut down for whatever reason you have to go back to
> the last image and replay to the present, which isn't always convenient. Nor
> are you able to shut down the standby, copy it to a second instance to use
> for testing/development/whatever, and restart the standby.

Why would you shut down the standby to copy it ? It would by nicer to be
able build a "standby of the standby" ;-)
Even nicer would be to have a stand-by which allows read only access,
but I guess that's a tough call.

Cheers,
Csaba.




Re: streamlined standby procedure

From
Tom Lane
Date:
Andrew Rawnsley <ronz@investoranalytics.com> writes:
> IMHO the #1 priority in the current PITR/WAL shipping system is to make the
> standby able to tolerate being shut down and restarted, i.e. actually having
> a true standby mode and not the current method of doing it only on startup.

How is shutting down the standby a good idea?  Seems like that will
block the master too --- or at least result in WAL log files piling up
rapidly.  If the standby goes off-line, abandoning it and starting from
a fresh base backup when you are ready to restart it seems like the most
likely recovery path.  For sure I don't see this as the "#1 priority".
        regards, tom lane


Re: streamlined standby procedure

From
Simon Riggs
Date:
On Tue, 2006-02-07 at 16:11 +0100, Csaba Nagy wrote:

> I decided to start implementing a streamlined WAL shipping based standby
> building procedure. My aim is fairly simple: to be able to build a
> standby as automated as possible. 

What do you find difficult about the current method? That's got to be
the first discussion point.

Best Regards, Simon Riggs



Re: streamlined standby procedure

From
Csaba Nagy
Date:
> What do you find difficult about the current method? That's got to be
> the first discussion point.

The main problem I have is the complexity of setup.

It involves a lot of additional scripting which you have to get it right
to be actually reliable. The documentation is giving a rough idea on how
to do it, but it is quite some work to make it work, and you can't
really tell that is reliable...

Another issue is that unless you got the archive_command right in the
master server from the beginning, you will have to restart the server
once you decide to build your standby... the archive_command is a
start-up time parameter. This could be of course alleviated by always
using a stub script as archive command, and let it do nothing if you
don't have a standby, and then modify it to start archiving to the right
place once you start building one, or if you want to move it to another
machine. But this is also not documented, and you have to figure it out
for yourself.

And responding to Tom's other post regarding a real standby mode, where
you could stop the standby and then later resume it still in standby
mode: I would actually have a good use for it tonight :-)
We will migrate our application to a new version, which involves some
changes in the data base. Now it would be nice to stop the standby
BEFORE doing these changes, and if the migration fails for some reason,
start up the standby and use it with our old application version. But if
the migration succeeds, I want to start up the standby still as standby,
and make it resume standby operation... rebuilding it will take half day
at least.
So a standby which can be isolated for a while would actually be useful.
OK, now that I'm thinking a bit more about this, I could achieve this by
fiddling with the restore_command so it stops delivering the logs for a
while. But again it is not straightforward.

The whole point of this is that starting up a standby should be as
simple as pointing the standby machine to the primary server, without
shell scripting gimmicks (which are OS specific and therefore hard to
document in a generic way), without the need of fiddling with the
primary's configuration (see archive command), without the need to
restart the primary if the archive command was not right in the first
place. And to make it easy to start up one more standby if needed, or
isolate it for a while when doing some risky work on the primary.

It's about user friendliness and flexibility. It's not that it can't do
the work right now, but it's really hard to do it...

Cheers,
Csaba.





Re: streamlined standby procedure

From
Andrew Rawnsley
Date:


On 2/7/06 1:19 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

> Andrew Rawnsley <ronz@investoranalytics.com> writes:
>> IMHO the #1 priority in the current PITR/WAL shipping system is to make the
>> standby able to tolerate being shut down and restarted, i.e. actually having
>> a true standby mode and not the current method of doing it only on startup.
> 
> How is shutting down the standby a good idea?  Seems like that will
> block the master too --- or at least result in WAL log files piling up
> rapidly.  If the standby goes off-line, abandoning it and starting from
> a fresh base backup when you are ready to restart it seems like the most
> likely recovery path.  For sure I don't see this as the "#1 priority".
> 
> regards, tom lane

I wasn't suggesting this in the context of Csaba's auto-ship plan (and, to
be clear, not #1 in the context of the entire database development. Just
PITR). 

For one, sometimes you have no choice about the standby being shut down, but
most of the time you can plan for that. As for Csaba's question of why I
would want to create a copy of a standby, its the easiest way to create
development and testing snapshots at standby locations, and for making
paranoid operations people confident that your standby procedures are
working. I do it with my Oracle (pardon the 'O' word) installations all the
time, and I despise being able to do something with Oracle that I can't with
PG.

I ship WAL logs around in batches independent of the archive command to
several locations. Either I :

A) let the logs 'pile up' on the standby (crap has to pile up somewhere),
and apply them should the standby be needed (could take some time should the
'pile' be large). The only way here to keep the recover time short is to
re-image the database frequently and ship it around. Not nice with big
databases.

B) Do the blocking recover command to continually apply the logs as they get
moved around. While this can generate good clever points, its a rig.
Fragile.

To me the question isn't 'How is shutting down the standby a good idea?',
its 'How is shutting down the standby not a bad idea?'. Different points of
view, I suppose - In my situation the standby going offline is not a
catastrophic event like the primary would be; its even a useful thing. If
there was some rman-style thing like people have suggested to auto-ship logs
around, then yeah, dealing with an offline standby could be a tricky thing
(but would need some solution anyway). But hell, Slony and Mammoth can
tolerate it, I just would like log shipping to handle it also.

Maybe it isn't #1 priority, but its something I view as a limitation, and
not just lacking a feature. Its something I can't control. As I originally
mentioned, the customizable archive/restore feature is great, superior to
dealing with it in Oracle. But the standby mode makes the Oracle setup more
bulletproof. 



-- 

Andrew Rawnsley
Chief Technology Officer
Investor Analytics, LLC
(740) 587-0114
http://www.investoranalytics.com





Re: streamlined standby procedure

From
Simon Riggs
Date:
On Wed, 2006-02-08 at 11:10 +0100, Csaba Nagy wrote:
> Another issue is that unless you got the archive_command right in the
> master server from the beginning, you will have to restart the server
> once you decide to build your standby... the archive_command is a
> start-up time parameter

Much of your difficulty seems to come from your thinking that this
parameter requires a restart. It doesn't - check it out.

The script need not be complex, you only need to put a wait loop in the
restore script so that it waits for the next log file.

Best Regards, Simon Riggs



Re: streamlined standby procedure

From
Csaba Nagy
Date:
OK, this is news to me, I recall that last looking at the configuration
docs it was start-up time, but I might be wrong.

[looking up the docs]

OK, citing the 8.1 online docs:

17.5.3. Archiving

archive_command (string)              The shell command to execute to archive a completed segment of       the WAL file
series.If this is an empty string (the default),       WAL archiving is disabled. Any %p in the string is replaced by
   the absolute path of the file to archive, and any %f is replaced       by the file name only. Use %% to embed an
actual% character in       the command. For more information see Section 23.3.1. This       option can only be set at
serverstart or in the postgresql.conf              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^       file.               It is
importantfor the command to return a zero exit status if       and only if it succeeds. Examples:
archive_command= 'cp "%p" /mnt/server/archivedir/"%f"'       archive_command = 'copy "%p" /mnt/server/archivedir/"%f"'
#Windows       
 
It's at least confusing... it does say "or in the postgresql.conf file" too, but I must have overlooked that... and the
"only"word is really confusing there.
 

[looking at: http://www.powerpostgresql.com/Downloads/annotated_conf_80.html]

OK, this is what confused me. The annotated conf file states it's a startup time parameter.

Well, good to know it's not...

Actually, my needs of PITR/standby building are mostly solved by now, but it's sure not an easy ride, and I really
wonderif there is any readily available script bundle to do it for a windows server...
 

Maybe a standby-building-tutorial is all what is needed...

Cheers,
Csaba.


> Much of your difficulty seems to come from your thinking that this
> parameter requires a restart. It doesn't - check it out.
> 
> The script need not be complex, you only need to put a wait loop in the
> restore script so that it waits for the next log file.
> 
> Best Regards, Simon Riggs
> 



Re: streamlined standby procedure

From
Marko Kreen
Date:
On 2/7/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Rawnsley <ronz@investoranalytics.com> writes:
> > IMHO the #1 priority in the current PITR/WAL shipping system is to make the
> > standby able to tolerate being shut down and restarted, i.e. actually having
> > a true standby mode and not the current method of doing it only on startup.
>
> How is shutting down the standby a good idea?  Seems like that will
> block the master too --- or at least result in WAL log files piling up
> rapidly.  If the standby goes off-line, abandoning it and starting from
> a fresh base backup when you are ready to restart it seems like the most
> likely recovery path.  For sure I don't see this as the "#1 priority".

For regular recovery it is indeed unnecessary.  But I would also
put this as #1 TODO for long-running hot-standby case.  The requirement
to start all over makes current setup rather cumbersome.

And #2 would be running read-only queries while in recovery :)

--
marko


Re: streamlined standby procedure

From
"Jim C. Nasby"
Date:
On Thu, Feb 09, 2006 at 10:37:34AM +0100, Csaba Nagy wrote:
>         option can only be set at server start or in the postgresql.conf

Yeah, this is something that was actually discussed on -docs recently. I
believe -HEAD was changed so that every parameter that used to have that
text now says: option can be set in postgresql.conf or on the server
command line.

> Maybe a standby-building-tutorial is all what is needed...

Having that would be very handy indeed. In fact, if you want to get
rough notes put together I'd be happy to edit it into a finished
product, though I'm not sure of the best place to put it. I could
certainly post it somewhere on pervasive-postgres.com if nothing else...
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: streamlined standby procedure

From
"Jim C. Nasby"
Date:
On Thu, Feb 09, 2006 at 04:44:20PM +0200, Marko Kreen wrote:
> On 2/7/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Andrew Rawnsley <ronz@investoranalytics.com> writes:
> > > IMHO the #1 priority in the current PITR/WAL shipping system is to make the
> > > standby able to tolerate being shut down and restarted, i.e. actually having
> > > a true standby mode and not the current method of doing it only on startup.
> >
> > How is shutting down the standby a good idea?  Seems like that will
> > block the master too --- or at least result in WAL log files piling up
> > rapidly.  If the standby goes off-line, abandoning it and starting from
> > a fresh base backup when you are ready to restart it seems like the most
> > likely recovery path.  For sure I don't see this as the "#1 priority".
> 
> For regular recovery it is indeed unnecessary.  But I would also
> put this as #1 TODO for long-running hot-standby case.  The requirement
> to start all over makes current setup rather cumbersome.

What happens right now when you want to bring the standby up? Do you
have to kill it out of recovery mode and re-start, forcing it to replay
WAL again anyway?

> And #2 would be running read-only queries while in recovery :)

That would be damn handy :)
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: streamlined standby procedure

From
Marko Kreen
Date:
On 2/9/06, Jim C. Nasby <jnasby@pervasive.com> wrote:
> What happens right now when you want to bring the standby up? Do you
> have to kill it out of recovery mode and re-start, forcing it to replay
> WAL again anyway?

touch $LOGDIR/STOP  ...

--
marko


Re: streamlined standby procedure

From
Bruce Momjian
Date:
I have updated the 8.1.X documentation to remove the word "only", which
is confusing.

---------------------------------------------------------------------------

Jim C. Nasby wrote:
> On Thu, Feb 09, 2006 at 10:37:34AM +0100, Csaba Nagy wrote:
> >         option can only be set at server start or in the postgresql.conf
> 
> Yeah, this is something that was actually discussed on -docs recently. I
> believe -HEAD was changed so that every parameter that used to have that
> text now says: option can be set in postgresql.conf or on the server
> command line.
> 
> > Maybe a standby-building-tutorial is all what is needed...
> 
> Having that would be very handy indeed. In fact, if you want to get
> rough notes put together I'd be happy to edit it into a finished
> product, though I'm not sure of the best place to put it. I could
> certainly post it somewhere on pervasive-postgres.com if nothing else...
> -- 
> Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
> Pervasive Software      http://pervasive.com    work: 512-231-6117
> vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: streamlined standby procedure

From
"Jim C. Nasby"
Date:
On Tue, Feb 07, 2006 at 04:11:07PM +0100, Csaba Nagy wrote:
> pg_wal (
>   name text,
>   walrecords blob,
>   iscurrent boolean
> )
ISTM that a current_wal_file() function would be better than a boolean
on the view...

> So, all this said, I'm not too familiar with either C programming or the
> postgres sources, but I'm willing to learn. And the project as a whole
> seems a bit too much to do it in one piece, so my first aim is to expose
> the WAL records in a system view.

Seems a reasonable place to start. IMO there's a lot more info that
could be exposed through system views than is currently being done.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461