Thread: Hot Standby utility and administrator functions
I'm looking to implement the following functions for Hot Standby, to allow those with administrative tools or management applications to have more control during recovery. Please let me know if other functions are required. What else do we need? * pg_is_in_recovery() returns bool (true if in recovery, false if not) * pg_last_recovered_xact_xid() Will throw an ERROR if *not* executed in recovery mode. returns bigint * pg_last_completed_xact_xid() Will throw an ERROR *if* executed in recovery mode. returns bigint (together allows easy arithmetic on xid difference between master and slave). * pg_last_recovered_xact_timestamp() returns timestamp with timezone (allows easy arithmetic with now() to allow derivation of replication delay etc) * pg_freeze_recovery() - freezes recovery after the current record has been applied. The server is still up and queries can happen, but no WAL replay will occur. This is a temporary state change and we keep no record of this, other than making a server log entry. If the server is shutdown or crashes, it will unfreeze itself automatically. Has no effect on master. Will throw an ERROR if not executed in recovery mode. Superusers only. returns text (XLogRecPtr of freeze point) * pg_unfreeze_recovery() - unfreezes recovery. Recovery will begin again at exactly the point recovery was frozen at. Will throw an ERROR is not executed in recovery mode. Superusers only. returns bool (true if unfroze, false if was not frozen when called) * pg_end_recovery() - Will force recovery to end at current location. Recovery mode cannot be easily re-entered, so there is no "restart" function. Will throw an ERROR is not executed in recovery mode. Superusers only. returns text (XLogRecPtr of freeze point) * pg_start_backup()/pg_stop_backup() could work during recovery, but the backup history file would need to be manually inserted into the archive once complete. Is that acceptable? (Note that we don't know where the archive is or how to access that; the information is all in recovery_command. We cannot assume that archive_command points to same archive. So making it happen automatically is too much work for this release, if ever.) If that seems useful, we could do this by avoiding any operation that changes WAL stream during recovery: no checkpoints, log switches etc.. pg_start_backup() would return XLogRecPtr of last restartpoint. pg_stop_backup() would return last known xlrec recovered (we won't keep track of this record by record). * pg_reload_conf() will not force re-read of recovery.conf since that may require extra work and doesn't seem that important, if we have the manual override mentioned above. All desirable? All possible? Any others? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support
> * pg_last_recovered_xact_xid() > Will throw an ERROR if *not* executed in recovery mode. > returns bigint > > * pg_last_completed_xact_xid() > Will throw an ERROR *if* executed in recovery mode. > returns bigint Should these return xid? ...Robert
On Mon, 2008-10-20 at 16:22 -0400, Robert Haas wrote: > > * pg_last_recovered_xact_xid() > > Will throw an ERROR if *not* executed in recovery mode. > > returns bigint > > > > * pg_last_completed_xact_xid() > > Will throw an ERROR *if* executed in recovery mode. > > returns bigint > > Should these return xid? Perhaps, but they match txid_current() which returns bigint. http://developer.postgresql.org/pgdocs/postgres/functions-info.html Thanks for checking. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support
Simon Riggs escribió: > > On Mon, 2008-10-20 at 16:22 -0400, Robert Haas wrote: > > > * pg_last_recovered_xact_xid() > > > Will throw an ERROR if *not* executed in recovery mode. > > > returns bigint > > > > > > * pg_last_completed_xact_xid() > > > Will throw an ERROR *if* executed in recovery mode. > > > returns bigint > > > > Should these return xid? > > Perhaps, but they match txid_current() which returns bigint. > http://developer.postgresql.org/pgdocs/postgres/functions-info.html That's been "extended with an epoch counter" per the docs; I don't think that's appropriate for the new functions, is it? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Mon, 2008-10-20 at 17:44 -0300, Alvaro Herrera wrote: > Simon Riggs escribió: > > > > On Mon, 2008-10-20 at 16:22 -0400, Robert Haas wrote: > > > > * pg_last_recovered_xact_xid() > > > > Will throw an ERROR if *not* executed in recovery mode. > > > > returns bigint > > > > > > > > * pg_last_completed_xact_xid() > > > > Will throw an ERROR *if* executed in recovery mode. > > > > returns bigint > > > > > > Should these return xid? > > > > Perhaps, but they match txid_current() which returns bigint. > > http://developer.postgresql.org/pgdocs/postgres/functions-info.html > > That's been "extended with an epoch counter" per the docs; I don't think > that's appropriate for the new functions, is it? I assumed it was, so you can subtract them easily. It can be done either way, I guess. Happy to provide what people need. I just dreamed up a few that sounded useful. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support
Simon Riggs <simon@2ndQuadrant.com> writes: > On Mon, 2008-10-20 at 17:44 -0300, Alvaro Herrera wrote: >> That's been "extended with an epoch counter" per the docs; I don't think >> that's appropriate for the new functions, is it? > I assumed it was, so you can subtract them easily. > It can be done either way, I guess. Happy to provide what people need. I > just dreamed up a few that sounded useful. I don't think you should be inventing new functions without clear use-cases in mind. Depending on what the use is, I could see either the xid or the txid definition as being *required*. In any case, do not use the wrong return type for the definition you're implementing. regards, tom lane
On Mon, 2008-10-20 at 18:45 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: > > On Mon, 2008-10-20 at 17:44 -0300, Alvaro Herrera wrote: > >> That's been "extended with an epoch counter" per the docs; I don't think > >> that's appropriate for the new functions, is it? > > > I assumed it was, so you can subtract them easily. > > > It can be done either way, I guess. Happy to provide what people need. I > > just dreamed up a few that sounded useful. > > I don't think you should be inventing new functions without clear > use-cases in mind. Depending on what the use is, I could see either the > xid or the txid definition as being *required*. The use case for the two functions was clearly stated as "together allows easy arithmetic on xid difference between master and slave". In that context, xid plus epoch is appropriate. There are other use cases. We can have both, neither or just one, depending upon what people think. What would you want "xid only" for? Do you think that should replace the txid one? This is everybody's opportunity to say what we need. > In any case, do not use the wrong return type for the definition you're > implementing. err...Why would anyone do that? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support
> > * pg_last_recovered_xact_xid() > > Will throw an ERROR if *not* executed in recovery mode. > > returns bigint > > > > * pg_last_completed_xact_xid() > > Will throw an ERROR *if* executed in recovery mode. > > returns bigint > > Should these return xid? And shouldn't these two be folded together ? It seems most usages of this xid(/lsn?) will be agnostic to the recovery mode. Or if not, it seems more convenient to have a function that returns both recovery mode and xid, no ? Andreas
Simon Riggs <simon@2ndQuadrant.com> writes: > On Mon, 2008-10-20 at 18:45 -0400, Tom Lane wrote: >> In any case, do not use the wrong return type for the definition you're >> implementing. > err...Why would anyone do that? That's what I wanted to know ;-). If these functions are really going to return txid, then they should be named to reflect that. regards, tom lane
On Tue, 2008-10-21 at 09:44 +0200, Zeugswetter Andreas OSB sIT wrote: > > > * pg_last_recovered_xact_xid() > > > Will throw an ERROR if *not* executed in recovery mode. > > > returns bigint > > > > > > * pg_last_completed_xact_xid() > > > Will throw an ERROR *if* executed in recovery mode. > > > returns bigint > > > > Should these return xid? > > And shouldn't these two be folded together ? > It seems most usages of this xid(/lsn?) will be agnostic to the > recovery mode. Or if not, it seems more convenient to have a function > that returns both recovery mode and xid, no ? You are right that it would be better to have a single function. Functions that return multiple values are a pain to use and develop, plus we can always run the other function if we are in doubt. txid_last_completed() returns bigint (txid) seems better. I am more than happy to add an id version as well, if anybody sees the need for that. Just say. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support
On Mon, 2008-10-20 at 10:25 +0100, Simon Riggs wrote: > What else do we need? > * pg_freeze_recovery() > * pg_unfreeze_recovery() Two more functions pg_freeze_recovery_cleanup() pg_unfreeze_recovery_cleanup() These would allow recovery to continue normally, except for row removal operations which would halt the progress of recovery. It would eventually be possible to have a function that halts recovery whenever row removal takes place for a list of tables. Not planning on implementing that initially though. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support
On Monday 20 October 2008 05:25:29 Simon Riggs wrote: > I'm looking to implement the following functions for Hot Standby, to > allow those with administrative tools or management applications to have > more control during recovery. Please let me know if other functions are > required. > > What else do we need? > Is it possible to give the master/slave knowledge about each other? -- Robert Treat Conjecture: http://www.xzilla.net Consulting: http://www.omniti.net
On Mon, 2008-10-27 at 11:42 -0400, Robert Treat wrote: > On Monday 20 October 2008 05:25:29 Simon Riggs wrote: > > I'm looking to implement the following functions for Hot Standby, to > > allow those with administrative tools or management applications to have > > more control during recovery. Please let me know if other functions are > > required. > > > > What else do we need? > > > > Is it possible to give the master/slave knowledge about each other? Yes, but for what reason? The project I'm working on is Hot Standby, not streaming replication. That will link things together better than they are now, so I'd probably rather not prejudge/duplicate that. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support
Simon Riggs wrote: > On Mon, 2008-10-27 at 11:42 -0400, Robert Treat wrote: > > >> On Monday 20 October 2008 05:25:29 Simon Riggs wrote: >> >>> I'm looking to implement the following functions for Hot Standby, to >>> allow those with administrative tools or management applications to have >>> more control during recovery. Please let me know if other functions are >>> required. >>> >>> What else do we need? >>> >>> >> Is it possible to give the master/slave knowledge about each other? >> > > Yes, but for what reason? > > The project I'm working on is Hot Standby, not streaming replication. > That will link things together better than they are now, so I'd probably > rather not prejudge/duplicate that. > I think this could make sense in the case of a network partition (split brain). If we need to perform a reconciliation after a partition we will need to know when a slave has switched to master and what transactions have been processed on each side. Another use case is probably to implement failback once the former master has been restarted to just send the diff since failover happened. Actually to complement pg_last_recovered_xact_xid(), pg_last_completed_xact_xid() or pg_last_recovered_xact_timestamp(), I would like to have something like pg_xact_xid_status(txid) that would return something of a xid_status type that can be completed, recovered or not_found with a timestamp that would only be meaningful if the status is recovered or completed. Regards, Emmanuel -- Emmanuel Cecchet FTO @ Frog Thinker Open Source Development & Consulting -- Web: http://www.frogthinker.org email: manu@frogthinker.org Skype: emmanuel_cecchet
On Monday 27 October 2008 12:12:18 Simon Riggs wrote: > On Mon, 2008-10-27 at 11:42 -0400, Robert Treat wrote: > > On Monday 20 October 2008 05:25:29 Simon Riggs wrote: > > > I'm looking to implement the following functions for Hot Standby, to > > > allow those with administrative tools or management applications to > > > have more control during recovery. Please let me know if other > > > functions are required. > > > > > > What else do we need? > > > > Is it possible to give the master/slave knowledge about each other? > > Yes, but for what reason? > Was thinking that admin tools that show hot standby information might also want to show the corresponding slave information (from the point of view of the master). It might also allow tools to not have to be configured for all servers... ie connect to one and lookup the other. > The project I'm working on is Hot Standby, not streaming replication. > That will link things together better than they are now, so I'd probably > rather not prejudge/duplicate that. It's possible this type of information isn't appropriate for our Hot Standby implementation, but it is somewhat common in asynchronous and/or master/slave systems. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
On Mon, 2008-10-27 at 13:08 -0400, Robert Treat wrote: > Was thinking that admin tools that show hot standby information might > also want to show the corresponding slave information (from the point > of view of the master). Well, the standby might be persuaded to know something about the master, but not the other way around. The master:standby relationship is 1:Many, and not restricted in the way things are daisy-chained. So it's more than just a pair of servers. All the slaves will say they are the same if you ask them, so you can't rely on that to identify them. So you need to specify which two servers you're interested in comparing and how to identify them externally. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support