Thread: Command to prune archive at restartpoints

Command to prune archive at restartpoints

From

Heikki Linnakangas

Date:

17 March 2010, 06:37:23

One awkward omission in the new built-in standby mode, mainly used for
streaming replication, is that there is no easy way to delete old
archived files like you do with the %r parameter to restore_command.
This was discussed at
http://archives.postgresql.org/pgsql-hackers/2010-02/msg01003.php, among
other things.

Per discussion, attached patch adds a new restartpoint_command option to
recovery.conf. That's an external shell command just like
recovery_end_command that's executed at every restartpoint. You can use
the %r parameter to pass the filename of the oldest WAL file that needs
to be retained.

While developing this I noticed that %r in recovery_end_command is not
working correctly:

LOG:  redo done at 0/14000C10
LOG:  last completed transaction was at log time 2000-01-01
02:21:08.816445+02
cp: cannot stat
`/home/hlinnaka/pgsql.cvshead/walarchive/000000010000000000000014': No
such file or directory
cp: cannot stat
`/home/hlinnaka/pgsql.cvshead/walarchive/00000002.history': No such file
or directory
LOG:  selected new timeline ID: 2
cp: cannot stat
`/home/hlinnaka/pgsql.cvshead/walarchive/00000001.history': No such file
or directory
LOG:  archive recovery complete
LOG:  checkpoint starting: end-of-recovery immediate wait
LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log
file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s,
total=0.003 s
LOG:  executing recovery_end_command "echo recovery_end_command %r"
recovery_end_command 000000000000000000000000
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started

Note how %r is always expanded to 000000000000000000000000. That's
because %r is expanded only when InRedo is true, which makes sense for
restore_command where that piece of code was copy-pasted from, but it's
never true anymore when recovery_end_command is run. The attached patch
fixes that too.

Barring objections, I will commit this later today.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index 6404768..7b7a8bd 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -59,14 +59,15 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>

-     <varlistentry id="recovery-end-command" xreflabel="recovery_end_command">
-      <term><varname>recovery_end_command</varname> (<type>string</type>)</term>
+     <varlistentry id="restartpoint-command" xreflabel="restartpoint_command">
+      <term><varname>restartpoint_command</varname> (<type>string</type>)</term>
       <listitem>
        <para>
-        This parameter specifies a shell command that will be executed once only
-        at the end of recovery. This parameter is optional. The purpose of the
-        <varname>recovery_end_command</> is to provide a mechanism for cleanup
-        following replication or recovery.
+        This parameter specifies a shell command that will be executed at
+        every restartpoint. This parameter is optional. The purpose of the
+        <varname>restartpoint_command</> is to provide a mechanism for cleaning
+        up old archived WAL files that are no longer needed by the standby
+        server.
         Any <literal>%r</> is replaced by the name of the file
         containing the last valid restart point. That is the earliest file that
         must be kept to allow a restore to be restartable, so this information
@@ -79,6 +80,24 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </para>
        <para>
         If the command returns a non-zero exit status then a WARNING log
+        message will be written.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry id="recovery-end-command" xreflabel="recovery_end_command">
+      <term><varname>recovery_end_command</varname> (<type>string</type>)</term>
+      <listitem>
+       <para>
+        This parameter specifies a shell command that will be executed once only
+        at the end of recovery. This parameter is optional. The purpose of the
+        <varname>recovery_end_command</> is to provide a mechanism for cleanup
+        following replication or recovery.
+        Any <literal>%r</> is replaced by the name of the file containing the
+        last valid restart point, like in <xref linkend="restartpoint-command">.
+       </para>
+       <para>
+        If the command returns a non-zero exit status then a WARNING log
         message will be written and the database will proceed to start up
         anyway.  An exception is that if the command was terminated by a
         signal, the database will not proceed with startup.
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 995794a..519526e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -171,6 +171,7 @@ static bool restoredFromArchive = false;
 /* options taken from recovery.conf for archive recovery */
 static char *recoveryRestoreCommand = NULL;
 static char *recoveryEndCommand = NULL;
+static char *restartPointCommand = NULL;
 static bool recoveryTarget = false;
 static bool recoveryTargetExact = false;
 static bool recoveryTargetInclusive = true;
@@ -370,6 +371,11 @@ typedef struct XLogCtlData
     int            XLogCacheBlck;    /* highest allocated xlog buffer index */
     TimeLineID    ThisTimeLineID;
     TimeLineID    RecoveryTargetTLI;
+    /*
+     * restartPointCommand is read from recovery.conf but needs to be in
+     * shared memory so that the bgwriter process can access it.
+     */
+    char        restartPointCommand[MAXPGPATH];

     /*
      * SharedRecoveryInProgress indicates if we're still in crash or archive
@@ -520,7 +526,8 @@ static bool XLogPageRead(XLogRecPtr *RecPtr, int emode, bool fetching_ckpt,
 static void XLogFileClose(void);
 static bool RestoreArchivedFile(char *path, const char *xlogfname,
                     const char *recovername, off_t expectedSize);
-static void ExecuteRecoveryEndCommand(void);
+static void ExecuteRecoveryCommand(char *command, char *commandName,
+                       bool failOnerror);
 static void PreallocXlogFiles(XLogRecPtr endptr);
 static void RemoveOldXlogFiles(uint32 log, uint32 seg, XLogRecPtr endptr);
 static void ValidateXLOGDirectoryStructure(void);
@@ -2990,12 +2997,19 @@ not_available:
 }

 /*
- * Attempt to execute the recovery_end_command.
+ * Attempt to execute an external recovery command.
+ *
+ * 'command' is the shell command to be executed, 'commandName' is a human-
+ * readable name describing the command emitted in the logs. If 'failonSignal'
+ * is true and the command is killed by a signal, a FATAL error is thrown.
+ * Otherwise a WARNING is emitted.
+ *
+ * This is currently used for restore_end_command and restartpoint_command.
  */
 static void
-ExecuteRecoveryEndCommand(void)
+ExecuteRecoveryCommand(char *command, char *commandName, bool failOnSignal)
 {
-    char        xlogRecoveryEndCmd[MAXPGPATH];
+    char        xlogRecoveryCmd[MAXPGPATH];
     char        lastRestartPointFname[MAXPGPATH];
     char       *dp;
     char       *endp;
@@ -3005,7 +3019,7 @@ ExecuteRecoveryEndCommand(void)
     uint32        restartLog;
     uint32        restartSeg;

-    Assert(recoveryEndCommand);
+    Assert(command && commandName);

     /*
      * Calculate the archive file cutoff point for use during log shipping
@@ -3023,25 +3037,22 @@ ExecuteRecoveryEndCommand(void)
      * flags to signify the point when we can begin deleting WAL files from
      * the archive.
      */
-    if (InRedo)
-    {
-        XLByteToSeg(ControlFile->checkPointCopy.redo,
-                    restartLog, restartSeg);
-        XLogFileName(lastRestartPointFname,
-                     ControlFile->checkPointCopy.ThisTimeLineID,
-                     restartLog, restartSeg);
-    }
-    else
-        XLogFileName(lastRestartPointFname, 0, 0, 0);
+    LWLockAcquire(ControlFileLock, LW_SHARED);
+    XLByteToSeg(ControlFile->checkPointCopy.redo,
+                restartLog, restartSeg);
+    XLogFileName(lastRestartPointFname,
+                 ControlFile->checkPointCopy.ThisTimeLineID,
+                 restartLog, restartSeg);
+    LWLockRelease(ControlFileLock);

     /*
      * construct the command to be executed
      */
-    dp = xlogRecoveryEndCmd;
-    endp = xlogRecoveryEndCmd + MAXPGPATH - 1;
+    dp = xlogRecoveryCmd;
+    endp = xlogRecoveryCmd + MAXPGPATH - 1;
     *endp = '\0';

-    for (sp = recoveryEndCommand; *sp; sp++)
+    for (sp = command; *sp; sp++)
     {
         if (*sp == '%')
         {
@@ -3075,13 +3086,12 @@ ExecuteRecoveryEndCommand(void)
     *dp = '\0';

     ereport(DEBUG3,
-            (errmsg_internal("executing recovery end command \"%s\"",
-                             xlogRecoveryEndCmd)));
+            (errmsg_internal("executing %s \"%s\"", commandName, command)));

     /*
      * execute the constructed command
      */
-    rc = system(xlogRecoveryEndCmd);
+    rc = system(xlogRecoveryCmd);
     if (rc != 0)
     {
         /*
@@ -3091,9 +3101,13 @@ ExecuteRecoveryEndCommand(void)
          */
         signaled = WIFSIGNALED(rc) || WEXITSTATUS(rc) > 125;

-        ereport(signaled ? FATAL : WARNING,
-                (errmsg("recovery_end_command \"%s\": return code %d",
-                        xlogRecoveryEndCmd, rc)));
+        /*
+         * translator: First %s represents a recovery.conf parameter name like
+         * "recovery_end_command", and the 2nd is the value of that parameter.
+         */
+        ereport((signaled && failOnSignal) ? FATAL : WARNING,
+                (errmsg("%s \"%s\": return code %d", commandName,
+                        command, rc)));
     }
 }

@@ -4936,6 +4950,13 @@ readRecoveryCommandFile(void)
                     (errmsg("recovery_end_command = '%s'",
                             recoveryEndCommand)));
         }
+        else if (strcmp(tok1, "restartpoint_command") == 0)
+        {
+            restartPointCommand = pstrdup(tok2);
+            ereport(DEBUG2,
+                    (errmsg("restartpoint_command = '%s'",
+                            restartPointCommand)));
+        }
         else if (strcmp(tok1, "recovery_target_timeline") == 0)
         {
             rtliGiven = true;
@@ -5505,8 +5526,14 @@ StartupXLOG(void)
                         recoveryTargetTLI,
                         ControlFile->checkPointCopy.ThisTimeLineID)));

-    /* Save the selected recovery target timeline ID in shared memory */
+    /*
+     * Save the selected recovery target timeline ID and restartpoint_command
+     * in shared memory so that other processes can see them
+     */
     XLogCtl->RecoveryTargetTLI = recoveryTargetTLI;
+    strncpy(XLogCtl->restartPointCommand,
+            restartPointCommand ? restartPointCommand : "",
+            sizeof(XLogCtl->restartPointCommand));

     if (read_backup_label(&checkPointLoc))
     {
@@ -6129,7 +6156,9 @@ StartupXLOG(void)
          * And finally, execute the recovery_end_command, if any.
          */
         if (recoveryEndCommand)
-            ExecuteRecoveryEndCommand();
+            ExecuteRecoveryCommand(recoveryEndCommand,
+                                   "recovery_end_command",
+                                   true);
     }

     /*
@@ -7318,6 +7347,15 @@ CreateRestartPoint(int flags)
              timestamptz_to_str(GetLatestXLogTime()))));

     LWLockRelease(CheckpointLock);
+
+    /*
+     * Finally, execute restartpoint_command, if any.
+     */
+    if (XLogCtl->restartPointCommand[0])
+        ExecuteRecoveryCommand(XLogCtl->restartPointCommand,
+                               "restartpoint_command",
+                               false);
+
     return true;
 }

Re: Command to prune archive at restartpoints

From

Greg Stark

Date:

17 March 2010, 07:02:18

On Wed, Mar 17, 2010 at 9:37 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> One awkward omission in the new built-in standby mode, mainly used for
> streaming replication, is that there is no easy way to delete old
> archived files like you do with the %r parameter to restore_command.

I'm still finding this kind of narrow-minded. I'm picturing a system
with multiple replicas -- obvious no one replica can take it upon
itself to delete archived log files based only on its own
restartpoint. And besides, if you're using the archived log files for
backups you also need to take into account the backup policy and only
delete files that aren't needed for a consistent backup and aren't
needed for the replica.

What we need is a program which can take all this information from all
your slaves and backup labels into account and implement your backup
policies. It probably won't exist in time for the release and in any
case doesn't really have to ship with Postgres. There might even be
more than one.

But do we have all the information that such a program would need? Is
there a way to connect to a replica and ask it what the restart point
is? I suppose with this new command you could always just make it a
command which wakes up this demon and sends it the restart point and
the replica id and it can update its internal state and recalculate
what archives are needed. It is a bit nerve-wracking that it's
dependent on its internal state remembering the restart points it's
been given though.

-- 
greg

Re: Command to prune archive at restartpoints

From

Heikki Linnakangas

Date:

17 March 2010, 07:14:59

Greg Stark wrote:
> On Wed, Mar 17, 2010 at 9:37 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> One awkward omission in the new built-in standby mode, mainly used for
>> streaming replication, is that there is no easy way to delete old
>> archived files like you do with the %r parameter to restore_command.
> 
> I'm still finding this kind of narrow-minded. I'm picturing a system
> with multiple replicas -- obvious no one replica can take it upon
> itself to delete archived log files based only on its own
> restartpoint. And besides, if you're using the archived log files for
> backups you also need to take into account the backup policy and only
> delete files that aren't needed for a consistent backup and aren't
> needed for the replica.

That's why we provide options that take any shell command you want,
rather than e.g a path to an archive directory that's pruned automatically.

For example, if you have multiple standbys sharing one archive, you
could do something like this:

In each standby, have a restartpoint_command along the lines of:
"echo %r > <archivedirectory>/standby1_location; archive_cleanup.sh"

Where '1' is different for every standby

and in archive_cleanup.sh, scan through all the standbyX_location files,
take the minimum, and delete all files smaller than that.

You'll need some care with locking etc., but the point is that the
current hooks allow you to implement complex setups like that.

> What we need is a program which can take all this information from all
> your slaves and backup labels into account and implement your backup
> policies. It probably won't exist in time for the release and in any
> case doesn't really have to ship with Postgres. There might even be
> more than one.

I guess I just described such a program :-). Yeah, I'd imagine that to
become part of toolkits like skytools.

> But do we have all the information that such a program would need? Is
> there a way to connect to a replica and ask it what the restart point
> is?

Hmm, Greg Smith opened a thread on exposing the fields in the control
file as user-defined functions. IIRC last restartpoint location was the
piece of information that triggered the discussion this time. Perhaps we
should indeed add a function to expose that in 9.0.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: Command to prune archive at restartpoints

From

Heikki Linnakangas

Date:

18 March 2010, 06:19:39

Committed.

Heikki Linnakangas wrote:
> One awkward omission in the new built-in standby mode, mainly used for
> streaming replication, is that there is no easy way to delete old
> archived files like you do with the %r parameter to restore_command.
> This was discussed at
> http://archives.postgresql.org/pgsql-hackers/2010-02/msg01003.php, among
> other things.
> 
> Per discussion, attached patch adds a new restartpoint_command option to
> recovery.conf. That's an external shell command just like
> recovery_end_command that's executed at every restartpoint. You can use
> the %r parameter to pass the filename of the oldest WAL file that needs
> to be retained.
> 
> While developing this I noticed that %r in recovery_end_command is not
> working correctly:
> 
> LOG:  redo done at 0/14000C10
> LOG:  last completed transaction was at log time 2000-01-01
> 02:21:08.816445+02
> cp: cannot stat
> `/home/hlinnaka/pgsql.cvshead/walarchive/000000010000000000000014': No
> such file or directory
> cp: cannot stat
> `/home/hlinnaka/pgsql.cvshead/walarchive/00000002.history': No such file
> or directory
> LOG:  selected new timeline ID: 2
> cp: cannot stat
> `/home/hlinnaka/pgsql.cvshead/walarchive/00000001.history': No such file
> or directory
> LOG:  archive recovery complete
> LOG:  checkpoint starting: end-of-recovery immediate wait
> LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log
> file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s,
> total=0.003 s
> LOG:  executing recovery_end_command "echo recovery_end_command %r"
> recovery_end_command 000000000000000000000000
> LOG:  database system is ready to accept connections
> LOG:  autovacuum launcher started
> 
> Note how %r is always expanded to 000000000000000000000000. That's
> because %r is expanded only when InRedo is true, which makes sense for
> restore_command where that piece of code was copy-pasted from, but it's
> never true anymore when recovery_end_command is run. The attached patch
> fixes that too.
> 
> Barring objections, I will commit this later today.
> 
> 


--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: Command to prune archive at restartpoints

From

Simon Riggs

Date:

18 March 2010, 06:43:44

On Wed, 2010-03-17 at 11:37 +0200, Heikki Linnakangas wrote:

> One awkward omission in the new built-in standby mode, mainly used for
> streaming replication, is that there is no easy way to delete old
> archived files like you do with the %r parameter to restore_command.
> This was discussed at
> http://archives.postgresql.org/pgsql-hackers/2010-02/msg01003.php, among
> other things.

...

> Barring objections, I will commit this later today.

Would it be better to call this "archive_cleanup_command"? That might
help people understand the need for and the use of this parameter.

-- Simon Riggs           www.2ndQuadrant.com

Re: Command to prune archive at restartpoints

From

Greg Stark

Date:

22 March 2010, 12:58:50

On Thu, Mar 18, 2010 at 9:43 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Wed, 2010-03-17 at 11:37 +0200, Heikki Linnakangas wrote:
>
>> One awkward omission in the new built-in standby mode, mainly used for
>> streaming replication, is that there is no easy way to delete old
>> archived files like you do with the %r parameter to restore_command.
>
> Would it be better to call this "archive_cleanup_command"? That might
> help people understand the need for and the use of this parameter.

This is bikeshedding but fwiw I like Simon's suggestion.

-- 
greg

Re: Command to prune archive at restartpoints

From

Robert Haas

Date:

08 June 2010, 18:17:44

On Mon, Mar 22, 2010 at 11:58 AM, Greg Stark <gsstark@mit.edu> wrote:
> On Thu, Mar 18, 2010 at 9:43 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On Wed, 2010-03-17 at 11:37 +0200, Heikki Linnakangas wrote:
>>
>>> One awkward omission in the new built-in standby mode, mainly used for
>>> streaming replication, is that there is no easy way to delete old
>>> archived files like you do with the %r parameter to restore_command.
>>
>> Would it be better to call this "archive_cleanup_command"? That might
>> help people understand the need for and the use of this parameter.
>
> This is bikeshedding but fwiw I like Simon's suggestion.

So, this thread is hanging out on our list of open items for 9.0.  My
personal opinion on it is that I don't really care much one way or the
other.  archive_cleanup_command does seem easier to understand, but
restartpoint_command has the advantage of describing exactly when it
gets run from a technical perspective, which might be a good thing,
too.  Since nobody's felt motivated to do anything about this for two
and a half months and we've now been through two betas with it the way
it is, I'm inclined to say we should just leave it alone.  On the
other hand, both of the people who voted in favor of changing it are
committers, and if one of them feels like putting in the effort to
change it, it won't bother me much, except that I feel it should get
done RSN.  But one way or the other we need to make a decision and get
this off the list.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Command to prune archive at restartpoints

From

Simon Riggs

Date:

08 June 2010, 18:35:37

On Tue, 2010-06-08 at 17:17 -0400, Robert Haas wrote:
> On Mon, Mar 22, 2010 at 11:58 AM, Greg Stark <gsstark@mit.edu> wrote:
> > On Thu, Mar 18, 2010 at 9:43 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> >> On Wed, 2010-03-17 at 11:37 +0200, Heikki Linnakangas wrote:
> >>
> >>> One awkward omission in the new built-in standby mode, mainly used for
> >>> streaming replication, is that there is no easy way to delete old
> >>> archived files like you do with the %r parameter to restore_command.
> >>
> >> Would it be better to call this "archive_cleanup_command"? That might
> >> help people understand the need for and the use of this parameter.
> >
> > This is bikeshedding but fwiw I like Simon's suggestion.
> 
> So, this thread is hanging out on our list of open items for 9.0.  My
> personal opinion on it is that I don't really care much one way or the
> other.  archive_cleanup_command does seem easier to understand, but
> restartpoint_command has the advantage of describing exactly when it
> gets run from a technical perspective, which might be a good thing,
> too.  Since nobody's felt motivated to do anything about this for two
> and a half months and we've now been through two betas with it the way
> it is, I'm inclined to say we should just leave it alone.  On the
> other hand, both of the people who voted in favor of changing it are
> committers, and if one of them feels like putting in the effort to
> change it, it won't bother me much, except that I feel it should get
> done RSN.  But one way or the other we need to make a decision and get
> this off the list.

Yes, restartpoint_command is exactly correct, and I do understand it; I
just don't think anyone else will. If there's another use for a
restartpoint_command other than for clearing up an archive, then it
would be sufficient to destroy the name change idea. 

-- Simon Riggs           www.2ndQuadrant.com

Re: Command to prune archive at restartpoints

From

Andrew Dunstan

Date:

08 June 2010, 19:31:12


Robert Haas wrote:
> On Mon, Mar 22, 2010 at 11:58 AM, Greg Stark <gsstark@mit.edu> wrote:
>   
>> On Thu, Mar 18, 2010 at 9:43 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>     
>>> On Wed, 2010-03-17 at 11:37 +0200, Heikki Linnakangas wrote:
>>>
>>>       
>>>> One awkward omission in the new built-in standby mode, mainly used for
>>>> streaming replication, is that there is no easy way to delete old
>>>> archived files like you do with the %r parameter to restore_command.
>>>>         
>>> Would it be better to call this "archive_cleanup_command"? That might
>>> help people understand the need for and the use of this parameter.
>>>       
>> This is bikeshedding but fwiw I like Simon's suggestion.
>>     
>
> So, this thread is hanging out on our list of open items for 9.0.  My
> personal opinion on it is that I don't really care much one way or the
> other.  archive_cleanup_command does seem easier to understand, but
> restartpoint_command has the advantage of describing exactly when it
> gets run from a technical perspective, which might be a good thing,
> too.  Since nobody's felt motivated to do anything about this for two
> and a half months and we've now been through two betas with it the way
> it is, I'm inclined to say we should just leave it alone.  On the
> other hand, both of the people who voted in favor of changing it are
> committers, and if one of them feels like putting in the effort to
> change it, it won't bother me much, except that I feel it should get
> done RSN.  But one way or the other we need to make a decision and get
> this off the list.
>
>   

I prefer archive_cleanup_command. We should name things after their 
principal function, not an implementation detail, IMNSHO.

More importantly, we should include an example in the docs. I created 
one the other day  when this was actually bothering me a bit (see 
<http://people.planetpostgresql.org/andrew/index.php?/archives/85-Keeping-a-hot-standby-log-archive-clean.html>). 
That seemed to work ok, but maybe it's too long, and maybe people would 
prefer a shell script to perl.

cheers

andrew

Re: Command to prune archive at restartpoints

From

Tom Lane

Date:

08 June 2010, 19:48:43

Andrew Dunstan <andrew@dunslane.net> writes:
> I prefer archive_cleanup_command. We should name things after their 
> principal function, not an implementation detail, IMNSHO.

Weak preference for archive_cleanup_command here.

> More importantly, we should include an example in the docs. I created 
> one the other day  when this was actually bothering me a bit (see 
> <http://people.planetpostgresql.org/andrew/index.php?/archives/85-Keeping-a-hot-standby-log-archive-clean.html>). 
> That seemed to work ok, but maybe it's too long, and maybe people would 
> prefer a shell script to perl.

Short is good.  Maybe you could remove the logging stuff from the
example.

As for the language choice, my first thought is +1 for perl over shell,
mainly because it might be directly useful to people on Windows while
shell never would be.  On the other hand, if it's possible to do a
useful one-liner in shell then let's do it that way.
        regards, tom lane

Re: Command to prune archive at restartpoints

From

Andrew Dunstan

Date:

08 June 2010, 20:07:22


Tom Lane wrote:
> As for the language choice, my first thought is +1 for perl over shell,
> mainly because it might be directly useful to people on Windows while
> shell never would be.  On the other hand, if it's possible to do a
> useful one-liner in shell then let's do it that way.
>   

I don't think it is, reasonably. But here is fairly minimal version that 
might suit the docs:
   use strict;   my ($dir, $num) = @ARGV;   foreach my $file (glob("$dir/*"))   {           my $name = basename($file);
         unlink $file if (-f $file && $name =~ /^[0-9A-Z]{24}$/ && $name lt $num);   }     
 



cheers

andrew

Re: Command to prune archive at restartpoints

From

Robert Haas

Date:

08 June 2010, 21:47:29

On Tue, Jun 8, 2010 at 6:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> I prefer archive_cleanup_command. We should name things after their
>> principal function, not an implementation detail, IMNSHO.
>
> Weak preference for archive_cleanup_command here.

OK, sounds like we have consensus on that.  Who wants to do it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Command to prune archive at restartpoints

From

Takahiro Itagaki

Date:

08 June 2010, 22:44:52

Robert Haas <robertmhaas@gmail.com> wrote:

> On Tue, Jun 8, 2010 at 6:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Andrew Dunstan <andrew@dunslane.net> writes:
> >> I prefer archive_cleanup_command. We should name things after their
> >> principal function, not an implementation detail, IMNSHO.
> >
> > Weak preference for archive_cleanup_command here.
>
> OK, sounds like we have consensus on that.  Who wants to do it?

Do we just need to replace all of them? If so, patch attached.
I replaced 3 terms: recovery_end_command, recovery-end-command,
and recoveryEndCommand.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

Attachment

archive_cleanup_command.patch

Re: Command to prune archive at restartpoints

From

Robert Haas

Date:

08 June 2010, 22:52:09

On Tue, Jun 8, 2010 at 9:45 PM, Takahiro Itagaki
<itagaki.takahiro@oss.ntt.co.jp> wrote:
> Robert Haas <robertmhaas@gmail.com> wrote:
>
>> On Tue, Jun 8, 2010 at 6:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> > Andrew Dunstan <andrew@dunslane.net> writes:
>> >> I prefer archive_cleanup_command. We should name things after their
>> >> principal function, not an implementation detail, IMNSHO.
>> >
>> > Weak preference for archive_cleanup_command here.
>>
>> OK, sounds like we have consensus on that.  Who wants to do it?
>
> Do we just need to replace all of them? If so, patch attached.
> I replaced 3 terms: recovery_end_command, recovery-end-command,
> and recoveryEndCommand.

I think we're replacing restartpoint_command, not recovery_end_command.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Command to prune archive at restartpoints

From

Fujii Masao

Date:

08 June 2010, 23:00:52

On Wed, Jun 9, 2010 at 10:45 AM, Takahiro Itagaki
<itagaki.takahiro@oss.ntt.co.jp> wrote:
>
> Robert Haas <robertmhaas@gmail.com> wrote:
>
>> On Tue, Jun 8, 2010 at 6:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> > Andrew Dunstan <andrew@dunslane.net> writes:
>> >> I prefer archive_cleanup_command. We should name things after their
>> >> principal function, not an implementation detail, IMNSHO.
>> >
>> > Weak preference for archive_cleanup_command here.
>>
>> OK, sounds like we have consensus on that.  Who wants to do it?
>
> Do we just need to replace all of them? If so, patch attached.
> I replaced 3 terms: recovery_end_command, recovery-end-command,
> and recoveryEndCommand.

s/recovery_end_command/restartpoint_command?

I prefer restartpoint_command over archive_cleanup_command because
not only restartpoint_command but also recovery_end_command is used
for archive cleanup.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: Command to prune archive at restartpoints

From

Takahiro Itagaki

Date:

08 June 2010, 23:17:27

Robert Haas <robertmhaas@gmail.com> wrote:
> I think we're replacing restartpoint_command, not recovery_end_command.

Ah, sorry. I did the same replacement for restartpoint_command
in _, -, and camel case words.

BTW, should we also have a release note for the command?
I added a simple description for it in the patch.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

Attachment

estartpoint-to-archive_cleanup.patch

Re: Command to prune archive at restartpoints

From

Robert Haas

Date:

08 June 2010, 23:58:43

On Tue, Jun 8, 2010 at 10:18 PM, Takahiro Itagaki
<itagaki.takahiro@oss.ntt.co.jp> wrote:
>
> Robert Haas <robertmhaas@gmail.com> wrote:
>> I think we're replacing restartpoint_command, not recovery_end_command.
>
> Ah, sorry. I did the same replacement for restartpoint_command
> in _, -, and camel case words.

Gah.  Perhaps one of these days we will stop spelling every identifier
in multiple different ways.

> BTW, should we also have a release note for the command?
> I added a simple description for it in the patch.

Yeah, it should be definitely mentioned in the release notes somewhere, I think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Command to prune archive at restartpoints

From

Simon Riggs

Date:

09 June 2010, 17:15:10

On Wed, 2010-06-09 at 11:18 +0900, Takahiro Itagaki wrote:
> Robert Haas <robertmhaas@gmail.com> wrote:
> > I think we're replacing restartpoint_command, not recovery_end_command.
> 
> Ah, sorry. I did the same replacement for restartpoint_command
> in _, -, and camel case words.
> 
> BTW, should we also have a release note for the command?
> I added a simple description for it in the patch.

I don't think so, its not a separate feature. It's a change as part of
SR.

-- Simon Riggs           www.2ndQuadrant.com

Re: Command to prune archive at restartpoints

From

Simon Riggs

Date:

09 June 2010, 17:18:28

On Tue, 2010-06-08 at 18:30 -0400, Andrew Dunstan wrote:

> I prefer archive_cleanup_command. We should name things after their 
> principal function, not an implementation detail, IMNSHO.
> 
> More importantly, we should include an example in the docs. I created 
> one the other day  when this was actually bothering me a bit (see 
> <http://people.planetpostgresql.org/andrew/index.php?/archives/85-Keeping-a-hot-standby-log-archive-clean.html>). 
> That seemed to work ok, but maybe it's too long, and maybe people would 
> prefer a shell script to perl.

I submitted a patch to make the command "pg_standby -a %r"

That's a more portable solution, ISTM.

I'll commit that and fix the docs.

-- Simon Riggs           www.2ndQuadrant.com

Re: Command to prune archive at restartpoints

From

Heikki Linnakangas

Date:

10 June 2010, 04:18:23

On 09/06/10 10:21, Simon Riggs wrote:
> On Tue, 2010-06-08 at 18:30 -0400, Andrew Dunstan wrote:
>
>> I prefer archive_cleanup_command. We should name things after their
>> principal function, not an implementation detail, IMNSHO.
>>
>> More importantly, we should include an example in the docs. I created
>> one the other day  when this was actually bothering me a bit (see
>> <http://people.planetpostgresql.org/andrew/index.php?/archives/85-Keeping-a-hot-standby-log-archive-clean.html>).
>> That seemed to work ok, but maybe it's too long, and maybe people would
>> prefer a shell script to perl.
>
> I submitted a patch to make the command "pg_standby -a %r"
>
> That's a more portable solution, ISTM.
>
> I'll commit that and fix the docs.

Huh, wait. There's no -a option in pg_standby, so I presume you're 
planning to add that too. I don't like confusing pg_standby into this, 
the docs are currently quite clear that if you want to use the built-in 
standby mode, you can't use pg_standby, and this would muddy the waters.

Maybe we could add a new pg_cleanuparchive binary, but we'll need some 
discussion...

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Command to prune archive at restartpoints

From

Simon Riggs

Date:

10 June 2010, 04:29:20

On Thu, 2010-06-10 at 10:18 +0300, Heikki Linnakangas wrote:
> On 09/06/10 10:21, Simon Riggs wrote:
> > On Tue, 2010-06-08 at 18:30 -0400, Andrew Dunstan wrote:
> >
> >> I prefer archive_cleanup_command. We should name things after their
> >> principal function, not an implementation detail, IMNSHO.
> >>
> >> More importantly, we should include an example in the docs. I created
> >> one the other day  when this was actually bothering me a bit (see
> >> <http://people.planetpostgresql.org/andrew/index.php?/archives/85-Keeping-a-hot-standby-log-archive-clean.html>).
> >> That seemed to work ok, but maybe it's too long, and maybe people would
> >> prefer a shell script to perl.
> >
> > I submitted a patch to make the command "pg_standby -a %r"
> >
> > That's a more portable solution, ISTM.
> >
> > I'll commit that and fix the docs.
> 
> Huh, wait. There's no -a option in pg_standby, so I presume you're 
> planning to add that too. I don't like confusing pg_standby into this, 
> the docs are currently quite clear that if you want to use the built-in 
> standby mode, you can't use pg_standby, and this would muddy the waters.

It won't kill us to change that sentence. "pg_standby is only used now
within the cleanup command" etc

pg_standby already contains the exact logic we need here. Having two
sets of code for the same thing isn't how we do things.

> Maybe we could add a new pg_cleanuparchive binary, but we'll need some 
> discussion...

Which will go nowhere, as we both already know.

-- Simon Riggs           www.2ndQuadrant.com

Re: Command to prune archive at restartpoints

From

Robert Haas

Date:

10 June 2010, 11:17:31

On Thu, Jun 10, 2010 at 3:28 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Thu, 2010-06-10 at 10:18 +0300, Heikki Linnakangas wrote:
>> On 09/06/10 10:21, Simon Riggs wrote:
>> > On Tue, 2010-06-08 at 18:30 -0400, Andrew Dunstan wrote:
>> >
>> >> I prefer archive_cleanup_command. We should name things after their
>> >> principal function, not an implementation detail, IMNSHO.
>> >>
>> >> More importantly, we should include an example in the docs. I created
>> >> one the other day  when this was actually bothering me a bit (see
>> >> <http://people.planetpostgresql.org/andrew/index.php?/archives/85-Keeping-a-hot-standby-log-archive-clean.html>).
>> >> That seemed to work ok, but maybe it's too long, and maybe people would
>> >> prefer a shell script to perl.
>> >
>> > I submitted a patch to make the command "pg_standby -a %r"
>> >
>> > That's a more portable solution, ISTM.
>> >
>> > I'll commit that and fix the docs.
>>
>> Huh, wait. There's no -a option in pg_standby, so I presume you're
>> planning to add that too. I don't like confusing pg_standby into this,
>> the docs are currently quite clear that if you want to use the built-in
>> standby mode, you can't use pg_standby, and this would muddy the waters.
>
> It won't kill us to change that sentence. "pg_standby is only used now
> within the cleanup command" etc
>
> pg_standby already contains the exact logic we need here. Having two
> sets of code for the same thing isn't how we do things.
>
>> Maybe we could add a new pg_cleanuparchive binary, but we'll need some
>> discussion...
>
> Which will go nowhere, as we both already know.

I have a feeling that I may be poking my nose into an incipient
shouting match, but FWIW I agree with Heikki that it would be
preferable to keep this separate from pg_standby.  Considering that
Andrew wrote this in 24 lines of Perl code (one-third of which are
basically just there for logging purposes), I'm not that worried about
code duplication, unless what we actually need is significantly more
complicated.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Command to prune archive at restartpoints

From

Andrew Dunstan

Date:

10 June 2010, 11:38:37

Robert Haas wrote:
>> It won't kill us to change that sentence. "pg_standby is only used now
>> within the cleanup command" etc
>>
>> pg_standby already contains the exact logic we need here. Having two
>> sets of code for the same thing isn't how we do things.
>>     

Well, we could factor out that part of the code so it could be used in 
two binaries. But ...

>>> Maybe we could add a new pg_cleanuparchive binary, but we'll need some
>>> discussion...
>>>       
>> Which will go nowhere, as we both already know.
>>     
>
> I have a feeling that I may be poking my nose into an incipient
> shouting match, but FWIW I agree with Heikki that it would be
> preferable to keep this separate from pg_standby.  Considering that
> Andrew wrote this in 24 lines of Perl code (one-third of which are
> basically just there for logging purposes), I'm not that worried about
> code duplication, unless what we actually need is significantly more
> complicated.
>
>   

I think my logic needs a tiny piece of adjustment, to ignore the 
timeline segment of the file name. But that will hardly involve a great 
deal of extra code - just chop off the first 8 chars. It's not like the 
code for this in pg_standby.c is terribly complex.

The virtue of a perl script is that it's very easily customizable, e.g. 
you might only delete files if they are older than a certain age.

cheers

andrew

Re: Command to prune archive at restartpoints

From

Heikki Linnakangas

Date:

10 June 2010, 12:19:34

On 10/06/10 17:38, Andrew Dunstan wrote:
> I think my logic needs a tiny piece of adjustment, to ignore the
> timeline segment of the file name.

I'm not sure you should ignore it. Presumably anything in an older 
timeline is indeed not required anymore and can be removed, and anything 
in a newer timeline... how did it get there? Seems safer not remove it.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Command to prune archive at restartpoints

From

Andrew Dunstan

Date:

10 June 2010, 12:33:36


Heikki Linnakangas wrote:
> On 10/06/10 17:38, Andrew Dunstan wrote:
>> I think my logic needs a tiny piece of adjustment, to ignore the
>> timeline segment of the file name.
>
> I'm not sure you should ignore it. Presumably anything in an older 
> timeline is indeed not required anymore and can be removed, and 
> anything in a newer timeline... how did it get there? Seems safer not 
> remove it.
>

Well, I was just following the logic in pg-standby.c:
               /*                * We ignore the timeline part of the XLOG segment 
identifiers                * in deciding whether a segment is still needed.  This                * ensures that we
won'tprematurely remove a segment from a                * parent timeline. We could probably be a little more
    * proactive about removing segments of non-parent 
 
timelines,                * but that would be a whole lot more complicated.                *                * We use
thealphanumeric sorting property of the filenames                * to decide which ones are earlier than the
   * exclusiveCleanupFileName file. Note that this means files                * are not removed in the order they were
originally
 
written,                * in case this worries you.                */               if (strlen(xlde->d_name) ==
XLOG_DATA_FNAME_LEN&&                   strspn(xlde->d_name, "0123456789ABCDEF")                       ==
XLOG_DATA_FNAME_LEN&&                 strcmp(xlde->d_name + 8, exclusiveCleanupFileName + 8) 
 
< 0)


cheers

andrew

Re: Command to prune archive at restartpoints

From

Dimitri Fontaine

Date:

10 June 2010, 16:24:53

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Maybe we could add a new pg_cleanuparchive binary, but we'll need some
> discussion...

Would this binary ever be used manually, not invoked by PostgreSQL? As
it depends on the %r option to be given and to be right, I don't think
so.

Therefore my take on this problem is to provide internal commands here,
that maybe wouldn't need to be explicitly passed any argument. If
they're internal they certainly can access to the information they need?

As a user, I'd find it so much better to trust PostgreSQL for proposing
sane defaults. As a developer, you will certainly find it easier to
maintain, document and distribute.

While at it, the other internal command we need is pg_archive_bypass for
the archive_command so that windows users have the /usr/bin/true option
too.

Regards,
-- 
dim

Re: Command to prune archive at restartpoints

From

Heikki Linnakangas

Date:

10 June 2010, 16:49:18

On 10/06/10 22:24, Dimitri Fontaine wrote:
> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  writes:
>> Maybe we could add a new pg_cleanuparchive binary, but we'll need some
>> discussion...
>
> Would this binary ever be used manually, not invoked by PostgreSQL? As
> it depends on the %r option to be given and to be right, I don't think
> so.

Hmm, actually it would be pretty handy. To make use of a base backup, 
you need all the WAL files following the one where pg_start_backup() was 
called. We create a .backup file in the archive to indicate that 
location, like:

00000001000000000000002F.00000020.backup

So to clean up all WAL files older than those needed by that base 
backup, you would simply copy-paste that location and call 
pg_cleanuparchive:

pg_cleanuparchive /walarchive/ 00000001000000000000002F

Of course, if there's a perl one-liner to do that, we can just put that 
in the docs and don't really need pg_cleanuparchive at all.

> Therefore my take on this problem is to provide internal commands here,
> that maybe wouldn't need to be explicitly passed any argument. If
> they're internal they certainly can access to the information they need?

You want more flexibility in more advanced cases. Like if you have 
multiple standbys sharing the archive, you only want to remove old WAL 
files after they're not needed by *any* of the standbys anymore. Doing 
the cleanup directly in the archive_cleanup_command would cause the old 
WAL files to be removed prematurely, but you could put a shell script 
there to store the location to a file, and call pg_cleanuparchive with 
the max() of the locations reported by all standby servers.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Command to prune archive at restartpoints

From

Simon Riggs

Date:

10 June 2010, 17:10:55

On Thu, 2010-06-10 at 22:49 +0300, Heikki Linnakangas wrote:
> On 10/06/10 22:24, Dimitri Fontaine wrote:
> > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  writes:
> >> Maybe we could add a new pg_cleanuparchive binary, but we'll need some
> >> discussion...
> >
> > Would this binary ever be used manually, not invoked by PostgreSQL? As
> > it depends on the %r option to be given and to be right, I don't think
> > so.
>
> Hmm, actually it would be pretty handy. To make use of a base backup,
> you need all the WAL files following the one where pg_start_backup() was
> called. We create a .backup file in the archive to indicate that
> location, like:
>
> 00000001000000000000002F.00000020.backup
>
> So to clean up all WAL files older than those needed by that base
> backup, you would simply copy-paste that location and call
> pg_cleanuparchive:
>
> pg_cleanuparchive /walarchive/ 00000001000000000000002F

OK, sounds like we're on the same thought train.

Here's the code.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services

Attachment

pg_archivecleanup.tar

Re: Command to prune archive at restartpoints

From

Dimitri Fontaine

Date:

11 June 2010, 15:19:08

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> So to clean up all WAL files older than those needed by that base backup,
> you would simply copy-paste that location and call pg_cleanuparchive:
>
> pg_cleanuparchive /walarchive/ 00000001000000000000002F

Ok, idle though: what about having a superuser-only SRF doing the same?
So that we have internal command for simple case, and SRF for use in
scripts in more complex case.

> Of course, if there's a perl one-liner to do that, we can just put that in
> the docs and don't really need pg_cleanuparchive at all.

psql -c "SELECT * FROM pg_cleanup_archive('00000001000000000000002F');"

>> Therefore my take on this problem is to provide internal commands here,
>> that maybe wouldn't need to be explicitly passed any argument. If
>> they're internal they certainly can access to the information they need?
>
> You want more flexibility in more advanced cases. Like if you have multiple
> standbys sharing the archive, you only want to remove old WAL files after
> they're not needed by *any* of the standbys anymore. Doing the cleanup
> directly in the archive_cleanup_command would cause the old WAL files to be
> removed prematurely, but you could put a shell script there to store the
> location to a file, and call pg_cleanuparchive with the max() of the
> locations reported by all standby servers.

Yes you still need to support external commands. That was not at all
what I'm proposing: I'm just after having the simple case dead simple to
setup. Like you don't write any script.

Regards,
-- 
dim

Re: Command to prune archive at restartpoints

From

Dimitri Fontaine

Date:

12 June 2010, 15:00:13

Dimitri Fontaine <dfontaine@hi-media.com> writes:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> So to clean up all WAL files older than those needed by that base backup,
>> you would simply copy-paste that location and call pg_cleanuparchive:
>>
>> pg_cleanuparchive /walarchive/ 00000001000000000000002F
>
> Ok, idle though: what about having a superuser-only SRF doing the
> same?

So I'm looking at what it'd take to have that code, and it seems it
would be quite easy. I wonder if we want to return only a boolean
(command success status) or the list of files we're pruning (that's the
SRF part), but other than that, it's all about having the code provided
by Simon in another place and some internal command support.

Something strange though: I notice that the error and signal handling in
pgarch.c::pgarch_archiveXlog (lines 551 and following) and in
xlog.c::ExecuteRecoveryCommand (lines 3143 and following) are very
different for no reason that I can see.

Why is that?

Also, should I try to send a patch implementing my proposal (internal
command exposed as a function at the SQL level, and while at it, maybe
the internal command "pg_archive_bypass" to mimic /usr/bin/true as an
archive_command)?

Regards,
-- 
dim

Re: Command to prune archive at restartpoints

From

Dimitri Fontaine

Date:

12 June 2010, 17:52:06

Dimitri Fontaine <dfontaine@hi-media.com> writes:
> Also, should I try to send a patch implementing my proposal (internal
> command exposed as a function at the SQL level, and while at it, maybe
> the internal command "pg_archive_bypass" to mimic /usr/bin/true as an
> archive_command)?

I had to have a try at it, even if quick and dirty. I've not tried to
code the pg_archive_bypass internal command for lack of discussion, but
I still think it would be great to have it.

So here's a "see my idea in code" patch, that put the previous code by
Simon into a backend function. As the goal was not to adapt the existing
code intended as external to use the internal APIs, you'll find it quite
ugly I'm sure.

For example, this #define XLOG_DATA_FNAME_LEN has to go away, but that
won't help having the idea accepted or not, and as I'm only warming up,
I didn't tackle the problem. If you want me to do it, I'd appreciate
some guidance as how to, though.

It goes like this:

dim=# select pg_switch_xlog();
 pg_switch_xlog
----------------
 0/1000098
(1 row)

dim=# select pg_archive_cleanup('0/1000098');
DEBUG:  removing "pg_xlog/000000010000000000000000"
DEBUG:  removing "pg_xlog/000000010000000000000001"
 pg_archive_cleanup
--------------------
 t
(1 row)

I hope you too will find this way of interfacing is easier to deal with
for everybody (from code maintenance to user settings).

Regards,
--
dim

*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 599,605 **** static bool read_backup_label(XLogRecPtr *checkPointLoc);
  static void rm_redo_error_callback(void *arg);
  static int    get_sync_bit(int method);

-
  /*
   * Insert an XLOG record having the specified RMID and info bytes,
   * with the body of the record being the data chunk(s) described by
--- 599,604 ----
***************
*** 3056,3062 **** not_available:
  }

  /*
!  * Attempt to execute an external shell command during recovery.
   *
   * 'command' is the shell command to be executed, 'commandName' is a
   * human-readable name describing the command emitted in the logs. If
--- 3055,3065 ----
  }

  /*
!  * Attempt to execute an external shell command during recovery, or an
!  * internal one if given one.
!  *
!  * There's only one supported internal commands today, which is named
!  * "pg_archive_cleanup".
   *
   * 'command' is the shell command to be executed, 'commandName' is a
   * human-readable name describing the command emitted in the logs. If
***************
*** 3094,3145 **** ExecuteRecoveryCommand(char *command, char *commandName, bool failOnSignal)
      LWLockRelease(ControlFileLock);

      /*
!      * construct the command to be executed
       */
!     dp = xlogRecoveryCmd;
!     endp = xlogRecoveryCmd + MAXPGPATH - 1;
!     *endp = '\0';

!     for (sp = command; *sp; sp++)
      {
!         if (*sp == '%')
          {
!             switch (sp[1])
              {
!                 case 'r':
!                     /* %r: filename of last restartpoint */
!                     sp++;
!                     StrNCpy(dp, lastRestartPointFname, endp - dp);
!                     dp += strlen(dp);
!                     break;
!                 case '%':
!                     /* convert %% to a single % */
!                     sp++;
!                     if (dp < endp)
!                         *dp++ = *sp;
!                     break;
!                 default:
!                     /* otherwise treat the % as not special */
!                     if (dp < endp)
!                         *dp++ = *sp;
!                     break;
              }
          }
!         else
!         {
!             if (dp < endp)
!                 *dp++ = *sp;
!         }
!     }
!     *dp = '\0';

!     ereport(DEBUG3,
!             (errmsg_internal("executing %s \"%s\"", commandName, command)));

-     /*
-      * execute the constructed command
-      */
-     rc = system(xlogRecoveryCmd);
      if (rc != 0)
      {
          /*
--- 3097,3164 ----
      LWLockRelease(ControlFileLock);

      /*
!      * if the command is internal, call the function
       */
!     if( strcmp(command, "pg_archive_cleanup") == 0 )
!     {
!         text *restart_filename = cstring_to_text(lastRestartPointFname);
!         rc = DatumGetBool(DirectFunctionCall1(pg_archive_cleanup,
!                                               restart_filename)) ? 0 : 1;

!         ereport(DEBUG3,
!                 (errmsg_internal("calling %s \"%s\"", commandName, command)));
!     }
!     else
      {
!         /*
!          * construct the command to be executed
!          */
!         dp = xlogRecoveryCmd;
!         endp = xlogRecoveryCmd + MAXPGPATH - 1;
!         *endp = '\0';
!
!         for (sp = command; *sp; sp++)
          {
!             if (*sp == '%')
              {
!                 switch (sp[1])
!                 {
!                     case 'r':
!                         /* %r: filename of last restartpoint */
!                         sp++;
!                         StrNCpy(dp, lastRestartPointFname, endp - dp);
!                         dp += strlen(dp);
!                         break;
!                     case '%':
!                         /* convert %% to a single % */
!                         sp++;
!                         if (dp < endp)
!                             *dp++ = *sp;
!                         break;
!                     default:
!                         /* otherwise treat the % as not special */
!                         if (dp < endp)
!                             *dp++ = *sp;
!                         break;
!                 }
!             }
!             else
!             {
!                 if (dp < endp)
!                     *dp++ = *sp;
              }
          }
!         *dp = '\0';

!         ereport(DEBUG3,
!                 (errmsg_internal("executing %s \"%s\"", commandName, command)));
!
!         /*
!          * execute the constructed command
!          */
!         rc = system(xlogRecoveryCmd);
!     }

      if (rc != 0)
      {
          /*
***************
*** 8916,8921 **** pg_xlogfile_name(PG_FUNCTION_ARGS)
--- 8935,9008 ----
  }

  /*
+  * Cleanup XLOGDIR given a specific WAL file name, which we keep. Typically
+  * used as an archive_cleanup_command.
+  */
+ Datum
+ pg_archive_cleanup(PG_FUNCTION_ARGS)
+ {
+     text       *location = PG_GETARG_TEXT_P(0);
+     char        *exclusiveCleanupFileName = text_to_cstring(location);
+     char        WALFilePath[MAXPGPATH]; /* the file path including archive */
+     int            rc = 0;                 /* nothing to do ain't an error */
+     DIR           *xldir;
+     struct dirent *xlde;
+
+     if ((xldir = opendir(XLOGDIR)) != NULL)
+     {
+         while ((xlde = readdir(xldir)) != NULL)
+         {
+             /*
+              * We ignore the timeline part of the XLOG segment identifiers
+              * in deciding whether a segment is still needed.  This
+              * ensures that we won't prematurely remove a segment from a
+              * parent timeline. We could probably be a little more
+              * proactive about removing segments of non-parent timelines,
+              * but that would be a whole lot more complicated.
+              *
+              * We use the alphanumeric sorting property of the filenames
+              * to decide which ones are earlier than the
+              * exclusiveCleanupFileName file. Note that this means files
+              * are not removed in the order they were originally written,
+              * in case this worries you.
+              */
+             #define XLOG_DATA_FNAME_LEN 24
+             if (strlen(xlde->d_name) == XLOG_DATA_FNAME_LEN &&
+                 strspn(xlde->d_name, "0123456789ABCDEF") == XLOG_DATA_FNAME_LEN &&
+                 strcmp(xlde->d_name + 8, exclusiveCleanupFileName + 8) < 0)
+             {
+ #ifdef WIN32
+                 snprintf(WALFilePath, MAXPGPATH, "%s\\%s", XLOGDIR, xlde->d_name);
+ #else
+                 snprintf(WALFilePath, MAXPGPATH, "%s/%s", XLOGDIR, xlde->d_name);
+ #endif
+
+                 elog(DEBUG1, "removing \"%s\"", WALFilePath);
+
+                 rc = unlink(WALFilePath);
+                 if (rc != 0)
+                 {
+                     elog(ERROR,
+                             "failed to remove \"%s\": %s",
+                             WALFilePath, strerror(errno));
+                     break;
+                 }
+             }
+         }
+     }
+     else
+         /*
+          * How much do we want to manage that? ereport?
+          */
+         elog(ERROR, "Can't open \"%s\"", XLOGDIR);
+
+     closedir(xldir);
+     fflush(stderr);
+
+     PG_RETURN_BOOL(rc == 0);
+ }
+
+ /*
   * read_backup_label: check to see if a backup_label file is present
   *
   * If we see a backup_label during recovery, we assume that we are recovering
*** a/src/include/access/xlog_internal.h
--- b/src/include/access/xlog_internal.h
***************
*** 273,278 **** extern Datum pg_last_xlog_receive_location(PG_FUNCTION_ARGS);
--- 273,279 ----
  extern Datum pg_last_xlog_replay_location(PG_FUNCTION_ARGS);
  extern Datum pg_xlogfile_name_offset(PG_FUNCTION_ARGS);
  extern Datum pg_xlogfile_name(PG_FUNCTION_ARGS);
+ extern Datum pg_archive_cleanup(PG_FUNCTION_ARGS);
  extern Datum pg_is_in_recovery(PG_FUNCTION_ARGS);

  #endif   /* XLOG_INTERNAL_H */
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***************
*** 3314,3319 **** DATA(insert OID = 2850 ( pg_xlogfile_name_offset    PGNSP PGUID 12 1 0 0 f f f t f
--- 3314,3321 ----
  DESCR("xlog filename and byte offset, given an xlog location");
  DATA(insert OID = 2851 ( pg_xlogfile_name            PGNSP PGUID 12 1 0 0 f f f t f i 1 0 25 "25" _null_ _null_
_null__null_ pg_xlogfile_name _null_ _null_ _null_ )); 
  DESCR("xlog filename, given an xlog location");
+ DATA(insert OID = 3822 ( pg_archive_cleanup            PGNSP PGUID 12 1 0 0 f f f t f i 1 0 16 "25" _null_ _null_
_null__null_ pg_archive_cleanup _null_ _null_ _null_ )); 
+ DESCR("cleanup archive up to given xlog location");

  DATA(insert OID = 3810 (  pg_is_in_recovery        PGNSP PGUID 12 1 0 0 f f f t f v 0 0 16 "" _null_ _null_ _null_
_null_pg_is_in_recovery _null_ _null_ _null_ )); 
  DESCR("true if server is in recovery");

Re: Command to prune archive at restartpoints

From

Robert Haas

Date:

13 June 2010, 13:22:40

On Thu, Jun 10, 2010 at 4:09 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Here's the code.

I haven't more than glanced at this, but +1 for committing it if
you're confident it DTRT.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Command to prune archive at restartpoints

From

Robert Haas

Date:

13 June 2010, 13:31:41

On Sat, Jun 12, 2010 at 4:51 PM, Dimitri Fontaine
<dfontaine@hi-media.com> wrote:
> Dimitri Fontaine <dfontaine@hi-media.com> writes:
>> Also, should I try to send a patch implementing my proposal (internal
>> command exposed as a function at the SQL level, and while at it, maybe
>> the internal command "pg_archive_bypass" to mimic /usr/bin/true as an
>> archive_command)?
>
> I had to have a try at it, even if quick and dirty. I've not tried to
> code the pg_archive_bypass internal command for lack of discussion, but
> I still think it would be great to have it.
>
> So here's a "see my idea in code" patch, that put the previous code by
> Simon into a backend function. As the goal was not to adapt the existing
> code intended as external to use the internal APIs, you'll find it quite
> ugly I'm sure.
>
> For example, this #define XLOG_DATA_FNAME_LEN has to go away, but that
> won't help having the idea accepted or not, and as I'm only warming up,
> I didn't tackle the problem. If you want me to do it, I'd appreciate
> some guidance as how to, though.
>
> It goes like this:
>
> dim=# select pg_switch_xlog();
>  pg_switch_xlog
> ----------------
>  0/1000098
> (1 row)
>
> dim=# select pg_archive_cleanup('0/1000098');
> DEBUG:  removing "pg_xlog/000000010000000000000000"
> DEBUG:  removing "pg_xlog/000000010000000000000001"
>  pg_archive_cleanup
> --------------------
>  t
> (1 row)
>
> I hope you too will find this way of interfacing is easier to deal with
> for everybody (from code maintenance to user settings).

I'm a bit perplexed here.  The archive cleanup has to run on the
standby, not the master, right?  Whereas pg_switch_xlog() can only run
on the master.  The purpose of making this a standalone executable is
so that people who have, for example, multiple standbys, can customize
the logic without having to hack the backend.  Pushing this into the
backend would defeat that goal; plus, it wouldn't be usable at all for
people who aren't running Hot Standby.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Command to prune archive at restartpoints

From

Dimitri Fontaine

Date:

13 June 2010, 14:04:39

Robert Haas <robertmhaas@gmail.com> writes:
> I'm a bit perplexed here.  The archive cleanup has to run on the
> standby, not the master, right?  Whereas pg_switch_xlog() can only run
> on the master.

I used it just to show a possible use case, easy to grasp. Sorry if
that's confusing instead.

>  The purpose of making this a standalone executable is
> so that people who have, for example, multiple standbys, can customize
> the logic without having to hack the backend.  Pushing this into the
> backend would defeat that goal; plus, it wouldn't be usable at all for
> people who aren't running Hot Standby.

In the simple cases, what you want to be able to easily choose is just
the first XLOG file you're NOT cleaning. And this is the only argument
you give the function. 

So you can either use the backend function as your internal command for
archive cleanup, or use a script that choose where to stop cleaning then
call it with that as an argument (it's SQL callable).

What it does is unlink the file. If that behavior doesn't suit you, it's
still possible to use an external command and tune some already proposed
scripts. I just don't see how an external binary has more to offer than
a backend function here. It's more code to maintain, it's harder to
setup for people, and if it does not suit you, you still have to make
you own script but you can not use what we ship easily (you have to get
the sources and code in C for that).

What I'm after is being able to tell people to just setup a GUC to a
given value, not to copy/paste a (perl or bash) script from the docs,
make it executable under their system, then test it and run it in
production. We can do better than that, and it's not even hard.

Regards,
-- 
dim

Re: Command to prune archive at restartpoints

From

Robert Haas

Date:

13 June 2010, 15:26:03

On Sun, Jun 13, 2010 at 1:04 PM, Dimitri Fontaine
<dfontaine@hi-media.com> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I'm a bit perplexed here.  The archive cleanup has to run on the
>> standby, not the master, right?  Whereas pg_switch_xlog() can only run
>> on the master.
>
> I used it just to show a possible use case, easy to grasp. Sorry if
> that's confusing instead.
>
>>  The purpose of making this a standalone executable is
>> so that people who have, for example, multiple standbys, can customize
>> the logic without having to hack the backend.  Pushing this into the
>> backend would defeat that goal; plus, it wouldn't be usable at all for
>> people who aren't running Hot Standby.
>
> In the simple cases, what you want to be able to easily choose is just
> the first XLOG file you're NOT cleaning. And this is the only argument
> you give the function.
>
> So you can either use the backend function as your internal command for
> archive cleanup, or use a script that choose where to stop cleaning then
> call it with that as an argument (it's SQL callable).
>
> What it does is unlink the file. If that behavior doesn't suit you, it's
> still possible to use an external command and tune some already proposed
> scripts. I just don't see how an external binary has more to offer than
> a backend function here. It's more code to maintain, it's harder to
> setup for people, and if it does not suit you, you still have to make
> you own script but you can not use what we ship easily (you have to get
> the sources and code in C for that).
>
> What I'm after is being able to tell people to just setup a GUC to a
> given value, not to copy/paste a (perl or bash) script from the docs,
> make it executable under their system, then test it and run it in
> production. We can do better than that, and it's not even hard.

We're not going to make them cut/paste anything from the docs.  We're
going to provide a production-ready executable they can just use,
which should be installed (presumably, already with the correct
permissions) by their packaging system if they install
postgresql-contrib or the equivalent.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Command to prune archive at restartpoints

From

Dimitri Fontaine

Date:

13 June 2010, 15:51:24

Robert Haas <robertmhaas@gmail.com> writes:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>>  The purpose of making this a standalone executable is
>>> so that people who have, for example, multiple standbys, can customize
>>> the logic without having to hack the backend.  Pushing this into the
>>> backend would defeat that goal; plus, it wouldn't be usable at all for
>>> people who aren't running Hot Standby.
>
> We're not going to make them cut/paste anything from the docs.  We're
> going to provide a production-ready executable they can just use,
> which should be installed (presumably, already with the correct
> permissions) by their packaging system if they install
> postgresql-contrib or the equivalent.

I still run against people not wanting to trust contrib. I still read
here from time to time that contrib's chapter is maintaining working
examples of extensibility, not maintaining production ready add-ons.

Other than that, you proposed something flexible and easy to customize,
and you end up with an executable binary that will only offer one
behavior (unlink), the only option is where to stop (%r).

The backend function I'm proposing uses the same option, but is easier
to call from a script, should you need to customize. You don't even have
to run the script locally or remember where is the XLOG directory of
that instance. You could operate over a JDBC connection, e.g.

I now realize that my proposal ain't helping if Streaming Replication is
filling the standby's pg_xlog and hot_standby = off. I don't remember
that SR rebuilds pg_xlog on the standby though, does it?

The proposed script will only cleanup XLOGDIR in fact, so if you use a
common archive elsewhere then you still need some external command not
provided by the project. So we still need the script example in the
docs.

I think that the pg_archivecleanup binary is a good solution, all the
more if not shipped in contrib, but that the SQL callable function is
better.

Regards,
--
dim

Re: Command to prune archive at restartpoints

From

Andrew Dunstan

Date:

13 June 2010, 16:04:09

Dimitri Fontaine wrote:
> I still read
> here from time to time that contrib's chapter is maintaining working
> examples of extensibility, not maintaining production ready add-ons.
>
>   

Even if this were true, and I don't believe it is, ISTM the solution 
would be to have a utility command alongside the other utility commands 
like pg_controldata.

cheers

andrew

Re: Command to prune archive at restartpoints

From

Fujii Masao

Date:

14 June 2010, 01:03:00

On Mon, Jun 14, 2010 at 3:51 AM, Dimitri Fontaine
<dfontaine@hi-media.com> wrote:
> I now realize that my proposal ain't helping if Streaming Replication is
> filling the standby's pg_xlog and hot_standby = off. I don't remember
> that SR rebuilds pg_xlog on the standby though, does it?

In SR, WAL files in the pg_xlog directory on the standby are recycled
by every restartpoints. So your proposed function seems not to be helpful
even if hot_standby = on.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: Command to prune archive at restartpoints

From

Dimitri Fontaine

Date:

14 June 2010, 07:33:46

Fujii Masao <masao.fujii@gmail.com> writes:
> In SR, WAL files in the pg_xlog directory on the standby are recycled
> by every restartpoints. So your proposed function seems not to be helpful
> even if hot_standby = on.

Then I guess I'm at a loss here: what is the pg_archivecleanup utility
good for in a standby?

-- 
dim

Re: Command to prune archive at restartpoints

From

Simon Riggs

Date:

14 June 2010, 08:34:53

On Mon, 2010-06-14 at 12:21 +0200, Dimitri Fontaine wrote:
> Fujii Masao <masao.fujii@gmail.com> writes:
> > In SR, WAL files in the pg_xlog directory on the standby are recycled
> > by every restartpoints. So your proposed function seems not to be helpful
> > even if hot_standby = on.
> 
> Then I guess I'm at a loss here: what is the pg_archivecleanup utility
> good for in a standby?

Cleaning the archive directory, not the pg_xlog directory.

-- Simon Riggs           www.2ndQuadrant.com

Re: Command to prune archive at restartpoints

From

Dimitri Fontaine

Date:

14 June 2010, 09:37:19

Simon Riggs <simon@2ndQuadrant.com> writes:
> Cleaning the archive directory, not the pg_xlog directory.

Hence the choice of the directory where to act. I was slow on that,
sorry guys.

I guess my main problem here is that I still picture PostgreSQL has
being able to maintain an archive with no external script in the simple
case.

Regards,
-- 
dim