Re: Missing clog, PITR - Mailing list pgsql-general

From Greg Smith
Subject Re: Missing clog, PITR
Date
Msg-id 4B85E4A4.9030500@2ndquadrant.com
Whole thread Raw
In response to Missing clog, PITR  (Patryk Sidzina <ps@heuthes.pl>)
Responses Re: Missing clog, PITR
List pgsql-general
Patryk Sidzina wrote:
> 1) how do the clogs relate to wal shipping based replication? Clearly
> the master doesn't need that clog but the slave does.
>
They should just be kept in sync.  There's some useful background on
this topic at
http://old.nabble.com/control-the-number-of-clog-files-and-xlog-files-td19173165.html

> 2) could 'pg_clearxlogtail' in archive_command be a cause of this? This
> is our archive_command:
> 'cat %p | pg_clearxlogtail | gzip -c |
> ssh slavehost "cat > /var/lib/postgresql/replication/in/%f"'
>

Kind of doubt that.  If pg_clearxlogtail were mangling your data, I'd
expect a more dramatic failure to restore.

> 3) is there a faster way to debug this problem? Clogs fill slowly. It
> takes about a month on a very busy production server for a clog to be
> removed by master DB.
>

You could create a bunch of transactions and then freeze things,
following the ideas in the reference I suggested above.

> More info:
> PostgreSQL 8.2.14 64-bit (though this happened in older versions also)
> pg_standby from PostgreSQL 8.3.6
>

There was a bug in this area fixed in 8.2.10:
http://www.postgresql.org/docs/8.2/static/release-8-2-10.html

"Fix potential miscalculation of datfrozenxid (Alvaro)

    *

      This error may explain some recent reports of failure to remove
      old pg_clog data."

If you were running this database with a version before that, I wonder
if maybe there's still some junk left behind from that old, buggy
version that's causing your issues.  You might try doing some manual
VACUUM or VACUUM FREEZE work to remove any lingering issues and then
re-create your standby systems afterwards.  I'm not quite familiar
enough with this specific bug to suggest a clearer resolution path, or
if in fact this is the same issue you're seeing.  It sure seems possible
they're related though.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


pgsql-general by date:

Previous
From: Greg Smith
Date:
Subject: Re: Possible causes for database corruption and solutions
Next
From: Greg Smith
Date:
Subject: Re: archive_timeout in postgresql.conf