Thread: CLOG read problem after pg_basebackup

CLOG read problem after pg_basebackup

From
Petr Novak
Date:
Hi all,

I'd like to ask for help clarifying an issue I'm having.

I've recently prepared new servers in another datacenter for some of our databases which I want to set up as a streaming replicas. There are several instances(clusters) with size ranging from 50-150GB. Some of them were set up with no issue. Three of them failed to start after pg_basebackup completed with:

FATAL:  could not access status of transaction 923709700
DETAIL:  Could not read from file "pg_clog/0370" at offset 237568: Success.

(the clog file differed in each case of course..)

As for PG versions one is 9.1.14 (on both master and replica) and the other two 9.2.9 (also on both)

I've checked each file in question on the master server and in each case it was last changed during the run of the pg_basebackup. I've copied the file from master to the slave and it started normally, succesfully connected to primary and everything seems ok since then.

My question is, why this is happening? Is it because the clog file was copied by pg_basebackup before the trasaction status had been written to it? Couldn't this information be recovered from xlog files?
Is my current solution safe for the database (I suppose so, but I'd rather have it confirmed :) )

I'm puzzled as I haven't found this issue discussed at all so far and it happened to me three times in one day on three different servers..

Thanks for any help.

Cheers.
Petr

 

Re: CLOG read problem after pg_basebackup

From
Adrian Klaver
Date:
On 01/23/2015 06:31 AM, Petr Novak wrote:
> Hi all,
>
> I'd like to ask for help clarifying an issue I'm having.
>
> I've recently prepared new servers in another datacenter for some of our
> databases which I want to set up as a streaming replicas. There are
> several instances(clusters) with size ranging from 50-150GB. Some of
> them were set up with no issue. Three of them failed to start after
> pg_basebackup completed with:
>
> FATAL:  could not access status of transaction 923709700
> DETAIL:  Could not read from file "pg_clog/0370" at offset 237568: Success.
>
> (the clog file differed in each case of course..)
>
> As for PG versions one is 9.1.14 (on both master and replica) and the
> other two 9.2.9 (also on both)
>
> I've checked each file in question on the master server and in each case
> it was last changed during the run of the pg_basebackup. I've copied the
> file from master to the slave and it started normally, succesfully
> connected to primary and everything seems ok since then.
>
> My question is, why this is happening? Is it because the clog file was
> copied by pg_basebackup before the trasaction status had been written to
> it? Couldn't this information be recovered from xlog files?
> Is my current solution safe for the database (I suppose so, but I'd
> rather have it confirmed :) )
>
> I'm puzzled as I haven't found this issue discussed at all so far and it
> happened to me three times in one day on three different servers..

Not sure what is going on, but it would help to see the pg_basebackup
command issued.

>
> Thanks for any help.
>
> Cheers.
> Petr
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: CLOG read problem after pg_basebackup

From
Petr Novak
Date:
Hi Adrian,

sure the command is as follows:

pg_basebackup -h <hostname> -p <port> -D /data2/pgsql/baseb -P -v -U replicator -x -c fast

After that I moved the content of /data2/pgsql/baseb to actual datadir and tried to start up the cluster. On some servers it worked on the three didn't. The servers are of the same HW config and the prerequisites are on all servers set by puppet.
 
P

2015-01-23 16:57 GMT+01:00 Adrian Klaver <adrian.klaver@aklaver.com>:
On 01/23/2015 06:31 AM, Petr Novak wrote:
Hi all,

I'd like to ask for help clarifying an issue I'm having.

I've recently prepared new servers in another datacenter for some of our
databases which I want to set up as a streaming replicas. There are
several instances(clusters) with size ranging from 50-150GB. Some of
them were set up with no issue. Three of them failed to start after
pg_basebackup completed with:

FATAL:  could not access status of transaction 923709700
DETAIL:  Could not read from file "pg_clog/0370" at offset 237568: Success.

(the clog file differed in each case of course..)

As for PG versions one is 9.1.14 (on both master and replica) and the
other two 9.2.9 (also on both)

I've checked each file in question on the master server and in each case
it was last changed during the run of the pg_basebackup. I've copied the
file from master to the slave and it started normally, succesfully
connected to primary and everything seems ok since then.

My question is, why this is happening? Is it because the clog file was
copied by pg_basebackup before the trasaction status had been written to
it? Couldn't this information be recovered from xlog files?
Is my current solution safe for the database (I suppose so, but I'd
rather have it confirmed :) )

I'm puzzled as I haven't found this issue discussed at all so far and it
happened to me three times in one day on three different servers..

Not sure what is going on, but it would help to see the pg_basebackup command issued.



Thanks for any help.

Cheers.
Petr




--
Adrian Klaver
adrian.klaver@aklaver.com

Re: CLOG read problem after pg_basebackup

From
Adrian Klaver
Date:
On 01/23/2015 08:21 AM, Petr Novak wrote:
> Hi Adrian,
>
> sure the command is as follows:
>
> pg_basebackup -h <hostname> -p <port> -D /data2/pgsql/baseb -P -v -U
> replicator -x -c fast
>
> After that I moved the content of /data2/pgsql/baseb to actual datadir
> and tried to start up the cluster.

Where there left over files in actual datadir from initial cluster init?

On some servers it worked on the
> three didn't. The servers are of the same HW config and the
> prerequisites are on all servers set by puppet.
>
> P
>



--
Adrian Klaver
adrian.klaver@aklaver.com


Re: CLOG read problem after pg_basebackup

From
Petr Novak
Date:

Just the config files for the replica, all other dirs were removed.

P

On Jan 23, 2015 5:30 PM, "Adrian Klaver" <adrian.klaver@aklaver.com> wrote:
On 01/23/2015 08:21 AM, Petr Novak wrote:
Hi Adrian,

sure the command is as follows:

pg_basebackup -h <hostname> -p <port> -D /data2/pgsql/baseb -P -v -U
replicator -x -c fast

After that I moved the content of /data2/pgsql/baseb to actual datadir
and tried to start up the cluster.

Where there left over files in actual datadir from initial cluster init?

On some servers it worked on the
three didn't. The servers are of the same HW config and the
prerequisites are on all servers set by puppet.

P




--
Adrian Klaver
adrian.klaver@aklaver.com

Re: CLOG read problem after pg_basebackup

From
Adrian Klaver
Date:
On 01/23/2015 08:36 AM, Petr Novak wrote:
> Just the config files for the replica, all other dirs were removed.

Alright, lets look at the other end. You are using -x which is:

Using this option is equivalent of using -X with method fetch.

and

-X fetch is:

f
fetch

The transaction log files are collected at the end of the backup.
Therefore, it is necessary for the wal_keep_segments parameter to be set
high enough that the log is not removed before the end of the backup. If
the log has been rotated when it's time to transfer it, the backup will
fail and be unusable.


So are the three servers that failed pulling from parents that are
seeing heavy use and do not have a sufficiently large wal_keep_segments set?

>
> P
>
> On Jan 23, 2015 5:30 PM, "Adrian Klaver" <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 01/23/2015 08:21 AM, Petr Novak wrote:
>
>         Hi Adrian,
>
>         sure the command is as follows:
>
>         pg_basebackup -h <hostname> -p <port> -D /data2/pgsql/baseb -P -v -U
>         replicator -x -c fast
>
>         After that I moved the content of /data2/pgsql/baseb to actual
>         datadir
>         and tried to start up the cluster.
>
>
>     Where there left over files in actual datadir from initial cluster init?
>
>     On some servers it worked on the
>
>         three didn't. The servers are of the same HW config and the
>         prerequisites are on all servers set by puppet.
>
>         P
>
>
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: CLOG read problem after pg_basebackup

From
Petr Novak
Date:

I've checked wal_keep_segments before i've started the backup and the xlog dir on the master contained logs from several hours ago. Which was sufficient.
Also if this would be the case, then copying affected clog file from the master wouldn't solve the problem as the requires wals would be still missing..

P

On Jan 23, 2015 5:42 PM, "Adrian Klaver" <adrian.klaver@aklaver.com> wrote:
On 01/23/2015 08:36 AM, Petr Novak wrote:
Just the config files for the replica, all other dirs were removed.

Alright, lets look at the other end. You are using -x which is:

Using this option is equivalent of using -X with method fetch.

and

-X fetch is:

f
fetch

The transaction log files are collected at the end of the backup. Therefore, it is necessary for the wal_keep_segments parameter to be set high enough that the log is not removed before the end of the backup. If the log has been rotated when it's time to transfer it, the backup will fail and be unusable.


So are the three servers that failed pulling from parents that are seeing heavy use and do not have a sufficiently large wal_keep_segments set?


P

On Jan 23, 2015 5:30 PM, "Adrian Klaver" <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

    On 01/23/2015 08:21 AM, Petr Novak wrote:

        Hi Adrian,

        sure the command is as follows:

        pg_basebackup -h <hostname> -p <port> -D /data2/pgsql/baseb -P -v -U
        replicator -x -c fast

        After that I moved the content of /data2/pgsql/baseb to actual
        datadir
        and tried to start up the cluster.


    Where there left over files in actual datadir from initial cluster init?

    On some servers it worked on the

        three didn't. The servers are of the same HW config and the
        prerequisites are on all servers set by puppet.

        P




    --
    Adrian Klaver
    adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>



--
Adrian Klaver
adrian.klaver@aklaver.com

Re: CLOG read problem after pg_basebackup

From
Adrian Klaver
Date:
On 01/23/2015 08:55 AM, Petr Novak wrote:
> I've checked wal_keep_segments before i've started the backup and the
> xlog dir on the master contained logs from several hours ago. Which was
> sufficient.
> Also if this would be the case, then copying affected clog file from the
> master wouldn't solve the problem as the requires wals would be still
> missing..

Hmm, I am out of ideas.



--
Adrian Klaver
adrian.klaver@aklaver.com


Re: CLOG read problem after pg_basebackup

From
David G Johnston
Date:
Petr Novak wrote
> Three of them failed to start after pg_basebackup completed with:
>
> FATAL:  could not access status of transaction 923709700
> DETAIL:  Could not read from file "pg_clog/0370" at offset 237568:
> Success.
>
> (the clog file differed in each case of course..)
>
> As for PG versions one is 9.1.14 (on both master and replica) and the
> other
> two 9.2.9 (also on both)

To clarify, the pg_basebackup against the master failed with the above
message?

You confirmed that the archive did not contain the named clog file?

But when you went and checked the running cluster's pg_clog directory the
file was present?

What was the timestamp of the file in the running cluster relative to the
start of the pg_basebackup?

Did you attempt another pg_basebackup against any of the failing servers -
i.e., is the error now a constant for the server or was it transient?

I am somewhat at a loss to explain how pg_basebackup works with pg_clog
given this quote from the wiki:

https://wiki.postgresql.org/wiki/Hint_Bits

"CLOG pages don't make their way out to disk until the internal CLOG buffers
are filled, at which point the least recently used buffer there is evicted
to permanent storage."

Either pg_clog should be course-corrected by WAL, in which case you
shouldn't get a fatal error if an incomplete clog file is found to exist, or
there must something being done to avoid a race condition in this area.  If
that isn't happening then your error could potentially be explained - though
damn bad luck getting it on three servers...

The last observation leads one to wonder if there some kind of transaction
volume or I/O difference that makes the failing servers special (more prone
to getting hit by said race condition)?

I may be just blowing smoke here but maybe it will spark an idea in someone
more knowledgeable.

David J.




--
View this message in context: http://postgresql.nabble.com/CLOG-read-problem-after-pg-basebackup-tp5835204p5835296.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: CLOG read problem after pg_basebackup

From
Adrian Klaver
Date:
On 01/23/2015 05:18 PM, David G Johnston wrote:
> Petr Novak wrote
>> Three of them failed to start after pg_basebackup completed with:
>>
>> FATAL:  could not access status of transaction 923709700
>> DETAIL:  Could not read from file "pg_clog/0370" at offset 237568:
>> Success.
>>
>> (the clog file differed in each case of course..)
>>
>> As for PG versions one is 9.1.14 (on both master and replica) and the
>> other
>> two 9.2.9 (also on both)
>
> To clarify, the pg_basebackup against the master failed with the above
> message?
>
> You confirmed that the archive did not contain the named clog file?
>
> But when you went and checked the running cluster's pg_clog directory the
> file was present?

In the initial post the OP said that the pg_clog file was present on
both the master and replica and that copying the presumably updated file
from the master, after the pg_basebackup, to the replica 'cured' the
problem. This would seem to explain the could not read from offset
error. Initially the replicated Postgres was looking for data in the
pg_clog file at an offset that existed only in the file version on the
master. Once the replica was provided with the updated file it was
happy. The question being how it got in that state?  Your observations
below are better then anything I could come up with.

>
> What was the timestamp of the file in the running cluster relative to the
> start of the pg_basebackup?
>
> Did you attempt another pg_basebackup against any of the failing servers -
> i.e., is the error now a constant for the server or was it transient?
>
> I am somewhat at a loss to explain how pg_basebackup works with pg_clog
> given this quote from the wiki:
>
> https://wiki.postgresql.org/wiki/Hint_Bits
>
> "CLOG pages don't make their way out to disk until the internal CLOG buffers
> are filled, at which point the least recently used buffer there is evicted
> to permanent storage."
>
> Either pg_clog should be course-corrected by WAL, in which case you
> shouldn't get a fatal error if an incomplete clog file is found to exist, or
> there must something being done to avoid a race condition in this area.  If
> that isn't happening then your error could potentially be explained - though
> damn bad luck getting it on three servers...
>
> The last observation leads one to wonder if there some kind of transaction
> volume or I/O difference that makes the failing servers special (more prone
> to getting hit by said race condition)?
>
> I may be just blowing smoke here but maybe it will spark an idea in someone
> more knowledgeable.
>
> David J.
>
>
>
>
> --
> View this message in context:
http://postgresql.nabble.com/CLOG-read-problem-after-pg-basebackup-tp5835204p5835296.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: CLOG read problem after pg_basebackup

From
Andres Freund
Date:
Hi,

On 2015-01-23 15:31:15 +0100, Petr Novak wrote:
> I'd like to ask for help clarifying an issue I'm having.
>
> I've recently prepared new servers in another datacenter for some of our
> databases which I want to set up as a streaming replicas. There are several
> instances(clusters) with size ranging from 50-150GB. Some of them were set
> up with no issue. Three of them failed to start after pg_basebackup
> completed with:
>
> FATAL:  could not access status of transaction 923709700
> DETAIL:  Could not read from file "pg_clog/0370" at offset 237568: Success.
>
> (the clog file differed in each case of course..)

How exactly are you starting the the standby after the basebackups? Any
chance you removed backup.label? This sounds like the typical symptoms
of doing that.

Greetings,

Andres Freund