Re: WAL scenario valid? - Mailing list pgsql-admin

From prakhar jauhari
Subject Re: WAL scenario valid?
Date
Msg-id CAEd0_=8hg_LgFg+a7nNdeXvDfs0BbvmM8C42WtV9XOsfujDZJA@mail.gmail.com
Whole thread Raw
In response to Re: WAL scenario valid?  (Albe Laurenz <laurenz.albe@wien.gv.at>)
List pgsql-admin
Hi,

Ya the error logs came, but only when the DB1 was corrupted. Here is a excerpt from the logs.

LOG:  database system was shut down at 2013-06-19 09:40:55 UTC
LOG:  restored log file "00000004.history" from archive
cp: cannot stat `/data/pgsql/archivedir/00000005.history': No such file or directory
LOG:  restored log file "00000004.history" from archive
LOG:  entering standby mode
cp: cannot stat `/data/pgsql/archivedir/000000040000000000000013': No such file or directory
cp: cannot stat `/data/pgsql/archivedir/000000030000000000000013': No such file or directory
LOG:  consistent recovery state reached at 0/13000080
LOG:  record with zero length at 0/13000080
LOG:  database system is ready to accept read only connections
cp: cannot stat `/data/pgsql/archivedir/000000040000000000000013': No such file or directory
cp: cannot stat `/data/pgsql/archivedir/000000030000000000000013': No such file or directory
cp: cannot stat `/data/pgsql/archivedir/00000005.history': No such file or directory
LOG:  streaming replication successfully connected to primary
ERROR:  cannot execute CREATE ROLE in a read-only transaction
STATEMENT:  CREATE USER replicationuser REPLICATION;
LOG:  invalid record length at 0/13000080
FATAL:  terminating walreceiver process due to administrator command
cp: cannot stat `/data/pgsql/archivedir/000000040000000000000013': No such file or directory
cp: cannot stat `/data/pgsql/archivedir/000000040000000000000013': No such file or directory
cp: cannot stat `/data/pgsql/archivedir/00000005.history': No such file or directory
LOG:  invalid record length at 0/13000080


This is what happened..
When DB1 came up as a new standby, It connected SR with DB2 (new master) then after some time it terminated the walreceiver process.
And the invalid record length logs went on, until i took base backup on DB1 and restarted it.


Also On DB1 (before DB1 was started in standby mode)  : pg_controldata /data/pgsql/9.2/data/

pg_control version number:            922
Catalog version number:               201204301
Database system identifier:           5891091665573732008
Database cluster state:               shut down
pg_control last modified:             Wed Jun 19 09:40:55 2013
Latest checkpoint location:           0/13000020
Prior checkpoint location:            0/124F1BC0
Latest checkpoint's REDO location:    0/13000020
Latest checkpoint's TimeLineID:       3
Latest checkpoint's full_page_writes: off
Latest checkpoint's NextXID:          0/8409
Latest checkpoint's NextOID:          18470
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        669
Latest checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  0
Time of latest checkpoint:            Wed Jun 19 09:40:54 2013
Minimum recovery ending location:     0/0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
Current wal_level setting:            hot_standby
Current max_connections setting:      300
Current max_prepared_xacts setting:   0
Current max_locks_per_xact setting:   64
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value


Will it be wrong to compare the "last replayed xlog id" on DB2 (which is the new master) with "Latest checkpoint location" on DB1 before starting DB1 in standby mode and if "Latest checkpoint location" on DB1(old master) is greater than "last replayed xlog id" on DB2(new master) then i have to go for basebackup.


regards,
Prakhar.


On Wed, Jun 19, 2013 at 1:11 PM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
prakhar jauhari wrote:
> I am facing a similar kind of problem, but in a two node setup. Streaming replication is being used
> with a high wal_keep_segments, using log shipping to bump up timelines on the standby server to setup
> SR(streaming replication).
> DB1 - master
>
> DB2 - standby
>
>
> When a switchover happens DB2 becomes the new master and when DB1 comes up it will act as the standby
> to the new master (.history files from new master are copied to DB1 to bump up its timeline so as to
> setup SR). DB1 is not recreated from scratch. This runs fine in normal switchover, but there seems to
> be problem in the following situation, leading to database corruption:
>
>
> Current state :
> DB1 - master
> DB2 - standby
>
>
> Now the failing scenario:
>
>
>
> 1. DB2 machine goes down.
>
> 2. After some time DB1 machine also goes down (DB2 is still down).
>
> 3. Now DB2 comes up (it will join the cluster as master as DB1 is still down).
>
> 4. DB2 is started as master postgresql.
>
> 5. Now DB1 comes up (it will join the cluster as standby to DB2)
>
> 6. Now when DB1 attempts to SR with DB2, DB1 gets corrupted.
>
>
> Looked into the issue and found that when DB1 went down initially, it created some WAL's which were
> not synced to DB2 as it was already down.
>
> Now when DB2 started as master it still had not played the last few WAL's created by DB1(when it was
> master). DB2 starts as master properly.
>
> When DB1 came as standby to DB2, it bumped it timeline using history file from DB2, but when SR was
> setup with DB2, DB1 gets corrupted.
>
>
> Now the question is:
>
>
> 1. Is this a theoretically valid approach?
>
> 2. If it is a valid approach, then how can i detect such a scenario (where SR will corrupt the DB)? So
> that i can go for a basebackup in such situation.

If you want to use the old primary as new standby without a new backup,
you have to ascertain that all transactions from the former have
been replayed at the latter.

To figure out where the primary currently is, you can
   SELECT pg_current_xlog_location();

To figure how much the standby has replayed, you can
   SELECT pg_last_xlog_replay_location();

Of course this only works if both are up.

I think that it would be tricky to automatize that; I'd choose
making a new backup after each failover.

In the event of a controlled failover it might be an option.

I am surprised that the scenario you described leads to
corruption; I would have expected an error message.

Yours,
Laurenz Albe

pgsql-admin by date:

Previous
From: Szymon Guz
Date:
Subject: Re: Why sequence grant is separated from table?
Next
From: girish R G peetle
Date:
Subject: Re: pg_stop_backup is not archiving latest transaction log from pg_xlog directory