Re: Standby corruption after master is restarted - Mailing list pgsql-bugs

From Emre Hasegeli
Subject Re: Standby corruption after master is restarted
Date
Msg-id CAE2gYzwDDnYLMW999NGyTjB9Vr84eARJ5-uWdctKq5zPtGeruQ@mail.gmail.com
Whole thread Raw
In response to Re: Standby corruption after master is restarted  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Standby corruption after master is restarted  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-bugs
> Can you check if the "incorrect" part of the WAL segment matches some
> previous segment? Verifying that shouldn't be very difficult (just cut a
> bunch of bytes using hexdump, compare to the incorrect data). Assuming
> you still have the WAL archive, of course. That would tell us that the
> corrupted part comes from an old recycled segment.

I had found and saved the recycled WAL file from the archive after the
incident.  Here is the hexdump of it at the same position:

0bddfc0 3253 4830 616f 5034 5243 4d79 664f 6164
0bddfd0 3967 592d 7963 7967 5541 4a59 3066 4f50
0bddfe0 2d55 346e 4254 3559 6a4e 726b 4e30 6f52
0bddff0 3876 4751 4a38 5956 5f32 7234 4b55 7045
0bde000 d087 0005 0005 0000 e000 66bd 1dfb 0000
0bde010 1931 0000 0000 0000 5a43 7746 7166 6e34
0bde020 304e 764e 9c32 0158 5400 e709 0900 6f66
0bde030 0765 7375 6111 646e 6f72 6469 370d 312e

If you compare it with the other 2 I have posted, you would notice
that the corrupted file on standby is combination of the two.  The
data on it starts with the data on the master, and continues with the
data of the recycled file.  The switch is at the position 0bddff8
which is the position printed as "Minimum recovery ending location" by
pg_controldata.

> Hmmm, I see you're using SSL. I don't think that could break affect
> anything, but maybe I should try mimicking this aspect too.

This is the connection information.  Although the master shows SSL
compression is disabled in despite of being explicitly asked for.

> primary_conninfo = 'host=MASTER_NODE port=5432 dbname=repmgr user=repmgr connect_timeout=10 sslcompression=1'


pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG #15159: Duplicate records for same primary key
Next
From: PG Bug reporting form
Date:
Subject: BUG #15160: planner overestimates number of rows in join when thereare more than 200 rows coming from CTE