Re: [BUGS] BUG #14702: Streaming replication broken after serverclosed connection unexpectedly - Mailing list pgsql-bugs

From Palle Girgensohn
Subject Re: [BUGS] BUG #14702: Streaming replication broken after serverclosed connection unexpectedly
Date
Msg-id 3029E88A-83B1-49BC-B2BA-DB3709AA26F7@pingpong.net
Whole thread Raw
In response to Re: [BUGS] BUG #14702: Streaming replication broken after serverclosed connection unexpectedly  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-bugs
> 13 juni 2017 kl. 03:57 skrev Michael Paquier <michael.paquier@gmail.com>:
>
> On Tue, Jun 13, 2017 at 6:52 AM,  <girgen@pingpong.net> wrote:
>> Setup is simple streaming replication: master -> slave. There is a
>> replication slot at the master, so xlogs should not be removed until the
>> client has received them properly.
>
> Hm. There has been the following discussion as well, which refers to a
> legit bug where WAL segments could be removed even if a slot is used:
> https://www.postgresql.org/message-id/CACJqAM3xVz0JY1XFDKPP+JoJAjoGx=GNuOAshEDWCext7BFvCQ@mail.gmail.com
> The circumstances to trigger the problem are quite particular though
> as it needs an incomplete WAL record at the end of a segment.
>
>> After this, the slave could not be started again, each time the same error
>> about "invalid memory alloc request size 1600487424".
>
> Hm. That smells of data corruption.. Be careful going forward.

I believe that corruption was in the broken WAL file though. I saw some notes pointing in that direction on the list,
butsure, I could be mistaken. 

>
>> Looking more closely, the last xlog file, let's call it 0000EB, is corrupt
>> on the slave, having a different checksum from the proper one at the master.
>
> To which checksum are you referring here? Did you do yourself a
> calculation using what is on-disk? Note that during streaming
> replication the content of an unfinished segment may be different than
> what is on the primary.

I calculated that myself using sha256 from the command line.

As you say, it was probably an unfinished segment. Problem is that the slave expects the *previous* wal file to still
besaved on the master, but it was already removed. The slave *has* it though, so why would it required it to be
transferredagain? 0000EA was requested, although it was already completeley transferred to the slave. I had to copy
that0000EA back to the master so it could be transferred again. 

>
>> Now, I don't know exactly what happened when the slave lost track, but the
>> bug, I think, is that the streamed WAL was corrupt, and still accepted by
>> the slave *and* hence removed from the master. It required too much
>> experience to fix that. The slave should not accept a not fully transported
>> WAL file. It seems it happened during some connection failure between the
>> slave and master, but still it should preferrably fail more gracefully. Are
>> the mechanisms implemented to support that, and they failed, or is it just
>> not implemented?
>
> There is a per-record CRC calculation to check the validity of each
> record, and this is processed when fetching each record at recovery as
> a sanity check. That's one way to prevent applying an incorrect
> record. In the event of such an error you would have seen "incorrect
> resource manager data checksum in record at" or similar. It seems to
> me that you should be careful with the primary as well.

OK. "Be careful" is somewhat vague, but I get it. Would a pg_dump + pg_restore, for example, reveal any data
corruption.Or is it just not possibly to be totally sure unless checksums would have been activated (they're not, this
isan old datbase). 



> --
> Michael


pgsql-bugs by date:

Previous
From: Pantelis Theodosiou
Date:
Subject: Re: [BUGS] BUG #14704: How to create unique index with a case statement?
Next
From: Cocco Gianfranco
Date:
Subject: R: [BUGS] Invalid WAL segment size. Allowed values are1,2,4,8,16,32,64