Re: BUGFIX: standby disconnect can corrupt serialized reorder buffers - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: BUGFIX: standby disconnect can corrupt serialized reorder buffers
Date
Msg-id CAD21AoB6yNcz5ThU8qVHx38OJ8Vx2J7DaiSjZksEO-_TPVU3Uw@mail.gmail.com
Whole thread Raw
In response to Re: BUGFIX: standby disconnect can corrupt serialized reorder buffers  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Responses Re: BUGFIX: standby disconnect can corrupt serialized reorder buffers  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
List pgsql-hackers
On Tue, Dec 26, 2017 at 12:49 AM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:
> Hi,
>
> thanks for writing the patch.
>
> On 05/12/17 06:58, Craig Ringer wrote:
>> Hi all
>>
>> [...]
>>> The cause appears to be that walsender.c's ProcessRepliesIfAny writes a
>> LOG for unexpected EOF then calls proc_exit(0). But  serialized txn
>> cleanup is done by
>> ReorderBufferRestoreCleanup, as called by ReorderBufferCleanupTXN, which
>> is invoked from the PG_CATCH() in ReorderBufferCommit() and on various
>> normal exits. It won't get called if we proc_exit() without an ERROR, so
>> we leave stale data lying around.

Thank you for the details. I could reproduce this bug. This bug also
happens if pq_flush_if_writable called by WalSndWriteData could not
write data (for example, the case where replicated data violate unique
constraint on the subscriber).

>>
>> It's not a problem on crash restart because StartupReorderBuffer already
>> does the required delete.
>>
>> ReorderBufferSerializeTXN, which spills the txns to disk, doesn't appear
>> to have any guard to ensure that the segment files don't already exist
>> when it goes to serialize a snapshot. Adding one there would probably be
>> expensive; we don't know the last lsn of the txn yet, so to be really
>> safe we'd have to do a directory listing and scan for any txn-$OURXID-*
>> entries.
>>
>> So to fix, I suggest that we should do a
>> slot-specific StartupReorderBuffer-style deletion when we start a new
>> decoding session on a slot, per attached patch.
>>
>> It might be nice to also add a hook on proc exit, so we don't have stale
>> buffers lying around until next decoding session, but I didn't want to
>> add new global state to reorderbuffer.c so I've left that for now.
>
>
> Hmm, can't we simply call the new cleanup function in
> ReplicationSlotRelease()? That's called during process exit and we know
> there about slot so no extra global variables are needed.
>

I guess that ReplicationSlotRelease() currently might not get called
if walsender exits by proc_exit(). ReplicationSlotRelease() can is
called by some functions such as WalSndErrorCleanup(), but at least in
the case where wal sender exits due to failed to write data to socket,
ReplicationSlotRelease() didn't get called as far as I tested.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From: "Michelle Konzack"
Date:
Subject: [table partitioning] How many partitions are possibel?
Next
From: Masahiko Sawada
Date:
Subject: Re: [HACKERS] Replication status in logical replication