Hi,
I realized that in CreateDecodingContext() function, we update both
slot->data.two_phase and two_phase_at without acquiring the spinlock:
/* Mark slot to allow two_phase decoding if not already marked */
if (ctx->twophase && !slot->data.two_phase)
{
slot->data.two_phase = true;
slot->data.two_phase_at = start_lsn;
ReplicationSlotMarkDirty();
ReplicationSlotSave();
SnapBuildSetTwoPhaseAt(ctx->snapshot_builder, start_lsn);
}
I think we should acquire the spinlock when updating fields of the
replication slot even by its owner. Otherwise readers could see
inconsistent results. Looking at another place where we update
two_phase_at, we acquire the spinlock:
SpinLockAcquire(&slot->mutex);
slot->data.confirmed_flush = ctx->reader->EndRecPtr;
if (slot->data.two_phase)
slot->data.two_phase_at = ctx->reader->EndRecPtr;
SpinLockRelease(&slot->mutex);
It seems to me an oversight of commit a8fd13cab0b. I've attached the
small patch to fix it.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com