Home > mailing lists

Problem with synchronous replication - Mailing list pgsql-hackers

From	Dongming Liu
Subject	Problem with synchronous replication
Date	October 25, 2019 07:18:34
Msg-id	a0806273-8bbb-43b3-bbe1-c45a58f6ae21.lingce.ldm@alibaba-inc.com Whole thread Raw
Responses	Re: Problem with synchronous replication
List	pgsql-hackers

Hi,

I recently discovered two possible bugs about synchronous replication.

1. SyncRepCleanupAtProcExit may delete an element that has been deleted

SyncRepCleanupAtProcExit first checks whether the queue is detached, if it is not detached,

acquires the SyncRepLock lock and deletes it. If this element has been deleted by walsender,

it will be deleted repeatedly, SHMQueueDelete will core with a segment fault.

IMO, like SyncRepCancelWait, we should lock the SyncRepLock first and then check

whether the queue is detached or not.

2. SyncRepWaitForLSN may not call SyncRepCancelWait if ereport check one interrupt.

For SyncRepWaitForLSN, if a query cancel interrupt arrives, we just terminate the wait

with suitable warning. As follows:

a. set QueryCancelPending to false

b. errport outputs one warning

c. calls SyncRepCancelWait to delete one element from the queue

If another cancel interrupt arrives when we are outputting warning at step b, the errfinish

will call CHECK_FOR_INTERRUPTS that will output an ERROR, such as "canceling autovacuum

task", then the process will jump to the sigsetjmp. Unfortunately, the step c will be skipped

and the element that should be deleted by SyncRepCancelWait is remained.

The easiest way to fix this is to swap the order of step b and step c. On the other hand,

let sigsetjmp clean up the queue may also be a good choice. What do you think?

Attached the patch, any feedback is greatly appreciated.

Best regards,

Dongming Liu

From: Michael Paquier
Date: 25 October 2019, 06:58:14
Subject: Re: Fix of fake unlogged LSN initialization

From: Amit Langote
Date: 25 October 2019, 07:42:24
Subject: Re: [PATCH] Do not use StdRdOptions in Access Methods