回复: BUG #18213: Standby's repeatable read isolation level transaction encountered a "nonrepeatable read" problem - Mailing list pgsql-bugs

From 费长红
Subject 回复: BUG #18213: Standby's repeatable read isolation level transaction encountered a "nonrepeatable read" problem
Date
Msg-id tencent_2F3B410655A7D447BF092A470CC297D2DA06@qq.com
Whole thread Raw
In response to Re: BUG #18213: Standby's repeatable read isolation level transaction encountered a "nonrepeatable read" problem  (zhihuifan1213@163.com)
List pgsql-bugs


Indeed, I simply implemented and verified the solution. In the above test, the "create index" command on RW will hang until the transaction on standby is committed or aborted.
In addition, even if there is no select query on the standby, RW's "create index" command may wait for a period of time, which affected by the wal_receiver_status_interval parameter.
You can see attachment for the patch.


------------------ 原始邮件 ------------------
发件人: "zhihuifan1213" <zhihuifan1213@163.com>;
发送时间: 2023年11月29日(星期三) 中午1:26
收件人: "费长红"<feichanghong@qq.com>;"pgsql-bugs"<pgsql-bugs@lists.postgresql.org>;
主题: Re: BUG #18213: Standby's repeatable read isolation level transaction encountered a "nonrepeatable read" problem


Hi,

 Thanks for the report!
 
PG Bug reporting form <noreply@postgresql.org> writes:

> Standby executes select again (still within the existing repeatable read
> transaction). We expect that the results of the two select should be the
> same, but the second select did not get any data.
> ```
> postgres=*# select a, b from t where a = 5;
>   a | b
> ---+---
>   5 | 5
> (1 row)
>
> postgres=*# set enable_seqscan to off;
> SET
> postgres=*# select a, b from t where a = 5;
>   a | b
> ---+---
> (0 rows)
> ```

I can confirm this bug in the current master and continue with your test:

postgres=*# set enable_seqscan to off;
SET
postgres=*# select a, b from t where a = 5;
 a | b
---+---
(0 rows)

postgres=*# set enable_seqscan to  on;
SET
postgres=*# select a, b from t where a = 5;
 a | b
---+---
 5 | 5
(1 row)

Different plan yieds different result is bad.

> We have several ideas to fix this problem, but they all have obvious
> flaws:
> 1. In the WaitForOlderSnapshots function in the final stage of "create index
> concurrently", wait for replication_slot_xmin to exceed limitXmin. In this
> solution, when there are long transactions on Standby, "create index
> concurrently" operations may be blocked.

+1 for this one. The current code doesn't take the advice from the
comments for validate_index, where it says..

 * ....  Also, we index only tuples that are valid
 * as of the start of the scan (see table_index_build_scan), whereas a normal
 * build takes care to include recently-dead tuples.  This is OK because
 * we won't mark the index valid until all transactions that might be able
 * to see those tuples are gone.  The reason for doing that is ...

In the current code, it doesn't consider the sessiones in standby.

I guess you have worked out a fix for this, if so, could you provide
one for review / test?

--
Best Regards
Andy Fan
Attachment

pgsql-bugs by date:

Previous
From: zhihuifan1213@163.com
Date:
Subject: Re: BUG #18213: Standby's repeatable read isolation level transaction encountered a "nonrepeatable read" problem
Next
From: zhihuifan1213@163.com
Date:
Subject: Re: 回复: BUG #18213: Standby's repeatable read isolation level transaction encountered a "nonrepeatable read" problem