Indeed, I simply implemented and verified the solution. In the above test, the "create index" command on RW will hang until the transaction on standby is committed or aborted.
In addition, even if there is no select query on the standby, RW's "create index" command may wait for a period of time, which affected by the wal_receiver_status_interval parameter.
You can see attachment for the patch.
------------------ 原始邮件 ------------------
发件人: "zhihuifan1213" <zhihuifan1213@163.com>;
发送时间: 2023年11月29日(星期三) 中午1:26
收件人: "费长红"<feichanghong@qq.com>;"pgsql-bugs"<pgsql-bugs@lists.postgresql.org>;
主题: Re: BUG #18213: Standby's repeatable read isolation level transaction encountered a "nonrepeatable read" problem
Hi,
Thanks for the report!
PG Bug reporting form <noreply@postgresql.org> writes:
> Standby executes select again (still within the existing repeatable read
> transaction). We expect that the results of the two select should be the
> same, but the second select did not get any data.
> ```
> postgres=*# select a, b from t where a = 5;
> a | b
> ---+---
> 5 | 5
> (1 row)
>
> postgres=*# set enable_seqscan to off;
> SET
> postgres=*# select a, b from t where a = 5;
> a | b
> ---+---
> (0 rows)
> ```
I can confirm this bug in the current master and continue with your test:
postgres=*# set enable_seqscan to off;
SET
postgres=*# select a, b from t where a = 5;
a | b
---+---
(0 rows)
postgres=*# set enable_seqscan to on;
SET
postgres=*# select a, b from t where a = 5;
a | b
---+---
5 | 5
(1 row)
Different plan yieds different result is bad.
> We have several ideas to fix this problem, but they all have obvious
> flaws:
> 1. In the WaitForOlderSnapshots function in the final stage of "create index
> concurrently", wait for replication_slot_xmin to exceed limitXmin. In this
> solution, when there are long transactions on Standby, "create index
> concurrently" operations may be blocked.
+1 for this one. The current code doesn't take the advice from the
comments for validate_index, where it says..
* .... Also, we index only tuples that are valid
* as of the start of the scan (see table_index_build_scan), whereas a normal
* build takes care to include recently-dead tuples. This is OK because
* we won't mark the index valid until all transactions that might be able
* to see those tuples are gone. The reason for doing that is ...
In the current code, it doesn't consider the sessiones in standby.
I guess you have worked out a fix for this, if so, could you provide
one for review / test?
--
Best Regards
Andy Fan