RE: Re:Re: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication. - Mailing list pgsql-bugs

From Hayato Kuroda (Fujitsu)
Subject RE: Re:Re: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.
Date
Msg-id TY3PR01MB98897597238BA6283A2A5B64F56C2@TY3PR01MB9889.jpnprd01.prod.outlook.com
Whole thread Raw
In response to BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.  (PG Bug reporting form <noreply@postgresql.org>)
List pgsql-bugs
Dear Song,

>
I see that snapshot changes is added to the tail of txn->changes list in
SnapBuildDistributeNewCatalogSnapshot. When the DML transaction is
committed, the corresponding entry in RelationSyncCache has already been
invalidated. However, snapshot hasn't been changed as this time (The newly added
snapshot change is still in the list and hasn't been invoked.), so as you say: "
pgoutput_change continues to see 'relentry->pubactions.pubinsert' as false,
even after re fetching the relation entry after the invalidation."

Therefore, there is one solution, as Andres Freund[1] states, that if a transaction is
found to have modified catalog during logical decoding, invalidations are forced to
be assigned to all concurrent in-progress transactions, just as my patch did.

[1] https://www.postgresql.org/message-id/20231118025445.crhaeeuvoe2g5dv6%40awork3.anarazel.de
>

Thanks for making a patch, but your patch cannot be built:

```
reorderbuffer.c: In function ‘ReorderBufferDistributeInvalidation’:
../../../../src/include/lib/ilist.h:594:2: error: expected identifier before ‘(’ token
  (AssertVariableIsOfTypeMacro(ptr, dlist_node *),      \
  ^
reorderbuffer.c:2648:8: note: in expansion of macro ‘dlist_container’
   txn->dlist_container(ReorderBufferTXN, node, txn_i.cur);
        ^~~~~~~~~~~~~~~
```

Based on your arguments, I revised your patch. Is it same as your expectations?
This also fixes some coding conventions.

>
Although my approach seems to be effective,
>

I ran my reproducer and confirmed that the issue was fixed.

>
it may bring additional overhead to logical decoding
under special circumstances. I think upgrading lock mentioned in [1] is also an effective solution,
and it seems to have less impact. But I don't quite understand why it may cause deadlocks mentioned
in [1].
>

IIUC, no one pointed out a concrete case for deadlocks, but anyone could not say
that it is safe, neither. We must understand the thread more and must decide how
we fix: modify the lock level or send invalidation messages to all the reorderbuffer
transactions. How do you think?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED




Attachment

pgsql-bugs by date:

Previous
From: Devrim Gündüz
Date:
Subject: Re: BUG #18293: Rocky 8 postgres install failing as nothing provides libarmadillo.so.10()(64bit)
Next
From: Devrim Gündüz
Date:
Subject: Re: BUG #18289: postgresql14-devel-14.10-2PGDG.rhel8.x86_64.rpm Contains invalid cLang option in Makefile.global