Re: Bugs in pgoutput.c - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Bugs in pgoutput.c
Date
Msg-id CAA4eK1KvyaDpJq7LHAOeZPE76Gf9ZDCnBzPK4_SDGiCj9AySng@mail.gmail.com
Whole thread Raw
In response to Bugs in pgoutput.c  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Bugs in pgoutput.c
List pgsql-hackers
On Thu, Jan 6, 2022 at 3:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Commit 6ce16088b caused me to look at pgoutput.c's handling of
> cache invalidations, and I was pretty appalled by what I found.
>
> * rel_sync_cache_relation_cb does the wrong thing when called for
> a cache flush (i.e., relid == 0).  Instead of invalidating all
> RelationSyncCache entries as it should, it will do nothing.
>
> * When rel_sync_cache_relation_cb does invalidate an entry,
> it immediately zaps the entry->map structure, even though that
> might still be in use (as per the adjacent comment that carefully
> explains why this isn't safe).  I'm not sure if this could lead
> to a dangling-pointer core dump, but it sure seems like it could
> lead to failing to translate tuples that are about to be sent.
>
> * Similarly, rel_sync_cache_publication_cb is way too eager to
> reset the pubactions flags, which would likely lead to failing
> to transmit changes that we should transmit.
>
> The attached patch fixes these things, but I'm still pretty
> unhappy with the general design of the data structures in
> pgoutput.c, because there is this weird random mishmash of
> static variables along with a palloc'd PGOutputData struct.
> This cannot work if there are ever two active LogicalDecodingContexts
> in the same process.  I don't think serial use of LogicalDecodingContexts
> (ie, destroy one and then make another) works very well either,
> because pgoutput_shutdown is a mere fig leaf that ignores all the
> junk the module previously made (in CacheMemoryContext no less).
> So I wonder whether either of those scenarios is possible/supported/
> expected to be needed in future.
>
> Also ... maybe I'm not looking in the right place, but I do not
> see anything anywhere in logical decoding that is taking any lock
> on the relation being processed.  How can that be safe?
>

We don't need to acquire a lock on relation while decoding changes
from WAL because it uses a historic snapshot to build a relcache entry
and all the later changes to the rel are absorbed while decoding WAL.

It is important to not acquire a lock on user-defined relations during
decoding otherwise it could lead to deadlock as explained in the email
[1].

* Would it be better if we move all the initialization done by patch
in get_rel_sync_entry() to a separate function as I expect future
patches might need to reset more things?

*
  * logical decoding callback calls - but invalidation events can come in
- * *during* a callback if we access the relcache in the callback. Because
- * of that we must mark the cache entry as invalid but not remove it from
- * the hash while it could still be referenced, then prune it at a later
- * safe point.
- *
- * Getting invalidations for relations that aren't in the table is
- * entirely normal, since there's no way to unregister for an invalidation
- * event. So we don't care if it's found or not.
+ * *during* a callback if we do any syscache or table access in the
+ * callback.

As we don't take locks on tables, can invalidation events be accepted
during table access? I could be missing something but I see relation.c
accepts invalidation messages only when lock mode is not 'NoLock'.

[1] - https://www.postgresql.org/message-id/CAA4eK1Ks%2Bp8wDbzhDr7yMYEWDbWFRJAd_uOY-moikc%2Bzr9ER%2Bg%40mail.gmail.com

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes
Next
From: Amit Kapila
Date:
Subject: Re: row filtering for logical replication