Re: long-standing data loss bug in initial sync of logical replication - Mailing list pgsql-hackers

From Andres Freund
Subject Re: long-standing data loss bug in initial sync of logical replication
Date
Msg-id 20231118210519.yborkczu2aln6zdd@awork3.anarazel.de
Whole thread Raw
In response to Re: long-standing data loss bug in initial sync of logical replication  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: long-standing data loss bug in initial sync of logical replication
List pgsql-hackers
Hi,

On 2023-11-18 21:45:35 +0100, Tomas Vondra wrote:
> On 11/18/23 19:12, Andres Freund wrote:
> >> If we increase the locks from ShareUpdateExclusive to ShareRowExclusive,
> >> we're making it conflict with RowExclusive. Which is just DML, and I
> >> think we need to do that.
> > 
> > From what I can tell it needs to to be an AccessExlusiveLock. Completely
> > independent of logical decoding. The way the cache stays coherent is catalog
> > modifications conflicting with anything that builds cache entries. We have a
> > few cases where we do use lower level locks, but for those we have explicit
> > analysis for why that's ok (see e.g. reloptions.c) or we block until nobody
> > could have an old view of the catalog (various CONCURRENTLY) operations.
> > 
> 
> Yeah, I got too focused on the issue I triggered, which seems to be
> fixed by using SRE (still don't understand why ...). But you're probably
> right there may be other cases where SRE would not be sufficient, I
> certainly can't prove it'd be safe.

I think it makes sense here: SRE prevents the problematic "scheduling" in your
test - with SRE no DML started before ALTER PUB ... ADD can commit after.

I'm not sure there are any cases where using SRE instead of AE would cause
problems for logical decoding, but it seems very hard to prove. I'd be very
surprised if just using SRE would not lead to corrupted cache contents in some
situations. The cases where a lower lock level is ok are ones where we just
don't care that the cache is coherent in that moment.

In a way, the logical decoding cache-invalidation situation is a lot more
atomic than the "normal" situation. During normal operation locking is
strictly required to prevent incoherent states when building a cache entry
after a transaction committed, but before the sinval entries have been
queued. But in the logical decoding case that window doesn't exist.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: long-standing data loss bug in initial sync of logical replication
Next
From: Andres Freund
Date:
Subject: Re: Use of backup_label not noted in log