Thread: Re: BF member drongo doesn't like

Re: BF member drongo doesn't like

Bertrand Drouvot

On Fri, Jan 24, 2025 at 02:44:21PM -0500, Andres Freund wrote:
> Hm, maybe I'm missing something, but isn't it possible for the active slot to
> actually progress decoding past the conflict point? It's an active slot, with
> the consumer running in the background, so all that needs to happen for that
> is that logical decoding progresses past the conflict point. That requires
> there be some reference to a newer xid to be in the WAL, but there's nothing
> preventing that afaict?
> In fact, I now saw this comment:
> # Note that pg_current_snapshot() is used to get the horizon.  It does
> # not generate a Transaction/COMMIT WAL record, decreasing the risk of
> # seeing a xl_running_xacts that would advance an active replication slot's
> # catalog_xmin.  Advancing the active replication slot's catalog_xmin
> # would break some tests that expect the active slot to conflict with
> # the catalog xmin horizon.

Yeah, that comes from 46d8587b504 (where we tried to reduce as much as possible
the risk of seeing an unwanted xl_running_xacts being generated).

> Which seems precisely what's happening here?

Much probably yes.

> If that's the issue, I think we need to find a way to block logical decoding
> from making forward progress during the test.
> The easiest way would be to stop pg_recvlogical and emit a bunch of changes,
> so that the backend is stalled sending out data. But that'd require a hard to
> predict amount of data to be emitted, which isn't great.

What about using an injection point instead to block pg_recvlogical until
we want it to resume?

> But perhaps we could do something smarter, by starting a session on the
> primary that acquires an access exclusive lock on a relation that logical
> decoding will need to access?  The tricky bit likely would be that it'd
> somehow need to *not* prevent VACUUM on the primary.

Hm, I'm not sure how we could do that.

> If we could trigger VACUUM in a transaction on the primary this would be
> easy, but we can't.

Another idea that I had ([1]) was  to make use of injection points
around places where RUNNING_XACTS is emitted. IIRC I tried to work on this but
that was not simple as it sounds as we need the startup process not to be blocked



Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services:

Re: BF member drongo doesn't like

Bertrand Drouvot

On Mon, Jan 27, 2025 at 07:13:01AM +0000, Bertrand Drouvot wrote:
> On Fri, Jan 24, 2025 at 02:44:21PM -0500, Andres Freund wrote:
> > If we could trigger VACUUM in a transaction on the primary this would be
> > easy, but we can't.
> Another idea that I had ([1]) was  to make use of injection points
> around places where RUNNING_XACTS is emitted. IIRC I tried to work on this but
> that was not simple as it sounds as we need the startup process not to be blocked
> .

I just proposed a patch to make use of an injection point to prevent the catalog_xmin
of a logical slot to advance past the conflict point ([1]). That does not fix the
issue on v16 though.



Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: