Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: Skipping logical replication transactions on subscriber side
Date
Msg-id CAKFQuwYJ7dsW+Stsw5+ZVoY3nwQ9j6pPt-7oYjGddH-h7uVb+g@mail.gmail.com
Whole thread Raw
In response to Re: Skipping logical replication transactions on subscriber side  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: Skipping logical replication transactions on subscriber side
List pgsql-hackers
On Mon, Jan 24, 2022 at 12:59 AM David G. Johnston <david.g.johnston@gmail.com> wrote:

> 5(out). wait for the user to manually restart the replication stream

Do you mean that there always is user intervention after error so the
replication stream can resume?
 
That is my working assumption.  It doesn't seem like the system would auto-resume without a DBA doing something (I'll attribute a server crash to the DBA for convenience).

Apparently I need to read more about how the system works today to understand how this varies from and integrates with today's user experience.


I've done some code reading.  My understanding is that a background worker for the main apply of a given subscription is created from the launcher code (not reviewed) which is initialized at server startup (or as needed sometime thereafter).  This goes into a for(;;) loop in LogicalRepApplyLoop under a PG_TRY in ApplyWorkerMain.  When a message is applied that provokes an error the PG_CATCH() in ApplyWorkerMain takes over and then this worker dies.  While in that PG_CATCH() we have an aborted transaction and so are limited in what we can change.  We PG_RE_THROW(); back to the background worker infrastructure and let it perform logging and cleanup; which includes this destroying this instance of the background worker.  The background worker that is destroyed is replaced and its replacement is identical to the original so far as the statistics collector is concerned.

I haven't traced out when the replacement apply worker gets recreated.  It seems like doing so immediately, and then it going and just encountering the same error, would be an undesirable choice, and so I've assumed it does not.  But I also wasn't expecting the apply worker to PG_RE_THROW() either, but instead continue on running in a different for(;;) loop waiting for some signal from the system that something has changed that may avoid the error that put it in timeout.

So my more detailed goal would be to get rid of PG_RE_THROW(); (I assume doing so would entail transaction rollback) and stay in the worker.  Update pg_subscription with the error information (having removed PG_RE_THROW we have new things to consider re: pg_stat_subscription_workers). Go into a for(;;) loop, maybe polling pg_subscription for an indication that it is OK to retry applying the last transaction. (can an inter-process signal be sent from a normal backend process to a background worker process?).  The SKIP command then matches XID values on pg_subscription; the resumption sees the subskipxid, updates pg_subscription to remove the error info and subskipid, skips the next transaction assuming it has the matching XID, and then continues applying as normal.  Adapt to deal with crash conditions as needed though clearing before reapplying seems like a safe default.  Again, upon worker startup maybe they should be cleared too (making pg_dump and other backup considerations moot - as noted in my P.S. in the previous email).

I'm not sure we are paranoid enough regarding the locking of pg_subscription for purposes of reading and writing subskipxid.  I'd probably rather serialize access to it, and maybe even not allow changing from one non-zero XID to another non-zero XID.  It shouldn't be needed in practice (moreso if the XID has to be the one that is present from current_error_xid) and the user can always reset first.

In worker.c I was and still am confused as to the meaning of 'c' and 'w' in LogicalRepApplyLoop.  In apply_dispatch in that file enums are used to compare against the message byte, it would be helpful for the inexperienced reader if 'c' and 'w' were done as enums instead as well.

David J.

pgsql-hackers by date:

Previous
From: "tanghy.fnst@fujitsu.com"
Date:
Subject: RE: Support tab completion for upper character inputs in psql
Next
From: Tomas Vondra
Date:
Subject: Re: logical decoding and replication of sequences