Home > mailing lists

Re: Design of pg_stat_subscription_workers vs pgstats - Mailing list pgsql-hackers

From	David G. Johnston
Subject	Re: Design of pg_stat_subscription_workers vs pgstats
Date	January 28, 2022 01:35:57
Msg-id	CAKFQuwYS_EUe+sR6MS3aiR9UXtUJfDcmHoDjrXAeDnY5w_9bnw@mail.gmail.com Whole thread Raw
In response to	Re: Design of pg_stat_subscription_workers vs pgstats (Andres Freund <andres@anarazel.de>)
List	pgsql-hackers

Tree view

On Thu, Jan 27, 2022 at 2:15 PM Andres Freund <andres@anarazel.de> wrote:

Another related thing is that using a 32bit xid for allowing skipping is a bad
idea anyway - we shouldn't adding new interfaces with xid wraparound dangers -
it's getting more and more common to have multiple wraparounds a day. An
easily better alternative would be the LSN at which a transaction starts.

Interesting idea. I do not think a well-designed skipping feature need worry about wrap-around though. The XID to be skipped was just seen be a worker and because it failed it will continue to be the same XID encountered by that worker until it is resolved. There is no effective progression in time while the subscriber is stuck for wrap-around to happen. Since we want to skip the transaction as a whole adding a layer of hidden indirection to the process seems undesirable. I'm not against the idea though - to the user it is basically "copy this value from the error message in order to skip the transaction that caused the error". Then the system verifies the value and then ensures it skips one, and only one, transaction.

It's pretty easy from the POV of getting into a new transaction.

PG_CATCH():

/* get us out of the failed transaction */
AbortOutOfAnyTransaction();

StartTransactionCommand();
/* do something to remember the error we just got */
CommitTransactionCommand();

Thank you.

It may be a bit harder to afterwards to to not just error out the whole
worker, because we'd need to know what to do instead.

I imagine the launcher and worker startup code can be made to deal with the restart adequately. Just wait if the last thing seen was an error. Require the user to manually resume the worker - unless we really think a try-until-you-succeed with a backoff protocol is superior. Upon system restart all error information is cleared and we start from scratch and let the errors happen (or not depending) as they will.

David J.

pgsql-hackers by date:

From: Thomas Munro
Date: 28 January 2022, 01:21:58
Subject: Re: A test for replay of regression tests

From: Andres Freund
Date: 28 January 2022, 01:36:32
Subject: Re: A test for replay of regression tests

Re: Design of pg_stat_subscription_workers vs pgstats - Mailing list pgsql-hackers

Previous

Next