Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers
From | David G. Johnston |
---|---|
Subject | Re: Skipping logical replication transactions on subscriber side |
Date | |
Msg-id | CAKFQuwYmGrGu63NZ-xKsb5be53sTyYSdDhaqScMuS=aJs_VF1w@mail.gmail.com Whole thread Raw |
In response to | Re: Skipping logical replication transactions on subscriber side (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Skipping logical replication transactions on subscriber side
|
List | pgsql-hackers |
On Fri, Jan 21, 2022 at 10:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 21, 2022 at 10:00 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Fri, Jan 21, 2022 at 4:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> Apart from this, I have changed a few comments and ran pgindent. Do
>> let me know what you think of the changes?
>
>
> The paragraph describing ALTER SUBSCRIPTION SKIP seems unnecessarily repetitive. Consider:
> """
> Skips applying all changes of the specified remote transaction, whose value should be obtained from pg_stat_subscription_workers.last_error_xid.
>
Here, you can also say that the value can be found from server logs as well.
subscriber's server logs, right? I would agree that adding that for completeness is warranted.
> Then change the subskipxid column description to be:
> """
> ID of the transaction whose changes are to be skipped. It is 0 when there are no pending skips. This is set by issuing ALTER SUBSCRIPTION SKIP and resets back to 0 when the identified transactions passes through the subscription stream and is successfully ignored.
> """
>
Users can manually reset it by specifying NONE, so that should be
covered in the above text, otherwise, looks good.
I agree with incorporating "reset" into the paragraph somehow - does not have to mention NONE, just that ALTER SUBSCRIPTION SKIP (not a family friendly abbreviation...) is what does it.
> I don't understand why/how ", if a valid transaction ID;" comes into play (how would we know whether it is valid, or if we do ALTER SUBSCRIPTION SKIP should prohibit the invalid value from being chosen).
>
What do you mean by invalid value here? Is it the value lesser than
FirstNormalTransactionId or a value that is of the non-error
transaction? For the former, we already have a check in the patch and
for later we can't identify it with any certainty because the error
stats are collected by the stats collector.
The original proposal qualifies the non-zero transaction id in subskipxid as being a "valid transaction ID" and that invalid ones (which is how "otherwise" is interpreted given the "valid" qualification preceding it) are shown as 0. As an end-user that makes me wonder what it means for a transaction ID to be invalid. My point is that dropping the mention of "valid transaction ID" avoids that and lets the reader operate with an understanding that things should "just work". If I see a non-zero in the column I have a pending skip and if I see zero I do not. My wording assumes it is that simple. If it isn't I would need some clarity as to why it is not in order to write something I could read and understand from my inexperienced user-centric point-of-view.
I get that I may provide a transaction ID that is invalid such that the system could never see it (or at least not for a long while) - say we error on transaction 102 and I typo it as 1002 or 101. But I would expect either an error where I make the typo or the numbers 1002 or 101 to appear on the table. I would not expect my 101 typo to result in a 0 appearing on the table (and if it does so today I argue that is a POLA violation). Thus, "if a valid transaction ID" from the original text just doesn't make sense to me.
In typical usage it would seem strange to allow a skip to be recorded if there is no existing error in the subscription. Should we (do we, haven't read the code) warn in that situation?
Or, why even force them to specify a number instead of just saying SKIP and if there is a current error we skip its transaction, otherwise we warn them that nothing happened because there is no last error.
Additionally, the description for pg_stat_subscription_workers should describe what happens once the transaction represented by last_error_xid has either been successfully processed or skipped. Does this "last error" stick around until another error happens (which is hopefully very rare) or does it reset to blanks? Seems like it should reset, which really makes this more of an "active_error" instead of a "last_error". This system is linear, we are stuck until this error is resolved, making it active.
> I'm against mentioning subtransactions in the skip_option description.
>
We have mentioned that because currently, we don't support it but in
the future one can come up with an idea to support it. What problem do
you see with it?
If you ever get around to implementing the feature then by all means add it. My main issue is that we basically never talk about subtransactions in the user-facing documentation and it doesn't seem desirable to do so here. Knowing that a whole transaction is skipped is all I need to care about as a user. I believe that no users will be asking "what about subtransactions (savepoints)" but by mentioning it less experienced ones will now have something to be curious about that they really do not need to be.
> The Logical Replication page changes provide good content overall but I dislike going into detail about how to perform conflict resolution in the third paragraph and then summarize the various forms of conflict resolution in the newly added forth. Maybe re-work things like:
Personally, I don't see much value in the split (especially giving
context like "Prior to v15 ..) but specifying the circumstances where
each of the options could be useful.
Yes, I've been reminded of the desire to avoid mentioning versions and agree doing so here is correct. The added context is desired, the style depends on the content.
David J.
pgsql-hackers by date: