Re: Optionally automatically disable logical replication subscriptions on error - Mailing list pgsql-hackers
From | Mark Dilger |
---|---|
Subject | Re: Optionally automatically disable logical replication subscriptions on error |
Date | |
Msg-id | DE7E13B7-DC76-416A-A98F-3BC3F80E6BE9@enterprisedb.com Whole thread Raw |
In response to | Re: Optionally automatically disable logical replication subscriptions on error (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Optionally automatically disable logical replication subscriptions on error
Re: Optionally automatically disable logical replication subscriptions on error Re: Optionally automatically disable logical replication subscriptions on error |
List | pgsql-hackers |
> On Jun 19, 2021, at 3:17 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > > Right, but there are things that could be common from the design > perspective. I went to reconcile my patch with that from [1] only to discover there is no patch on that thread. Is there one in progressthat I can see? I don't mind trying to reconcile this patch with what you're discussing in [1], but I am a bit skeptical about [1] becominga reality and I don't want to entirely hitch this patch to that effort. This can be committed with or without anysolution to the idea in [1]. The original motivation for this patch was that the TAP tests don't have a great way todeal with a subscription getting into a fail-retry infinite loop, which makes it harder for me to make progress on [2]. That doesn't absolve me of the responsibility of making this patch a good one, but it does motivate me to keep it simple. > For example, why is it mandatory to update this conflict > ( error) information in the system catalog instead of displaying it > via some stats view? The catalog must be updated to disable the subscription, so placing the error information in the same row doesn't requireany independent touching of the catalogs. Likewise, the catalog must be updated to re-enable the subscription, soclearing the error from that same row doesn't require any independent touching of the catalogs. The error information does not *need* to be included in the catalog, but placing the information in any location that won'tsurvive server restart leaves the user no information about why the subscription got disabled after a restart (or crash+ restart) happens. Furthermore, since v2 removed the "disabled_by_error" field in favor of just using subenabled + suberrmsg to determine ifthe subscription was automatically disabled, not having the information in the catalog would make it ambiguous whetherthe subscription was manually or automatically disabled. > Also, why not also log the xid of the failed > transaction? We could also do that. Reading [1], it seems you are overly focused on user-facing xids. The errdetail in the examplesI've been using for testing, and the one mentioned in [1], contain information about the conflicting data. I thinkusers are more likely to understand that a particular primary key value cannot be replicated because it is not uniquethan to understand that a particular xid cannot be replicated. (Likewise for permissions errors.) For example: 2021-06-18 16:25:20.139 PDT [56926] ERROR: duplicate key value violates unique constraint "s1_tbl_unique" 2021-06-18 16:25:20.139 PDT [56926] DETAIL: Key (i)=(1) already exists. 2021-06-18 16:25:20.139 PDT [56926] CONTEXT: COPY tbl, line 2 This tells the user what they need to clean up before they can continue. Telling them which xid tried to apply the change,but not the change itself or the conflict itself, seems rather unhelpful. So at best, the xid is an additional pieceof information. I'd rather have both the ERROR and DETAIL fields above and not the xid than have the xid and lack oneof those two fields. Even so, I have not yet included the DETAIL field because I didn't want to bloat the catalog. For the problem in [1], having the xid is more important than it is in my patch, because the user is expected in [1] to usethe xid as a handle. But that seems like an odd interface to me. Imagine that a transaction on the publisher side inserteda batch of data, and only a subset of that data conflicts on the subscriber side. What advantage is there in skippingthe entire transaction? Wouldn't the user rather skip just the problematic rows? I understand that on the subscriberside it is difficult to do so, but if you are going to implement this sort of thing, it makes more sense to allowthe user to filter out data that is problematic rather than filtering out xids that are problematic, and the filtershouldn't just be an in-or-out filter, but rather a mapping function that can redirect the data someplace else or rewriteit before inserting or change the pre-existing conflicting data prior to applying the problematic data or whatever. That's a huge effort, of course, but if the idea in [1] goes in that direction, I don't want my patch to have alreadyadded an xid field which ultimately nobody wants. [1] - https://www.postgresql.org/message-id/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK%3D30xJfUVihNZDA%40mail.gmail.com [2] - https://www.postgresql.org/message-id/flat/915B995D-1D79-4E0A-BD8D-3B267925FCE9%40enterprisedb.com#dbbce39c9e460183b67ee44b647b1209 — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: