Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Skipping logical replication transactions on subscriber side |
Date | |
Msg-id | CAD21AoDoQ6pUdXN=wx2UoB5_uWR=24w0q+YwYDr4LEcEjeqxKA@mail.gmail.com Whole thread Raw |
In response to | Re: Skipping logical replication transactions on subscriber side (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Skipping logical replication transactions on subscriber side
Re: Skipping logical replication transactions on subscriber side |
List | pgsql-hackers |
On Wed, Jul 14, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Mon, Jul 12, 2021 at 8:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Jul 12, 2021 at 11:13 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Mon, Jul 12, 2021 at 1:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Mon, Jul 12, 2021 at 9:37 AM Alexey Lesovsky <lesovsky@gmail.com> wrote: > > > > > > > > > > On Mon, Jul 12, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > >> > > > > >> > > > > > >> > Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in the samesubscription, but different relations? > > > > >> > > > > > >> > > > > >> We can't proceed unless the first error is resolved, so there > > > > >> shouldn't be multiple unresolved errors. > > > > > > > > > > > > > > > Ok. I thought multiple errors are possible when many tables are initialized using parallel workers (with max_sync_workers_per_subscription> 1). > > > > > > > > > > > > > Yeah, that is possible but that covers under the second condition > > > > mentioned by me and in such cases I think we should have separate rows > > > > for each tablesync. Is that right, Sawada-san or do you have something > > > > else in mind? > > > > > > Yeah, I agree to have separate rows for each table sync. The table > > > should not be processed by both the table sync worker and the apply > > > worker at a time so the pair of subscription OID and relation OID will > > > be unique. I think that we have a boolean column in the view, > > > indicating whether the error entry is reported by the table sync > > > worker or the apply worker, or maybe we also can have the action > > > column show "TABLE SYNC" if the error is reported by the table sync > > > worker. > > > > > > > Or similar to backend_type (text) in pg_stat_activity, we can have > > something like error_source (text) which will display apply worker or > > tablesync worker? I think if we have this column then even if there is > > a chance that both apply and sync worker operates on the same > > relation, we can identify it via this column. > > Sounds good. I'll incorporate this in the next version patch that I'm > planning to submit this week. Sorry, I could not make it this week. I'll submit them early next week. While updating the patch I thought we need to have more design discussion on two points of clearing error details after the error is resolved: 1. How to clear apply worker errors. IIUC we've discussed that once the apply worker skipped the transaction we leave the error entry itself but clear its fields except for some fields such as failure counts. But given that the stats messages could be lost, how can we ensure to clear those error details? For table sync workers’ error, we can have autovacuum workers periodically check entires of pg_subscription_rel and clear the error entry if the table sync worker completes table sync (i.g., checking if srsubstate = ‘r’). But there is no such information for the apply workers and subscriptions. In addition to sending the message clearing the error details just after skipping the transaction, I thought that we can have apply workers periodically send the message clearing the error details but it seems not good. 2. Do we really want to leave the table sync worker even after the error is resolved and the table sync completes? Unlike the apply worker error, the number of table sync worker errors could be very large, for example, if a subscriber subscribes to many tables. If we leave those errors in the stats view, it uses more memory space and could affect writing and reading stats file performance. If such left table sync error entries are not helpful in practice I think we can remove them rather than clear some fields. What do you think? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
pgsql-hackers by date: