Re: Design of pg_stat_subscription_workers vs pgstats - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Design of pg_stat_subscription_workers vs pgstats |
Date | |
Msg-id | CAD21AoCOG-BOUor3HyYk9KWxfmaosLq8ANu9Nhk0vyS7Rfd4GQ@mail.gmail.com Whole thread Raw |
In response to | Re: Design of pg_stat_subscription_workers vs pgstats ("David G. Johnston" <david.g.johnston@gmail.com>) |
List | pgsql-hackers |
On Sat, Feb 19, 2022 at 4:47 AM David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Fri, Feb 18, 2022 at 1:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> >> >> Here is the summary of the discussion, changes, and plan. >> >> 1. Move some error information such as the error message to a new >> system catalog, pg_subscription_error. The pg_subscription_error table >> would have the following columns: >> >> * sesubid : subscription Oid. >> * serelid : relation Oid (NULL for apply worker). >> * seerrlsn : commit-LSN or the error transaction. >> * seerrcmd : command (INSERT, UPDATE, etc.) of the error transaction. >> * seerrmsg : error message > > > Not a fan of the "se" prefix but overall yes. We should include a timestamp. > >> >> The tuple is inserted or updated when an apply worker or a tablesync >> worker raises an error. If the same error occurs in a row, the update >> is skipped. > > > I disagree with this - I would treat every new instance of an error to be important and insert on conflict (sesubid, serelid)the new entry, updating lsn/cmd/msg with the new values. > >> The tuple is removed in the following cases: >> >> * the subscription is dropped. >> * the table is dropped. >> >> * the table is removed from the subscription. >> * the worker successfully committed a non-empty transaction. > > > Correct. This handles the "end of sync worker" just fine since its final action should be a successful commit of a non-emptytransaction. >> >> >> With this change, pg_stat_subscription_workers will be like: >> >> * subid >> * subname >> * subrelid >> * error_count >> * last_error_timestamp >> >> This view will be extended by adding transaction statistics proposed >> on another thread[1]. > > > I haven't reviewed that thread but in-so-far as this one goes I would just drop this altogether. The error count, if desired,can be added to pg_subscription_error, and the timestamp should be added there as noted above. > > If this is useful for the feature on the other thread it can be reconstituted from there. > >> >> 2. Fix a bug in pg_stat_subscription_workers. As pointed out by >> Andres, there is a bug in pg_stat_subscription_workers; it doesn't >> drop entries for already-dropped tables. We need to fix it. > > > Given the above, this becomes an N/A. > >> >> 3. Show commit-LSN of the error transaction in errcontext. Currently, >> we show XID and commit timestamp in the errcontext. But given that we >> use LSN in pg_subscriptoin_error, it's better to show commit-LSN as >> well (or instead of error-XID). > > > Agreed, I think: what "errcontext" is this referring to? The CONTEXT part in a log message. The apply worker and the tablesync worker write the details of the changes on an error in the CONTEXT part as follow: ERROR: duplicate key value violates unique constraint "test_pkey" DETAIL: Key (id)=(1) already exists. CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-09-29 15:52:45.165754+00 >> >> >> 5. Record skipped data to the system catalog, say >> pg_subscription_conflict_history so that there is a chance to analyze >> and recover. > > >> We can discuss the >> details in a new thread. > > Agreed - the "before skipping" consideration seems considerably more helpful; but wouldn't need to be persistent, it couldjust be a view. A permanent record probably would best be recorded in the logs - though if we get the pre-skip functionalitythe user can view directly and save the results wherever they wish or we can provide a function to spool theinformation to the log. I don't see persistent in-database storage being that desirable here. > > If we only do something after the transaction has been skipped it may be useful to add an option to the skipping commandto auto-disable the subscription after the transaction has been skipped and before any subsequent transactions areapplied. This will give the user a chance to process the post-skipped information before restoring sync and having thesystem begin changing underneath them again. > >> >> >> 4 and 5 might be better introduced together but I think since the user >> is able to check what changes will be skipped on the publisher side we >> can do 5 for v16. > > > How would one go about doing that (checking what changes will be skipped on the publisher side)? We can copy the replication slot while changing the plugin by using pg_copy_replication_slot(). Then we can check the changes by using pg_logical_slot_peek_changes() with the copied slot. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
pgsql-hackers by date: