Re: Design of pg_stat_subscription_workers vs pgstats - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Design of pg_stat_subscription_workers vs pgstats |
Date | |
Msg-id | CAD21AoCM0RZuLk+dOvTOGWJwxtgzSmTOKkvmGGWC6gytWSERFg@mail.gmail.com Whole thread Raw |
In response to | Design of pg_stat_subscription_workers vs pgstats (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Design of pg_stat_subscription_workers vs pgstats
|
List | pgsql-hackers |
Hi, On Tue, Jan 25, 2022 at 3:31 PM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > I was looking the shared memory stats patch again. The rebase of which > collided fairly heavily with the addition of pg_stat_subscription_workers. > > I'm concerned about the design of pg_stat_subscription_workers. The view was > introduced in > > > commit 8d74fc96db5fd547e077bf9bf4c3b67f821d71cd > Author: Amit Kapila <akapila@postgresql.org> > Date: 2021-11-30 08:54:30 +0530 > > Add a view to show the stats of subscription workers. > > This commit adds a new system view pg_stat_subscription_workers, that > shows information about any errors which occur during the application of > logical replication changes as well as during performing initial table > synchronization. The subscription statistics entries are removed when the > corresponding subscription is removed. > > It also adds an SQL function pg_stat_reset_subscription_worker() to reset > single subscription errors. > > The contents of this view can be used by an upcoming patch that skips the > particular transaction that conflicts with the existing data on the > subscriber. > > This view can be extended in the future to track other xact related > statistics like the number of xacts committed/aborted for subscription > workers. > > Author: Masahiko Sawada > Reviewed-by: Greg Nancarrow, Hou Zhijie, Tang Haiying, Vignesh C, Dilip Kumar, Takamichi Osumi, Amit Kapila > Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com > > > I tried to skim-read the discussion leading to its introduction, but it's > extraordinarily long: 474 messages in [1], 131 messages in [2], as well as a > few other associated threads. > > > From the commit message alone I am concerned that this appears to be intended > to be used to store important state in pgstats. For which pgstats is > fundamentally unsuitable (pgstat can loose state during normal operation, > always looses state during crash restarts, the state can be reset). The information on pg_stat_subscription_workers view, especially last_error_xid, can be used to specify XID to "ALTER SUBSCRIPTION ... SKIP (xid = XXX)" command which is proposed on the same thread, but it doesn't mean that the new SKIP command relies on this information. The failure XID is written in the server logs as well and the user specifies XID manually. > > I don't really understand the name "pg_stat_subscription_workers" - what > workers are stats kept about exactly? The columns don't obviously refer to a > single worker or such? From the contents it should be name > pg_stat_subscription_table_stats or such. But no, that'd not quite right, > because apply errors are stored per-susbcription, while initial sync stuff is > per-(subscription, table). This stores stats for subscription workers namely apply and tablesync worker, so named as pg_stat_subscription_workers. Also, there is another proposal to add transaction statistics for logical replication subscribers[1], and it's reasonable to merge these statistics and this error information rather than having separate views[2]. There also was an idea to add the transaction statistics to pg_stat_subscription view, but it doesn't seem a good idea because the pg_stat_subscription shows dynamic statistics whereas the transaction statistics are accumulative statistics[3]. > > The pgstat entries are quite wide (292 bytes), because of the error message > stored. That's nearly twice the size of PgStat_StatTabEntry. And as far as I > can tell, once there was an error, we'll just keep the stats entry around > until the subscription is dropped. We can drop the particular statistics by pg_stat_reset_subscription_worker() function. > And that includes stats for long dropped > tables, as far as I can see - except that they're hidden from view, due to a > join to pg_subscription_rel. We are planning to drop this after successfully apply[4]. > To me this looks like it's using pgstat as an extremely poor IPC mechanism. > > > Why isn't this just storing data in pg_subscription_rel? These need to be updated on error which means for a failed xact and we don't want to update the system catalog in that state. There will be some challenges in a case where updating pg_subscription_rel also failed too (what to report to the user, etc.). And moreover, we don't want to consume space for temporary information in the system catalog. Regards, [1] https://www.postgresql.org/message-id/OSBPR01MB48887CA8F40C8D984A6DC00CED199%40OSBPR01MB4888.jpnprd01.prod.outlook.com [2] https://www.postgresql.org/message-id/CAD21AoDF7LmSALzMfmPshRw_xFcRz3WvB-me8T2gO6Ht%3D3zL2w%40mail.gmail.com [3] https://www.postgresql.org/message-id/CAA4eK1JqwpsvjhLxV8CMYQ3NrhimZ8AFhWHh0Qn1FrL%3DLXfY6Q%40mail.gmail.com [4] https://www.postgresql.org/message-id/CAA4eK1%2B9yXkWkJSNtWYV2rG7QNAnoAt%2BeNH0PexoSP9ZQmXKPg%40mail.gmail.com -- Masahiko Sawada EDB: https://www.enterprisedb.com/
pgsql-hackers by date: