Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Skipping logical replication transactions on subscriber side |
Date | |
Msg-id | CAA4eK1+oj008H4p3cfdvxYADaMJGWCEZ0iWdVLs72t+T0FeQvg@mail.gmail.com Whole thread Raw |
In response to | Re: Skipping logical replication transactions on subscriber side (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Skipping logical replication transactions on subscriber side
|
List | pgsql-hackers |
On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > Right. I've fixed this issue and attached an updated patch. > Few comments/questions: ===================== 1. + <para> + The <structname>pg_stat_subscription_workers</structname> view will contain + one row per subscription error reported by workers applying logical + replication changes and workers handling the initial data copy of the + subscribed tables. The statistics entry is removed when the subscription + the worker is running on is removed. + </para> The last line of this paragraph is not clear to me. First "the" before "worker" in the following part of the sentence seems unnecessary "..when the subscription the worker..". Then the part "running on is removed" is unclear because it could also mean that we remove the entry when a subscription is disabled. Can we rephrase it to: "The statistics entry is removed when the corresponding subscription is dropped"? 2. Between v20 and v23 versions of patch the size of hash table PGSTAT_SUBWORKER_HASH_SIZE is increased from 32 to 256. I might have missed the comment which lead to this change, can you point me to the same or if you changed it for some other reason, can you let me know the same? 3. + + /* + * Repeat for subscription workers. Similarly, we needn't bother + * in the common case where no function stats are being collected. + */ /function/subscription workers' 4. + <para> + Name of command being applied when the error occurred. This field + is always NULL if the error was reported during the initial data + copy. + </para></entry> + </row> + + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>xid</structfield> <type>xid</type> + </para> + <para> + Transaction ID of the publisher node being applied when the error + occurred. This field is always NULL if the error was reported + during the initial data copy. + </para></entry> Is it important to stress on 'always' in the above two descriptions? 5. The current description of first/last_error_time seems sliglthy misleading as one can interpret that these are about different errors. Let's slightly change the description of first/last_error_time as follows or something on those lines: </para> + <para> + Time at which the first error occurred + </para></entry> + </row> First time at which this error occurred <structfield>last_error_time</structfield> <type>timestamp with time zone</type> + </para> + <para> + Time at which the last error occurred Last time at which this error occurred. This will be the same as first_error_time except when the same error occurred more than once consecutively. 6. + </indexterm> + <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type>, <optional> <parameter>relid</parameter> <type>oid</type> </optional> ) + <returnvalue>void</returnvalue> + </para> + <para> + Resets the statistics of a single subscription worker running on the + subscription with <parameter>subid</parameter> shown in the + <structname>pg_stat_subscription_worker</structname> view. If the + argument <parameter>relid</parameter> is not <literal>NULL</literal>, + resets statistics of the subscription worker handling the initial data + copy of the relation with <parameter>relid</parameter>. Otherwise, + resets the subscription worker statistics of the main apply worker. + If the argument <parameter>relid</parameter> is omitted, resets the + statistics of all subscription workers running on the subscription + with <parameter>subid</parameter>. + </para> The first line of this description seems to indicate that we can only reset the stats of a single worker but the later part indicates that we can reset stats of all subscription workers. Can we change the first line as: "Resets the statistics of subscription workers running on the subscription with <parameter>subid</parameter> shown in the <structname>pg_stat_subscription_worker</structname> view.". 7. pgstat_vacuum_stat() { .. + pgstat_setheader(&spmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE); + spmsg.m_databaseid = MyDatabaseId; + spmsg.m_nentries = 0; .. } Do we really need to set the header here? It seems to be getting set in pgstat_send_subscription_purge() while sending this message. 8. pgstat_vacuum_stat() { .. + + if (hash_search(htab, (void *) &(subwentry->key.subid), HASH_FIND, NULL) + != NULL) + continue; + + /* This subscription is dead, add the subid to the message */ + spmsg.m_subids[spmsg.m_nentries++] = subwentry->key.subid; .. } I think it is better to use a separate variable here for subid as we are using for funcid and tableid. That will make this part of the code easier to follow and look consistent. 9. +/* ---------- + * PgStat_MsgSubWorkerError Sent by the apply worker or the table sync worker to + * report the error occurred during logical replication. + * ---------- In this comment "during logical replication" sounds too generic. Can we instead use "while processing changes." or something like that to make it a bit more specific? -- With Regards, Amit Kapila.
pgsql-hackers by date: