Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Skipping logical replication transactions on subscriber side
Date
Msg-id CAA4eK1JKf2Jejg1abuuU-Gx8Dkw0tey4Jiz5ACyQJ4QzKDvHSg@mail.gmail.com
Whole thread Raw
In response to Re: Skipping logical replication transactions on subscriber side  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Skipping logical replication transactions on subscriber side
List pgsql-hackers
On Tue, Sep 21, 2021 at 10:23 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached the updated version patches. Please review them.
>

Review comments for v14-0001-Add-pg_stat_subscription_errors-statistics-view
==============================================================
1.
<entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>command</structfield> <type>text</type>
+      </para>
+      <para>
+        Name of command being applied when the error occurred.  This
+        field is always NULL if the error is reported by the
+        <literal>tablesync</literal> worker.
+      </para></entry>
+     </row>
..
..
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>xid</structfield> <type>xid</type>
+      </para>
+      <para>
+        Transaction ID of the publisher node being applied when the error
+        occurred.  This field is always NULL if the error is reported
+        by the <literal>tablesync</literal> worker.
+      </para></entry>

Shouldn't we display command and transaction id even for table sync
worker if it occurs during sync phase (syncing with apply worker
position)

2.
+ /*
+ * The number of not-ready relations can be high for example right
+ * after creating a subscription, so we load the list of
+ * SubscriptionRelState into the hash table for faster lookups.
+ */

I am not sure this optimization of converting to not-ready relations
list to hash table is worth it. Are we expecting thousands of
relations per subscription? I think that will be a rare case even if
it is there.

3.
+static void
+pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
+{
+ if (subscriptionHash == NULL)
+ return;
+
+ for (int i = 0; i < msg->m_nentries; i++)
+ {
+ PgStat_StatSubEntry *subent;
+
+ subent = pgstat_get_subscription_entry(msg->m_subids[i], false);
+
+ /*
+ * Nothing to do if the subscription entry is not found.  This could
+ * happen when the subscription is dropped and the message for
+ * dropping subscription entry arrived before the message for
+ * reporting the error.
+ */
+ if (subent == NULL)

Is the above comment true even during the purge? I can think of this
during normal processing but not during the purge.

4.
+typedef struct PgStat_MsgSubscriptionErr
+{
+ PgStat_MsgHdr m_hdr;
+
+ /*
+ * m_subid and m_subrelid are used to determine the subscription and the
+ * reporter of this error.  m_subrelid is InvalidOid if reported by the
+ * apply worker, otherwise by the table sync worker.  In table sync worker
+ * case, m_subrelid must be the same as m_relid.
+ */
+ Oid m_subid;
+ Oid m_subrelid;
+
+ /* Error information */
+ Oid m_relid;

Is m_subrelid is used only to distinguish the type of worker? I think
it could be InvalidOid during the syncing phase in the table sync
worker.

5.
+/*
+ * Subscription error statistics kept in the stats collector, representing
+ * an error that occurred during application of logical replication or

The part of the message " ... application of logical replication ..."
sounds a little unclear. Shall we instead write: " ... application of
logical message ..."?

6.
+typedef struct PgStat_StatSubEntry
+{
+ Oid subid; /* hash table key */
+
+ /*
+ * Statistics of errors that occurred during logical replication.  While
+ * having the hash table for table sync errors we have a separate
+ * statistics value for apply error (apply_error), because we can avoid
+ * building a nested hash table for table sync errors in the case where
+ * there is no table sync error, which is the common case in practice.
+ *

The above comment is not clear to me. Why do you need to have a
separate hash table for table sync errors? And what makes it avoid
building nested hash table?

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Skipping logical replication transactions on subscriber side
Next
From: Ranier Vilela
Date:
Subject: Re: Empty string in lexeme for tsvector