Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id CAHut+PtsDYPbg7qM1nGWtJcSQBQ5JH=LmgyqwqBPL9k+z8f5Ew@mail.gmail.com
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  (shveta malik <shveta.malik@gmail.com>)
Responses Re: Synchronizing slots from primary to standby
List pgsql-hackers
Here are some review comments for v67-0002.

======
src/backend/replication/logical/slotsync.c

1.
+/* The sleep time (ms) between slot-sync cycles varies dynamically
+ * (within a MIN/MAX range) according to slot activity. See
+ * wait_for_slot_activity() for details.
+ */
+#define MIN_WORKER_NAPTIME_MS  200
+#define MAX_WORKER_NAPTIME_MS  30000 /* 30s */
+
+static long sleep_ms = MIN_WORKER_NAPTIME_MS;

In my previous review for this, I meant for there to be no whitespace
between the #defines and the static long sleep_ms so the prior comment
then clearly belongs to all 3 lines

~~~

2. synchronize_one_slot

+ /*
+ * Sanity check: Make sure that concerned WAL is received and flushed
+ * before syncing slot to target lsn received from the primary server.
+ *
+ * This check should never pass as on the primary server, we have waited
+ * for the standby's confirmation before updating the logical slot.
+ */
+ latestFlushPtr = GetStandbyFlushRecPtr(NULL);
+ if (remote_slot->confirmed_lsn > latestFlushPtr)
+ {
+ ereport(LOG,
+ errmsg("skipping slot synchronization as the received slot sync"
+    " LSN %X/%X for slot \"%s\" is ahead of the standby position %X/%X",
+    LSN_FORMAT_ARGS(remote_slot->confirmed_lsn),
+    remote_slot->name,
+    LSN_FORMAT_ARGS(latestFlushPtr)));
+
+ return false;
+ }

Previously in v65 this was an elog, but now it is an ereport. But
since this is a sanity check condition that "should never pass" wasn't
the elog the more appropriate choice?

~~~

3. synchronize_one_slot

+ /*
+ * We don't want any complicated code while holding a spinlock, so do
+ * namestrcpy() and get_database_oid() outside.
+ */
+ namestrcpy(&plugin_name, remote_slot->plugin);
+ dbid = get_database_oid(remote_slot->database, false);

IMO just simplify the whole comment, here and for the other similar
comment in local_slot_update().

SUGGESTION
/* Avoid expensive operations while holding a spinlock. */

~~~

4. synchronize_slots

+ /* Construct the remote_slot tuple and synchronize each slot locally */
+ tupslot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple);
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, tupslot))
+ {
+ bool isnull;
+ RemoteSlot *remote_slot = palloc0(sizeof(RemoteSlot));
+ Datum d;
+
+ remote_slot->name = TextDatumGetCString(slot_getattr(tupslot, 1, &isnull));
+ Assert(!isnull);
+
+ remote_slot->plugin = TextDatumGetCString(slot_getattr(tupslot, 2, &isnull));
+ Assert(!isnull);
+
+ /*
+ * It is possible to get null values for LSN and Xmin if slot is
+ * invalidated on the primary server, so handle accordingly.
+ */
+ d = slot_getattr(tupslot, 3, &isnull);
+ remote_slot->confirmed_lsn = isnull ? InvalidXLogRecPtr :
+ DatumGetLSN(d);
+
+ d = slot_getattr(tupslot, 4, &isnull);
+ remote_slot->restart_lsn = isnull ? InvalidXLogRecPtr : DatumGetLSN(d);
+
+ d = slot_getattr(tupslot, 5, &isnull);
+ remote_slot->catalog_xmin = isnull ? InvalidTransactionId :
+ DatumGetTransactionId(d);
+
+ remote_slot->two_phase = DatumGetBool(slot_getattr(tupslot, 6, &isnull));
+ Assert(!isnull);
+
+ remote_slot->database = TextDatumGetCString(slot_getattr(tupslot,
+ 7, &isnull));
+ Assert(!isnull);
+
+ d = slot_getattr(tupslot, 8, &isnull);
+ remote_slot->invalidated = isnull ? RS_INVAL_NONE :
+ get_slot_invalidation_cause(TextDatumGetCString(d));

Would it be better to get rid of the hardwired column numbers and then
be able to use the SLOTSYNC_COLUMN_COUNT already defined as a sanity
check?

SUGGESTION
int col = 0;
...
remote_slot->name = TextDatumGetCString(slot_getattr(tupslot, ++col, &isnull));
...
remote_slot->plugin = TextDatumGetCString(slot_getattr(tupslot, ++col,
&isnull));
...
d = slot_getattr(tupslot, ++col, &isnull);
remote_slot->confirmed_lsn = isnull ? InvalidXLogRecPtr : DatumGetLSN(d);
...
d = slot_getattr(tupslot, ++col, &isnull);
remote_slot->restart_lsn = isnull ? InvalidXLogRecPtr : DatumGetLSN(d);
...
d = slot_getattr(tupslot, ++col, &isnull);
remote_slot->catalog_xmin = isnull ? InvalidTransactionId :
DatumGetTransactionId(d);
...
remote_slot->two_phase = DatumGetBool(slot_getattr(tupslot, ++col, &isnull));
...
remote_slot->database = TextDatumGetCString(slot_getattr(tupslot,
++col, &isnull));
...
d = slot_getattr(tupslot, ++col, &isnull);
remote_slot->invalidated = isnull ? RS_INVAL_NONE :
get_slot_invalidation_cause(TextDatumGetCString(d));

/* Sanity check */
Asert(col == SLOTSYNC_COLUMN_COUNT);

~~~

5.
+static char *
+validate_parameters_and_get_dbname(void)
+{
+ char    *dbname;

These are configuration issues, so probably all these ereports could
also set errcode(ERRCODE_INVALID_PARAMETER_VALUE).

======
Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Schema variables - new implementation for Postgres 15
Next
From: Masahiko Sawada
Date:
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations