Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: Synchronizing slots from primary to standby |
Date | |
Msg-id | CAHut+PtsDYPbg7qM1nGWtJcSQBQ5JH=LmgyqwqBPL9k+z8f5Ew@mail.gmail.com Whole thread Raw |
In response to | Re: Synchronizing slots from primary to standby (shveta malik <shveta.malik@gmail.com>) |
Responses |
Re: Synchronizing slots from primary to standby
|
List | pgsql-hackers |
Here are some review comments for v67-0002. ====== src/backend/replication/logical/slotsync.c 1. +/* The sleep time (ms) between slot-sync cycles varies dynamically + * (within a MIN/MAX range) according to slot activity. See + * wait_for_slot_activity() for details. + */ +#define MIN_WORKER_NAPTIME_MS 200 +#define MAX_WORKER_NAPTIME_MS 30000 /* 30s */ + +static long sleep_ms = MIN_WORKER_NAPTIME_MS; In my previous review for this, I meant for there to be no whitespace between the #defines and the static long sleep_ms so the prior comment then clearly belongs to all 3 lines ~~~ 2. synchronize_one_slot + /* + * Sanity check: Make sure that concerned WAL is received and flushed + * before syncing slot to target lsn received from the primary server. + * + * This check should never pass as on the primary server, we have waited + * for the standby's confirmation before updating the logical slot. + */ + latestFlushPtr = GetStandbyFlushRecPtr(NULL); + if (remote_slot->confirmed_lsn > latestFlushPtr) + { + ereport(LOG, + errmsg("skipping slot synchronization as the received slot sync" + " LSN %X/%X for slot \"%s\" is ahead of the standby position %X/%X", + LSN_FORMAT_ARGS(remote_slot->confirmed_lsn), + remote_slot->name, + LSN_FORMAT_ARGS(latestFlushPtr))); + + return false; + } Previously in v65 this was an elog, but now it is an ereport. But since this is a sanity check condition that "should never pass" wasn't the elog the more appropriate choice? ~~~ 3. synchronize_one_slot + /* + * We don't want any complicated code while holding a spinlock, so do + * namestrcpy() and get_database_oid() outside. + */ + namestrcpy(&plugin_name, remote_slot->plugin); + dbid = get_database_oid(remote_slot->database, false); IMO just simplify the whole comment, here and for the other similar comment in local_slot_update(). SUGGESTION /* Avoid expensive operations while holding a spinlock. */ ~~~ 4. synchronize_slots + /* Construct the remote_slot tuple and synchronize each slot locally */ + tupslot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple); + while (tuplestore_gettupleslot(res->tuplestore, true, false, tupslot)) + { + bool isnull; + RemoteSlot *remote_slot = palloc0(sizeof(RemoteSlot)); + Datum d; + + remote_slot->name = TextDatumGetCString(slot_getattr(tupslot, 1, &isnull)); + Assert(!isnull); + + remote_slot->plugin = TextDatumGetCString(slot_getattr(tupslot, 2, &isnull)); + Assert(!isnull); + + /* + * It is possible to get null values for LSN and Xmin if slot is + * invalidated on the primary server, so handle accordingly. + */ + d = slot_getattr(tupslot, 3, &isnull); + remote_slot->confirmed_lsn = isnull ? InvalidXLogRecPtr : + DatumGetLSN(d); + + d = slot_getattr(tupslot, 4, &isnull); + remote_slot->restart_lsn = isnull ? InvalidXLogRecPtr : DatumGetLSN(d); + + d = slot_getattr(tupslot, 5, &isnull); + remote_slot->catalog_xmin = isnull ? InvalidTransactionId : + DatumGetTransactionId(d); + + remote_slot->two_phase = DatumGetBool(slot_getattr(tupslot, 6, &isnull)); + Assert(!isnull); + + remote_slot->database = TextDatumGetCString(slot_getattr(tupslot, + 7, &isnull)); + Assert(!isnull); + + d = slot_getattr(tupslot, 8, &isnull); + remote_slot->invalidated = isnull ? RS_INVAL_NONE : + get_slot_invalidation_cause(TextDatumGetCString(d)); Would it be better to get rid of the hardwired column numbers and then be able to use the SLOTSYNC_COLUMN_COUNT already defined as a sanity check? SUGGESTION int col = 0; ... remote_slot->name = TextDatumGetCString(slot_getattr(tupslot, ++col, &isnull)); ... remote_slot->plugin = TextDatumGetCString(slot_getattr(tupslot, ++col, &isnull)); ... d = slot_getattr(tupslot, ++col, &isnull); remote_slot->confirmed_lsn = isnull ? InvalidXLogRecPtr : DatumGetLSN(d); ... d = slot_getattr(tupslot, ++col, &isnull); remote_slot->restart_lsn = isnull ? InvalidXLogRecPtr : DatumGetLSN(d); ... d = slot_getattr(tupslot, ++col, &isnull); remote_slot->catalog_xmin = isnull ? InvalidTransactionId : DatumGetTransactionId(d); ... remote_slot->two_phase = DatumGetBool(slot_getattr(tupslot, ++col, &isnull)); ... remote_slot->database = TextDatumGetCString(slot_getattr(tupslot, ++col, &isnull)); ... d = slot_getattr(tupslot, ++col, &isnull); remote_slot->invalidated = isnull ? RS_INVAL_NONE : get_slot_invalidation_cause(TextDatumGetCString(d)); /* Sanity check */ Asert(col == SLOTSYNC_COLUMN_COUNT); ~~~ 5. +static char * +validate_parameters_and_get_dbname(void) +{ + char *dbname; These are configuration issues, so probably all these ereports could also set errcode(ERRCODE_INVALID_PARAMETER_VALUE). ====== Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: