Re: pg_receivewal starting position - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: pg_receivewal starting position
Date
Msg-id CALj2ACW6YA0SfqwPm1-FRRdeNj1e9isiSax3jZ8MmyQYYQQWNw@mail.gmail.com
Whole thread Raw
In response to Re: pg_receivewal starting position  (Ronan Dunklau <ronan.dunklau@aiven.io>)
Responses Re: pg_receivewal starting position  (Ronan Dunklau <ronan.dunklau@aiven.io>)
Re: pg_receivewal starting position  (Ronan Dunklau <ronan.dunklau@aiven.io>)
List pgsql-hackers
On Thu, Aug 26, 2021 at 5:45 PM Ronan Dunklau <ronan.dunklau@aiven.io> wrote:
> order to fail early if the replication slot doesn't exist, so please find
> attached v2 for that.

Thanks for the patches. Here are some comments:

1) While the intent of these patches looks good, I have following
concern with new replication command READ_REPLICATION_SLOT: what if
the pg_receivewal exits (because user issued a SIGINT or for some
reason) after flushing  the received WAL to disk, before it sends
sendFeedback to postgres server's walsender so that it doesn't get a
chance to update the restart_lsn in the replication slot via
PhysicalConfirmReceivedLocation. If the pg_receivewal is started
again, isn't it going to get the previous restart_lsn and receive the
last chunk of flushed WAL again?

2) What is the significance of READ_REPLICATION_SLOT for logical
replication slots? I read above that somebody suggested to restrict
the walsender to handle READ_REPLICATION_SLOT for physical replication
slots so that the callers will see a command failure. But I tend to
think that it is clean to have this command common for both physical
and logical replication slots and the callers can have an Assert(type
== 'physical').

3) Isn't it useful to send active, active_pid info of the replication
slot via READ_REPLICATION_SLOT? pg_receivewal can use Assert(active ==
true && active_pid == getpid()) as an assertion to ensure that it is
the sole owner of the replication slot? Also, is it good send
wal_status info

4) I think below messages should start with lower case letter and also
there are some typos:
+ pg_log_warning("Could not fetch the replication_slot \"%s\" information "
+ pg_log_warning("Server does not suport fetching the slot's position, "
something like:
+ pg_log_warning("could not fetch replication slot \"%s\" information, "
+    "resuming from current server position instead", replication_slot);
+ pg_log_warning("server does not support fetching replication slot
information, "
+    "resuming from current server position instead");

5) How about emitting the above messages in case of "verbose"?

6) How about an assertion like below?
+ if (stream.startpos == InvalidXLogRecPtr)
+ {
+ stream.startpos = serverpos;
+ stream.timeline = servertli;
+ }
+
+Assert(stream.startpos != InvalidXLogRecPtr)>>

7) How about we let pg_receivewal use READ_REPLICATION_SLOT as an option?

8) Just an idea, how about we store pg_receivewal's lastFlushPosition
in a file before pg_receivewal exits and compare it with the
restart_lsn that it received from the replication slot, if
lastFlushPosition == received_restart_lsn well and good, if not, then
something would have happened and we always start at the
lastFlushPosition ?

Regards,
Bharath Rupireddy.



pgsql-hackers by date:

Previous
From: Zhihong Yu
Date:
Subject: Re: Gather performance analysis
Next
From: Bharath Rupireddy
Date:
Subject: pg_receivewal: remove extra conn = NULL; in StreamLog