Re: pg_receivewal starting position - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: pg_receivewal starting position
Date
Msg-id CALj2ACWW2E4drNd5FzVBBoR-_5jUWUvBeoy7LUOkny6sxaRs0A@mail.gmail.com
Whole thread Raw
In response to Re: pg_receivewal starting position  (Michael Paquier <michael@paquier.xyz>)
Responses Re: pg_receivewal starting position  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On Mon, Oct 25, 2021 at 12:21 PM Michael Paquier <michael@paquier.xyz> wrote:
> Attached is the updated patch I am finishing with, which is rather
> clean now.  I have tweaked a couple of things while on it, and
> documented better the new GetSlotInformation() in streamutil.c.

Thanks for the v11 patch, here are some comments:

1) Remove the extra whitespace in between "t" and ":"
+ pg_log_error("could not read replication slot : %s",

2) I think we should tweak the below error message
+ pg_log_error("could not read replication slot \"%s\": %s",
                                slot_name, PQerrorMessage(conn));
to
        pg_log_error("could not read replication slot \"%s\": %s",
                     "IDENTIFY_SYSTEM", PQerrorMessage(conn));
Having slot name in the error message helps to isolate the error
message from tons of server logs that gets generated.

3) Also for the same reasons stated as above, change the below error message
pg_log_error("could not read replication slot: got %d rows and %d
fields, expected %d rows and %d or more fields",
to
pg_log_error("could not read replication slot  \"%s\": got %d rows and
%d fields, expected %d rows and %d or more fields", slot_name,....

4) Also for the same reasons, change below
+ pg_log_error("could not parse slot's restart_lsn \"%s\"",
to
pg_log_error("could not parse replicaton slot  \"%s\" restart_lsn \"%s\"",
                         slot_name, PQgetvalue(res, 0, 1));

5) I think we should also have assertion for the timeline id:
    Assert(stream.startpos != InvalidXLogRecPtr);
    Assert(stream.timeline!= 0);

6) Why do we need these two assignements?
+ if (*restart_lsn)
+ *restart_lsn = lsn_loc;
+ if (restart_tli != NULL)
+ *restart_tli = tli_loc;

+ /* Assign results if requested */
+ if (restart_lsn)
+ *restart_lsn = lsn_loc;
+ if (restart_tli)
+ *restart_tli = tli_loc;

I think we can just get rid of lsn_loc and tli_loc, initlaize
*restart_lsn = InvalidXLogRecPtr and *restart_tli = 0 at the start of
the function and directly assign the requrested values to *restart_lsn
and *restart_tli, also see comment (8).

7) Let's be consistent, change the following
+
+ if (*restart_lsn)
+ *restart_lsn = lsn_loc;
+ if (restart_tli != NULL)
+ *restart_tli = tli_loc;
to
+
+ if (restart_lsn)
+ *restart_lsn = lsn_loc;
+ if (restart_tli != NULL)
+ *restart_tli = tli_loc;

8) Let's extract the values asked by the caller, change:
+ /* restart LSN */
+ if (!PQgetisnull(res, 0, 1))

+ /* current TLI */
+ if (!PQgetisnull(res, 0, 2))
+ tli_loc = (TimeLineID) atol(PQgetvalue(res, 0, 2));

to

+ /* restart LSN */
+ if (restart_lsn && !PQgetisnull(res, 0, 1))

+ /* current TLI */
+ if (restart_tli && !PQgetisnull(res, 0, 2))

9) 80char limit crossed:
+GetSlotInformation(PGconn *conn, const char *slot_name, XLogRecPtr
*restart_lsn, TimeLineID *restart_tli)

10) Missing word "command", and use "issued to the server", so change the below:
+      <command>READ_REPLICATION_SLOT</command> is issued to retrieve the
to
+      <command>READ_REPLICATION_SLOT</command> command is issued to
the server to retrieve the

11) Will replication_slot ever be NULL? If it ever be null, then we
don't reach this far right? We see the pg_log_error("%s needs a slot
to be specified using --slot". Please revmove below if condition:
+ * server may not support this option.
+ */
+ if (replication_slot != NULL)

We can just add Assert(slot_name); in GetSlotInformation().

12) How about following:
"If a starting point cannot be calculated with the previous method,
<command>READ_REPLICATION_SLOT</command> command with the provided
slot is issued to the server for retreving the slot's restart_lsn and
timelineid"
instead of
+      and if a replication slot is used, an extra
+      <command>READ_REPLICATION_SLOT</command> is issued to retrieve the
+      slot's <literal>restart_lsn</literal> to use as starting point.

The IDENTIFY_SYSTEM descritption starts with "If a starting point
cannot be calculated....":
    <listitem>
     <para>
      If a starting point cannot be calculated with the previous method,
      the latest WAL flush location is used as reported by the server from
      a <literal>IDENTIFY_SYSTEM</literal> command.

Regards,
Bharath Rupireddy.



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg_receivewal starting position
Next
From: Amul Sul
Date:
Subject: Re: TAP test for recovery_end_command