Re: [Proposal] Add foreign-server health checks infrastructure - Mailing list pgsql-hackers

From Peter Smith
Subject Re: [Proposal] Add foreign-server health checks infrastructure
Date
Msg-id CAHut+Puqtm1tYJNtggNKANDy7foJD2zuJog4Vb6u9YBE3vLgKA@mail.gmail.com
Whole thread Raw
In response to RE: [Proposal] Add foreign-server health checks infrastructure  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Responses RE: [Proposal] Add foreign-server health checks infrastructure
List pgsql-hackers
Here is a code review only for patch v31-0001.

======
General Comment

1.
PQcanConnCheck seemed like a strange API name. Maybe it can have the
same prefix as the other?

e.g.

- PQconnCheck()
- PGconnCheckSupported()

or

- PQconnCheck()
- PGconnCheckable()

======
Commit Message

2.
PqconnCheck() function allows to check the status of socket by polling
the socket. This function is currently available only on systems that
support the non-standard POLLRDHUP extension to the poll system call,
including Linux.

PqcanConnCheck() checks whether above function is available or not.

~

2a.
"status of socket" --> "status of the connection"

~

2b.
"above function" --> "the above function"

======
doc/src/sgml/libpq.sgml

3. PQconnCheck

Returns the health of the socket.

int PQconnCheck(PGconn *conn);

Unlike PQstatus, this function checks socket health. This check is
performed by polling the socket. This function is currently available
only on systems that support the non-standard POLLRDHUP extension to
the poll system call, including Linux. PQconnCheck returns greater
than zero if the remote peer seems to be closed, returns 0 if the
socket is valid, and returns -1 if the connection has been already
invalid or an error is error occurred.

~

3a.
Should these descriptions be referring to the health of the
*connection* rather than the health of the socket?

~

3b.
"has been already invalid" ?? wording

~~~

4. PQcanConnCheck

Returns whether PQconnCheck is available on this platform.
PQcanConnCheck returns 1 if the function is supported, otherwise
returns 0.

~

I thought this should be worded using "true" and "false" same as other
boolean functions on this page.

SUGGESTION
Returns true (1) or false (0) to indicate if the PQconnCheck function
is supported on this platform.

======
src/interfaces/libpq/fe-misc.c

5.
-static int pqSocketCheck(PGconn *conn, int forRead, int forWrite,
-   time_t end_time);
+static int pqSocketIsReadableOrWritableOrValid(PGconn *conn, int forRead,
+ int forWrite, time_t end_time);

I was not 100% sure overloading this API is the right thing to do.
Doesn't this introduce a subtle side-effect on some of the existing
callers? e.g. Previously pqWaitTimed would ALWAYS return 0 if
forRead/forWrite were both false. But now other return values like
errors will be possible. Is that OK?

~~~

6. pqSocketPoll

/*
 * Check a file descriptor for read and/or write data, possibly waiting.
 * If neither forRead nor forWrite are set, immediately return a timeout
 * condition (without waiting).  Return >0 if condition is met, 0
 * if a timeout occurred, -1 if an error or interrupt occurred.
 *
 * Timeout is infinite if end_time is -1.  Timeout is immediate (no blocking)
 * if end_time is 0 (or indeed, any time before now).
 *
 * Moreover, when neither forRead nor forWrite is requested and timeout is
 * disabled, try to check the health of socket.
 */


The new comment "Moreover..." is contrary to the earlier part of the
same comment which already said, "If neither forRead nor forWrite are
set, immediately return a timeout condition (without waiting)."

There might be side-effects to previous/existing callers of this
function (e.g. pqWaitTimed via pqSocketCheck)

~~~

7.
  if (!forRead && !forWrite)
- return 0;
+ {
+ /* Try to check the health if requested */
+ if (end_time == 0)
+#if defined(POLLRDHUP)
+ input_fd.events = POLLRDHUP | POLLHUP | POLLNVAL;
+#else
+ return 0;
+#endif /* defined(POLLRDHUP) */
+ else
+ return 0;
+ }

FYI - I think the new code can be simpler without needing #else by
calling your other new function.

SUGGESTION
if (!forRead && !forWrite)
{
if (!PQcanConnCheck() || end_time != 0)
return 0;

/* Check the connection health when end_time is 0 */
Assert(PQcanConnCheck() && end_time == 0);
#if defined(POLLRDHUP)
input_fd.events = POLLRDHUP | POLLHUP | POLLNVAL;
#endif
}

~~~

8. PQconnCheck
+/*
+ * Check whether PQconnCheck() can work well on this platform.
+ *
+ * Returns 1 if this can use PQconnCheck(), otherwise 0.
+ */
+int
+PQcanConnCheck(void)
+{
+#if (defined(HAVE_POLL) && defined(POLLRDHUP))
+ return true;
+#else
+ return false;
+#endif
+}

~

8a.
"can work well" --> "works"

~

8b.
Maybe better to say "true (1)" and "otherwise false (0)"

------
Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: cfbot failures
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Time delayed LR (WAS Re: logical replication restrictions)