Re: [HACKERS] Patch to implement pg_current_logfile() function - Mailing list pgsql-hackers

From Gilles Darold
Subject Re: [HACKERS] Patch to implement pg_current_logfile() function
Date
Msg-id 24ebcb40-1ec7-9148-03db-ac498bb0cc7f@dalibo.com
Whole thread Raw
In response to Re: Patch to implement pg_current_logfile() function  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Patch to implement pg_current_logfile() function
List pgsql-hackers
Le 08/12/2016 à 00:52, Robert Haas a écrit :
On Tue, Dec 6, 2016 at 6:11 PM, Gilles Darold <gilles.darold@dalibo.com> wrote:
It seems that all fixes have been included in this patch.
Yikes.  This patch has certainly had a lot of review, but it has lots
of basic inconsistencies with PostgreSQL coding style, which one would
think somebody would have noticed by now, and there are multiple
places where the logic seems to do in a complicated way something that
could be done significantly more simply.  I don't have any objection
to the basic idea of this patch, but it's got to look and feel like
the surrounding PostgreSQL code.  And not be unnecessarily
complicated.

Detailed comments below:

The SGML doesn't match the surrounding style - for example, <para>
typically appears on a line by itself.
Fixed.

+    if (!Logging_collector) {

Formatting.
Fixed.

rm_log_metainfo() could be merged into logfile_writename(), since
they're always called in conjunction and in the same pattern.  The
function is poorly named; it should be something like
update_metainfo_datafile().
Merge and logfile_writename() function renamed as suggested.

+        if (errno == ENFILE || errno == EMFILE)
+            ereport(LOG,
+                (errmsg("system is too busy to write logfile meta
info, %s will be updated on next rotation (or use SIGHUP to retry)",
LOG_METAINFO_DATAFILE)));

This seems like a totally unprincipled reaction to a purely arbitrary
subset of errors.  EMFILE or ENFILE doesn't represent a general "too
busy" condition, and logfile_open() has already logged the real error.
Removed.

+    snprintf(tempfn, sizeof(tempfn), "%s.tmp", LOG_METAINFO_DATAFILE);

You don't really need to use snprintf() here.  You could #define
LOG_METAINFO_DATAFILE_TMP LOG_METAINFO_DATAFILE ".tmp" and compute
this at compile time instead of runtime.
Done and added to syslogger.h

+    if (PG_NARGS() == 1) {
+        fmt = PG_ARGISNULL(0) ? NULL : PG_GETARG_TEXT_PP(0);
+        if (fmt != NULL) {
+            logfmt = text_to_cstring(fmt);
+            if ( (strcmp(logfmt, "stderr") != 0) &&
+                (strcmp(logfmt, "csvlog") != 0) ) {
+                ereport(ERROR,
+        (errmsg("log format %s not supported, possible values are
stderr or csvlog", logfmt)));
+                PG_RETURN_NULL();
+            }
+        }
+    } else {
+        logfmt = NULL;
+    }

Formatting.  This is unlike PostgreSQL style in multiple ways,
including cuddled braces, extra parentheses and spaces, and use of
braces around a single-line block.  Also, this could be written in a
much less contorted way, like:

if (PG_NARGS() == 0 || PG_ARGISNULL(0))   logfmt = NULL;
else
{   logfmt = text_to_cstring(PG_GETARG_TEXT_PP(0));   if (strcmp(logfmt, "stderr") != 0 && strcmp(logfmt, "csvlog") != 0)        ereport(...);
}
Applied.

Also, the ereport() needs an errcode(), not just an errmsg().
Otherwise it defaults to ERRCODE_INTERNAL_ERROR, which is not correct.
Add errcode(ERRCODE_INVALID_PARAMETER_VALUE) to the ereport call.

+    /* Check if current log file is present */
+    if (stat(LOG_METAINFO_DATAFILE, &stat_buf) != 0)
+        PG_RETURN_NULL();

Useless test.  The code that follows catches this case anyway and
handles it the same way.  Which is good, because this has an inherent
race condition. The previous if (!Logging_collector) test sems fairly
useless too; unless there's a bug in your syslogger.c code, the file
won't exist anyway, so we'll return NULL for that reason with no extra
code needed here.
That's right, both test have been removed.

+    /*
+    * Read the file to gather current log filename(s) registered
+    * by the syslogger.
+    */

Whitespace isn't right.

+    while (fgets(lbuffer, sizeof(lbuffer), fd) != NULL) {
+        char    log_format[10];
+        int     i = 0, space_pos = 0;
+
+        /* Check for a read error. */
+        if (ferror(fd)) {

More coding style issues.

+            ereport(ERROR,
+                (errcode_for_file_access(),
+                errmsg("could not read file \"%s\": %m",
LOG_METAINFO_DATAFILE)));
+            FreeFile(fd);
+            break;

ereport(ERROR) doesn't return, so the code that follows is pointless.
All issues above are fixed.

+        if (strchr(lbuffer, '\n') != NULL)
+            *strchr(lbuffer, '\n') = '\0';

Probably worth adding a local variable instead of doing this twice.
Local variables are cheap, and the code would be more readable.
Removed

+            if ((space_pos == 0) && (isspace(lbuffer[i]) != 0))

Too many parentheses.  Also, I think the whole loop in which this is
contained could be eliminated entirely.  Just search for the first ' '
character with strchr(); you don't need to are about isspace() because
you know the code that writes this file uses ' ' specifically.
Overwrite it with '\0'.  And then use a pointer to lbuffer for the log
format and a pointer to lbuffer + strchr_result + 1 for the pathname.
Rewritten, not exactly the way you describe but like this:

              /* extract log format and log file path from the line */
              log_filepath = strchr(lbuffer, ' ');
              log_filepath++;
              lbuffer[log_filepath-lbuffer-1] = '\0';
              log_format = lbuffer;
              *strchr(log_filepath, '\n') = '\0';

it was possible to not use the additional log_format variable and just keep the log format information in lbuffer, then use it the next code, but I have kept the log_format variable for readability.


+        if ((space_pos != (int)strlen("stderr")) &&
+            (space_pos != (int)strlen("csvlog")))
+        {
+            ereport(ERROR,
+                (errmsg("unexpected line format in file %s",
LOG_METAINFO_DATAFILE)));
+            break;
+        }

I think this is pointless.  Validating the length of the log format
but not the content seems kind of silly, and why do either?  The only
way it's going to do the wrong thing is if the file contents are
corrupt, and that should only happen if we have a bug.  Which we
probably won't, and if we do we'll fix it and then we won't, and this
will just be deadweight -- and one more thing to update if we ever add
support for another log format.
Removed.

+    /* Return null when csvlog is requested but we have a stderr log */
+    if ( (logfmt != NULL) && (strcmp(logfmt, "csvlog") == 0)
+            && !(Log_destination & LOG_DESTINATION_CSVLOG) )
+        PG_RETURN_NULL();
+
+    /* Return null when stderr is requested but we have a csv log */
+    if ( (logfmt != NULL) && (strcmp(logfmt, "stderr") == 0)
+            && !(Log_destination & LOG_DESTINATION_STDERR) )
+        PG_RETURN_NULL();

Seems complicated.  Instead, you could declare char *result = NULL.
When you find the row you're looking for, set result and break out of
the loop.  Then here you can just do if (result == NULL)
PG_RETURN_NULL() and this complexity goes away.
Fixed.

+/* in backend/utils/adt/misc.c and backend/postmaster/syslogger.c */
+/*
+ * Name of file holding the paths, names, and types of the files where current
+ * log messages are written, when the log collector is enabled.  Useful
+ * outside of Postgres when finding the name of the current log file in the
+ * case of time-based log rotation.
+ */
+#define LOG_METAINFO_DATAFILE  "current_logfiles"

syslogger.h seems more appropriate.  miscadmin.h is quite widely
included and it would be better not to bloat it with things that
aren't widely needed.  It makes sense to use it for widely-used stuff
that has no other obvious home, but since this is clearly a creature
of the syslogger, it could logically go in syslogger.h.
Moved to syslogger.h

This might not be exhaustive but I think fixing these things would get
us much closer to a committable patch.

I have also run the code through pgindent as Tom suggest to fix all pending coding style issues.  I've just change on line from pgindent work to prevent git apply to complain about a space in tabulation issue.  I don't know which is the best, let things like pgindent want to write it or prevent git to complain.  Attached is the v17 of the patch.

Thanks a lot for this synthetic review.

Regards

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org
Attachment

pgsql-hackers by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: [HACKERS] [OSSTEST PATCH 0/1] PostgreSQL db: Retry on constraint violation
Next
From: Keith Fiske
Date:
Subject: Re: [HACKERS] [COMMITTERS] pgsql: Implement table partitioning.