Allowing printf("%m") only where it actually works - Mailing list pgsql-hackers

From Tom Lane
Subject Allowing printf("%m") only where it actually works
Date
Msg-id 2975.1526862605@sss.pgh.pa.us
Whole thread Raw
Responses Re: Allowing printf("%m") only where it actually works  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
<digression>

For amusement's sake, I was playing around with NetBSD-current (9-to-be)
today, and tried to compile Postgres on it.  It works OK --- and I can
even confirm that our new code for using ARM v8 CRC instructions works
there --- but I got a boatload of compile warnings like this:

latch.c:1180:4: warning: %m is only allowed in syslog(3) like functions [-Wformat=]
    ereport(ERROR,
    ^~~~~~~

A bit of googling turned up the patch that caused this [1], which was
soon followed by some well-reasoned push-back [2]; but the warning's
still there, so evidently the forces of bullheadedness won.  I was
ready to discount the whole thing as being another badly designed
no-wonder-gcc-upstream-won't-take-it compiler warning, when I noticed that
the last few warnings in my output were pointing out a live bug, to wit
using %m with plain old printf rather than elog/ereport.  So I fixed
that [3], but I'm thinking that we need to take a bit more care here.

</digression>

Looking around, we have circa seventy-five functions declared with
pg_attribute_printf in our tree right now, and of those, *only* the
elog/ereport support functions can be relied on to process %m correctly.
However, anyone who's accustomed to working in backend code is likely to
not think hard about using %m in an error message, as indeed the authors
and reviewers of pg_verify_checksums did not.  Worse yet, such cases
actually will work as long as you're testing on glibc platforms, only
to fail everywhere else.

So I think we need to try to make provisions for getting compiler warnings
when %m is used in a function that doesn't support it.  gcc on Linux-ish
platforms isn't going to be very helpful with this, but that doesn't mean
that we should confuse %m-supporting and not-%m-supporting functions,
as we do right now.

Hence, I think we need something roughly like the attached patch, which
arranges to use "gnu_printf" (if available) as the format archetype for
the elog/ereport functions, and plain "printf" for all the rest.  With
some additional hackery not included here, this can be ju-jitsu'd into
compiling warning-free on NetBSD-current.  (The basic idea is to extend
PGAC_C_PRINTF_ARCHETYPE so it will select __syslog__ as the archetype
if available; but then you need some hack to suppress the follow-on
warnings complained of in [2].  I haven't decided what's the least ugly
solution for the latter, so I'm not proposing such a patch yet.)

What I'm mainly concerned about at this stage is what effects this'll
have on Windows builds.  The existing comment for PGAC_C_PRINTF_ARCHETYPE
claims that using gnu_printf silences complaints about "%lld" and related
formats on Windows, but I wonder whether that is still true on Windows
versions we still support.  As I mentioned in [4], I don't think we really
work any longer on platforms that don't use "%lld" for "long long" values,
and it seems that Windows does accept that in post-XP versions --- but has
gcc gotten the word?

If this does work as desired on Windows, then that would be a fairly
mainstream platform that can produce warnings about wrong uses of %m,
even if gcc-on-Linux doesn't.  If worst comes to worst, somebody could
crank up a buildfarm machine with a newer NetBSD release.

Anyway, I don't feel a need to cram this into v11, since I just fixed
the live bugs of this ilk in HEAD; it seems like it can wait for v12.
So I'll add this to the next commitfest.

            regards, tom lane

[1] https://mail-index.netbsd.org/tech-userlevel/2015/08/21/msg009282.html
[2] https://mail-index.netbsd.org/tech-userlevel/2015/10/23/msg009371.html
[3] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a13b47a59ffce6f3c13c8b777738a3aab1db10d3
[4] https://www.postgresql.org/message-id/13103.1526749980@sss.pgh.pa.us

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index ba5c40d..5cd8994 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -19,10 +19,10 @@ fi])# PGAC_C_SIGNED

 # PGAC_C_PRINTF_ARCHETYPE
 # -----------------------
-# Set the format archetype used by gcc to check printf type functions.  We
-# prefer "gnu_printf", which includes what glibc uses, such as %m for error
-# strings and %lld for 64 bit long longs.  GCC 4.4 introduced it.  It makes a
-# dramatic difference on Windows.
+# Set the format archetype used by gcc to check elog/ereport functions.
+# This should accept %m, whether or not the platform's printf does.
+# We use "gnu_printf" if possible, which does that, although in some cases
+# it might do more than we could wish.
 AC_DEFUN([PGAC_PRINTF_ARCHETYPE],
 [AC_CACHE_CHECK([for printf format archetype], pgac_cv_printf_archetype,
 [ac_save_c_werror_flag=$ac_c_werror_flag
@@ -34,8 +34,8 @@ __attribute__((format(gnu_printf, 2, 3)));], [])],
                   [pgac_cv_printf_archetype=gnu_printf],
                   [pgac_cv_printf_archetype=printf])
 ac_c_werror_flag=$ac_save_c_werror_flag])
-AC_DEFINE_UNQUOTED([PG_PRINTF_ATTRIBUTE], [$pgac_cv_printf_archetype],
-                   [Define to gnu_printf if compiler supports it, else printf.])
+AC_DEFINE_UNQUOTED([PG_PRINTF_ATTRIBUTE_M], [$pgac_cv_printf_archetype],
+                   [Define as a format archetype that accepts %m, if available, else printf.])
 ])# PGAC_PRINTF_ARCHETYPE


diff --git a/configure b/configure
index b244fc3..df12bb5 100755
--- a/configure
+++ b/configure
@@ -13305,7 +13305,7 @@ fi
 $as_echo "$pgac_cv_printf_archetype" >&6; }

 cat >>confdefs.h <<_ACEOF
-#define PG_PRINTF_ATTRIBUTE $pgac_cv_printf_archetype
+#define PG_PRINTF_ATTRIBUTE_M $pgac_cv_printf_archetype
 _ACEOF


diff --git a/src/include/c.h b/src/include/c.h
index 1e50103..0a4757e 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -126,10 +126,14 @@
 /* GCC and XLC support format attributes */
 #if defined(__GNUC__) || defined(__IBMC__)
 #define pg_attribute_format_arg(a) __attribute__((format_arg(a)))
-#define pg_attribute_printf(f,a) __attribute__((format(PG_PRINTF_ATTRIBUTE, f, a)))
+/* Use for functions wrapping stdio's printf, which often doesn't take %m: */
+#define pg_attribute_printf(f,a) __attribute__((format(printf, f, a)))
+/* Use for elog/ereport, which implement %m for themselves: */
+#define pg_attribute_printf_m(f,a) __attribute__((format(PG_PRINTF_ATTRIBUTE_M, f, a)))
 #else
 #define pg_attribute_format_arg(a)
 #define pg_attribute_printf(f,a)
+#define pg_attribute_printf_m(f,a)
 #endif

 /* GCC, Sunpro and XLC support aligned, packed and noreturn */
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index c3320f2..92daabc 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -799,8 +799,8 @@
 /* PostgreSQL major version as a string */
 #undef PG_MAJORVERSION

-/* Define to gnu_printf if compiler supports it, else printf. */
-#undef PG_PRINTF_ATTRIBUTE
+/* Define as a format archetype that accepts %m, if available, else printf. */
+#undef PG_PRINTF_ATTRIBUTE_M

 /* PostgreSQL version as a string */
 #undef PG_VERSION
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 7a9ba7f..4f4091d 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -133,25 +133,25 @@ extern int    errcode(int sqlerrcode);
 extern int    errcode_for_file_access(void);
 extern int    errcode_for_socket_access(void);

-extern int    errmsg(const char *fmt,...) pg_attribute_printf(1, 2);
-extern int    errmsg_internal(const char *fmt,...) pg_attribute_printf(1, 2);
+extern int    errmsg(const char *fmt,...) pg_attribute_printf_m(1, 2);
+extern int    errmsg_internal(const char *fmt,...) pg_attribute_printf_m(1, 2);

 extern int errmsg_plural(const char *fmt_singular, const char *fmt_plural,
-              unsigned long n,...) pg_attribute_printf(1, 4) pg_attribute_printf(2, 4);
+              unsigned long n,...) pg_attribute_printf_m(1, 4) pg_attribute_printf_m(2, 4);

-extern int    errdetail(const char *fmt,...) pg_attribute_printf(1, 2);
-extern int    errdetail_internal(const char *fmt,...) pg_attribute_printf(1, 2);
+extern int    errdetail(const char *fmt,...) pg_attribute_printf_m(1, 2);
+extern int    errdetail_internal(const char *fmt,...) pg_attribute_printf_m(1, 2);

-extern int    errdetail_log(const char *fmt,...) pg_attribute_printf(1, 2);
+extern int    errdetail_log(const char *fmt,...) pg_attribute_printf_m(1, 2);

 extern int errdetail_log_plural(const char *fmt_singular,
                      const char *fmt_plural,
-                     unsigned long n,...) pg_attribute_printf(1, 4) pg_attribute_printf(2, 4);
+                     unsigned long n,...) pg_attribute_printf_m(1, 4) pg_attribute_printf_m(2, 4);

 extern int errdetail_plural(const char *fmt_singular, const char *fmt_plural,
-                 unsigned long n,...) pg_attribute_printf(1, 4) pg_attribute_printf(2, 4);
+                 unsigned long n,...) pg_attribute_printf_m(1, 4) pg_attribute_printf_m(2, 4);

-extern int    errhint(const char *fmt,...) pg_attribute_printf(1, 2);
+extern int    errhint(const char *fmt,...) pg_attribute_printf_m(1, 2);

 /*
  * errcontext() is typically called in error context callback functions, not
@@ -165,7 +165,7 @@ extern int    errhint(const char *fmt,...) pg_attribute_printf(1, 2);

 extern int    set_errcontext_domain(const char *domain);

-extern int    errcontext_msg(const char *fmt,...) pg_attribute_printf(1, 2);
+extern int    errcontext_msg(const char *fmt,...) pg_attribute_printf_m(1, 2);

 extern int    errhidestmt(bool hide_stmt);
 extern int    errhidecontext(bool hide_ctx);
@@ -222,13 +222,13 @@ extern int    getinternalerrposition(void);
 #endif                            /* HAVE__VA_ARGS */

 extern void elog_start(const char *filename, int lineno, const char *funcname);
-extern void elog_finish(int elevel, const char *fmt,...) pg_attribute_printf(2, 3);
+extern void elog_finish(int elevel, const char *fmt,...) pg_attribute_printf_m(2, 3);


 /* Support for constructing error strings separately from ereport() calls */

 extern void pre_format_elog_string(int errnumber, const char *domain);
-extern char *format_elog_string(const char *fmt,...) pg_attribute_printf(1, 2);
+extern char *format_elog_string(const char *fmt,...) pg_attribute_printf_m(1, 2);


 /* Support for attaching context information to error reports */
@@ -407,9 +407,9 @@ extern void set_syslog_parameters(const char *ident, int facility);
 #endif

 /*
- * Write errors to stderr (or by equal means when stderr is
- * not available). Used before ereport/elog can be used
- * safely (memory context, GUC load etc)
+ * Write errors to stderr (or by comparable means when stderr is not
+ * available).  Used before ereport/elog can be used safely (memory context,
+ * GUC load etc).  Note that this does *not* accept "%m".
  */
 extern void write_stderr(const char *fmt,...) pg_attribute_printf(1, 2);


pgsql-hackers by date:

Previous
From: Carter Thaxton
Date:
Subject: Add --include-table-data-where option to pg_dump, to export only asubset of table data
Next
From: Andrew Gierth
Date:
Subject: Re: Fix for FETCH FIRST syntax problems