Re: Allowing printf("%m") only where it actually works - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Allowing printf("%m") only where it actually works
Date
Msg-id 1437.1534100930@sss.pgh.pa.us
Whole thread Raw
In response to Re: Allowing printf("%m") only where it actually works  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Allowing printf("%m") only where it actually works
Re: Allowing printf("%m") only where it actually works
Re: Allowing printf("%m") only where it actually works
List pgsql-hackers
I wrote:
> So this is all pretty much of a mess.  If we annotate the elog functions
> differently from printf's annotation then we risk getting these complaints
> in elog.c, but if we don't do that then we can't really describe their
> semantics correctly.  We could possibly mark the replacement snprintf
> functions with gnu_printf, but that's a lie with respect to the very
> point at issue about %m.  Unless we were to teach snprintf.c about %m
> ... which probably wouldn't be hard, but I'm not sure I want to go there.

Actually ... the more I think about this, the less insane that idea seems.
Consider the following approach:

1. Teach src/port/snprintf.c about %m.  While I've not written a patch
for this, it looks pretty trivial.

2. Teach configure to test for %m and if it's not there, use the
replacement snprintf.  (Note: we're already forcing snprintf replacement
in cross-compiles, so the added run-time test isn't losing anything.)

3. Get rid of elog.c's hand-made substitution of %m strings, and instead
just let it pass the correct errno value down.  (We'd likely need to do
some fooling in appendStringInfoVA and related functions to preserve
call-time errno, but that's not complicated, nor expensive.)

4. (Optional) Get rid of strerror(errno) calls in favor of %m, even in
frontend code.

Once we've done this, we have uniform printf semantics across all
platforms, which is kind of nice from a testing standpoint, as well as
being less of a cognitive load for developers.  And we can stick with
the existing approach of using the gnu_printf archetype across the board;
that's no longer a lie for the snprintf.c code.

One objection to this is the possible performance advantage of the native
printf functions over snprintf.c.  I did a bit of investigation of that
using the attached testbed, and found that the quality of implementation
of the native functions seems pretty variable:

RHEL6's glibc on x86_64 (this is just a comparison point, since we'd not
be replacing glibc's printf anyway):

snprintf time = 756.795 ms total, 0.000756795 ms per iteration
pg_snprintf time = 824.643 ms total, 0.000824643 ms per iteration

macOS High Sierra on x86_64:

snprintf time = 264.071 ms total, 0.000264071 ms per iteration
pg_snprintf time = 348.41 ms total, 0.00034841 ms per iteration

FreeBSD 11.0 on x86_64:

snprintf time = 628.873 ms total, 0.000628873 ms per iteration
pg_snprintf time = 606.684 ms total, 0.000606684 ms per iteration

OpenBSD 6.0 on x86_64 (same hardware as FreeBSD test):

snprintf time = 331.245 ms total, 0.000331245 ms per iteration
pg_snprintf time = 539.849 ms total, 0.000539849 ms per iteration

NetBSD 8.99 on armv6:

snprintf time = 2423.39 ms total, 0.00242339 ms per iteration
pg_snprintf time = 3769.16 ms total, 0.00376916 ms per iteration

So we would be taking a hit on most platforms, but I've not really
seen sprintf as a major component of very many profiles.  Moreover,
at least for the elog/ereport use-case, we'd be buying back some
nontrivial part of that hit by getting rid of expand_fmt_string().
Also worth noting is that we've never made any effort at all to
micro-optimize snprintf.c, so maybe there's some gold to be mined
there to reduce the penalty.

A different objection, possibly more serious than the performance
one, is that if we get in the habit of using %m in frontend code
then at some point we'd inevitably back-patch such a usage.  (Worse,
it'd pass testing on glibc platforms, only to fail elsewhere.)
I don't see a bulletproof answer to that except to back-patch this
set of changes, which might be a bridge too far.

Aside from the back-patching angle, though, this seems pretty
promising.  Thoughts?

            regards, tom lane

PS: here's the testbed I used for the above numbers.  Feel free to
try other platforms or other test-case formats.  Compile this with
something like

gcc -Wall -O2 -I pgsql/src/include -I pgsql/src/port timeprintf.c

(point the -I switches into a configured PG source tree); you might
need to add "-lrt" or some such to get clock_gettime().  Then run
with "./a.out 1000000" or so.

#include "postgres_fe.h"

#include "portability/instr_time.h"

#include "snprintf.c"

int
main(int argc, char **argv)
{
    int count = 0;
    char buffer[1000];
    instr_time    start;
    instr_time    stop;
    double elapsed;
    int i;

    if (argc > 1)
        count = atoi(argv[1]);
    if (count <= 0)
        count = 1000000;

    INSTR_TIME_SET_CURRENT(start);

    for (i = 0; i < count; i++)
    {
        snprintf(buffer, sizeof(buffer),
                 "%d %g %s",
                 42, 42.2,
"01234567890012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890");
    }

    INSTR_TIME_SET_CURRENT(stop);
    INSTR_TIME_SUBTRACT(stop, start);
    elapsed = INSTR_TIME_GET_MILLISEC(stop);

    printf("snprintf time = %g ms total, %g ms per iteration\n",
           elapsed, elapsed / count);

    INSTR_TIME_SET_CURRENT(start);

    for (i = 0; i < count; i++)
    {
        pg_snprintf(buffer, sizeof(buffer),
                 "%d %g %s",
                 42, 42.2,
"01234567890012345678900123456789001234567890012345678900123456789001234567890012345678900123456789001234567890");
    }

    INSTR_TIME_SET_CURRENT(stop);
    INSTR_TIME_SUBTRACT(stop, start);
    elapsed = INSTR_TIME_GET_MILLISEC(stop);

    printf("pg_snprintf time = %g ms total, %g ms per iteration\n",
           elapsed, elapsed / count);

    return 0;
}

pgsql-hackers by date:

Previous
From: Arseny Sher
Date:
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Next
From: Tom Lane
Date:
Subject: Re: Allowing printf("%m") only where it actually works