Re: Allowing printf("%m") only where it actually works - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Allowing printf("%m") only where it actually works
Date
Msg-id 20180926174645.nsyj77lx2mvtz4kx@alap3.anarazel.de
Whole thread Raw
In response to Re: Allowing printf("%m") only where it actually works  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Allowing printf("%m") only where it actually works
List pgsql-hackers
Hi,

On 2018-09-24 13:18:47 -0400, Tom Lane wrote:
> 0002 changes things so that we always use our snprintf, removing all the
> configure logic associated with that.

In the commit message you wrote:

> Preliminary performance testing suggests that as it stands, snprintf.c is
> faster than the native printf functions for some tasks on some platforms,
> and slower for other cases.  A pending patch will improve that, though
> cases with floating-point conversions will doubtless remain slower unless
> we want to put a *lot* of effort into that.  Still, we've not observed
> that *printf is really a performance bottleneck for most workloads, so
> I doubt this matters much.

I severely doubt the last sentence. I've *many* times seen *printf be a
significant bottleneck. In particular just about any pg_dump of a
database that has large tables with even just a single float column is
commonly bottlenecked on float -> string conversion.

A trivial bad benchmark:

CREATE TABLE somefloats(id serial, data1 float8, data2 float8, data3 float8);
INSERT INTO somefloats(data1, data2, data3) SELECT random(), random(), random() FROM generate_series(1, 10000000);
VACUUM FREEZE somefloats;


postgres[12850][1]=# \dt+ somefloats
                       List of relations
┌────────┬────────────┬───────┬────────┬────────┬─────────────┐
│ Schema │    Name    │ Type  │ Owner  │  Size  │ Description │
├────────┼────────────┼───────┼────────┼────────┼─────────────┤
│ public │ somefloats │ table │ andres │ 575 MB │             │
└────────┴────────────┴───────┴────────┴────────┴─────────────┘

96bf88d52711ad3a0a4cc2d1d9cb0e2acab85e63:

COPY somefloats TO '/dev/null';
COPY 10000000
Time: 24575.770 ms (00:24.576)

96bf88d52711ad3a0a4cc2d1d9cb0e2acab85e63^:

COPY somefloats TO '/dev/null';
COPY 10000000
Time: 12877.037 ms (00:12.877)

IOW, we regress copy performance by about 2x. And one int and three
floats isn't a particularly insane table layout.


I'm not saying we shouldn't default to our printf - in fact I think we
probably past due to use a faster float->string conversion than we
portably get from the OS - but I don't think we can default to our
sprintf without doing something about the float conversion performance.


Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Sarah Conway Schnurr
Date:
Subject: Re: Participate in GCI as a Mentor
Next
From: Amit Khandekar
Date:
Subject: Re: Query is over 2x slower with jit=on