Re: Performance improvements for src/port/snprintf.c - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Performance improvements for src/port/snprintf.c
Date
Msg-id 20181003155207.b3lqmovuv2c5c4id@alap3.anarazel.de
Whole thread Raw
In response to Re: Performance improvements for src/port/snprintf.c  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Performance improvements for src/port/snprintf.c  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

On 2018-10-03 08:20:14 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> >> While there might be value in implementing our own float printing code,
> >> I have a pretty hard time getting excited about the cost/benefit ratio
> >> of that.  I think that what we probably really ought to do here is hack
> >> float4out/float8out to bypass the extra overhead, as in the 0002 patch
> >> below.
> 
> > I'm thinking we should do a bit more than just that hack. I'm thinking
> > of something (barely tested) like
> 
> Meh.  The trouble with that is that it relies on the platform's snprintf,
> not sprintf, and that brings us right back into a world of portability
> hurt.  I don't feel that the move to C99 gets us out of worrying about
> noncompliant snprintfs --- we're only requiring a C99 *compiler*, not
> libc.  See buildfarm member gharial for a counterexample.

Oh, we could just use sprintf() and tell strfromd the buffer is large
enough. I only used snprintf because it seemed more symmetric, and
because I was at most 1/3 awake.


> I'm happy to look into whether using strfromd when available buys us
> anything over using sprintf.  I'm not entirely convinced that it will,
> because of the need to ASCII-ize and de-ASCII-ize the precision, but
> it's worth checking.

It's definitely faster.  It's not a full-blown format parser, so I guess
the cost of the conversion isn't too bad:
https://sourceware.org/git/?p=glibc.git;a=blob;f=stdlib/strfrom-skeleton.c;hb=HEAD#l68

CREATE TABLE somefloats(id serial, data1 float8, data2 float8, data3 float8);
INSERT INTO somefloats(data1, data2, data3) SELECT random(), random(), random() FROM generate_series(1, 10000000);
VACUUM FREEZE somefloats;

I'm comparing the times of:
COPY somefloats TO '/dev/null';

master (including your commit):
16177.202 ms

snprintf using sprintf via pg_double_to_string:
16195.787

snprintf using strfromd via pg_double_to_string:
14856.974 ms

float8out using sprintf via pg_double_to_string:
16176.169

float8out using strfromd via pg_double_to_string:
13532.698



FWIW, it seems that using a local buffer and than pstrdup'ing that in
float8out_internal is a bit faster, and would probably save a bit of
memory on average:

float8out using sprintf via pg_double_to_string, pstrdup:
15370.774

float8out using strfromd via pg_double_to_string, pstrdup:
13498.331


Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Madeleine Thompson
Date:
Subject: Re: BUG #15307: Low numerical precision of (Co-) Variance
Next
From: David Fetter
Date:
Subject: Re: Early WIP/PoC for inlining CTEs