Thread: Avoid overhead with fprintf related functions

Avoid overhead with fprintf related functions

From
Ranier Vilela
Date:
Based on work in [1].
According to https://cplusplus.com/reference/cstdio/fprintf/
The use of fprintf is related to the need to generate a string based on a format, which should be different from "%s".
Since fprintf has overhead when parsing the "format" parameter, plus all the trouble of checking the va_arg parameters.
I think this is one of the low fruits available and easy to reap.
By replacing fprintf with its equivalents, fputs and fputc,
we avoid overhead and increase security [2] and [3].

The downside is a huge big churm, which unfortunately will occur.
But, IHMO, I think the advantages are worth it.
Note that behavior remains the same, since fputs and fputc do not change the expected behavior of fprintf.

A small performance gain is expected, mainly for the client, since there are several occurrences in some critical places, such as (usr/src/fe_utils/print.c).

Patch attached.
This pass check-world.

regards,
Ranier Vilela

Attachment

Re: Avoid overhead with fprintf related functions

From
Nathan Bossart
Date:
On Fri, Sep 09, 2022 at 10:45:37AM -0300, Ranier Vilela wrote:
> Based on work in [1].
> According to https://cplusplus.com/reference/cstdio/fprintf/
> The use of fprintf is related to the need to generate a string based on a
> format, which should be different from "%s".
> Since fprintf has overhead when parsing the "format" parameter, plus all
> the trouble of checking the va_arg parameters.
> I think this is one of the low fruits available and easy to reap.
> By replacing fprintf with its equivalents, fputs and fputc,
> we avoid overhead and increase security [2] and [3].
> 
> The downside is a huge big churm, which unfortunately will occur.
> But, IHMO, I think the advantages are worth it.
> Note that behavior remains the same, since fputs and fputc do not change
> the expected behavior of fprintf.
> 
> A small performance gain is expected, mainly for the client, since there
> are several occurrences in some critical places, such as
> (usr/src/fe_utils/print.c).

I agree with David [0].  But if you can demonstrate a performance gain,
perhaps it's worth considering a subset of these changes in hot paths.

[0] https://postgr.es/m/CAApHDvp2THseLvCc%2BTcYFBC7FKHpHTs1JyYmd2JghtOVhb5WGA%40mail.gmail.com

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: Avoid overhead with fprintf related functions

From
Ranier Vilela
Date:

Em sex., 9 de set. de 2022 às 13:20, Nathan Bossart <nathandbossart@gmail.com> escreveu:
On Fri, Sep 09, 2022 at 10:45:37AM -0300, Ranier Vilela wrote:
> Based on work in [1].
> According to https://cplusplus.com/reference/cstdio/fprintf/
> The use of fprintf is related to the need to generate a string based on a
> format, which should be different from "%s".
> Since fprintf has overhead when parsing the "format" parameter, plus all
> the trouble of checking the va_arg parameters.
> I think this is one of the low fruits available and easy to reap.
> By replacing fprintf with its equivalents, fputs and fputc,
> we avoid overhead and increase security [2] and [3].
>
> The downside is a huge big churm, which unfortunately will occur.
> But, IHMO, I think the advantages are worth it.
> Note that behavior remains the same, since fputs and fputc do not change
> the expected behavior of fprintf.
>
> A small performance gain is expected, mainly for the client, since there
> are several occurrences in some critical places, such as
> (usr/src/fe_utils/print.c).

I agree with David [0].  But if you can demonstrate a performance gain,
perhaps it's worth considering a subset of these changes in hot paths.
Simple benchmark with David sort example.

1. make data
create table t (a bigint not null, b bigint not null, c bigint not
null, d bigint not null, e bigint not null, f bigint not null);

insert into t select x,x,x,x,x,x from generate_Series(1,140247142) x; -- 10GB!
vacuum freeze t;

2. client run
\timing on
\pset pager off
select * from t limit 1000000;

head:
Time: 418,210 ms
Time: 419,588 ms
Time: 424,713 ms

fprintf patch:
Time: 416,919 ms
Time: 416,246 ms
Time: 416,237 ms

regards,
Ranier Vilela

Re: Avoid overhead with fprintf related functions

From
Ranier Vilela
Date:
Em sex., 9 de set. de 2022 às 10:45, Ranier Vilela <ranier.vf@gmail.com> escreveu:
Based on work in [1].
According to https://cplusplus.com/reference/cstdio/fprintf/
The use of fprintf is related to the need to generate a string based on a format, which should be different from "%s".
Since fprintf has overhead when parsing the "format" parameter, plus all the trouble of checking the va_arg parameters.
I think this is one of the low fruits available and easy to reap.
By replacing fprintf with its equivalents, fputs and fputc,
we avoid overhead and increase security [2] and [3].

The downside is a huge big churm, which unfortunately will occur.
But, IHMO, I think the advantages are worth it.
Note that behavior remains the same, since fputs and fputc do not change the expected behavior of fprintf.

A small performance gain is expected, mainly for the client, since there are several occurrences in some critical places, such as (usr/src/fe_utils/print.c).

Patch attached.
This pass check-world.
Rechecked for the hundredth time.
One typo.

regards,
Ranier Vilela
Attachment

Re: Avoid overhead with fprintf related functions

From
Tom Lane
Date:
Ranier Vilela <ranier.vf@gmail.com> writes:
> Em sex., 9 de set. de 2022 às 13:20, Nathan Bossart <
> nathandbossart@gmail.com> escreveu:
>> I agree with David [0].  But if you can demonstrate a performance gain,
>> perhaps it's worth considering a subset of these changes in hot paths.

> head:
> Time: 418,210 ms
> Time: 419,588 ms
> Time: 424,713 ms

> fprintf patch:
> Time: 416,919 ms
> Time: 416,246 ms
> Time: 416,237 ms

That is most certainly not enough gain to justify a large amount
of code churn.  In fact, given that this is probably pretty
platform-dependent and you've checked only one platform, I don't
think I'd call this a sufficient case for even a one-line change.

            regards, tom lane



Re: Avoid overhead with fprintf related functions

From
Ranier Vilela
Date:
Em sex., 9 de set. de 2022 às 18:53, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
Ranier Vilela <ranier.vf@gmail.com> writes:
> Em sex., 9 de set. de 2022 às 13:20, Nathan Bossart <
> nathandbossart@gmail.com> escreveu:
>> I agree with David [0].  But if you can demonstrate a performance gain,
>> perhaps it's worth considering a subset of these changes in hot paths.

> head:
> Time: 418,210 ms
> Time: 419,588 ms
> Time: 424,713 ms

> fprintf patch:
> Time: 416,919 ms
> Time: 416,246 ms
> Time: 416,237 ms

That is most certainly not enough gain to justify a large amount
of code churn.  In fact, given that this is probably pretty
platform-dependent and you've checked only one platform, I don't
think I'd call this a sufficient case for even a one-line change.
Of course, base these changes not on performance gain, but on correct style and increased security.
But out-vote is out-vote, case closed.

Regards,
Ranier Vilela

Re: Avoid overhead with fprintf related functions

From
Nathan Bossart
Date:
On Fri, Sep 09, 2022 at 05:53:54PM -0400, Tom Lane wrote:
> Ranier Vilela <ranier.vf@gmail.com> writes:
>> Em sex., 9 de set. de 2022 às 13:20, Nathan Bossart <
>> nathandbossart@gmail.com> escreveu:
>>> I agree with David [0].  But if you can demonstrate a performance gain,
>>> perhaps it's worth considering a subset of these changes in hot paths.
> 
>> head:
>> Time: 418,210 ms
>> Time: 419,588 ms
>> Time: 424,713 ms
> 
>> fprintf patch:
>> Time: 416,919 ms
>> Time: 416,246 ms
>> Time: 416,237 ms
> 
> That is most certainly not enough gain to justify a large amount
> of code churn.  In fact, given that this is probably pretty
> platform-dependent and you've checked only one platform, I don't
> think I'd call this a sufficient case for even a one-line change.

Agreed.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com