Home > mailing lists

Thread: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

14 May 2022, 21:46:53

Hi hackers,

At function load_relcache_init_file, there is an unnecessary function call,

to initialize pgstat_info pointer to NULL.

MemSet(&rel->pgstat_info, 0, sizeof(rel->pgstat_info));

I think that intention with use of MemSet was:

MemSet(&rel->pgstat_info, 0, sizeof(*rel->pgstat_info));

Initialize with sizeof of Struct size, not with sizeof pointer size.

But so it breaks.

Attached a tiny patch.

regards,

Ranier Vilela

Attachment

avoid_unecessary_memset_call.patch

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

David Rowley

Date:

16 May 2022, 23:26:04

On Sun, 15 May 2022 at 09:47, Ranier Vilela <ranier.vf@gmail.com> wrote:
> At function load_relcache_init_file, there is an unnecessary function call,
> to initialize pgstat_info pointer to NULL.
>
> MemSet(&rel->pgstat_info, 0, sizeof(rel->pgstat_info));

What seems to have happened here is the field was changed to become a
pointer in 77947c51c.  It's not incorrect to use MemSet() to zero out
the pointer field.  What it does probably do is confuse the casual
reader into thinking the field is a struct rather than a pointer to
one.   It's probably worth making that consistent with the other
fields so nobody gets confused.

Can you add a CF entry for PG16 for this so we come back to it after we branch?

David

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

17 May 2022, 13:33:57

Em seg., 16 de mai. de 2022 às 20:26, David Rowley <dgrowleyml@gmail.com> escreveu:

On Sun, 15 May 2022 at 09:47, Ranier Vilela <ranier.vf@gmail.com> wrote:
> At function load_relcache_init_file, there is an unnecessary function call,
> to initialize pgstat_info pointer to NULL.
>
> MemSet(&rel->pgstat_info, 0, sizeof(rel->pgstat_info));

What seems to have happened here is the field was changed to become a
pointer in 77947c51c. It's not incorrect to use MemSet() to zero out
the pointer field. What it does probably do is confuse the casual
reader into thinking the field is a struct rather than a pointer to
one. It's probably worth making that consistent with the other
fields so nobody gets confused.

Can you add a CF entry for PG16 for this so we come back to it after we branch?

Of course.

I will add it.

regards,

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

17 May 2022, 22:52:30

Em ter., 17 de mai. de 2022 às 10:33, Ranier Vilela <ranier.vf@gmail.com> escreveu:

Em seg., 16 de mai. de 2022 às 20:26, David Rowley <dgrowleyml@gmail.com> escreveu:
On Sun, 15 May 2022 at 09:47, Ranier Vilela <ranier.vf@gmail.com> wrote:
> At function load_relcache_init_file, there is an unnecessary function call,
> to initialize pgstat_info pointer to NULL.
>
> MemSet(&rel->pgstat_info, 0, sizeof(rel->pgstat_info));

What seems to have happened here is the field was changed to become a
pointer in 77947c51c. It's not incorrect to use MemSet() to zero out
the pointer field. What it does probably do is confuse the casual
reader into thinking the field is a struct rather than a pointer to
one. It's probably worth making that consistent with the other
fields so nobody gets confused.

Can you add a CF entry for PG16 for this so we come back to it after we branch?
Of course.
I will add it.

Created https://commitfest.postgresql.org/38/3640/

However, I would like to add more.
I found, I believe, a serious problem of incorrect usage of the memset api.
Historically, people have relied on using memset or MemSet, using the variable name as an argument for the sizeof.
While it works correctly, for arrays, when it comes to pointers to structures, things go awry.

#include <stdio.h>

struct test_t
{
double b;
int a;
char c;
};

typedef struct test_t Test;

int main()
{
Test * my_test;

printf("Sizeof pointer=%u\n", sizeof(my_test));
printf("Sizeof struct=%u\n", sizeof(Test));
}

Output:
Sizeof pointer=8
Sizeof struct=16

So throughout the code there are these misuses.

So, taking advantage of this CF I'm going to add one more big patch, with suggestions to fix the calls.

This pass vcregress check.

regards,

Ranier Vilela

Attachment

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Justin Pryzby

Date:

17 May 2022, 23:18:55

On Tue, May 17, 2022 at 07:52:30PM -0300, Ranier Vilela wrote:
> I found, I believe, a serious problem of incorrect usage of the memset api.
> Historically, people have relied on using memset or MemSet, using the
> variable name as an argument for the sizeof.
> While it works correctly, for arrays, when it comes to pointers to
> structures, things go awry.

Knowing how sizeof() works is required before using it - the same is true for
pointers.

> So throughout the code there are these misuses.

Why do you think it's a misuse ?

Take the first one as an example.  It says:

        GenericCosts costs;
        MemSet(&costs, 0, sizeof(costs));

You sent a patch to change it to sizeof(GenericCosts).

But it's not a pointer, so they are the same.

Is that true for every change in your patch ?

-- 
Justin

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Tom Lane

Date:

17 May 2022, 23:22:20

Ranier Vilela <ranier.vf@gmail.com> writes:
> I found, I believe, a serious problem of incorrect usage of the memset api.
> Historically, people have relied on using memset or MemSet, using the
> variable name as an argument for the sizeof.
> While it works correctly, for arrays, when it comes to pointers to
> structures, things go awry.

You'll have to convince people that any of these places are in
fact incorrect.  Everyone who's used C for any length of time
is well aware of the possibility of getting sizeof() wrong in
this sort of context, and I think we've been careful about it.

Also, as a stylistic matter I think it's best to write
"memset(&x, 0, sizeof(x))" where we can.  Replacing sizeof(x)
with sizeof(some type name) has its own risks of error, and
therefore is not automatically an improvement.

            regards, tom lane

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

18 May 2022, 00:08:59

Em ter., 17 de mai. de 2022 às 20:18, Justin Pryzby <pryzby@telsasoft.com> escreveu:

On Tue, May 17, 2022 at 07:52:30PM -0300, Ranier Vilela wrote:
> I found, I believe, a serious problem of incorrect usage of the memset api.
> Historically, people have relied on using memset or MemSet, using the
> variable name as an argument for the sizeof.
> While it works correctly, for arrays, when it comes to pointers to
> structures, things go awry.

Knowing how sizeof() works is required before using it - the same is true for
pointers.

> So throughout the code there are these misuses.

Why do you think it's a misuse ?

Take the first one as an example. It says:

GenericCosts costs;
MemSet(&costs, 0, sizeof(costs));

You sent a patch to change it to sizeof(GenericCosts).

But it's not a pointer, so they are the same.

Is that true for every change in your patch ?

It seems true, sorry.

Thanks Justin for pointing out my big mistake.

I hope this isn't all wasted work, but should I remove the 002 patch.

regards,

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Alvaro Herrera

Date:

18 May 2022, 08:54:58

This one caught my attention:

diff --git a/contrib/pgcrypto/crypt-blowfish.c b/contrib/pgcrypto/crypt-blowfish.c
index a663852ccf..63fcef562d 100644
--- a/contrib/pgcrypto/crypt-blowfish.c
+++ b/contrib/pgcrypto/crypt-blowfish.c
@@ -750,7 +750,7 @@ _crypt_blowfish_rn(const char *key, const char *setting,
 /* Overwrite the most obvious sensitive data we have on the stack. Note
  * that this does not guarantee there's no sensitive data left on the
  * stack and/or in registers; I'm not aware of portable code that does. */
-    px_memset(&data, 0, sizeof(data));
+    px_memset(&data, 0, sizeof(struct data));
 
     return output;
 }

The curious thing here is that sizeof(data) is correct, because it
refers to a variable defined earlier in that function, whose type is an
anonymous struct declared there.  But I don't know what "struct data"
refers to, precisely because that struct is unnamed.  Am I misreading it?


Also:

diff --git a/contrib/pgstattuple/pgstatindex.c b/contrib/pgstattuple/pgstatindex.c
index e1048e47ff..87be62f023 100644
--- a/contrib/pgstattuple/pgstatindex.c
+++ b/contrib/pgstattuple/pgstatindex.c
@@ -601,7 +601,7 @@ pgstathashindex(PG_FUNCTION_ARGS)
                  errmsg("cannot access temporary indexes of other sessions")));
 
     /* Get the information we need from the metapage. */
-    memset(&stats, 0, sizeof(stats));
+    memset(&stats, 0, sizeof(HashIndexStat));
     metabuf = _hash_getbuf(rel, HASH_METAPAGE, HASH_READ, LH_META_PAGE);
     metap = HashPageGetMeta(BufferGetPage(metabuf));
     stats.version = metap->hashm_version;

I think the working theory here is that the original line is correct
now, and it continues to be correct if somebody edits the function and
makes variable 'stats' be of a different type.  But if you change the
sizeof() to use the type name, then there are two places that you need
to edit, and they are not necessarily close together; so it is correct
now and could become a bug in the future.  I don't think we're fully
consistent about this, but I think you're proposing to change it in the
opposite direction that we'd prefer.

For the case where the variable is a pointer, the developer could write
'sizeof(*variable)' instead of being forced to specify the type name,
for example (just a random one):

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index a434cf93ef..e92c03686f 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -438,7 +438,7 @@ BloomFillMetapage(Relation index, Page metaPage)
      */
     BloomInitPage(metaPage, BLOOM_META);
     metadata = BloomPageGetMeta(metaPage);
-    memset(metadata, 0, sizeof(BloomMetaPageData));
+    memset(metadata, 0, sizeof(*metadata));
     metadata->magickNumber = BLOOM_MAGICK_NUMBER;
     metadata->opts = *opts;
     ((PageHeader) metaPage)->pd_lower += sizeof(BloomMetaPageData);

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

18 May 2022, 11:36:29

Em qua., 18 de mai. de 2022 às 05:54, Alvaro Herrera <alvherre@alvh.no-ip.org> escreveu:

This one caught my attention:

diff --git a/contrib/pgcrypto/crypt-blowfish.c b/contrib/pgcrypto/crypt-blowfish.c
index a663852ccf..63fcef562d 100644
--- a/contrib/pgcrypto/crypt-blowfish.c
+++ b/contrib/pgcrypto/crypt-blowfish.c
@@ -750,7 +750,7 @@ _crypt_blowfish_rn(const char *key, const char *setting,
/* Overwrite the most obvious sensitive data we have on the stack. Note
* that this does not guarantee there's no sensitive data left on the
* stack and/or in registers; I'm not aware of portable code that does. */
- px_memset(&data, 0, sizeof(data));
+ px_memset(&data, 0, sizeof(struct data));

return output;
}

The curious thing here is that sizeof(data) is correct, because it
refers to a variable defined earlier in that function, whose type is an
anonymous struct declared there. But I don't know what "struct data"
refers to, precisely because that struct is unnamed. Am I misreading it?

No, you are right.

This is definitely wrong.

Also:

diff --git a/contrib/pgstattuple/pgstatindex.c b/contrib/pgstattuple/pgstatindex.c
index e1048e47ff..87be62f023 100644
--- a/contrib/pgstattuple/pgstatindex.c
+++ b/contrib/pgstattuple/pgstatindex.c
@@ -601,7 +601,7 @@ pgstathashindex(PG_FUNCTION_ARGS)
errmsg("cannot access temporary indexes of other sessions")));

/* Get the information we need from the metapage. */
- memset(&stats, 0, sizeof(stats));
+ memset(&stats, 0, sizeof(HashIndexStat));
metabuf = _hash_getbuf(rel, HASH_METAPAGE, HASH_READ, LH_META_PAGE);
metap = HashPageGetMeta(BufferGetPage(metabuf));
stats.version = metap->hashm_version;

I think the working theory here is that the original line is correct
now, and it continues to be correct if somebody edits the function and
makes variable 'stats' be of a different type. But if you change the
sizeof() to use the type name, then there are two places that you need
to edit, and they are not necessarily close together; so it is correct
now and could become a bug in the future. I don't think we're fully
consistent about this, but I think you're proposing to change it in the
opposite direction that we'd prefer.

Yes. I think that only advantage using the name of structure is

when you read the line of MemSet, you know what kind type

is filled.

For the case where the variable is a pointer, the developer could write
'sizeof(*variable)' instead of being forced to specify the type name,
for example (just a random one):

Could have used this style to make the patch.
But the intention was to correct a possible misinterpretation,

which in this case, showed that I was totally wrong.

Sorry by the noise.

regards,

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Peter Eisentraut

Date:

18 May 2022, 13:52:15

On 18.05.22 01:18, Justin Pryzby wrote:
> Take the first one as an example.  It says:
> 
>          GenericCosts costs;
>          MemSet(&costs, 0, sizeof(costs));
> 
> You sent a patch to change it to sizeof(GenericCosts).
> 
> But it's not a pointer, so they are the same.

This instance can more easily be written as

     costs = {0};

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

18 May 2022, 14:08:08

Em qua., 18 de mai. de 2022 às 10:52, Peter Eisentraut <peter.eisentraut@enterprisedb.com> escreveu:

On 18.05.22 01:18, Justin Pryzby wrote:
> Take the first one as an example. It says:
>
> GenericCosts costs;
> MemSet(&costs, 0, sizeof(costs));
>
> You sent a patch to change it to sizeof(GenericCosts).
>
> But it's not a pointer, so they are the same.

This instance can more easily be written as

costs = {0};

That would initialize the content at compilation and not at runtime, correct?

And we would avoid MemSet/memset altogether.

There are a lot of cases using MemSet (with struct variables) and at Windows 64 bits, long are 4 (four) bytes.

So I believe that MemSet is less efficient on Windows than on Linux.

"The size of the '_vstart' buffer is not a multiple of the element size of the type 'long'."

message from PVS-Studio static analysis tool.

regards,

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

David Rowley

Date:

18 May 2022, 22:57:02

On Thu, 19 May 2022 at 02:08, Ranier Vilela <ranier.vf@gmail.com> wrote:
> That would initialize the content at compilation and not at runtime, correct?

Your mental model of compilation and run-time might be flawed here.
Here's no such thing as zeroing memory at compile time. There's only
emitting instructions that perform those tasks at run-time.
https://godbolt.org/ might help your understanding.

> There are a lot of cases using MemSet (with struct variables) and at Windows 64 bits, long are 4 (four) bytes.
> So I believe that MemSet is less efficient on Windows than on Linux.
> "The size of the '_vstart' buffer is not a multiple of the element size of the type 'long'."
> message from PVS-Studio static analysis tool.

I've been wondering for a while if we really need to have the MemSet()
macro. I see it was added in 8cb415449 (1997).  I think compilers have
evolved quite a bit in the past 25 years, so it could be time to
revisit that.

Your comment on the sizeof(long) on win64 is certainly true.  I wrote
the attached C program to test the performance difference.

(windows 64-bit)
>cl memset.c /Ox
>memset 200000000
Running 200000000 loops
MemSet: size 8: 1.833000 seconds
MemSet: size 16: 1.841000 seconds
MemSet: size 32: 1.838000 seconds
MemSet: size 64: 1.851000 seconds
MemSet: size 128: 3.228000 seconds
MemSet: size 256: 5.278000 seconds
MemSet: size 512: 3.943000 seconds
memset: size 8: 0.065000 seconds
memset: size 16: 0.131000 seconds
memset: size 32: 0.262000 seconds
memset: size 64: 0.530000 seconds
memset: size 128: 1.169000 seconds
memset: size 256: 2.950000 seconds
memset: size 512: 3.191000 seconds

It seems like there's no cases there where MemSet is faster than
memset.  I was careful to only provide MemSet() with inputs that
result in it not using the memset fallback.  I also provided constants
so that the decision about which method to use was known at compile
time.

It's not clear to me why 512 is faster than 256. I saw the same on a repeat run.

Changing "long" to "long long" it looks like:

>memset 200000000
Running 200000000 loops
MemSet: size 8: 0.066000 seconds
MemSet: size 16: 1.978000 seconds
MemSet: size 32: 1.982000 seconds
MemSet: size 64: 1.973000 seconds
MemSet: size 128: 1.970000 seconds
MemSet: size 256: 3.225000 seconds
MemSet: size 512: 5.366000 seconds
memset: size 8: 0.069000 seconds
memset: size 16: 0.132000 seconds
memset: size 32: 0.265000 seconds
memset: size 64: 0.527000 seconds
memset: size 128: 1.161000 seconds
memset: size 256: 2.976000 seconds
memset: size 512: 3.179000 seconds

The situation is a little different on my Linux machine:

$ gcc memset.c -o memset -O2
$ ./memset 200000000
Running 200000000 loops
MemSet: size 8: 0.000002 seconds
MemSet: size 16: 0.000000 seconds
MemSet: size 32: 0.094041 seconds
MemSet: size 64: 0.184618 seconds
MemSet: size 128: 1.781503 seconds
MemSet: size 256: 2.547910 seconds
MemSet: size 512: 4.005173 seconds
memset: size 8: 0.046156 seconds
memset: size 16: 0.046123 seconds
memset: size 32: 0.092291 seconds
memset: size 64: 0.184509 seconds
memset: size 128: 1.781518 seconds
memset: size 256: 2.577104 seconds
memset: size 512: 4.004757 seconds

It looks like part of the work might be getting optimised away in the
8-16 MemSet() calls.

clang seems to have the opposite for size 8.

$ clang memset.c -o memset -O2
$ ./memset 200000000
Running 200000000 loops
MemSet: size 8: 0.007653 seconds
MemSet: size 16: 0.005771 seconds
MemSet: size 32: 0.011539 seconds
MemSet: size 64: 0.023095 seconds
MemSet: size 128: 0.046130 seconds
MemSet: size 256: 0.092269 seconds
MemSet: size 512: 0.968564 seconds
memset: size 8: 0.000000 seconds
memset: size 16: 0.005776 seconds
memset: size 32: 0.011559 seconds
memset: size 64: 0.023069 seconds
memset: size 128: 0.046129 seconds
memset: size 256: 0.092243 seconds
memset: size 512: 0.968534 seconds

There does not seem to be any significant reduction in the size of the
binary from changing the MemSet macro to directly use memset. It went
from 9865008 bytes down to 9860800 bytes (4208 bytes less).

David

Attachment

memset.c

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Tom Lane

Date:

18 May 2022, 23:20:13

David Rowley <dgrowleyml@gmail.com> writes:
> I've been wondering for a while if we really need to have the MemSet()
> macro. I see it was added in 8cb415449 (1997).  I think compilers have
> evolved quite a bit in the past 25 years, so it could be time to
> revisit that.

Yeah, I've thought for awhile that technology has moved on from that.
Nobody's really taken the trouble to measure it though.  (And no,
results from one compiler on one machine are not terribly convincing.)

The thing that makes this a bit more difficult than it might be is
the special cases we have for known-aligned and so on targets, which
are particularly critical for palloc0 and makeNode etc.  So there's
more than one case to look into.  But I'd argue that those special
cases are actually what we want to worry about the most: zeroing
relatively small, known-aligned node structs is THE use case.

            regards, tom lane

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

18 May 2022, 23:47:14

Em qua., 18 de mai. de 2022 às 19:57, David Rowley <dgrowleyml@gmail.com> escreveu:

On Thu, 19 May 2022 at 02:08, Ranier Vilela <ranier.vf@gmail.com> wrote:
> That would initialize the content at compilation and not at runtime, correct?

Your mental model of compilation and run-time might be flawed here.
Here's no such thing as zeroing memory at compile time. There's only
emitting instructions that perform those tasks at run-time.
https://godbolt.org/ might help your understanding.

> There are a lot of cases using MemSet (with struct variables) and at Windows 64 bits, long are 4 (four) bytes.
> So I believe that MemSet is less efficient on Windows than on Linux.
> "The size of the '_vstart' buffer is not a multiple of the element size of the type 'long'."
> message from PVS-Studio static analysis tool.

I've been wondering for a while if we really need to have the MemSet()
macro. I see it was added in 8cb415449 (1997). I think compilers have
evolved quite a bit in the past 25 years, so it could be time to
revisit that.

All compilers currently have memset optimized.

Your comment on the sizeof(long) on win64 is certainly true. I wrote
the attached C program to test the performance difference.

(windows 64-bit)
>cl memset.c /Ox
>memset 200000000
Running 200000000 loops
MemSet: size 8: 1.833000 seconds
MemSet: size 16: 1.841000 seconds
MemSet: size 32: 1.838000 seconds
MemSet: size 64: 1.851000 seconds
MemSet: size 128: 3.228000 seconds
MemSet: size 256: 5.278000 seconds
MemSet: size 512: 3.943000 seconds
memset: size 8: 0.065000 seconds
memset: size 16: 0.131000 seconds
memset: size 32: 0.262000 seconds
memset: size 64: 0.530000 seconds
memset: size 128: 1.169000 seconds
memset: size 256: 2.950000 seconds
memset: size 512: 3.191000 seconds

It seems like there's no cases there where MemSet is faster than
memset. I was careful to only provide MemSet() with inputs that
result in it not using the memset fallback. I also provided constants
so that the decision about which method to use was known at compile
time.

It's not clear to me why 512 is faster than 256.

Probably broken alignment with 256?

Another warning from PVS-Studio:

[1] "The pointer '_start' is cast to a more strictly aligned pointer type."

src/contrib/postgres_fdw/connection.c (Line 1690)

MemSet(values, 0, sizeof(values));

I saw the same on a repeat run.

Changing "long" to "long long" it looks like:

>memset 200000000
Running 200000000 loops
MemSet: size 8: 0.066000 seconds
MemSet: size 16: 1.978000 seconds
MemSet: size 32: 1.982000 seconds
MemSet: size 64: 1.973000 seconds
MemSet: size 128: 1.970000 seconds
MemSet: size 256: 3.225000 seconds
MemSet: size 512: 5.366000 seconds
memset: size 8: 0.069000 seconds
memset: size 16: 0.132000 seconds
memset: size 32: 0.265000 seconds
memset: size 64: 0.527000 seconds
memset: size 128: 1.161000 seconds
memset: size 256: 2.976000 seconds
memset: size 512: 3.179000 seconds

The situation is a little different on my Linux machine:

$ gcc memset.c -o memset -O2
$ ./memset 200000000
Running 200000000 loops
MemSet: size 8: 0.000002 seconds
MemSet: size 16: 0.000000 seconds
MemSet: size 32: 0.094041 seconds
MemSet: size 64: 0.184618 seconds
MemSet: size 128: 1.781503 seconds
MemSet: size 256: 2.547910 seconds
MemSet: size 512: 4.005173 seconds
memset: size 8: 0.046156 seconds
memset: size 16: 0.046123 seconds
memset: size 32: 0.092291 seconds
memset: size 64: 0.184509 seconds
memset: size 128: 1.781518 seconds
memset: size 256: 2.577104 seconds
memset: size 512: 4.004757 seconds

It looks like part of the work might be getting optimised away in the
8-16 MemSet() calls.

On linux (long) have 8 bytes.

I'm still surprised that MemSet (8/16) is faster.

clang seems to have the opposite for size 8.

$ clang memset.c -o memset -O2
$ ./memset 200000000
Running 200000000 loops
MemSet: size 8: 0.007653 seconds
MemSet: size 16: 0.005771 seconds
MemSet: size 32: 0.011539 seconds
MemSet: size 64: 0.023095 seconds
MemSet: size 128: 0.046130 seconds
MemSet: size 256: 0.092269 seconds
MemSet: size 512: 0.968564 seconds
memset: size 8: 0.000000 seconds
memset: size 16: 0.005776 seconds
memset: size 32: 0.011559 seconds
memset: size 64: 0.023069 seconds
memset: size 128: 0.046129 seconds
memset: size 256: 0.092243 seconds
memset: size 512: 0.968534 seconds

There does not seem to be any significant reduction in the size of the
binary from changing the MemSet macro to directly use memset. It went
from 9865008 bytes down to 9860800 bytes (4208 bytes less).

Anyway I think on Windows 64 bits,

it is very worthwhile to remove the MemSet macro.

regards,

Ranier Vilela

[1] https://pvs-studio.com/en/docs/warnings/v1032/

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

18 May 2022, 23:51:01

Em qua., 18 de mai. de 2022 às 20:20, Tom Lane <tgl@sss.pgh.pa.us> escreveu:

zeroing
relatively small, known-aligned node structs is THE use case.

Currently, especially on 64-bit Windows, MemSet can break alignment.

regards,

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

19 May 2022, 00:04:23

Em qua., 18 de mai. de 2022 às 19:57, David Rowley <dgrowleyml@gmail.com> escreveu:

On Thu, 19 May 2022 at 02:08, Ranier Vilela <ranier.vf@gmail.com> wrote:
> That would initialize the content at compilation and not at runtime, correct?

Your mental model of compilation and run-time might be flawed here.
Here's no such thing as zeroing memory at compile time. There's only
emitting instructions that perform those tasks at run-time.
https://godbolt.org/ might help your understanding.

> There are a lot of cases using MemSet (with struct variables) and at Windows 64 bits, long are 4 (four) bytes.
> So I believe that MemSet is less efficient on Windows than on Linux.
> "The size of the '_vstart' buffer is not a multiple of the element size of the type 'long'."
> message from PVS-Studio static analysis tool.

I've been wondering for a while if we really need to have the MemSet()
macro. I see it was added in 8cb415449 (1997). I think compilers have
evolved quite a bit in the past 25 years, so it could be time to
revisit that.

Your comment on the sizeof(long) on win64 is certainly true. I wrote
the attached C program to test the performance difference.

(windows 64-bit)
>cl memset.c /Ox
>memset 200000000
Running 200000000 loops
MemSet: size 8: 1.833000 seconds
MemSet: size 16: 1.841000 seconds
MemSet: size 32: 1.838000 seconds
MemSet: size 64: 1.851000 seconds
MemSet: size 128: 3.228000 seconds
MemSet: size 256: 5.278000 seconds
MemSet: size 512: 3.943000 seconds
memset: size 8: 0.065000 seconds
memset: size 16: 0.131000 seconds
memset: size 32: 0.262000 seconds
memset: size 64: 0.530000 seconds
memset: size 128: 1.169000 seconds
memset: size 256: 2.950000 seconds
memset: size 512: 3.191000 seconds

It seems like there's no cases there where MemSet is faster than
memset. I was careful to only provide MemSet() with inputs that
result in it not using the memset fallback. I also provided constants
so that the decision about which method to use was known at compile
time.

It's not clear to me why 512 is faster than 256. I saw the same on a repeat run.

Changing "long" to "long long" it looks like:

>memset 200000000
Running 200000000 loops
MemSet: size 8: 0.066000 seconds
MemSet: size 16: 1.978000 seconds
MemSet: size 32: 1.982000 seconds
MemSet: size 64: 1.973000 seconds
MemSet: size 128: 1.970000 seconds
MemSet: size 256: 3.225000 seconds
MemSet: size 512: 5.366000 seconds
memset: size 8: 0.069000 seconds
memset: size 16: 0.132000 seconds
memset: size 32: 0.265000 seconds
memset: size 64: 0.527000 seconds
memset: size 128: 1.161000 seconds
memset: size 256: 2.976000 seconds
memset: size 512: 3.179000 seconds

The situation is a little different on my Linux machine:

$ gcc memset.c -o memset -O2
$ ./memset 200000000
Running 200000000 loops
MemSet: size 8: 0.000002 seconds
MemSet: size 16: 0.000000 seconds
MemSet: size 32: 0.094041 seconds
MemSet: size 64: 0.184618 seconds
MemSet: size 128: 1.781503 seconds
MemSet: size 256: 2.547910 seconds
MemSet: size 512: 4.005173 seconds
memset: size 8: 0.046156 seconds
memset: size 16: 0.046123 seconds
memset: size 32: 0.092291 seconds
memset: size 64: 0.184509 seconds
memset: size 128: 1.781518 seconds
memset: size 256: 2.577104 seconds
memset: size 512: 4.004757 seconds

It looks like part of the work might be getting optimised away in the
8-16 MemSet() calls.

clang seems to have the opposite for size 8.

$ clang memset.c -o memset -O2
$ ./memset 200000000
Running 200000000 loops
MemSet: size 8: 0.007653 seconds
MemSet: size 16: 0.005771 seconds
MemSet: size 32: 0.011539 seconds
MemSet: size 64: 0.023095 seconds
MemSet: size 128: 0.046130 seconds
MemSet: size 256: 0.092269 seconds
MemSet: size 512: 0.968564 seconds
memset: size 8: 0.000000 seconds
memset: size 16: 0.005776 seconds
memset: size 32: 0.011559 seconds
memset: size 64: 0.023069 seconds
memset: size 128: 0.046129 seconds
memset: size 256: 0.092243 seconds
memset: size 512: 0.968534 seconds

The results from clang, only reinforce the argument in favor of native memset.
There is still room for gcc to improve with 8/16 bytes and for sure at some point they will.

Which will make memset faster on all platforms and compilers.

regards,

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

19 May 2022, 16:09:46

Taking it a step further.

Created a new patch into commitfest, targeting 16 version.

https://commitfest.postgresql.org/38/3645/

Currently native memset is well optimized on several platforms, including Windows 64 bits [1].

However, even the native memset has problems,
I redid the David's memset.c test:

C:\usr\src\tests\memset>memset2 2000000000
Running 2000000000 loops
MemSet: size 8: 6.635000 seconds
MemSet: size 16: 6.594000 seconds
MemSet: size 32: 6.694000 seconds
MemSet: size 64: 9.002000 seconds
MemSet: size 128: 10.598000 seconds
MemSet: size 256: 25.061000 seconds
MemSet: size 512: 27.365000 seconds
memset: size 8: 0.594000 seconds
memset: size 16: 0.595000 seconds
memset: size 32: 1.189000 seconds
memset: size 64: 2.378000 seconds
memset: size 128: 4.753000 seconds
memset: size 256: 24.391000 seconds
memset: size 512: 27.064000 seconds

Both MemSet/memset perform very poorly with 256/512.

But, I believe it is worth removing the use of MemSet, because the usage is empirical and has been mixed with memset in several places in the code, without any criteria.
Using just memset makes the mental process of using it more simplified and it seems like there aren't any regressions when removing the use of MemSet.

Windows 10 64 bit
msvc 2019 64 bit

RAM 8GB

SSD 256GB

Postgres (15beta1 with original configuration)

1. pgbench -c 50 -T 300 -S -n -U postgres

HEAD:
pgbench (15beta1)
transaction type: <builtin: select only>
scaling factor: 1
query mode: simple
number of clients: 50
number of threads: 1
maximum number of tries: 1
duration: 300 s
number of transactions actually processed: 10448967
number of failed transactions: 0 (0.000%)
latency average = 1.432 ms
initial connection time = 846.186 ms
tps = 34926.861987 (without initial connection time)

PATCHED (without MemSet)
pgbench (15beta1)
transaction type: <builtin: select only>
scaling factor: 1
query mode: simple
number of clients: 50
number of threads: 1
maximum number of tries: 1
duration: 300 s
number of transactions actually processed: 10655332
number of failed transactions: 0 (0.000%)
latency average = 1.404 ms
initial connection time = 866.203 ms
tps = 35621.045750 (without initial connection time)

CREATE TABLE t_test (x numeric);
INSERT INTO t_test SELECT random()
FROM generate_series(1, 5000000);
ANALYZE;
SHOW work_mem;

HEAD:
postgres=# explain analyze SELECT * FROM t_test ORDER BY x;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Gather Merge (cost=397084.73..883229.71 rows=4166666 width=11) (actual time=1328.331..2743.310 rows=5000000 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=396084.71..401293.04 rows=2083333 width=11) (actual time=1278.442..1513.510 rows=1666667 loops=3)
Sort Key: x
Sort Method: external merge Disk: 25704kB
Worker 0: Sort Method: external merge Disk: 23960kB
Worker 1: Sort Method: external merge Disk: 23960kB
-> Parallel Seq Scan on t_test (cost=0.00..47861.33 rows=2083333 width=11) (actual time=0.234..128.607 rows=1666667 loops=3)
Planning Time: 0.064 ms
Execution Time: 2863.381 ms
(11 rows)

PATCHED:
postgres=# explain analyze SELECT * FROM t_test ORDER BY x;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Gather Merge (cost=397084.73..883229.71 rows=4166666 width=11) (actual time=1309.703..2705.027 rows=5000000 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=396084.71..401293.04 rows=2083333 width=11) (actual time=1281.111..1515.928 rows=1666667 loops=3)
Sort Key: x
Sort Method: external merge Disk: 24880kB
Worker 0: Sort Method: external merge Disk: 24776kB
Worker 1: Sort Method: external merge Disk: 23960kB
-> Parallel Seq Scan on t_test (cost=0.00..47861.33 rows=2083333 width=11) (actual time=0.260..130.277 rows=1666667 loops=3)
Planning Time: 0.060 ms
Execution Time: 2825.201 ms
(11 rows)

I leave MemSetAligned and MemSetLoop to another step.

regards,

Ranier Vilela

[1] https://msrc-blog.microsoft.com/2021/01/11/building-faster-amd64-memset-routines/

Attachment

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

David Rowley

Date:

19 May 2022, 23:20:34

On Thu, 19 May 2022 at 11:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The thing that makes this a bit more difficult than it might be is
> the special cases we have for known-aligned and so on targets, which
> are particularly critical for palloc0 and makeNode etc.  So there's
> more than one case to look into.  But I'd argue that those special
> cases are actually what we want to worry about the most: zeroing
> relatively small, known-aligned node structs is THE use case.

I think the makeNode() trickery would be harder to get rid of, or for
that matter, anything where the size/alignment is unknown at compile
time.  I think the more interesting ones that we might be able to get
rid of are the ones where the alignment and size *are* known at
compile time. Also probably anything that passes a compile-time const
that's not 0 will fallback on memset anyway, so might as well be
removed to tidy things up.

It just all seems a bit untidy when you look at functions like
ExecStoreAllNullTuple() which use a mix of memset and MemSet without
any apparent explanation of why. That particular one is likely that
way due to the first size guaranteed to be multiples of sizeof(Datum)
and the latter not.

Naturally, we'd need to run enough benchmarks to prove to ourselves
that we're not causing any slowdowns.  The intention of memset.c was
to try to put something out there that people could test so we could
get an idea if there are any machines/compilers that we might need to
be concerned about.

David

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Peter Eisentraut

Date:

30 June 2022, 22:37:23

On 19.05.22 18:09, Ranier Vilela wrote:
> Taking it a step further.
> Created a new patch into commitfest, targeting 16 version.
> https://commitfest.postgresql.org/38/3645/ 
> <https://commitfest.postgresql.org/38/3645/>

I have committed your 001 patch, which was clearly a (harmless) mistake.

I have also committed a patch that gets rid of MemSet() calls where the 
value is a constant not-0, because that just falls back to memset() anyway.

I'm on board with trying to get rid of MemSet(), but first I need to 
analyze all the performance numbers and arguments that were shown in 
this thread.

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

01 July 2022, 11:10:37

Em qui., 30 de jun. de 2022 às 19:37, Peter Eisentraut <peter.eisentraut@enterprisedb.com> escreveu:

On 19.05.22 18:09, Ranier Vilela wrote:
> Taking it a step further.
> Created a new patch into commitfest, targeting 16 version.
> https://commitfest.postgresql.org/38/3645/
> <https://commitfest.postgresql.org/38/3645/>

I have committed your 001 patch, which was clearly a (harmless) mistake.

Thank you.

I have also committed a patch that gets rid of MemSet() calls where the
value is a constant not-0, because that just falls back to memset() anyway.

I'm on board with trying to get rid of MemSet(), but first I need to
analyze all the performance numbers and arguments that were shown in
this thread.

One good argument is that using memset, allows to compiler

analyze and remove completely memset call if he understands

that can do it, which with MemSet is not possible.

regards,

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

01 July 2022, 15:58:14

Em qui., 30 de jun. de 2022 às 19:37, Peter Eisentraut <peter.eisentraut@enterprisedb.com> escreveu:

I have also committed a patch that gets rid of MemSet() calls where the
value is a constant not-0, because that just falls back to memset() anyway.

Peter there are some missing paths in this commit.

Despite having included the attached patch, there is no need to credit me as the author, just as a report.

regards,

Ranier Vilela

Attachment

001-avoid-unecessary-MemSet-calls.patch

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Peter Eisentraut

Date:

07 July 2022, 11:00:15

On 18.05.22 15:52, Peter Eisentraut wrote:
> On 18.05.22 01:18, Justin Pryzby wrote:
>> Take the first one as an example.  It says:
>>
>>          GenericCosts costs;
>>          MemSet(&costs, 0, sizeof(costs));
>>
>> You sent a patch to change it to sizeof(GenericCosts).
>>
>> But it's not a pointer, so they are the same.
> 
> This instance can more easily be written as
> 
>      costs = {0};

The attached patch replaces all MemSet() calls with struct 
initialization where that is easily possible.  (For example, some cases 
have to worry about padding bits, so I left those.)

This reduces the number of MemSet() calls from about 400 to about 200. 
Maybe this can help simplify the investigation of the merits of the 
remaining calls.

Attachment

0001-WIP-Replace-MemSet-calls-with-struct-initialization.patch

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Alvaro Herrera

Date:

07 July 2022, 11:16:07

On 2022-Jul-07, Peter Eisentraut wrote:

> diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
> index 4445a86aee..79b23fa7d7 100644
> --- a/src/bin/pg_basebackup/pg_basebackup.c
> +++ b/src/bin/pg_basebackup/pg_basebackup.c

> @@ -1952,7 +1948,6 @@ BaseBackup(char *compression_algorithm, char *compression_detail,
>      else
>          starttli = latesttli;
>      PQclear(res);
> -    MemSet(xlogend, 0, sizeof(xlogend));
>  
>      if (verbose && includewal != NO_WAL)
>          pg_log_info("write-ahead log start point: %s on timeline %u",

You removed the MemSet here, but there's no corresponding
initialization.

> diff --git a/src/port/snprintf.c b/src/port/snprintf.c
> index abb1c59770..e646b0e642 100644
> --- a/src/port/snprintf.c
> +++ b/src/port/snprintf.c
> @@ -756,12 +756,9 @@ find_arguments(const char *format, va_list args,
>      int            longflag;
>      int            fmtpos;
>      int            i;
> -    int            last_dollar;
> -    PrintfArgType argtypes[PG_NL_ARGMAX + 1];
> -
>      /* Initialize to "no dollar arguments known" */
> -    last_dollar = 0;
> -    MemSet(argtypes, 0, sizeof(argtypes));
> +    int            last_dollar = 0;
> +    PrintfArgType argtypes[PG_NL_ARGMAX + 1] = {0};

pgindent will insert a blank line before the comment, which I personally
find quite ugly (because it splits the block of declarations).

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"El Maquinismo fue proscrito so pena de cosquilleo hasta la muerte"
(Ijon Tichy en Viajes, Stanislaw Lem)

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Peter Eisentraut

Date:

07 July 2022, 11:44:38

On 01.07.22 17:58, Ranier Vilela wrote:
> Em qui., 30 de jun. de 2022 às 19:37, Peter Eisentraut 
> <peter.eisentraut@enterprisedb.com 
> <mailto:peter.eisentraut@enterprisedb.com>> escreveu:
> 
>     I have also committed a patch that gets rid of MemSet() calls where the
>     value is a constant not-0, because that just falls back to memset()
>     anyway.
> 
> Peter there are some missing paths in this commit.

As I wrote in the commit message:

(There are a few MemSet() calls that I didn't change to maintain the 
consistency with their surrounding code.)

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

07 July 2022, 11:53:29

Em qui., 7 de jul. de 2022 às 08:00, Peter Eisentraut <peter.eisentraut@enterprisedb.com> escreveu:

On 18.05.22 15:52, Peter Eisentraut wrote:
> On 18.05.22 01:18, Justin Pryzby wrote:
>> Take the first one as an example. It says:
>>
>>          GenericCosts costs;
>>          MemSet(&costs, 0, sizeof(costs));
>>
>> You sent a patch to change it to sizeof(GenericCosts).
>>
>> But it's not a pointer, so they are the same.
>
> This instance can more easily be written as
>
>     costs = {0};

The attached patch replaces all MemSet() calls with struct
initialization where that is easily possible. (For example, some cases
have to worry about padding bits, so I left those.)

Sounds great.

#include <stdio.h>
#include <string.h>

int main(void) {
    bool nulls[4] = {0};
    int i;

    memset(nulls, 0, sizeof(nulls));

    for(i = 0; i < 4; i++)
    {
        nulls[i] = 0;
    }

    return 1;
}

main:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     DWORD PTR [rbp-8], 0 // bool nulls[4] = {0};         lea     rax, [rbp-8]
        mov     edx, 4
        mov     esi, 0
        mov     rdi, rax
        call    memset
        mov     DWORD PTR [rbp-4], 0
        jmp     .L2
.L3:
        mov     eax, DWORD PTR [rbp-4]
        cdqe
        mov     BYTE PTR [rbp-8+rax], 0
        add     DWORD PTR [rbp-4], 1
.L2:
        cmp     DWORD PTR [rbp-4], 3
        jle     .L3
        mov     eax, 1
        leave
        ret

Only one line using {0}.

regards,

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

07 July 2022, 12:45:34

Em qui., 7 de jul. de 2022 às 08:00, Peter Eisentraut <peter.eisentraut@enterprisedb.com> escreveu:

>diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
>index 41b31c5c6f..803d169f57 100644
>--- a/src/backend/access/transam/twophase.c
>+++ b/src/backend/access/transam/twophase.c
>@@ -780,8 +780,8 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
> {
> GlobalTransaction gxact = &status->array[status->currIdx++];
> PGPROC *proc = &ProcGlobal->allProcs[gxact->pgprocno];
>- Datum values[5];
>- bool nulls[5];
>+ Datum values[5] = {0};
>+ bool nulls[5] = {0};

values variable no initialization or MemSet needed.

diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 02bd919ff6..61e0f4a29c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -106,8 +106,8 @@ pg_backup_stop(PG_FUNCTION_ARGS)
{
#define PG_STOP_BACKUP_V2_COLS 3
TupleDesc tupdesc;
- Datum values[PG_STOP_BACKUP_V2_COLS];
- bool nulls[PG_STOP_BACKUP_V2_COLS];
+ Datum values[PG_STOP_BACKUP_V2_COLS] = {0};
+ bool nulls[PG_STOP_BACKUP_V2_COLS] = {0};

Same, values variable no initialization or MemSet needed.

diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 5f1726c095..17ff617fba 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -1188,9 +1188,6 @@ SetDefaultACL(InternalDefaultACL *iacls)
Acl *old_acl;
Acl *new_acl;
HeapTuple newtuple;
- Datum values[Natts_pg_default_acl];
- bool nulls[Natts_pg_default_acl];
- bool replaces[Natts_pg_default_acl];
int noldmembers;
int nnewmembers;
Oid *oldmembers;
@@ -1341,13 +1338,11 @@ SetDefaultACL(InternalDefaultACL *iacls)
}
else
{
+ Datum values[Natts_pg_default_acl] = {0};
+ bool nulls[Natts_pg_default_acl] = {0};

replaces, can be reduced more one level.

line 1365:

else
{

+bool replaces[Natts_pg_default_acl] = {0};

defAclOid = ((Form_pg_default_acl) GETSTRUCT(tuple))->oid;

please, wait a minute, I will produce a new version of your patch, with some changes for your review.

regards,

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

07 July 2022, 17:01:25

Em qui., 7 de jul. de 2022 às 09:45, Ranier Vilela <ranier.vf@gmail.com> escreveu:

Em qui., 7 de jul. de 2022 às 08:00, Peter Eisentraut <peter.eisentraut@enterprisedb.com> escreveu:
>diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
>index 41b31c5c6f..803d169f57 100644
>--- a/src/backend/access/transam/twophase.c
>+++ b/src/backend/access/transam/twophase.c
>@@ -780,8 +780,8 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
> {
> GlobalTransaction gxact = &status->array[status->currIdx++];
> PGPROC *proc = &ProcGlobal->allProcs[gxact->pgprocno];
>- Datum values[5];
>- bool nulls[5];
>+ Datum values[5] = {0};
>+ bool nulls[5] = {0};

values variable no initialization or MemSet needed.

diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 02bd919ff6..61e0f4a29c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -106,8 +106,8 @@ pg_backup_stop(PG_FUNCTION_ARGS)
{
#define PG_STOP_BACKUP_V2_COLS 3
TupleDesc tupdesc;
- Datum values[PG_STOP_BACKUP_V2_COLS];
- bool nulls[PG_STOP_BACKUP_V2_COLS];
+ Datum values[PG_STOP_BACKUP_V2_COLS] = {0};
+ bool nulls[PG_STOP_BACKUP_V2_COLS] = {0};

Same, values variable no initialization or MemSet needed.

diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 5f1726c095..17ff617fba 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -1188,9 +1188,6 @@ SetDefaultACL(InternalDefaultACL *iacls)
Acl *old_acl;
Acl *new_acl;
HeapTuple newtuple;
- Datum values[Natts_pg_default_acl];
- bool nulls[Natts_pg_default_acl];
- bool replaces[Natts_pg_default_acl];
int noldmembers;
int nnewmembers;
Oid *oldmembers;
@@ -1341,13 +1338,11 @@ SetDefaultACL(InternalDefaultACL *iacls)
}
else
{
+ Datum values[Natts_pg_default_acl] = {0};
+ bool nulls[Natts_pg_default_acl] = {0};

replaces, can be reduced more one level.

line 1365:
else
{
+bool replaces[Natts_pg_default_acl] = {0};
defAclOid = ((Form_pg_default_acl) GETSTRUCT(tuple))->oid;

please, wait a minute, I will produce a new version of your patch, with some changes for your review.

Attached the v1 of your patch.

I think that all is safe to switch MemSet by {0}.

regards,

Ranier Vilela

Attachment

v1-0001-WIP-Replace-MemSet-calls-with-struct-initialization.patch

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Peter Eisentraut

Date:

11 July 2022, 13:26:16

On 07.07.22 13:16, Alvaro Herrera wrote:
> On 2022-Jul-07, Peter Eisentraut wrote:
> 
>> diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
>> index 4445a86aee..79b23fa7d7 100644
>> --- a/src/bin/pg_basebackup/pg_basebackup.c
>> +++ b/src/bin/pg_basebackup/pg_basebackup.c
> 
>> @@ -1952,7 +1948,6 @@ BaseBackup(char *compression_algorithm, char *compression_detail,
>>       else
>>           starttli = latesttli;
>>       PQclear(res);
>> -    MemSet(xlogend, 0, sizeof(xlogend));
>>   
>>       if (verbose && includewal != NO_WAL)
>>           pg_log_info("write-ahead log start point: %s on timeline %u",
> 
> You removed the MemSet here, but there's no corresponding
> initialization.

Maybe that was an oversight by me, but it seems to me that that 
initialization was useless anyway, since xlogend is later 
unconditionally overwritten anyway.

>> diff --git a/src/port/snprintf.c b/src/port/snprintf.c
>> index abb1c59770..e646b0e642 100644
>> --- a/src/port/snprintf.c
>> +++ b/src/port/snprintf.c
>> @@ -756,12 +756,9 @@ find_arguments(const char *format, va_list args,
>>       int            longflag;
>>       int            fmtpos;
>>       int            i;
>> -    int            last_dollar;
>> -    PrintfArgType argtypes[PG_NL_ARGMAX + 1];
>> -
>>       /* Initialize to "no dollar arguments known" */
>> -    last_dollar = 0;
>> -    MemSet(argtypes, 0, sizeof(argtypes));
>> +    int            last_dollar = 0;
>> +    PrintfArgType argtypes[PG_NL_ARGMAX + 1] = {0};
> 
> pgindent will insert a blank line before the comment, which I personally
> find quite ugly (because it splits the block of declarations).

Yeah.  I think I can convert that to an end-of-line comment instead.

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

11 July 2022, 19:06:33

Em qui., 7 de jul. de 2022 às 14:01, Ranier Vilela <ranier.vf@gmail.com> escreveu:

Attached the v1 of your patch.
I think that all is safe to switch MemSet by {0}.

Here the rebased patch v2, against latest head.

regards,

Ranier Vilela

Attachment

v2-0001-WIP-Replace-MemSet-calls-with-struct-initialization.patch

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Peter Eisentraut

Date:

16 July 2022, 06:58:10

On 11.07.22 21:06, Ranier Vilela wrote:
> Em qui., 7 de jul. de 2022 às 14:01, Ranier Vilela <ranier.vf@gmail.com 
> <mailto:ranier.vf@gmail.com>> escreveu:
> 
>     Attached the v1 of your patch.
>     I think that all is safe to switch MemSet by {0}.
> 
> Here the rebased patch v2, against latest head.

I have committed my patch with Álvaro's comments addressed.

Your patch appears to add in changes that are either arguably out of 
scope or would need further review (e.g., changing memset() calls, 
changing the scope of some variables, changing places that need to worry 
about padding bits).  Please submit separate patches for those, and we 
can continue the analysis.

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

16 July 2022, 19:54:56

Em sáb, 16 de jul de 2022 2:58 AM, Peter Eisentraut <peter.eisentraut@enterprisedb.com> escreveu:

On 11.07.22 21:06, Ranier Vilela wrote:
> Em qui., 7 de jul. de 2022 às 14:01, Ranier Vilela <ranier.vf@gmail.com
> <mailto:ranier.vf@gmail.com>> escreveu:
>
> Attached the v1 of your patch.
> I think that all is safe to switch MemSet by {0}.
>
> Here the rebased patch v2, against latest head.

I have committed my patch with Álvaro's comments addressed

I see.

It's annoing that old compiler (gcc 4.7.2) don't handle this style.

Your patch appears to add in changes that are either arguably out of
scope or would need further review (e.g., changing memset() calls,
changing the scope of some variables, changing places that need to worry
about padding bits). Please submit separate patches for those, and we
can continue the analysis.

Sure.

Regards

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

01 August 2022, 17:08:48

Em sáb., 16 de jul. de 2022 às 16:54, Ranier Vilela <ranier.vf@gmail.com> escreveu:

Em sáb, 16 de jul de 2022 2:58 AM, Peter Eisentraut <peter.eisentraut@enterprisedb.com> escreveu:
On 11.07.22 21:06, Ranier Vilela wrote:
> Em qui., 7 de jul. de 2022 às 14:01, Ranier Vilela <ranier.vf@gmail.com
> <mailto:ranier.vf@gmail.com>> escreveu:
>
> Attached the v1 of your patch.
> I think that all is safe to switch MemSet by {0}.
>
> Here the rebased patch v2, against latest head.

I have committed my patch with Álvaro's comments addressed
I see.
It's annoing that old compiler (gcc 4.7.2) don't handle this style.

Your patch appears to add in changes that are either arguably out of
scope or would need further review (e.g., changing memset() calls,
changing the scope of some variables, changing places that need to worry
about padding bits). Please submit separate patches for those, and we
can continue the analysis.
Sure.

Hi, sorry for the delay.

Like how https://github.com/postgres/postgres/commit/9fd45870c1436b477264c0c82eb195df52bc0919

New attempt to remove more MemSet calls, that are safe.

Attached v3 patch.

regards,

Ranier Vilela

Attachment

v3-0001-WIP-Replace-MemSet-calls-with-struct-initialization.patch

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Peter Smith

Date:

02 August 2022, 01:19:25

On Tue, Aug 2, 2022 at 3:09 AM Ranier Vilela <ranier.vf@gmail.com> wrote:
>
> Em sáb., 16 de jul. de 2022 às 16:54, Ranier Vilela <ranier.vf@gmail.com> escreveu:
>>
>>
>>
>> Em sáb, 16 de jul de 2022 2:58 AM, Peter Eisentraut <peter.eisentraut@enterprisedb.com> escreveu:
>>>
>>> On 11.07.22 21:06, Ranier Vilela wrote:
>>> > Em qui., 7 de jul. de 2022 às 14:01, Ranier Vilela <ranier.vf@gmail.com
>>> > <mailto:ranier.vf@gmail.com>> escreveu:
>>> >
>>> >     Attached the v1 of your patch.
>>> >     I think that all is safe to switch MemSet by {0}.
>>> >
>>> > Here the rebased patch v2, against latest head.
>>>
>>> I have committed my patch with Álvaro's comments addressed
>>
>> I see.
>> It's annoing that old compiler (gcc 4.7.2) don't handle this style.
>>
>>>
>>> Your patch appears to add in changes that are either arguably out of
>>> scope or would need further review (e.g., changing memset() calls,
>>> changing the scope of some variables, changing places that need to worry
>>> about padding bits).  Please submit separate patches for those, and we
>>> can continue the analysis.
>>
>> Sure.
>
> Hi, sorry for the delay.
> Like how https://github.com/postgres/postgres/commit/9fd45870c1436b477264c0c82eb195df52bc0919
> New attempt to remove more MemSet calls, that are safe.
>
> Attached v3 patch.
>
> regards,
> Ranier Vilela

Hi, I have not been closely following this thread, but it's starting
to sound very deja-vu with something I proposed 3 years ago. See [1]
"Make use of C99 designated initialisers for nulls/values arrays".
That started off with lots of support, but then there was a suggestion
that the {0} should be implemented as a macro, and the subsequent
discussions about that macro eventually bikeshedded the patch to
death.

It might be a good idea if you check that old thread so you can avoid
the same pitfalls. I hope you have more luck than I did ;-)

------
[1]
https://www.postgresql.org/message-id/flat/2793d0d2-c65f-5db0-4f89-251188438391%40gmail.com#102ee1b34a8341f28758efc347874b8a

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

02 August 2022, 11:55:59

Em seg., 1 de ago. de 2022 às 22:19, Peter Smith <smithpb2250@gmail.com> escreveu:

On Tue, Aug 2, 2022 at 3:09 AM Ranier Vilela <ranier.vf@gmail.com> wrote:
>
> Em sáb., 16 de jul. de 2022 às 16:54, Ranier Vilela <ranier.vf@gmail.com> escreveu:
>>
>>
>>
>> Em sáb, 16 de jul de 2022 2:58 AM, Peter Eisentraut <peter.eisentraut@enterprisedb.com> escreveu:
>>>
>>> On 11.07.22 21:06, Ranier Vilela wrote:
>>> > Em qui., 7 de jul. de 2022 às 14:01, Ranier Vilela <ranier.vf@gmail.com
>>> > <mailto:ranier.vf@gmail.com>> escreveu:
>>> >
>>> > Attached the v1 of your patch.
>>> > I think that all is safe to switch MemSet by {0}.
>>> >
>>> > Here the rebased patch v2, against latest head.
>>>
>>> I have committed my patch with Álvaro's comments addressed
>>
>> I see.
>> It's annoing that old compiler (gcc 4.7.2) don't handle this style.
>>
>>>
>>> Your patch appears to add in changes that are either arguably out of
>>> scope or would need further review (e.g., changing memset() calls,
>>> changing the scope of some variables, changing places that need to worry
>>> about padding bits). Please submit separate patches for those, and we
>>> can continue the analysis.
>>
>> Sure.
>
> Hi, sorry for the delay.
> Like how https://github.com/postgres/postgres/commit/9fd45870c1436b477264c0c82eb195df52bc0919
> New attempt to remove more MemSet calls, that are safe.
>
> Attached v3 patch.
>
> regards,
> Ranier Vilela

Hi, I have not been closely following this thread, but it's starting
to sound very deja-vu with something I proposed 3 years ago. See [1]
"Make use of C99 designated initialisers for nulls/values arrays".
That started off with lots of support, but then there was a suggestion
that the {0} should be implemented as a macro, and the subsequent
discussions about that macro eventually bikeshedded the patch to
death.

It might be a good idea if you check that old thread so you can avoid
the same pitfalls. I hope you have more luck than I did ;-)

I see, thanks.

We are using only {0}, just to avoid these pitfalls.

All changes here are safe, because, the tradeoff is

MemSet with 0 to {0}

Any else is ignored.

The rest of the calls with MemSet are alignment and padding dependent, and for now, will not be played.

regards,

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

mahendrakar s

Date:

02 August 2022, 13:17:11

Hi Ranier,

I'm pretty late to thread but would like to know about your claim in the thread:
`All compilers currently have memset optimized.` I know one case of optimization where variable is not used after the memset.

Are the cases for which the optimization is done consistent across all the compilers?

Thanks,

Mahendrakar.

On Tue, 2 Aug 2022 at 17:26, Ranier Vilela <ranier.vf@gmail.com> wrote:

Em seg., 1 de ago. de 2022 às 22:19, Peter Smith <smithpb2250@gmail.com> escreveu:
On Tue, Aug 2, 2022 at 3:09 AM Ranier Vilela <ranier.vf@gmail.com> wrote:
>
> Em sáb., 16 de jul. de 2022 às 16:54, Ranier Vilela <ranier.vf@gmail.com> escreveu:
>>
>>
>>
>> Em sáb, 16 de jul de 2022 2:58 AM, Peter Eisentraut <peter.eisentraut@enterprisedb.com> escreveu:
>>>
>>> On 11.07.22 21:06, Ranier Vilela wrote:
>>> > Em qui., 7 de jul. de 2022 às 14:01, Ranier Vilela <ranier.vf@gmail.com
>>> > <mailto:ranier.vf@gmail.com>> escreveu:
>>> >
>>> > Attached the v1 of your patch.
>>> > I think that all is safe to switch MemSet by {0}.
>>> >
>>> > Here the rebased patch v2, against latest head.
>>>
>>> I have committed my patch with Álvaro's comments addressed
>>
>> I see.
>> It's annoing that old compiler (gcc 4.7.2) don't handle this style.
>>
>>>
>>> Your patch appears to add in changes that are either arguably out of
>>> scope or would need further review (e.g., changing memset() calls,
>>> changing the scope of some variables, changing places that need to worry
>>> about padding bits). Please submit separate patches for those, and we
>>> can continue the analysis.
>>
>> Sure.
>
> Hi, sorry for the delay.
> Like how https://github.com/postgres/postgres/commit/9fd45870c1436b477264c0c82eb195df52bc0919
> New attempt to remove more MemSet calls, that are safe.
>
> Attached v3 patch.
>
> regards,
> Ranier Vilela

Hi, I have not been closely following this thread, but it's starting
to sound very deja-vu with something I proposed 3 years ago. See [1]
"Make use of C99 designated initialisers for nulls/values arrays".
That started off with lots of support, but then there was a suggestion
that the {0} should be implemented as a macro, and the subsequent
discussions about that macro eventually bikeshedded the patch to
death.

It might be a good idea if you check that old thread so you can avoid
the same pitfalls. I hope you have more luck than I did ;-)
I see, thanks.
We are using only {0}, just to avoid these pitfalls.
All changes here are safe, because, the tradeoff is

MemSet with 0 to {0}

Any else is ignored.

The rest of the calls with MemSet are alignment and padding dependent, and for now, will not be played.

regards,
Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

02 August 2022, 14:18:12

Em ter., 2 de ago. de 2022 às 10:17, mahendrakar s <mahendrakarforpg@gmail.com> escreveu:

Hi Ranier,

I'm pretty late to thread but would like to know about your claim in the thread:
`All compilers currently have memset optimized.`

What did I mean, modern compilers.

I know one case of optimization where variable is not used after the memset.

Probably, the compiler decided to remove the variable altogether.

The most common is to remove the padding, when he understands that this is possible and safe.

This does not mean that this will happen in all cases.

The point here is, this is only possible when using memset.

Are the cases for which the optimization is done consistent across all the compilers?

Of course not. But it does not matter.

regards,

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Peter Eisentraut

Date:

11 August 2022, 10:38:19

On 01.08.22 19:08, Ranier Vilela wrote:
> Like how 
> https://github.com/postgres/postgres/commit/9fd45870c1436b477264c0c82eb195df52bc0919 
> <https://github.com/postgres/postgres/commit/9fd45870c1436b477264c0c82eb195df52bc0919>
> New attempt to remove more MemSet calls, that are safe.
> 
> Attached v3 patch.

Note that struct initialization does not set padding bits.  So any 
struct that is used as a hash key or that goes to disk or something 
similar needs to be set with memset/MemSet instead.  Various places in 
the code make explicit comments about that, which your patch deletes, 
which is a mistake.  This patch needs to be adjusted carefully with this 
in mind before it can be considered.

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

11 August 2022, 11:15:16

Em qui., 11 de ago. de 2022 às 07:38, Peter Eisentraut <peter.eisentraut@enterprisedb.com> escreveu:

On 01.08.22 19:08, Ranier Vilela wrote:
> Like how
> https://github.com/postgres/postgres/commit/9fd45870c1436b477264c0c82eb195df52bc0919
> <https://github.com/postgres/postgres/commit/9fd45870c1436b477264c0c82eb195df52bc0919>
> New attempt to remove more MemSet calls, that are safe.
>
> Attached v3 patch.

Note that struct initialization does not set padding bits.

According to:

https://interrupt.memfault.com/blog/c-struct-padding-initialization

2. individually set all members to 0:

struct foo a = {  .i = 0,  .b = 0,
};

Suffer from this problem.

3. use { 0 } zero-initializer, not.

So any
struct that is used as a hash key or that goes to disk or something
similar needs to be set with memset/MemSet instead. Various places in
the code make explicit comments about that, which your patch deletes,
which is a mistake. This patch needs to be adjusted carefully with this
in mind before it can be considered.

I think this needs better comprovation?

regards,

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Alvaro Herrera

Date:

11 August 2022, 11:49:04

On 2022-Aug-11, Ranier Vilela wrote:

> According to:
> https://interrupt.memfault.com/blog/c-struct-padding-initialization

Did you actually read it?

https://interrupt.memfault.com/blog/c-struct-padding-initialization#structure-zero-initialization

: This looks great! However, it’s not obvious (from looking at those snippets)
: what the value loaded into the padding region will be.
:
: The unfortunate answer is: it depends
:
: The C11 standard, chapter §6.2.6.1/6 says this:
:
: : When a value is stored in an object of structure or union type, including in a
: : member object, the bytes of the object representation that correspond to any
: : padding bytes take unspecified values.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"La rebeldía es la virtud original del hombre" (Arthur Schopenhauer)

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

11 August 2022, 11:51:53

Em qui., 11 de ago. de 2022 às 08:48, Alvaro Herrera <alvherre@alvh.no-ip.org> escreveu:

On 2022-Aug-11, Ranier Vilela wrote:

> According to:
> https://interrupt.memfault.com/blog/c-struct-padding-initialization

Did you actually read it?

Yes, today.

https://interrupt.memfault.com/blog/c-struct-padding-initialization#structure-zero-initialization

: This looks great! However, it’s not obvious (from looking at those snippets)
: what the value loaded into the padding region will be.
:
: The unfortunate answer is: it depends
:
: The C11 standard, chapter §6.2.6.1/6 says this:
:
: : When a value is stored in an object of structure or union type, including in a
: : member object, the bytes of the object representation that correspond to any
: : padding bytes take unspecified values.

Did you see the Strategy 3 table, { 0 } ?

regards,

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Julien Rouhaud

Date:

11 August 2022, 12:23:03

On Thu, Aug 11, 2022 at 08:51:53AM -0300, Ranier Vilela wrote:
> Em qui., 11 de ago. de 2022 às 08:48, Alvaro Herrera <
> alvherre@alvh.no-ip.org> escreveu:
> 
> > On 2022-Aug-11, Ranier Vilela wrote:
> >
> > > According to:
> > > https://interrupt.memfault.com/blog/c-struct-padding-initialization
> >
> Did you see the Strategy 3 table, { 0 } ?

It explicitly shows that at least Ubuntu clang version 13.0.0-2 with -01
doesn't do anything about the padding bytes (and that's after testing only 2
different compilers).  Even if those compilers didn't show any problem, we
still couldn't rely on an undefined behavior and assume that no other compilers
behave differently.

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

11 August 2022, 16:59:33

Em qui., 11 de ago. de 2022 às 09:23, Julien Rouhaud <rjuju123@gmail.com> escreveu:

On Thu, Aug 11, 2022 at 08:51:53AM -0300, Ranier Vilela wrote:
> Em qui., 11 de ago. de 2022 às 08:48, Alvaro Herrera <
> alvherre@alvh.no-ip.org> escreveu:
>
> > On 2022-Aug-11, Ranier Vilela wrote:
> >
> > > According to:
> > > https://interrupt.memfault.com/blog/c-struct-padding-initialization
> >
> Did you see the Strategy 3 table, { 0 } ?

It explicitly shows that at least Ubuntu clang version 13.0.0-2 with -01
doesn't do anything about the padding bytes (and that's after testing only 2
different compilers). Even if those compilers didn't show any problem, we
still couldn't rely on an undefined behavior and assume that no other compilers
behave differently.

Yeah, although not a problem in the main current compilers clang, gcc and msvc,
it seems that this cannot be changed.

Being an undefined behavior, filling structures with holes, it seems to me that you should always use MemSet or memset.
Since even a current structure without holes could be changed in the future and become a bug.

regards,

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

David Zhang

Date:

19 August 2022, 22:27:04

Hi Ranier,

Following the comment in commit 9fd45870c1436b477264c0c82eb195df52bc0919,

    (The same could be done with appropriate memset() calls, but this
    patch is part of an effort to phase out MemSet(), so it doesn't touch
    memset() calls.)

Should these obviously possible replacement of the standard library function "memset" be considered as well? For example, something like the attached one which is focusing on the pageinspect extension only.

Best regards,

David

On 2022-08-01 10:08 a.m., Ranier Vilela wrote:

Em sáb., 16 de jul. de 2022 às 16:54, Ranier Vilela <ranier.vf@gmail.com> escreveu:

Em sáb, 16 de jul de 2022 2:58 AM, Peter Eisentraut <peter.eisentraut@enterprisedb.com> escreveu:
On 11.07.22 21:06, Ranier Vilela wrote:
> Em qui., 7 de jul. de 2022 às 14:01, Ranier Vilela <ranier.vf@gmail.com
> <mailto:ranier.vf@gmail.com>> escreveu:
>
> Attached the v1 of your patch.
> I think that all is safe to switch MemSet by {0}.
>
> Here the rebased patch v2, against latest head.

I have committed my patch with Álvaro's comments addressed
I see.
It's annoing that old compiler (gcc 4.7.2) don't handle this style.

Your patch appears to add in changes that are either arguably out of
scope or would need further review (e.g., changing memset() calls,
changing the scope of some variables, changing places that need to worry
about padding bits). Please submit separate patches for those, and we
can continue the analysis.
Sure.
Hi, sorry for the delay.
Like how https://github.com/postgres/postgres/commit/9fd45870c1436b477264c0c82eb195df52bc0919
New attempt to remove more MemSet calls, that are safe.

Attached v3 patch.

regards,
Ranier Vilela

Attachment

0001-Replace-many-memset-calls-with-struct-initialization.patch

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

20 August 2022, 14:26:53

Em sex., 19 de ago. de 2022 às 19:27, David Zhang <david.zhang@highgo.ca> escreveu:

Hi Ranier,

Hi David,

Following the comment in commit 9fd45870c1436b477264c0c82eb195df52bc0919,

    (The same could be done with appropriate memset() calls, but this
    patch is part of an effort to phase out MemSet(), so it doesn't touch
    memset() calls.)

Should these obviously possible replacement of the standard library function "memset" be considered as well?

Yes, sure.

In modern C compilers like clang above 13, gcc and msvc the initialization with {0},

has no problem, because all bits are correctly initialized to zero.

However with some old compilers, such behavior is not strictly followed, so with structs it is not safe to use.

But especially for arrays, whose use doesn't depend on filling the holes, it's certainly safe and cheap to use,

which is the case here.

For example, something like the attached one which is focusing on the pageinspect extension only.

Surely you did, but it has to be said, it was compiled and tested with at least a make check.
Looks like it's ok, LTGM.

regards,

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Alvaro Herrera

Date:

24 August 2022, 14:30:04

On 2022-Aug-19, David Zhang wrote:

> Should these obviously possible replacement of the standard library function
> "memset" be considered as well? For example, something like the attached one
> which is focusing on the pageinspect extension only.

If you do this, you're creating a potential backpatching hazard.  This
is OK if we get something in return, so a question to ask is whether
there is any benefit in doing it.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Most hackers will be perfectly comfortable conceptualizing users as entropy
 sources, so let's move on."                               (Nathaniel Smith)

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Peter Eisentraut

Date:

24 August 2022, 18:50:56

On 24.08.22 16:30, Alvaro Herrera wrote:
> On 2022-Aug-19, David Zhang wrote:
>> Should these obviously possible replacement of the standard library function
>> "memset" be considered as well? For example, something like the attached one
>> which is focusing on the pageinspect extension only.
> 
> If you do this, you're creating a potential backpatching hazard.  This
> is OK if we get something in return, so a question to ask is whether
> there is any benefit in doing it.

I don't follow how this is a backpatching hazard.

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Tom Lane

Date:

24 August 2022, 19:00:10

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> On 24.08.22 16:30, Alvaro Herrera wrote:
>> If you do this, you're creating a potential backpatching hazard.  This
>> is OK if we get something in return, so a question to ask is whether
>> there is any benefit in doing it.

> I don't follow how this is a backpatching hazard.

Call me a trogdolyte, but I don't follow how it's an improvement.
It looks to me like an entirely random change that doesn't get rid
of assumptions about what the bits are, it just replaces one set of
assumptions with a different set.  Moreover, the new set of assumptions
may include "there are no padding bits in here", which is mighty fragile
and hard to verify.  So I frankly do not find this a stylistic improvement.

            regards, tom lane

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Robert Haas

Date:

24 August 2022, 19:19:35

On Wed, Aug 24, 2022 at 3:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Call me a trogdolyte, but I don't follow how it's an improvement.
> It looks to me like an entirely random change that doesn't get rid
> of assumptions about what the bits are, it just replaces one set of
> assumptions with a different set.  Moreover, the new set of assumptions
> may include "there are no padding bits in here", which is mighty fragile
> and hard to verify.  So I frankly do not find this a stylistic improvement.

Ditto.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

24 August 2022, 19:20:15

Em qua., 24 de ago. de 2022 às 16:00, Tom Lane <tgl@sss.pgh.pa.us> escreveu:

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> On 24.08.22 16:30, Alvaro Herrera wrote:
>> If you do this, you're creating a potential backpatching hazard. This
>> is OK if we get something in return, so a question to ask is whether
>> there is any benefit in doing it.

> I don't follow how this is a backpatching hazard.

Call me a trogdolyte, but I don't follow how it's an improvement.
It looks to me like an entirely random change that doesn't get rid
of assumptions about what the bits are, it just replaces one set of
assumptions with a different set. Moreover, the new set of assumptions
may include "there are no padding bits in here", which is mighty fragile
and hard to verify. So I frankly do not find this a stylistic improvement.

But, these same arguments apply to Designated Initializers [1].

like:
struct foo a = {
.i = 0,
.b = 0,
};

That is slowly being introduced and IMHO brings the same problems with padding bits.

regards,

Ranier Vilela

[1] https://interrupt.memfault.com/blog/c-struct-padding-initialization

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Robert Haas

Date:

24 August 2022, 19:41:19

On Wed, Aug 24, 2022 at 3:20 PM Ranier Vilela <ranier.vf@gmail.com> wrote:
> But, these same arguments apply to Designated Initializers [1].
>
> like:
> struct foo a = {
>    .i = 0,
>    .b = 0,
> };
>
> That is slowly being introduced and IMHO brings the same problems with padding bits.

Yep. I don't find that an improvement over a MemSet on the struct
either, if we're just using it to fill in zeroes.

If we're using it to fill in non-zero values, though, then there's a
reasonable argument that it offers some notational convenience.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Ranier Vilela

Date:

24 August 2022, 19:49:59

Em qua., 24 de ago. de 2022 às 16:41, Robert Haas <robertmhaas@gmail.com> escreveu:

On Wed, Aug 24, 2022 at 3:20 PM Ranier Vilela <ranier.vf@gmail.com> wrote:
> But, these same arguments apply to Designated Initializers [1].
>
> like:
> struct foo a = {
> .i = 0,
> .b = 0,
> };
>
> That is slowly being introduced and IMHO brings the same problems with padding bits.

Yep. I don't find that an improvement over a MemSet on the struct
either, if we're just using it to fill in zeroes.

If we're using it to fill in non-zero values, though, then there's a
reasonable argument that it offers some notational convenience.

Even in that case, it still hides bugs.
All arguments against {0} apply entirely to this initialization type.
Because the padding bits remain uninitialized.

Note that where all major compilers are correctly initializing padding bits
with {0}, then this misbehavior will become of no practical effect in the future.

regards,

Ranier Vilela

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Alvaro Herrera

Date:

25 August 2022, 08:38:41

On 2022-Aug-24, Peter Eisentraut wrote:

> I don't follow how this is a backpatching hazard.

It changes code.  Any bugfix in the surrounding code would have to fix a
conflict.  That is nonzero effort.  Is it a huge risk?  No, it is very
small risk and a very small cost to fix such a conflict; but my claim is
that this change has zero benefit, therefore we should not incur a
nonzero future effort.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"How amazing is that? I call it a night and come back to find that a bug has
been identified and patched while I sleep."                (Robert Davidson)
               http://archives.postgresql.org/pgsql-sql/2006-03/msg00378.php

Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)

From

Michael Paquier

Date:

25 August 2022, 11:07:26

On Thu, Aug 25, 2022 at 10:38:41AM +0200, Alvaro Herrera wrote:
> It changes code.  Any bugfix in the surrounding code would have to fix a
> conflict.  That is nonzero effort.  Is it a huge risk?  No, it is very
> small risk and a very small cost to fix such a conflict; but my claim is
> that this change has zero benefit, therefore we should not incur a
> nonzero future effort.

Agreed to leave things as they are.  This really comes down to if we
want to make this code more C99-ish or not, and the post-patch result
is logically the same as the pre-patch result.
--
Michael

Attachment

signature.asc