PgStat_HashKey padding issue when passed by reference - Mailing list pgsql-hackers

From Sami Imseih
Subject PgStat_HashKey padding issue when passed by reference
Date
Msg-id CAA5RZ0t9omat+HVSakJXwTMWvhpYFcAZb41RPWKwrKFUgmAFBQ@mail.gmail.com
Whole thread Raw
Responses Re: PgStat_HashKey padding issue when passed by reference
Re: PgStat_HashKey padding issue when passed by reference
Re: PgStat_HashKey padding issue when passed by reference
List pgsql-hackers
Hi,

I was experimenting with adding a new OID field to PgStat_HashKey:

```
typedef struct PgStat_HashKey
{
  PgStat_Kind kind;      /* statistics entry kind */
  Oid     dboid;     /* database ID. InvalidOid for shared objects. */
  Oid     newfield;
  uint64   objid;     /* object ID (table, function, etc.), or
                 * identifier. */
} PgStat_HashKey;
```

It's important to observe that hole padding is added with this new
field:

```
(gdb) ptype /o PgStat_HashKey
type = struct PgStat_HashKey {
/*   0   |    4 */  uint32 kind;
/*   4   |    4 */  Oid dboid;
/*   8   |    4 */  Oid newfield;
/* XXX 4-byte hole   */
/*   16   |    8 */  uint64 objid;

                /* total size (bytes):  24 */
               }
```

With "newfield" added, I started seeing the error "entry ref vanished
before deletion" when creating the cluster (or selecting from
pg_stat_database, etc.).

This error occurs when the local reference to a pgstat entry is deleted
and the entry can't be found in the simplehash pgStatEntryRefHash,
based on the key.

```
  if (!pgstat_entry_ref_hash_delete(pgStatEntryRefHash, key))
    elog(ERROR, "entry ref vanished before deletion");
```

This is surprising because this entry with the key is created earlier
via pgstat_get_entry_ref_cached -> pgstat_entry_ref_hash_insert.

When I brought this up to Bertrand (CC'd), he was not able to reproduce the
ERROR by adding a new field, and we quickly realized that it is related
to the compiler optimization (-O) flags we were using. He was building
with -O0 (no optimization) and I was building with -O2. So something
with optimization was causing the issue.

Starting a problematic cluster with Valgrind:

```
valgrind --leak-check=full --track-origins=yes \
  $PGHOME/bin/postgres -D $PGDATA --single -P
```

shows messages like "Use of uninitialised value of size 8":

```
==26625== Use of uninitialised value of size 8
==26625==  at 0x61D04F: pgstat_get_entry_ref_cached (pgstat_shmem.c:405)
==26625==  by 0x61D04F: pgstat_get_entry_ref (pgstat_shmem.c:488)
==26625==  by 0x6177B9: pgstat_fetch_entry (pgstat.c:977)
==26625==  by 0x5433B2: rebuild_database_list (autovacuum.c:1002)
```

Further debugging led us to the fact that the key is passed-by-value to

```
pgstat_get_entry_ref_cached(PgStat_HashKey key, PgStat_EntryRef **entry_ref_p)
```

and in that routine the entry is created:

```
cache_entry = pgstat_entry_ref_hash_insert(pgStatEntryRefHash, key, &found);
``

The hash for the key is created with fasthash32(key, size, 0) and
compared against another key using memcmp(a, b, sizeof(PgStat_HashKey))
in the "pgstat_hash_hash_key" and "pgstat_cmp_hash_key" routines,
respectively.

With -O2, we observed that (on our testing environment):

pgstat_get_entry_ref_cached() is inlined
the inlined code does not copy the hole padding content when the key
is passed-by-value

So that while the hole initially contains zeroes:

```
  /* clear padding */
  memset(&key, 0, sizeof(struct PgStat_HashKey));
```

We observed that the zeroes in the hole are not copied to
pgstat_get_entry_ref_cached()
when passed-by-value with -O2.

GDB when setting a breakpoint at pgstat_entry_ref_hash_insert (which
is called inside
pgstat_get_entry_ref_cached) shows that the padding has some unexpected
values, see the 12-15th byte "0x00 0x23 0x21 0x21" which based on the
memset done earlier should be 0.

```
(gdb) x/24xb d
0x7fff68cc16f0:  0x02  0x00  0x00  0x00  0x05  0x00  0x00  0x00
0x7fff68cc16f8:  0x00  0x00  0x00  0x00  0x00  0x23  0x21  0x21
0x7fff68cc1700:  0x67  0x0a  0x00  0x00  0x00  0x00  0x00  0x00
(gdb)
```

Now, if we pass the key by reference to pgstat_entry_ref_hash_insert as is
patched, we can see the padding is in fact cleared.

```
(gdb) x/24xb d
0x7ffe1bf6c640:  0x01  0x00  0x00  0x00  0x05  0x00  0x00  0x00
0x7ffe1bf6c648:  0x00  0x00  0x00  0x00  0x00  0x00  0x00  0x00
0x7ffe1bf6c650:  0x00  0x00  0x00  0x00  0x00  0x00  0x00  0x00
```

This looks like a compiler bug, we tested multiple ways to workaround:

1/ pg_noinline pgstat_get_entry_ref_cached
2/ making volatile the PgStat_HashKey key in the
pgstat_get_entry_ref_cached parameter
3/ Passing the key by reference instead of by value
4/ set SH_SCOPE to "external" rather than "static inline" in pgstat_shmem.c

it seems, at least for this case, the best option is to pass the key
by reference, like
in the attached patch.

FWIW, quickly looking at other areas that pass keys around, we can
also otherhash
keys being passed by value here:
```
spcachekey_hash();
spcachekey_equal():
```

but does not look problematic, as only the string field of the struct is hashed.

--
Sami Imseih
Amazon Web Services (AWS)

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PG19-3 PATCH] Don't ignore passfile
Next
From: Fujii Masao
Date:
Subject: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication