On Wed, Feb 05, 2020 at 02:43:49PM +0900, Fujii Masao wrote:
> Hi,
>
> When I compiled PostgreSQL with -DLWLOCK_STATS and tried to check
> the statistics of light-weight locks, I observed that more than one
> statistics entries were output *for the same backend process and
> the same lwlock*. For example, I got the following four statistics
> when I checked how the process with PID 81141 processed ProcArrayLock.
> This is strange, and IMO only one statistics should be output for
> the same backend process and lwlock.
>
> $ grep "PID 81141 lwlock ProcArrayLock" data/log/postgresql-2020-02-05_141842.log
> PID 81141 lwlock ProcArrayLock 0x111e87780: shacq 4000 exacq 0 blk 0 spindelay 0 dequeue self 0
> PID 81141 lwlock ProcArrayLock 0x111e87780: shacq 2 exacq 0 blk 0 spindelay 0 dequeue self 0
> PID 81141 lwlock ProcArrayLock 0x111e87780: shacq 6001 exacq 1 blk 0 spindelay 0 dequeue self 0
> PID 81141 lwlock ProcArrayLock 0x111e87780: shacq 5 exacq 1 blk 0 spindelay 0 dequeue self 0
>
> The cause of this issue is that the key variable used for lwlock hash
> table was not fully initialized. The key consists of two fields and
> they are initialized as follows. But the following 4 bytes allocated
> for the alignment was not initialized. So even if the same key was
> specified, hash_search(HASH_ENTER) could not find the existing
> entry for that key and created new one.
>
> key.tranche = lock->tranche;
> key.instance = lock;
>
> Attached is the patch fixing this issue by initializing the key
> variable with zero. In the patched version, I confirmed that only one
> statistics is output for the same process and the same lwlock.
> Also this patch would reduce the volume of lwlock statistics
> very much.
>
> This issue was introduced by commit 3761fe3c20. So the patch needs
> to be back-patch to v10.
Good catch! The patch looks good to me. Just in case I looked at other users
of HASH_BLOBS and AFAICT there's no other cases of key that can contain padding
bytes that aren't memset first.