Re: Reducing the chunk header sizes on all memory context types - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Reducing the chunk header sizes on all memory context types
Date
Msg-id f1d04651-caf9-5c6d-1bf0-c9feac16df10@enterprisedb.com
Whole thread Raw
In response to Re: Reducing the chunk header sizes on all memory context types  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Reducing the chunk header sizes on all memory context types
List pgsql-hackers

On 8/29/22 16:02, Amit Kapila wrote:
> On Mon, Aug 29, 2022 at 7:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> David Rowley <dgrowleyml@gmail.com> writes:
>>> I suspect, going by all 3 failing animals being 32-bit which have a
>>> MAXIMUM_ALIGNOF 8 and SIZEOF_SIZE_T of 4 that this is due to the lack
>>> of padding in the MemoryChunk struct.
>>> AllocChunkData and GenerationChunk had padding to account for
>>> sizeof(Size) being 4 and sizeof(void *) being 8, I didn't add that to
>>> MemoryChunk, so I'll do that now.
>>
>> Doesn't seem to have fixed it.  IMO, the fact that we can get through
>> core regression tests and pg_upgrade is a strong indicator that
>> there's not anything fundamentally wrong with memory context
>> management.  I'm inclined to think the problem is in d2169c9985,
>> instead ... though I can't see anything wrong with it.
>>
> 
> Yeah, I also thought that way but couldn't find a reason. I think if
> David is able to reproduce it on one of his systems then he can try
> locally reverting both the commits one by one.
> 

I can reproduce it on my system (rpi4 running 32-bit raspbian). I can't
grant access very easily at the moment, so I'll continue investigating
do more debugging on perhaps I can grant access to the system.

So far all I know is that it doesn't happen on d2169c9985 (so ~5 commits
back), and then it starts failing on c6e0fe1f2a. The extra padding added
by df0f4feef8 makes no difference, because the struct looked like this:

    struct MemoryChunk {
        Size                       requested_size;  /*     0     4 */

        /* XXX 4 bytes hole, try to pack */

        uint64                     hdrmask;         /*     8     8 */

        /* size: 16, cachelines: 1, members: 2 */
        /* sum members: 12, holes: 1, sum holes: 4 */
        /* last cacheline: 16 bytes */
    };

and the padding makes it look like this:

    struct MemoryChunk {
        Size                       requested_size;  /*     0     4 */
        char                       padding[4];      /*     4     8 */
        uint64                     hdrmask;         /*     8     8 */

        /* size: 16, cachelines: 1, members: 2 */
        /* sum members: 12, holes: 1, sum holes: 4 */
        /* last cacheline: 16 bytes */
    };

so it makes no difference.

I did look at the pointers in GetMemoryChunkMethodID, and it looks like
this (p1 is result of MAXALIGN(pointer):

(gdb) p pointer
$1 = (void *) 0x1ca1d2c
(gdb) p p1
$2 = 0x1ca1d30 ""
(gdb) p p1 - pointer
$3 = 4
(gdb) p (long int) pointer
$4 = 30022956
(gdb) p (long int) p1
$5 = 30022960
(gdb) p 30022956 % 8
$6 = 4

So the input pointer is not actually aligned to MAXIMUM_ALIGNOF (8B),
but only to 4B. That seems a bit strange.


>> Another possibility is that there's a pre-existing bug in the
>> logical decoding stuff that your changes accidentally exposed.
>>
> 
> Yeah, this is another possibility.

No idea.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Expand palloc/pg_malloc API
Next
From: Tom Lane
Date:
Subject: Re: Reducing the chunk header sizes on all memory context types