Re: Reducing the chunk header sizes on all memory context types - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Reducing the chunk header sizes on all memory context types
Date
Msg-id e0ded24c-ce95-c5e1-175e-4fca9c497a9e@enterprisedb.com
Whole thread Raw
In response to Re: Reducing the chunk header sizes on all memory context types  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Reducing the chunk header sizes on all memory context types
List pgsql-hackers

On 9/1/22 02:23, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
>> Focusing on the aset, vast majority of allocations (60M out of 64M) is
>> small enough to use power-of-2 logic, and we go from 6.3GB to 8.2GB, so
>> ~30%. Not great, not terrible.
> 
> Not sure why this escaped me before, but I remembered another argument
> for not forcibly adding space for a sentinel: if you don't have room,
> that means the chunk end is up against the header for the next chunk,
> which means that any buffer overrun will clobber that header.  So we'll
> detect the problem anyway if we validate the headers to a reasonable
> extent.
> 
> The hole in this argument is that the very last chunk allocated in a
> block might have no following chunk to validate.  But we could probably
> special-case that somehow, maybe by laying down a sentinel in the free
> space, where it will get overwritten by the next chunk when that does
> get allocated.
> 
> 30% memory bloat seems like a high price to pay if it's adding negligible
> detection ability, which it seems is true if this argument is valid.
> Is there reason to think we can't validate headers enough to catch
> clobbers?
> 

I'm not quite convinced the 30% figure is correct - it might be if you
ignore cases exceeding allocChunkLimit, but that also makes it pretty
bogus (because large allocations represent ~2x as much space).

You're probably right we'll notice the clobber cases due to corruption
of the next chunk header. The annoying thing is having a corrupted
header only tells you there's a corruption somewhere, but it may be hard
to know which part of the code caused it. I was hoping the sentinel
would make it easier, because we mark it as NOACCESS for valgrind. But
now I see we mark the first part of a MemoryChunk too, so maybe that's
enough.

OTOH we have platforms where valgrind is either not supported or no one
runs tests with (e.g. on rpi4 it'd take insane amounts of code). In that
case the sentinel might be helpful, especially considering alignment on
those platforms can cause funny memory issues, as evidenced by this thread.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Reducing the chunk header sizes on all memory context types
Next
From: Alexander Korotkov
Date:
Subject: Re: POC: Better infrastructure for automated testing of concurrency issues