Thread: SlabCheck leaks memory into TopMemoryContext

SlabCheck leaks memory into TopMemoryContext

From
Andres Freund
Date:
Hi Tomas,

I just noticed that having a slab context around in an assertion build
leads to performance degrading and memory usage going up. A bit of
poking revealed that SlabCheck() doesn't free the freechunks it
allocates.

It's on its own obviously trivial to fix.

But there's also this note at the top:
 * NOTE: report errors as WARNING, *not* ERROR or FATAL.  Otherwise you'll
 * find yourself in an infinite loop when trouble occurs, because this
 * routine will be entered again when elog cleanup tries to release memory!

which seems to be violated by doing:

    /* bitmap of free chunks on a block */
    freechunks = palloc(slab->chunksPerBlock * sizeof(bool));

I guess it'd be better to fall back to malloc() here, and handle the
allocation failure gracefully? Or perhaps we can just allocate something
persistently during SlabCreate()?

Greetings,

Andres Freund



Re: SlabCheck leaks memory into TopMemoryContext

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> I just noticed that having a slab context around in an assertion build
> leads to performance degrading and memory usage going up. A bit of
> poking revealed that SlabCheck() doesn't free the freechunks it
> allocates.

> It's on its own obviously trivial to fix.

It seems like having a context check function do new allocations
is something to be avoided in the first place.  It's basically assuming
that the memory management mechanism is sane, which makes the whole thing
fundamentally circular, even if it's relying on some other context to be
sane.  Is there a way to do the checking without extra allocations?

            regards, tom lane



Re: SlabCheck leaks memory into TopMemoryContext

From
Andres Freund
Date:
Hi,

On 2020-01-16 00:09:53 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I just noticed that having a slab context around in an assertion build
> > leads to performance degrading and memory usage going up. A bit of
> > poking revealed that SlabCheck() doesn't free the freechunks it
> > allocates.
>
> > It's on its own obviously trivial to fix.
>
> It seems like having a context check function do new allocations
> is something to be avoided in the first place.

Yea, that's why I was wondering about doing the allocation during
context creation, for the largest size necessary of any context:

    /* bitmap of free chunks on a block */
    freechunks = palloc(slab->chunksPerBlock * sizeof(bool));

or at the very least using malloc(), rather than another context.


> It's basically assuming that the memory management mechanism is sane,
> which makes the whole thing fundamentally circular, even if it's
> relying on some other context to be sane.  Is there a way to do the
> checking without extra allocations?

Probably not easily.

Was wondering if the bitmap could be made more dense, and allocated on
the stack. It could actually using one bit for each tracked chunk,
instead of one byte. Would have to put in a clear lower limit of the
allowed chunk size, in relation to the block size, however, to stay in a
reasonable range.

Greetings,

Andres Freund



Re: SlabCheck leaks memory into TopMemoryContext

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2020-01-16 00:09:53 -0500, Tom Lane wrote:
>> It's basically assuming that the memory management mechanism is sane,
>> which makes the whole thing fundamentally circular, even if it's
>> relying on some other context to be sane.  Is there a way to do the
>> checking without extra allocations?

> Probably not easily.

In the AllocSet code, we don't hesitate to expend extra space per-chunk
for debug support:

typedef struct AllocChunkData
{
...
#ifdef MEMORY_CONTEXT_CHECKING
    Size        requested_size;
#endif
...

I don't see a pressing reason why SlabContext couldn't do something
similar, either per-chunk or per-context, whatever makes sense.

            regards, tom lane



Re: SlabCheck leaks memory into TopMemoryContext

From
Andres Freund
Date:
Hi,

On 2020-01-16 01:25:00 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2020-01-16 00:09:53 -0500, Tom Lane wrote:
> >> It's basically assuming that the memory management mechanism is sane,
> >> which makes the whole thing fundamentally circular, even if it's
> >> relying on some other context to be sane.  Is there a way to do the
> >> checking without extra allocations?
> 
> > Probably not easily.
> 
> In the AllocSet code, we don't hesitate to expend extra space per-chunk
> for debug support:
> 
> typedef struct AllocChunkData
> {
> ...
> #ifdef MEMORY_CONTEXT_CHECKING
>     Size        requested_size;
> #endif
> ...
> 
> I don't see a pressing reason why SlabContext couldn't do something
> similar, either per-chunk or per-context, whatever makes sense.

Well, what I suggested upthread, was to just have two globals like

int slabcheck_freechunks_size;
bool *slabcheck_freechunks;

and realloc that to the larger size in SlabContextCreate() if necessary,
based on the computed chunksPerBlock?  I thought you were asking whether
any additional memory could just be avoided...

Greetings,

Andres Freund



Re: SlabCheck leaks memory into TopMemoryContext

From
Tomas Vondra
Date:
On Wed, Jan 15, 2020 at 10:41:43PM -0800, Andres Freund wrote:
>Hi,
>
>On 2020-01-16 01:25:00 -0500, Tom Lane wrote:
>> Andres Freund <andres@anarazel.de> writes:
>> > On 2020-01-16 00:09:53 -0500, Tom Lane wrote:
>> >> It's basically assuming that the memory management mechanism is sane,
>> >> which makes the whole thing fundamentally circular, even if it's
>> >> relying on some other context to be sane.  Is there a way to do the
>> >> checking without extra allocations?
>>
>> > Probably not easily.
>>
>> In the AllocSet code, we don't hesitate to expend extra space per-chunk
>> for debug support:
>>
>> typedef struct AllocChunkData
>> {
>> ...
>> #ifdef MEMORY_CONTEXT_CHECKING
>>     Size        requested_size;
>> #endif
>> ...
>>
>> I don't see a pressing reason why SlabContext couldn't do something
>> similar, either per-chunk or per-context, whatever makes sense.
>
>Well, what I suggested upthread, was to just have two globals like
>
>int slabcheck_freechunks_size;
>bool *slabcheck_freechunks;
>
>and realloc that to the larger size in SlabContextCreate() if necessary,
>based on the computed chunksPerBlock?  I thought you were asking whether
>any additional memory could just be avoided...
>

I don't see why not to just do what Tom proposed, i.e. allocate this as
part of the memory context in SlabContextCreate(), when memory context
checking is enabled. It seems much more convenient / simpler than the
globals (no repalloc, ...).


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: SlabCheck leaks memory into TopMemoryContext

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> ... I thought you were asking whether
> any additional memory could just be avoided...

Well, I was kind of wondering that, but if it's not practical then
preallocating the space instead will do.

            regards, tom lane



Re: SlabCheck leaks memory into TopMemoryContext

From
Tomas Vondra
Date:
On Thu, Jan 16, 2020 at 10:27:01AM -0500, Tom Lane wrote:
>Andres Freund <andres@anarazel.de> writes:
>> ... I thought you were asking whether
>> any additional memory could just be avoided...
>
>Well, I was kind of wondering that, but if it's not practical then
>preallocating the space instead will do.
>

I don't think it's practical to rework the checks in a way that would
not require allocations. Maybe it's possible, but I think it's not worth
the extra complexity.

The attached fix should do the trick - it pre-allocates the space when
creating the context. There is a bit of complexity because we want to
allocate the space as part of the context header, but nothin too bad. We
might optimize it a bit by using a regular bitmap (instead of just an
array of bools), but I haven't done that.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: SlabCheck leaks memory into TopMemoryContext

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> The attached fix should do the trick - it pre-allocates the space when
> creating the context. There is a bit of complexity because we want to
> allocate the space as part of the context header, but nothin too bad. We
> might optimize it a bit by using a regular bitmap (instead of just an
> array of bools), but I haven't done that.

Hmm ... so if this is an array of bools, why isn't it declared bool*
rather than char* ?  (Pre-existing ugliness, sure, but we might as
well fix it while we're here.  Especially since you used sizeof(bool)
in the space calculation.)

I agree that maxaligning the start point of the array is pointless.

I'd write "free chunks in a block" not "free chunks on a block",
the latter seems rather shaky English.  But that's getting picky.

LGTM otherwise.

            regards, tom lane



Re: SlabCheck leaks memory into TopMemoryContext

From
Andres Freund
Date:
Hi,

On 2020-01-16 17:25:00 +0100, Tomas Vondra wrote:
> On Thu, Jan 16, 2020 at 10:27:01AM -0500, Tom Lane wrote:
> > Andres Freund <andres@anarazel.de> writes:
> > > ... I thought you were asking whether
> > > any additional memory could just be avoided...
> > 
> > Well, I was kind of wondering that, but if it's not practical then
> > preallocating the space instead will do.
> > 
> 
> I don't think it's practical to rework the checks in a way that would
> not require allocations. Maybe it's possible, but I think it's not worth
> the extra complexity.
> 
> The attached fix should do the trick - it pre-allocates the space when
> creating the context. There is a bit of complexity because we want to
> allocate the space as part of the context header, but nothin too bad. We
> might optimize it a bit by using a regular bitmap (instead of just an
> array of bools), but I haven't done that.

I don't get why it's advantageous to allocate this once for each slab,
rather than having it as a global once for all slabs. But anyway, still
clearly better than the current situation.

- Andres



Re: SlabCheck leaks memory into TopMemoryContext

From
Tomas Vondra
Date:
On Thu, Jan 16, 2020 at 08:48:49AM -0800, Andres Freund wrote:
>Hi,
>
>On 2020-01-16 17:25:00 +0100, Tomas Vondra wrote:
>> On Thu, Jan 16, 2020 at 10:27:01AM -0500, Tom Lane wrote:
>> > Andres Freund <andres@anarazel.de> writes:
>> > > ... I thought you were asking whether
>> > > any additional memory could just be avoided...
>> >
>> > Well, I was kind of wondering that, but if it's not practical then
>> > preallocating the space instead will do.
>> >
>>
>> I don't think it's practical to rework the checks in a way that would
>> not require allocations. Maybe it's possible, but I think it's not worth
>> the extra complexity.
>>
>> The attached fix should do the trick - it pre-allocates the space when
>> creating the context. There is a bit of complexity because we want to
>> allocate the space as part of the context header, but nothin too bad. We
>> might optimize it a bit by using a regular bitmap (instead of just an
>> array of bools), but I haven't done that.
>
>I don't get why it's advantageous to allocate this once for each slab,
>rather than having it as a global once for all slabs. But anyway, still
>clearly better than the current situation.
>

It's largely a matter of personal preference - I agree there are cases
when global variables are the best solution, but I kinda dislike them.
It seems cleaner to just allocate it as part of the slab, not having to
deal with different number of chunks per block between slabs.

Plus we don't have all that many slabs (like 2), and it's only really
used in debug builds anyway. So I'm not all that woried about this
wasting a couple extra kB of memory.

YMMV of course ...


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: SlabCheck leaks memory into TopMemoryContext

From
Tomas Vondra
Date:
On Thu, Jan 16, 2020 at 11:43:34AM -0500, Tom Lane wrote:
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> The attached fix should do the trick - it pre-allocates the space when
>> creating the context. There is a bit of complexity because we want to
>> allocate the space as part of the context header, but nothin too bad. We
>> might optimize it a bit by using a regular bitmap (instead of just an
>> array of bools), but I haven't done that.
>
>Hmm ... so if this is an array of bools, why isn't it declared bool*
>rather than char* ?  (Pre-existing ugliness, sure, but we might as
>well fix it while we're here.  Especially since you used sizeof(bool)
>in the space calculation.)
>

True. Will fix.

>I agree that maxaligning the start point of the array is pointless.
>
>I'd write "free chunks in a block" not "free chunks on a block",
>the latter seems rather shaky English.  But that's getting picky.
>
>LGTM otherwise.
>

OK. Barring objections I'll push and backpatch this later today.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: SlabCheck leaks memory into TopMemoryContext

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On Thu, Jan 16, 2020 at 08:48:49AM -0800, Andres Freund wrote:
>> I don't get why it's advantageous to allocate this once for each slab,
>> rather than having it as a global once for all slabs. But anyway, still
>> clearly better than the current situation.

> It's largely a matter of personal preference - I agree there are cases
> when global variables are the best solution, but I kinda dislike them.
> It seems cleaner to just allocate it as part of the slab, not having to
> deal with different number of chunks per block between slabs.
> Plus we don't have all that many slabs (like 2), and it's only really
> used in debug builds anyway. So I'm not all that woried about this
> wasting a couple extra kB of memory.

A positive argument for doing it like this is that the overhead goes away
when the SlabContexts are all deallocated, while a global variable would
presumably stick around indefinitely.  But I concur that in current usage,
there's hardly any point in worrying about the relative benefits.  We
should just keep it simple, and this seems marginally simpler than the
other way.

            regards, tom lane



Re: SlabCheck leaks memory into TopMemoryContext

From
Andres Freund
Date:
Hi,

On 2020-01-16 18:01:53 +0100, Tomas Vondra wrote:
> Plus we don't have all that many slabs (like 2)

FWIW, I have a local patch that adds additional ones, for the relcache
and catcache, that's how I noticed the leak. Because a test pgbench
absolutely *tanked* in performance.

Just for giggles. With leak:
pgbench -n -M prepared -P1 -c20 -j20 -T6000 -S
progress: 1.0 s, 81689.4 tps, lat 0.242 ms stddev 0.087
progress: 2.0 s, 51228.5 tps, lat 0.390 ms stddev 0.107
progress: 3.0 s, 42297.4 tps, lat 0.473 ms stddev 0.141
progress: 4.0 s, 34885.9 tps, lat 0.573 ms stddev 0.171
progress: 5.0 s, 31211.2 tps, lat 0.640 ms stddev 0.182
progress: 6.0 s, 27307.9 tps, lat 0.732 ms stddev 0.216
progress: 7.0 s, 25698.9 tps, lat 0.778 ms stddev 0.228

without:
pgbench -n -M prepared -P1 -c20 -j20 -T6000 -S
progress: 1.0 s, 144119.1 tps, lat 0.137 ms stddev 0.047
progress: 2.0 s, 148092.8 tps, lat 0.135 ms stddev 0.039
progress: 3.0 s, 148757.0 tps, lat 0.134 ms stddev 0.032
progress: 4.0 s, 148553.7 tps, lat 0.134 ms stddev 0.038

I do find the size of the impact quite impressive. It's all due to the
TopMemoryContext's AllocSetCheck() taking longer and longer.


> and it's only really used in debug builds anyway. So I'm not all that
> woried about this wasting a couple extra kB of memory.

IDK, making memory usage look different makes optimizing it harder. Not
a hard rule, obviously, but ...

Greetings,

Andres Freund



Re: SlabCheck leaks memory into TopMemoryContext

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2020-01-16 18:01:53 +0100, Tomas Vondra wrote:
>> and it's only really used in debug builds anyway. So I'm not all that
>> woried about this wasting a couple extra kB of memory.

> IDK, making memory usage look different makes optimizing it harder. Not
> a hard rule, obviously, but ...

Well, if you're that excited about it, make a patch so we can see
how ugly it ends up being.

            regards, tom lane



Re: SlabCheck leaks memory into TopMemoryContext

From
Tomas Vondra
Date:
On Thu, Jan 16, 2020 at 12:33:03PM -0500, Tom Lane wrote:
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On Thu, Jan 16, 2020 at 08:48:49AM -0800, Andres Freund wrote:
>>> I don't get why it's advantageous to allocate this once for each slab,
>>> rather than having it as a global once for all slabs. But anyway, still
>>> clearly better than the current situation.
>
>> It's largely a matter of personal preference - I agree there are cases
>> when global variables are the best solution, but I kinda dislike them.
>> It seems cleaner to just allocate it as part of the slab, not having to
>> deal with different number of chunks per block between slabs.
>> Plus we don't have all that many slabs (like 2), and it's only really
>> used in debug builds anyway. So I'm not all that woried about this
>> wasting a couple extra kB of memory.
>
>A positive argument for doing it like this is that the overhead goes away
>when the SlabContexts are all deallocated, while a global variable would
>presumably stick around indefinitely.  But I concur that in current usage,
>there's hardly any point in worrying about the relative benefits.  We
>should just keep it simple, and this seems marginally simpler than the
>other way.
>

I think the one possible argument against this approach might be that it
adds a field to the struct, so if you have an extension using a Slab
context, it'll break if you don't rebuild it. But that only matters if
we want to backpatch it (which I think is not the plan) and with memory
context checking enabled (which does not apply to regular packages).


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: SlabCheck leaks memory into TopMemoryContext

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> I think the one possible argument against this approach might be that it
> adds a field to the struct, so if you have an extension using a Slab
> context, it'll break if you don't rebuild it. But that only matters if
> we want to backpatch it (which I think is not the plan) and with memory
> context checking enabled (which does not apply to regular packages).

Huh?  That struct is private in slab.c, no?  Any outside code relying
on its contents deserves to break.

I do think we ought to back-patch this, given the horrible results
Andres showed.

            regards, tom lane



Re: SlabCheck leaks memory into TopMemoryContext

From
Tomas Vondra
Date:
On Thu, Jan 16, 2020 at 03:15:41PM -0500, Tom Lane wrote:
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> I think the one possible argument against this approach might be that it
>> adds a field to the struct, so if you have an extension using a Slab
>> context, it'll break if you don't rebuild it. But that only matters if
>> we want to backpatch it (which I think is not the plan) and with memory
>> context checking enabled (which does not apply to regular packages).
>
>Huh?  That struct is private in slab.c, no?  Any outside code relying
>on its contents deserves to break.
>

Ah, right. Silly me.

>I do think we ought to back-patch this, given the horrible results
>Andres showed.
>

OK.

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: SlabCheck leaks memory into TopMemoryContext

From
Tomas Vondra
Date:
On Thu, Jan 16, 2020 at 06:04:32PM +0100, Tomas Vondra wrote:
>On Thu, Jan 16, 2020 at 11:43:34AM -0500, Tom Lane wrote:
>>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>>>The attached fix should do the trick - it pre-allocates the space when
>>>creating the context. There is a bit of complexity because we want to
>>>allocate the space as part of the context header, but nothin too bad. We
>>>might optimize it a bit by using a regular bitmap (instead of just an
>>>array of bools), but I haven't done that.
>>
>>Hmm ... so if this is an array of bools, why isn't it declared bool*
>>rather than char* ?  (Pre-existing ugliness, sure, but we might as
>>well fix it while we're here.  Especially since you used sizeof(bool)
>>in the space calculation.)
>>
>
>True. Will fix.
>
>>I agree that maxaligning the start point of the array is pointless.
>>
>>I'd write "free chunks in a block" not "free chunks on a block",
>>the latter seems rather shaky English.  But that's getting picky.
>>
>>LGTM otherwise.
>>
>
>OK. Barring objections I'll push and backpatch this later today.
>

I've pushed and backpatched this all the back back to 10.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: SlabCheck leaks memory into TopMemoryContext

From
Tomas Vondra
Date:
On Thu, Jan 16, 2020 at 01:41:39PM -0500, Tom Lane wrote:
>Andres Freund <andres@anarazel.de> writes:
>> On 2020-01-16 18:01:53 +0100, Tomas Vondra wrote:
>>> and it's only really used in debug builds anyway. So I'm not all that
>>> woried about this wasting a couple extra kB of memory.
>
>> IDK, making memory usage look different makes optimizing it harder. Not
>> a hard rule, obviously, but ...
>
>Well, if you're that excited about it, make a patch so we can see
>how ugly it ends up being.
>

I think the question is how much memory would using globals actually
save, compared to including the bitmap in SlabContext.

The bitmap size depends on block/chunk size - I don't know what
parameters Andres uses for the additional contexts, but for the two
places already using Slab we have 8kB blocks with 80B and 240B chunks,
so ~102 and ~34 chunks in a block. So it's not a huge amount, and we
could easily reduce this to 1/8 by switching to a proper bitmap.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services