Re: 9.5: Better memory accounting, towards memory-bounded HashAgg - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: 9.5: Better memory accounting, towards memory-bounded HashAgg
Date
Msg-id 546F965B.1090202@fuzzy.cz
Whole thread Raw
In response to Re: 9.5: Better memory accounting, towards memory-bounded HashAgg  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On 21.11.2014 00:03, Andres Freund wrote:
> On 2014-11-17 21:03:07 +0100, Tomas Vondra wrote:
>> On 17.11.2014 19:46, Andres Freund wrote:
>>
>>> The MemoryContextData struct is embedded into AllocSetContext.
>>
>> Oh, right. That makes is slightly more complicated, though, because
>> AllocSetContext adds 6 x 8B fields plus an in-line array of 
>> freelist pointers. Which is 11x8 bytes. So in total 56+56+88=200B, 
>> without the additional field. There might be some difference 
>> because of alignment, but I still don't see how that one
>> additional field might impact cachelines?
> 
> It's actually 196 bytes:

Ummmm, I think the pahole output shows 192, not 196? Otherwise it
wouldn't be exactly 3 cachelines anyway.

But yeah - my math-foo was weak for a moment, because 6x8 != 56. Which
is the 8B difference :-/

> struct AllocSetContext {
>         MemoryContextData          header;               /*     0    56 */
>         AllocBlock                 blocks;               /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         AllocChunk                 freelist[11];         /*    64    88 */
>         /* --- cacheline 2 boundary (128 bytes) was 24 bytes ago --- */
>         Size                       initBlockSize;        /*   152     8 */
>         Size                       maxBlockSize;         /*   160     8 */
>         Size                       nextBlockSize;        /*   168     8 */
>         Size                       allocChunkLimit;      /*   176     8 */
>         AllocBlock                 keeper;               /*   184     8 */
>         /* --- cacheline 3 boundary (192 bytes) --- */
> 
>         /* size: 192, cachelines: 3, members: 8 */
> };
> 
> And thus one additional field tipps it over the edge.
> 
> "pahole" is a very useful utility.

Indeed.

>> But if we separated the freelist, that might actually make it
>> faster, at least for calls that don't touch the freelist at all,
>> no? Because most of the palloc() calls will be handled from the
>> current block.
> 
> I seriously doubt it. The additional indirection + additional
> branches are likely to make it worse.

That's possible, although I tried it on "my version" of the accounting
patch, and it showed slight improvement (lower overhead) on Robert's
reindex benchmark.

The question is how would that work with regular workload, because
moving the freelist out of the structure makes it smaller (2 cachelines
instead of 3), and it can only impact workloads working with the
freelists (i.e. either by calling free, or realloc, or whatever).
Although palloc() checks the freelist too ...

Also, those pieces may be allocated together (next to each other), which
might keep locality.

But I haven't tested any of this, and my knowledge of this low-level
stuff is poor, so I might be completely wrong.

regard
Tomas



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: psql \sf doesn't show it's SQL when ECHO_HIDDEN is on
Next
From: Andrew Dunstan
Date:
Subject: Re: psql \sf doesn't show it's SQL when ECHO_HIDDEN is on