Re: bad estimation together with large work_mem generates terrible slow hash joins - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: bad estimation together with large work_mem generates terrible slow hash joins
Date
Msg-id 54109EB6.8020408@vmware.com
Whole thread Raw
In response to Re: bad estimation together with large work_mem generates terrible slow hash joins  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: bad estimation together with large work_mem generates terrible slow hash joins  (Tomas Vondra <tv@fuzzy.cz>)
List pgsql-hackers
On 09/10/2014 09:31 PM, Robert Haas wrote:
> On Wed, Sep 10, 2014 at 2:25 PM, Heikki Linnakangas
> <hlinnakangas@vmware.com> wrote:
>> The dense-alloc-v5.patch looks good to me. I have committed that with minor
>> cleanup (more comments below). I have not looked at the second patch.
>
> Gah.  I was in the middle of doing this.  Sigh.

Sorry.

>>> * the chunks size is 32kB (instead of 16kB), and we're using 1/4
>>>     threshold for 'oversized' items
>>>
>>>     We need the threshold to be >=8kB, to trigger the special case
>>>     within AllocSet. The 1/4 rule is consistent with ALLOC_CHUNK_FRACTION.
>>
>> Should we care about the fact that if there are only a few tuples, we will
>> nevertheless waste 32kB of memory for the chunk? I guess not, but I thought
>> I'd mention it. The smallest allowed value for work_mem is 64kB.
>
> I think we should change the threshold here to 1/8th.  The worst case
> memory wastage as-is ~32k/5 > 6k.

Not sure how you arrived at that number. The worst case is that each 32k 
chunk is filled up to 24k+1 bytes, i.e. 8k-1 bytes is wasted in each 
chunk. That's 25% wastage. That's not too bad IMHO - the worst case 
waste of AllocSet is 50%.

But if you want to twiddle it, feel free. Note that there's some 
interplay with allocset code, like Tomas mentioned. If the threshold is 
< 8k, palloc() will round up request sizes smaller than 8k anyway. That 
wastes some memory, nothing more serious than that, but it seems like a 
good idea to avoid it.

- Heikki




pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: removing volatile qualifiers from lwlock.c
Next
From: Tomas Vondra
Date:
Subject: Re: bad estimation together with large work_mem generates terrible slow hash joins