Home > mailing lists

Re: Re: custom hash-based COUNT(DISTINCT) aggregate - unexpectedly high memory consumption - Mailing list pgsql-hackers

From	Atri Sharma
Subject	Re: Re: custom hash-based COUNT(DISTINCT) aggregate - unexpectedly high memory consumption
Date	October 7, 2013 17:57:15
Msg-id	6AC801DD-68B2-41F1-B12B-E2ADEA090466@gmail.com Whole thread Raw
In response to	Re: Re: custom hash-based COUNT(DISTINCT) aggregate - unexpectedly high memory consumption (Tomas Vondra <tv@fuzzy.cz>)
Responses	Re: Re: custom hash-based COUNT(DISTINCT) aggregate - unexpectedly high memory consumption (Tomas Vondra <tv@fuzzy.cz>)
List	pgsql-hackers

Tree view


Sent from my iPad

> On 07-Oct-2013, at 4:11, Tomas Vondra <tv@fuzzy.cz> wrote:
>
>> On 6.10.2013 20:37, Tomáš Janoušek wrote:
>> Hi,
>>
>>> On Sat, Oct 05, 2013 at 08:22:54PM +0200, Tomas Vondra wrote:
>>> I'm on 64-bit architecture and the example works with int32, which means
>>> the sizes should be about this:
>>>
>>>    hash_element_t => 20B
>>>    hash_bucket_t  => 4B + (20B * items in the bucket [in steps of 5])
>>>    hash_table_t   => 4B + space for buckets
>>>
>>> In the example above, there's ~20k unique values in each group. The
>>> threshold is 20 items per bucket on average, so that's 1024 buckets, and
>>> the buckets are almost full.
>>>
>>> So for single group, the hash table size is about
>>>
>>>   4B + 1024 * (4B + 20 * 20B) = 413700B = ~ 400 kB
>>>
>>> There are 4000 groups, so the total estimate is ~1.6GB in total.
>>>
>>> However when executed (9.2, 9.3 and HEAD behave exactly the same), the
>>> query consumes almost ~5GB of RAM (excluding shared buffers).
>>
>> I think the missing thing is the memory allocator bookkeeping overhead.
>> You're assuming that hash_element_t.value takes 8B for the pointer and 4B for
>> the value itself, but using malloc it takes another at least 20 bytes, and
>> from a quick glance at backend/utils/mmgr/aset.c it seems that palloc is
>> certainly not without its overhead either.
>>
>> Also, each additional level of pointers adds execution overhead and increases
>> the likelihood of cache misses.  I'd suggest a few improvements, if I may:
>>
>> 1. Drop hash_element_t altogether, store length in hash_bucket_t and alloc
>>   hash_bucket_t.items of size nitems * length bytes.  I doubt that storing
>>   the hash values has a benefit worth the storage and code complexity
>>   overhead (you're storing fixed-size ints, not large blobs that are
>>   expensive to compare and hash).
>
> Good idea - I'll move the length to the hash table.
>
> You're right that keeping the hash for int32/64 values is probably
> useless, however I planned to eventually extend the code to support
> larger values (possibly even varlena types, i.e. text/bytea). So I'll
> keep this for now, but I'll keep this as a possible optimization.
>
>> 2. Consider using a simpler/faster hash function, like FNV[1] or Jenkins[2].
>>   For fun, try not hashing those ints at all and see how that performs (that,
>>   I think, is what you get from HashSet<int> in Java/C#).
>
> I've used crc32 mostly because it's easily available in the code (i.e.
> I'm lazy), but I've done some quick research / primitive benchmarking
> too. For example hashing 2e9 integers takes this much time:
>
> FNV-1   = 11.9
> FNV-1a  = 11.9
> jenkins = 38.8
> crc32   = 32.0
>
> So it's not really "slow" and the uniformity seems to be rather good.
>
> I'll try FNV in the future, however I don't think that's the main issue
> right now.
>
>> 3. Consider dropping buckets in favor of open addressing (linear probing,
>>   quadratic, whatever).  This avoids another level of pointer indirection.
>
> OK, this sounds really interesting. It should be fairly straightforward
> for fixed-length data types (in that case I can get rid of the pointers
> altogether).
>
Consider the aspects associated with open addressing.Open addressing can quickly lead to growth in the main table.Also,
chainingis a much cleaner way of collision resolution,IMHO. 

>> It's been a few years since I've done stuff this low level, so I won't go into
>> suggesting a different data structure -- I have honestly no idea what's the
>> best way to count the number of distinct integers in a list.
>
> Thanks for the hints!
>
> Tomas
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

From: Kevin Grittner
Date: 07 October 2013, 17:53:01
Subject: Re: SSI freezing bug

From: David Fetter
Date: 07 October 2013, 18:43:31
Subject: Re: old warning in docs

Re: Re: custom hash-based COUNT(DISTINCT) aggregate - unexpectedly high memory consumption - Mailing list pgsql-hackers

Previous

Next