Home > mailing lists

Re: [HACKERS] Poor memory context performance in large hash joins - Mailing list pgsql-hackers

From	Jeff Janes
Subject	Re: [HACKERS] Poor memory context performance in large hash joins
Date	February 25, 2017 01:57:17
Msg-id	CAMkU=1xdSg0g8JuwUq=zro=PG508_0qKaKFLQbEcGciT4hAYLw@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] Poor memory context performance in large hash joins (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: [HACKERS] Poor memory context performance in large hash joins (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

On Fri, Feb 24, 2017 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jeff Janes <jeff.janes@gmail.com> writes:
> On Thu, Feb 23, 2017 at 2:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Maybe it's time to convert that to a doubly-linked list.

> I don't think that would help. You would need some heuristic to guess
> whether the chunk you are looking for is near the front, or near the end.

Uh, what? In a doubly-linked list, you can remove an element in O(1)
time, you don't need any searching. It basically becomes
item->prev->next = item->next;
item->next->prev = item->prev;
modulo possible special cases for the head and tail elements.

Currently it is walking the chain to identify which block holds the chunk in the first place, not just to get the pointer to the previous block. But I see that that could be fixed by pointer arithmetic once there is a reason to fix it. Which there isn't a reason to as long as you need to walk the chain to get the prev pointer anyway. Like this:?

targetblock = (AllocBlock) (((char*)chunk) - ALLOC_BLOCKHDRSZ);

>> Although if the
>> hash code is producing a whole lot of requests that are only a bit bigger
>> than the separate-block threshold, I'd say It's Doing It Wrong. It should
>> learn to aggregate them into larger requests.

> Right now it is using compiled-in 32KB chunks. Should it use something
> like max(32kb,work_mem/128) instead?

I'd say it should double the size of the request each time. That's what
we do in most places.

I thought the design goal there was that the space in old chunks could get re-used into the new chunks in a reasonably fine-grained way. If the last chunk contains just over half of all the data, that couldn't happen.

Cheers,

Jeff

pgsql-hackers by date:

From: Andrew Dunstan
Date: 25 February 2017, 01:55:21
Subject: Re: [HACKERS] btree_gin and btree_gist for enums

From: Jim Nasby
Date: 25 February 2017, 01:57:25
Subject: Re: [HACKERS] case_preservation_and_insensitivity = on

Re: [HACKERS] Poor memory context performance in large hash joins - Mailing list pgsql-hackers

Previous

Next