Re: [HACKERS] Avoiding OOM in a hash join with many duplicate inner keys - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Avoiding OOM in a hash join with many duplicate inner keys
Date
Msg-id 12185.1488932980@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Avoiding OOM in a hash join with many duplicate inner keys  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: [HACKERS] Avoiding OOM in a hash join with many duplicate inner keys  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
Thomas Munro <thomas.munro@enterprisedb.com> writes:
> I have been wondering about a couple of different worst case execution
> strategies that would be better than throwing our hands up and
> potentially exploding memory once we detect that further partitioning
> is not going to help, if we still manage to reach that case despite
> adding stats-based defences like this due to statistics being missing,
> bad or confused by joins below us.

Yeah, it would definitely be nice if we could constrain the runtime
space consumption better.

> 1.  We could probe the fraction of the hash table that we have managed
> to load into work_mem so far and then rewind the outer batch and do it
> again for the next work_mem-sized fraction of the inner batch and so
> on.  For outer joins we'd need to scan for unmatched tuples after each
> hash table refill.

I do not understand how that works for a left join?  You'd need to track
whether a given outer tuple has been matched in any one of the fractions
of the inner batch, so that when you're done with the batch you could know
which outer tuples need to be emitted null-extended.  Right now we only
need to track that state for the current outer tuple, but in a rescan
situation we'd have to remember it for each outer tuple in the batch.

Perhaps it could be done by treating the outer batch file as read/write
and incorporating a state flag in each tuple; or to reduce write volume,
maintaining a separate outer batch file parallel to the main one with just
a bool or even just a bit per outer tuple.  Seems messy though.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: [HACKERS] delta relations in AFTER triggers
Next
From: Amit Langote
Date:
Subject: Re: [HACKERS] Example Custom Scan Provider Implementation?