Re: Broken resetting of subplan hash tables - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Broken resetting of subplan hash tables
Date
Msg-id 17213.1583012663@sss.pgh.pa.us
Whole thread Raw
In response to Re: Broken resetting of subplan hash tables  (Andres Freund <andres@anarazel.de>)
Responses Re: Broken resetting of subplan hash tables
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2020-02-29 14:02:59 -0500, Tom Lane wrote:
>> TBH, I think that this tuple table API is seriously misdesigned;
>> it is confusing and very error-prone to have the callers need to
>> reset the tuple context separately from calling ResetTupleHashTable.

> Do you have an alternative proposal?

I'd be inclined to let the tuple hashtable make its own tuple-storage
context and reset that for itself.  Is it really worth the complexity
and bug hazards to share such a context with other uses?

> We could change it so more of the metadata for execGrouping.c is
> computed outside of BuildTupleHashTableExt(), and continue to destroy
> the entire hashtable. But we'd still have to reallocate the hashtable,
> the slot, etc.  So having a reset interface seems like the right thing.

Agreed, the reset interface is a good idea.  I'm just not happy that
in addition to resetting, you have to remember to reset some
vaguely-related context (and heaven help you if you reset that context
but not the hashtable).

>> Also, the callers all look like their resets are intended to destroy
>> the whole hashtable not just its contents (as indeed they were doing,
>> before the faulty commit).  But I didn't attempt to fix that today.

> Hm? nodeAgg.c, nodeSetOp.c, nodeRecursiveUnion.c don't at all look like
> that to me? Why would they want to drop the hashtable metadata when
> resetting? What am I missing?

They may not look like it to you; but Andreas misread that, and so did
I at first --- not least because that *is* how it used to work, and
there are still comments suggesting that that's how it works, eg this
in ExecInitRecursiveUnion:

     * If hashing, we need a per-tuple memory context for comparisons, and a
     * longer-lived context to store the hash table.  The table can't just be
     * kept in the per-query context because we want to be able to throw it
     * away when rescanning.

"throw it away" sure looks like it means the entire hashtable, not just
its tuple contents.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Allowing ALTER TYPE to change storage strategy
Next
From: Ivan Panchenko
Date:
Subject: bool_plperl transform