Re: Improve hash join's handling of tuples with null join keys - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Improve hash join's handling of tuples with null join keys
Date
Msg-id 3073503.1746493111@sss.pgh.pa.us
Whole thread Raw
In response to Re: Improve hash join's handling of tuples with null join keys  (Tomas Vondra <tomas@vondra.me>)
List pgsql-hackers
Tomas Vondra <tomas@vondra.me> writes:
> My personal experience is that the growEnabled heuristics is overly
> sensitive, and probably does not trigger very often.

Yeah, it would be good to make it not quite all-or-nothing.

> But more importantly, wasn't the issue discussed in [1] about parallel
> hash joins?

I'm not clear on that either; it seemed that the OP was able to
trigger it in some non-parallel cases too.  But we don't have a
reproducer so I can't say for sure.  Building a reproducer would
be a useful exercise for testing this.  There might well be some
parallel-specific misbehavior that would be worth ameliorating
independently of this work, in case of a lot of non-null duplicate
keys.

>> This passes check-world, and I've extended a couple of existing test
>> cases to ensure that the new code paths are exercised.  I've not done
>> any real performance testing, though.

> Are you planning to? If not, I can try to collect some numbers, but I
> can't promise that before pgconf.dev.

If you have time after the conference, please feel free.

> BTW do you consider this to be a bugfix for PG18? Or would it have to
> wait for PG19 at this point?

This has been like this forever I suspect --- certainly for as long
as we've had PHJ, and probably longer.  So I'm seeing it as new work
for v19, not something we'd attempt to back-patch.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: Row pattern recognition
Next
From: Jeremy Schneider
Date:
Subject: Re: queryId constant squashing does not support prepared statements