On 5/16/23 00:15, Jehan-Guillaume de Rorthais wrote:
> Hi,
>
> Thanks for your review!
>
> On Sat, 13 May 2023 23:47:53 +0200
> Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
>
>> Thanks for the patches. A couple mostly minor comments, to complement
>> Melanie's review:
>>
>> 0001
>>
>> I'm not really sure about calling this "hybrid hash-join". What does it
>> even mean? The new comments simply describe the existing batching, and
>> how we increment number of batches, etc.
>
> Unless I'm wrong, I believed it comes from the "Hybrid hash join algorithm" as
> described here (+ see pdf in this page ref):
>
> https://en.wikipedia.org/wiki/Hash_join#Hybrid_hash_join
>
> I added the ref in the v7 documentation to avoid futur confusion, is it ok?
>
Ah, I see. I'd just leave out the "hybrid" entirely. We've lived without
it until now, we know what the implementation does ...
>> 0002
>>
>> I wouldn't call the new ExecHashJoinSaveTuple parameter spillcxt but
>> just something generic (e.g. cxt). Yes, we're passing spillCxt, but from
>> the function's POV it's just a pointer.
>
> changed in v7.
>
>> I also wouldn't move the ExecHashJoinSaveTuple comment inside - it just
>> needs to be reworded that we're expecting the context to be with the
>> right lifespan. The function comment is the right place to document such
>> expectations, people are less likely to read the function body.
>
> moved and reworded in v7.
>
>> The modified comment in hashjoin.h has a some alignment issues.
>
> I see no alignment issue. I suppose what bother you might be the spaces
> before spillCxt and batchCxt to show they are childs of hashCxt? Should I
> remove them?
>
It didn't occur to me this is intentional to show the contexts are
children of hashCxt, so maybe it's not a good way to document that. I'd
remove it, and if you think it's something worth mentioning, I'd add an
explicit comment.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company