Re: pgsql: Add some regression tests that exercise hash join code. - Mailing list pgsql-committers

From Thomas Munro
Subject Re: pgsql: Add some regression tests that exercise hash join code.
Date
Msg-id CAEepm=2D8aSZc-Z3rWetp38S8P+aeyv3eRjOdxzoZsJDpH8xTg@mail.gmail.com
Whole thread Raw
In response to Re: pgsql: Add some regression tests that exercise hash join code.  (Andres Freund <andres@anarazel.de>)
Responses Re: pgsql: Add some regression tests that exercise hash join code.  (Andres Freund <andres@anarazel.de>)
List pgsql-committers
On Tue, Dec 5, 2017 at 9:21 AM, Andres Freund <andres@anarazel.de> wrote:
> On 2017-12-04 12:28:56 +1300, Thomas Munro wrote:
>> From 43eeea0bc35204440d262224b56efca958b33428 Mon Sep 17 00:00:00 2001
>> From: Thomas Munro <thomas.munro@enterprisedb.com>
>> Date: Mon, 4 Dec 2017 11:52:11 +1300
>> Subject: [PATCH] Fix EXPLAIN ANALYZE of hash join when the leader doesn't
>>  execute.
>>
>> If a hash join appears in a parallel query, there may be no hash table
>> available for explain.c to inspect even though a hash table may have been
>> built in other processes.  This could happen either because
>> parallel_leader_participation was set to off or because the leader happened to
>> hit the end of the outer relation immediately (even though the complete
>> relation is not empty) and decided not to build the hash table.
>>
>> Commit bf11e7ee introduced a way for workers to exchange instrumentation via
>> the DSM segment for Sort nodes even though they are not parallel-aware.  This
>> commit does the same for Hash nodes, so that explain.c has a way to find
>> instrumentation data from an arbitrary participant that actually built the
>> hash table.
>
> Makes sense. Seems ok to only fix this inv11, even if present earlier.

+1

I doubt any v10 users will notice.  v11 makes it more noticeable
because it has the new parallel_leader_participation GUC and these new
regression tests which hit the racy case surprisingly often when
running on somebody's toaster in the buildfarm.

> I can't think of a way that, before the parallel HJ patch, results in a
> difference between the size of the hashtable between workers. Short of
> some participant not having a hashtable at all, that is. Wwe also don't
> currently display lookup information etc. So just more or less randomly
> choosing one worker's seems ok.

Right.  Finding the first arbitrary backend that actually built the
hash table and using its instrumentation data works OK for now because
they each do either exactly the same hash build or nothing.  Even with
the later PHJ patch the batch/buckets sizing info will always be the
same across all workers that built the hash table, so it just needs
some special handling to choose what peak size to report (or some more
elaborate information on space usage, to be bikeshedded later).

>> +             case T_HashState:
>> +                     /* even when not parallel-aware */
>
> "..., for stats" or such maybe?

How about ", for EXPLAIN ANALYZE"?  Done that way.

>> +                     ExecHashEstimate((HashState *) planstate, e->pcxt);
>> +                     break;
>>               case T_SortState:
>>                       /* even when not parallel-aware */
>
> I'd just update that comment as well.

Done.

Looking for all the places where Robert has that comment, I realised
that I probably also needed to supply ExecHashReInitializeDSM() to zap
the instrumentation array for each loop.  Done.

>> +/* ----------------
>> + *    Values displayed by EXPLAIN ANALYZE
>> + * ----------------
>> + */
>> +typedef struct HashInstrumentation
>> +{
>> +     int                     nbuckets;
>> +     int                     nbuckets_original;
>> +     int                     nbatch;
>> +     int                     nbatch_original;
>> +     size_t          space_peak;
>> +} HashInstrumentation;
>
> Maybe I'm being pedantic here, but I think it'd not hurt to explain what
> these mean. It's obviously pretty clear what nbuckets/batch mean, but
> the difference between original and not is far less clea.r

Fixed.

-- 
Thomas Munro
http://www.enterprisedb.com

Attachment

pgsql-committers by date:

Previous
From: Tom Lane
Date:
Subject: pgsql: Simplify do_pg_start_backup's API by opening pg_tblspcinternall
Next
From: Peter Eisentraut
Date:
Subject: pgsql: Fix warnings from cpluspluscheck