Re: pgsql: Add some regression tests that exercise hash join code. - Mailing list pgsql-committers
From | Thomas Munro |
---|---|
Subject | Re: pgsql: Add some regression tests that exercise hash join code. |
Date | |
Msg-id | CAEepm=2D8aSZc-Z3rWetp38S8P+aeyv3eRjOdxzoZsJDpH8xTg@mail.gmail.com Whole thread Raw |
In response to | Re: pgsql: Add some regression tests that exercise hash join code. (Andres Freund <andres@anarazel.de>) |
Responses |
Re: pgsql: Add some regression tests that exercise hash join code.
|
List | pgsql-committers |
On Tue, Dec 5, 2017 at 9:21 AM, Andres Freund <andres@anarazel.de> wrote: > On 2017-12-04 12:28:56 +1300, Thomas Munro wrote: >> From 43eeea0bc35204440d262224b56efca958b33428 Mon Sep 17 00:00:00 2001 >> From: Thomas Munro <thomas.munro@enterprisedb.com> >> Date: Mon, 4 Dec 2017 11:52:11 +1300 >> Subject: [PATCH] Fix EXPLAIN ANALYZE of hash join when the leader doesn't >> execute. >> >> If a hash join appears in a parallel query, there may be no hash table >> available for explain.c to inspect even though a hash table may have been >> built in other processes. This could happen either because >> parallel_leader_participation was set to off or because the leader happened to >> hit the end of the outer relation immediately (even though the complete >> relation is not empty) and decided not to build the hash table. >> >> Commit bf11e7ee introduced a way for workers to exchange instrumentation via >> the DSM segment for Sort nodes even though they are not parallel-aware. This >> commit does the same for Hash nodes, so that explain.c has a way to find >> instrumentation data from an arbitrary participant that actually built the >> hash table. > > Makes sense. Seems ok to only fix this inv11, even if present earlier. +1 I doubt any v10 users will notice. v11 makes it more noticeable because it has the new parallel_leader_participation GUC and these new regression tests which hit the racy case surprisingly often when running on somebody's toaster in the buildfarm. > I can't think of a way that, before the parallel HJ patch, results in a > difference between the size of the hashtable between workers. Short of > some participant not having a hashtable at all, that is. Wwe also don't > currently display lookup information etc. So just more or less randomly > choosing one worker's seems ok. Right. Finding the first arbitrary backend that actually built the hash table and using its instrumentation data works OK for now because they each do either exactly the same hash build or nothing. Even with the later PHJ patch the batch/buckets sizing info will always be the same across all workers that built the hash table, so it just needs some special handling to choose what peak size to report (or some more elaborate information on space usage, to be bikeshedded later). >> + case T_HashState: >> + /* even when not parallel-aware */ > > "..., for stats" or such maybe? How about ", for EXPLAIN ANALYZE"? Done that way. >> + ExecHashEstimate((HashState *) planstate, e->pcxt); >> + break; >> case T_SortState: >> /* even when not parallel-aware */ > > I'd just update that comment as well. Done. Looking for all the places where Robert has that comment, I realised that I probably also needed to supply ExecHashReInitializeDSM() to zap the instrumentation array for each loop. Done. >> +/* ---------------- >> + * Values displayed by EXPLAIN ANALYZE >> + * ---------------- >> + */ >> +typedef struct HashInstrumentation >> +{ >> + int nbuckets; >> + int nbuckets_original; >> + int nbatch; >> + int nbatch_original; >> + size_t space_peak; >> +} HashInstrumentation; > > Maybe I'm being pedantic here, but I think it'd not hurt to explain what > these mean. It's obviously pretty clear what nbuckets/batch mean, but > the difference between original and not is far less clea.r Fixed. -- Thomas Munro http://www.enterprisedb.com
Attachment
pgsql-committers by date: