Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
Date
Msg-id 603c8f070812152051j5510a229j6f268526677ed906@mail.gmail.com
Whole thread Raw
In response to Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets  ("Lawrence, Ramon" <ramon.lawrence@ubc.ca>)
Responses Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
List pgsql-hackers
I have to admit that I haven't fully grokked what this patch is about
just yet, so what follows is mostly a coding style review at this
point.  It would help a lot if you could add some comments to the new
functions that are being added to explain the purpose of each at a
very high level.  There's clearly been a lot of thought put into some
parts of this logic, so it would be worth explaining the reasoning
behind that logic.

This patch applies clearly against CVS HEAD, but does not compile
(please fix the warning, too).

nodeHash.c:88: warning: no previous prototype for 'freezeNextMCVPartiton'
nodeHash.c: In function 'freezeNextMCVPartiton':
nodeHash.c:148: error: 'struct HashJoinTableData' has no member named 'inTupIOs'

I commented out the offending line.  It errored out again here:

nodeHashjoin.c: In function 'getMostCommonValues':
nodeHashjoin.c:136: error: wrong type argument to unary plus

After removing the stray + sign, it compiled, but failed the
"rangefuncs" regression test.  If you're going to keep the
enable_hashjoin_usestatmvcs() GUC around, you need to patch
rangefuncs.out so that the regression tests pass.  I think, however,
that there was some discussion of removing that before the patch is
committed; if so, please do that instead.  Keeping the GUC would also
require patching the documentation, which the current patch does not
do.

getMostCommonValues() isn't a good name for a non-static function
because there's nothing to tip the reader off to the fact that it has
something to do with hash joins; compare with the other function names
defined in the same header file.  On the flip side, that function has
only one call site, so it should probably be made static and not
declared in the header file at all.  Some of the other new functions
need similar treatment.  I am also a little suspicious of this bit of
code:
       relid = getrelid(((SeqScan *) ((SeqScanState *)
outerPlanState(hjstate))->ps.plan)->scanrelid,
estate->es_range_table);       clause = (FuncExprState *) lfirst(list_head(hjstate->hashclauses));       argstate =
(ExprState*) lfirst(list_head(clause->args));       variable = (Var *) argstate->expr;
 

I'm not very familiar with the hash join code, but it seems like there
are a lot of assumptions being made there about what things are
pointing to what other things.  Is this this actually safe?  And if it
is, perhaps a comment explaining why?

getMostCommonValues() also appears to be creating and maintaining a
counter called collisionsWhileHashing, but nothing is ever done with
the counter.  On a similar note, the variables relattnum, atttype, and
atttypmod don't appear to be necessary; 2 out of 3 of them are only
used once, so maybe inlining the reference and dropping the variable
would make more sense.  Also, the if (HeapTupleIsValid(statsTuple))
block encompasses the whole rest of the function, maybe if
(!HeapTupleIsValid(statsTuple)) return?

I don't understand why
hashtable->mostCommonTuplePartition[bucket].tuples and .frozen need to
be initialized to 0.  It looks to me like those are in a zero-filled
array that was just allocated, so it shouldn't be necessary to re-zero
them, unless I'm missing something.

freezeNextMCVPartiton is mis-spelled consistently throughout (the last
three letters should be "ion").  I also don't think it makes sense to
enclose everything but the first two lines of that function in an
else-block.

There is some initialization code in ExecHashJoin() that looks like it
belongs in ExecHashTableCreate.

It appears to me that the interface to isAMostCommonValue() could be
simplified by just making it return the partition number.  It could
perhaps be renamed something like ExecHashGetMCVPartition().

Does ExecHashTableDestroy() need to explicitly pfree
hashtable->mostCommonTuplePartition and
hashtable->flushOrderedMostCommonTuplePartition?  I would think those
would be allocated out of hashCxt - if they aren't, they probably
should be.

Department of minor nitpicks: (1) C++-style comments are not
permitted, (2) function names need to be capitalized like_this() or
LikeThis() but not likeThis(), (3) when defining a function, the
return type should be placed on the line preceding the actual function
name, so that the function name is at the beginning of the line, (4)
curly braces should be avoided around a block containing only one
statement, (5) excessive blank lines should be avoided (for example,
the one in costsize.c is clearly unnecessary, and there's at least one
place where you add two consecutive blank lines), and (6) I believe
the accepted way to write an empty loop is an indented semi-colon on
the next line, rather than {} on the same line as the while.

I will try to do some more substantive testing of this as well.

...Robert


pgsql-hackers by date:

Previous
From: "Fujii Masao"
Date:
Subject: Re: 8.4 replication questions
Next
From: "Fujii Masao"
Date:
Subject: Coding TODO for 8.4: Synch Rep