On Wed, Nov 05, 2008 at 04:06:11PM -0800, Bryce Cutt wrote:
> The error is causes by me Asserting against the wrong variable. I
> never noticed this as I apparently did not have assertions turned on
> on my development machine. That is fixed now and with the new patch
> version I have attached all assertions are passing with your query and
> my test queries. I added another assertion to that section of the
> code so that it is a bit more vigorous in confirming the hash table
> partition is correct. It does not change the operation of the code.
>
> There are two partition counts. One holds the maximum number of
> buckets in the hash table and the other counts the number of actual
> buckets created for hash values. I was incorrectly testing against
> the second one because that was valid before I started using a hash
> table to store the buckets.
>
> The enable_hashjoin_usestatmcvs flag was valuable for my own research
> and tests and likely useful for your review but Tom is correct that it
> can be removed in the final version.
>
> - Bryce Cutt
>
Well, this version seems to work as advertised. Skewed data sets tend to
hash join more quickly with this turned on, and data sets with
deliberately bad statistics don't perform much differently than with the
feature turned off. The patch applies cleanly to CVS HEAD.
I don't consider myself qualified to do a decent code review. However I
noticed that the comments are all done with // instead of /* ... */.
That should probably be changed.
To those familiar with code review: is there more I should do to review
this?
- Josh / eggyknap