On Thu, Jul 11, 2019 at 06:06:33PM -0700, Jeff Davis wrote:
>On Thu, 2019-07-11 at 17:55 +0200, Tomas Vondra wrote:
>> Makes sense. I haven't thought about how the hybrid approach would be
>> implemented very much, so I can't quite judge how complicated would
>> it be
>> to extend "approach 1" later. But if you think it's a sensible first
>> step,
>> I trust you. And I certainly agree we need something to compare the
>> other
>> approaches against.
>
>Is this a duplicate of your previous email?
>
Yes. I don't know how I managed to send it again. Sorry.
>I'm slightly confused but I will use the opportunity to put out another
>WIP patch. The patch could use a few rounds of cleanup and quality
>work, but the funcionality is there and the performance seems
>reasonable.
>
>I rebased on master and fixed a few bugs, and most importantly, added
>tests.
>
>It seems to be working with grouping sets fine. It will take a little
>longer to get good performance numbers, but even for group size of one,
>I'm seeing HashAgg get close to Sort+Group in some cases.
>
Nice! That's a very nice progress!
>You are right that the missed lookups appear to be costly, at least
>when the data all fits in system memory. I think it's the cache misses,
>because sometimes reducing work_mem improves performance. I'll try
>tuning the number of buckets for the hash table and see if that helps.
>If not, then the performance still seems pretty good to me.
>
>Of course, HashAgg can beat sort for larger group sizes, but I'll try
>to gather some more data on the cross-over point.
>
Yes, makes sense. I think it's acceptable as long as we consider this
during costing (when we know in advance we'll need this) or treat it to be
emergency measure.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services