Re: [HACKERS] Hash support for grouping sets - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: [HACKERS] Hash support for grouping sets
Date
Msg-id 3D7C328C-ADA8-4026-A613-710D25B317AD@gmail.com
Whole thread Raw
In response to Re: [HACKERS] Hash support for grouping sets  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Responses Re: [HACKERS] Hash support for grouping sets  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
List pgsql-hackers
> On Mar 23, 2017, at 11:22 AM, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
>
>>>>>> "Mark" == Mark Dilger <hornschnorter@gmail.com> writes:
>
> Mark> You define DiscreteKnapsack to take integer weights and double
> Mark> values, and perform the usual Dynamic Programming algorithm to
> Mark> solve.  But the only place you call this, you pass in NULL for
> Mark> the values, indicating that all the values are equal.  I'm
> Mark> confused why in this degenerate case you bother with the DP at
> Mark> all.  Can't you just load the knapsack from lightest to heaviest
> Mark> after sorting, reducing the runtime complexity to O(nlogn)?  It's
> Mark> been a long day.  Sorry if I am missing something obvious.
>
> That's actually a very good question.
>
> (Though in passing I should point out that the runtime cost is going to
> be negligible in all practical cases. Even an 8-way cube (256 grouping
> sets) has only 70 rollups, and a 4-way cube has only 6; the limit of
> 4096 grouping sets was only chosen because it was a nice round number
> that was larger than comparable limits in other database products. Other
> stuff in the grouping sets code has worse time bounds; the
> bipartite-match code used to minimize the number of rollups is
> potentially O(n^2.5) in the number of grouping sets.)
>
> Thinking about it, it's not at all obvious what we should be optimizing
> for here in the absence of accurate sort costs.

Is there a performance test case where this patch should shine
brightest?  I'd like to load a schema with lots of data, and run
a grouping sets query, both before and after applying the patch,
to see what the performance advantage is.

mark


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [HACKERS] Hash support for grouping sets
Next
From: Pavan Deolasee
Date:
Subject: Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)