> On Mar 23, 2017, at 11:22 AM, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
>
>>>>>> "Mark" == Mark Dilger <hornschnorter@gmail.com> writes:
>
> Mark> You define DiscreteKnapsack to take integer weights and double
> Mark> values, and perform the usual Dynamic Programming algorithm to
> Mark> solve. But the only place you call this, you pass in NULL for
> Mark> the values, indicating that all the values are equal. I'm
> Mark> confused why in this degenerate case you bother with the DP at
> Mark> all. Can't you just load the knapsack from lightest to heaviest
> Mark> after sorting, reducing the runtime complexity to O(nlogn)? It's
> Mark> been a long day. Sorry if I am missing something obvious.
>
> That's actually a very good question.
>
> (Though in passing I should point out that the runtime cost is going to
> be negligible in all practical cases. Even an 8-way cube (256 grouping
> sets) has only 70 rollups, and a 4-way cube has only 6; the limit of
> 4096 grouping sets was only chosen because it was a nice round number
> that was larger than comparable limits in other database products. Other
> stuff in the grouping sets code has worse time bounds; the
> bipartite-match code used to minimize the number of rollups is
> potentially O(n^2.5) in the number of grouping sets.)
>
> Thinking about it, it's not at all obvious what we should be optimizing
> for here in the absence of accurate sort costs.
Is there a performance test case where this patch should shine
brightest? I'd like to load a schema with lots of data, and run
a grouping sets query, both before and after applying the patch,
to see what the performance advantage is.
mark