Re: Final Patch for GROUPING SETS - unrecognized node type: 347 - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Final Patch for GROUPING SETS - unrecognized node type: 347 |
Date | |
Msg-id | 540B68DE.404@fuzzy.cz Whole thread Raw |
In response to | Re: Final Patch for GROUPING SETS - unrecognized node type: 347 (Andrew Gierth <andrew@tao11.riddles.org.uk>) |
Responses |
Re: Final Patch for GROUPING SETS - unrecognized node
type: 347
|
List | pgsql-hackers |
On 31.8.2014 22:52, Andrew Gierth wrote: > Recut patches: > > gsp1.patch - phase 1 code patch (full syntax, limited functionality) > gsp2.patch - phase 2 code patch (adds full functionality using the > new chained aggregate mechanism) > gsp-doc.patch - docs > gsp-contrib.patch - quote "cube" in contrib/cube and contrib/earthdistance, > intended primarily for testing pending a decision on > renaming contrib/cube or unreserving keywords > gsp-u.patch - proposed method to unreserve CUBE and ROLLUP > > (the contrib patch is not necessary if the -u patch is used; the > contrib/pg_stat_statements fixes are in the phase1 patch) Hi, I looked at the patch today. The good news is it seems to apply cleanly on HEAD (with some small offsets, but no conflicts). The code generally seems OK to me, although the patch is quite massive. I've also did a considerable amount of testing and I've been unable to cause failures. I have significant doubts about the whole design, though. Especially the decision not to use HashAggregate, and the whole chaining idea. I haven't noticed any discussion about this (at least in this thread), and the chaining idea was not mentioned until 21/8, so I'd appreciate some reasoning behind this choice. I assume the "no HashAggregate" decision was done because of fear of underestimates, and the related OOM issues. I don't see how this is different from the general HashAggregate, though. Or is there another reason for this? Now, the chaining only makes this worse, because it effectively forces a separate sort of the whole table for each grouping set. We're doing a lot of analytics on large tables, where large means tens of GBs and hundreds of millions of rows. What we do now at the moment is basically the usual ROLAP approach - create a cube with aggregated data, which is usually much smaller than the source table, and then compute the rollups for the interesting slices in a second step. I was hoping that maybe we could eventually replace this with the GROUP BY CUBE functionality provided by this patch, but these design decisions make it pretty much impossible. I believe most other users processing non-trivial amounts of data (pretty much everyone with just a few million rows) will be in similar position :-( What I envisioned when considering hacking on this a few months back, was extending the aggregate API with "merge state" function, doing the aggregation just like today and merging the groups (for each cell) at the end. Yeah, we don't have this infrastructure, but maybe it'd be a better way than the current chaining approach. And it was repeatedly mentioned as necessary for parallel aggregation (and even mentioned in the memory-bounded hashagg batching discussion). I'm ready to spend some time on this, if it makes the grouping sets useful for us. regards Tomas
pgsql-hackers by date: