On 11/12/14, 1:54 AM, David Rowley wrote:
> On Tue, Nov 11, 2014 at 9:29 PM, Simon Riggs <simon@2ndquadrant.com <mailto:simon@2ndquadrant.com>> wrote:
>
>
> This plan type is widely used in reporting queries, so will hit the
> mainline of BI applications and many Mat View creations.
> This will allow SELECT count(*) FROM foo to go faster also.
>
> We'd also need to add some infrastructure to merge aggregate states together for this to work properly. This means
thatcould also work for avg() and stddev etc. For max() and min() the merge functions would likely just be the same as
thetransition functions.
Sanity check: what % of a large aggregate query fed by a seqscan actually spent in the aggregate functions? Even if you
lookstrictly at CPU cost, isn't there more code involved to get data to the aggregate function than in the aggregation
itself,except maybe for numeric?
In other words, I suspect that just having a dirt-simple parallel SeqScan could be a win for CPU. It should certainly
bea win IO-wise; in my experience we're not very good at maxing out IO systems.
(I was curious and came up with the list below for just the page-level stuff (ignoring IO). I don't see much code
involvedin per-tuple work, but I also never came across detoasting code, so I suspect I'm missing something...)
ExecScanFetch, heapgettup_pagemode, ReadBuffer, BufferAlloc, heap_page_prune_opt, LWLockAcquire... then you can finally
doper-tuple work. HeapTupleSatisfiesVisibility.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com