Window functions review - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Window functions review
Date
Msg-id 491AE529.3030106@enterprisedb.com
Whole thread Raw
Responses Re: Window functions review  ("Hitoshi Harada" <umi.tanuki@gmail.com>)
List pgsql-hackers
I've been slicing and dicing this patch for the last few days. There's a 
lot of code in there, but here's some initial comments:

The code to initialize, advance, and finalize an aggregate should be 
shared between Agg and Window nodes.

I'm a bit disappointed that we need so much code to support the window 
aggregates. I figured we'd just have a special window function that 
calls the initialize/advance/finalize functions for the right aggregate, 
working with the WindowObject object like other window functions. But we 
have in fact very separate code path for aggregates. I feel we should 
implement the aggregates as just a thin wrapper window function that I 
just described. Or, perhaps the aggregate functionality should be moved 
to Agg node, although if we do that, we'd probably have to change that 
again when we implement the window frames. Or perhaps it should be 
completely new node type, though you'd really want to share the 
tuplestore with the Window node if there's any window functions in the 
same query, using the same window specification.

I don't like that the window functions are passed Var and Const nodes as 
arguments. A user-defined function shouldn't see those internal data 
structures, even though they're just passed as opaque arguments to 
WinRowGetArg. You could pass WinRowGetArg just argument numbers, and 
have it fetch the Expr nodes.

The incomplete support for window frames should be removed. It was good 
that you did it, so that we have a sketch of how it'd work and got some 
confidence that we won't have to rip out and rewrite all of this in the 
future to support the window frames. But if it's not going to go into 
this release, we should take it out.

The buffering strategies are very coarse. For aggregates, as long as we 
don't support window frames, row buffering is enough. Even when 
partition-buffering is needed, we shouldn't need to fetch the whole 
partition into the tuplestore before we start processing. lead/lag for 
example can start returning values earlier. That's just an optimization, 
though, so maybe we can live with that for now.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Block-level CRC checks
Next
From: Tom Lane
Date:
Subject: Re: Block-level CRC checks