Thread: more support for various frame types of window functions
I'm not sure if it can be finished until the start of the next CF, but I've been working on $subject. This work intends to extend current limited frame types of window functions such like below; - ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING - ORDER BY x RANGE BETWEEN CURRENT ROW AND f FOLLOWING - ORDER BY x RANGE BETWEEN p PRECEDING AND f FOLLOWING where "p" and "f" are values that indicate preceding/following frame boundary offsets from current row (or peer). With this feature, you can calculate something like "moving average" by SQL. Frame types that won't be introduced in this work includes: - EXCLUDE clause The hardest point is that aggregates must be re-initialized as rows exit from current frame, which doesn't occur in 8.4 design. One of the solution for this is to let aggregates have "negative trans functions" (NTF), and some comments in nodeWindowAgg.c say about it, which current aggregate system doesn't have. But my work doesn't introduce this mechanism because 1) "negative trans function" doesn't do anything in normal aggregate 2) forcing that to everyone who writes his/her own aggregate is quite hard and incompatible to older releases 3) so, we must at least support aggregates that don't have NTF even if it will be introduced in the future That means moving average is initialized again on frame-off situation as the frame moves down. I know that may kill it's performance but reasons above result in my proposing design. If you have better ideas please feel free to tell me, and any comments welcomed. Regards, -- Hitoshi Harada
Hitoshi Harada wrote: > That means moving average is initialized again on frame-off situation > as the frame moves down. I know that may kill it's performance but > reasons above result in my proposing design. Yeah, we need the reinitialization support to handle the generic case. One idea is to take a copy of the state datum after each row. Then, instead of initializing the aggregate from scratch, you can "roll back" to an earlier copied state. It doesn't always help, but might be a part of the solution. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Mon, Nov 09, 2009 at 06:39:54PM +0900, Hitoshi Harada wrote: > I'm not sure if it can be finished until the start of the next CF, > but I've been working on $subject. This work intends to extend > current limited frame types of window functions such like below; This is very, very exciting. Is there a public repository people can check out? In particular, I'm curious about how to handle ROWS vs. RANGE, e.g.: avg(t) OVER (... ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) AS smooth_five_points vs. avg(t) OVER (... RANGE BETWEEN INTERVAL '2 day' PRECEDING AND INTERVAL '2 day' FOLLOWING) AS five_day_average > - ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING > - ORDER BY x RANGE BETWEEN CURRENT ROW AND f FOLLOWING > - ORDER BY x RANGE BETWEEN p PRECEDING AND f FOLLOWING > > where "p" and "f" are values that indicate preceding/following frame > boundary offsets from current row (or peer). With this feature, you > can calculate something like "moving average" by SQL. > > Frame types that won't be introduced in this work includes: > > - EXCLUDE clause > > The hardest point is that aggregates must be re-initialized as rows > exit from current frame, which doesn't occur in 8.4 design. One of the > solution for this is to let aggregates have "negative trans functions" > (NTF), and some comments in nodeWindowAgg.c say about it, which > current aggregate system doesn't have. But my work doesn't introduce > this mechanism because > > 1) "negative trans function" doesn't do anything in normal aggregate > 2) forcing that to everyone who writes his/her own aggregate is quite > hard and incompatible to older releases > 3) so, we must at least support aggregates that don't have NTF even if > it will be introduced in the future > > That means moving average is initialized again on frame-off > situation as the frame moves down. I know that may kill it's > performance but reasons above result in my proposing design. > > If you have better ideas please feel free to tell me, and any > comments welcomed. First, it's wonderful to hear you're working on this. :) Second, it's tradition on the PostgreSQL project to start with a slow(ish) and correct implementation, then make it faster. NTFs or other speed boosts, while nice to have, would not be needed for a first production implementation. After all, the alternatives without the native ones are usually slow, buggy, unstable, or combinations of all three. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
>>>>> "Heikki" == Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: Heikki> Yeah, we need the reinitialization support to handle the generic case. Heikki> One idea is to take a copy of the state datum after each row. Some existing aggregates, notably array_agg, rely on the assumption that this never happens. -- Andrew (irc:RhodiumToad)
2009/11/9 Andrew Gierth <andrew@tao11.riddles.org.uk>: >>>>>> "Heikki" == Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > > Heikki> Yeah, we need the reinitialization support to handle the generic case. > > Heikki> One idea is to take a copy of the state datum after each row. > > Some existing aggregates, notably array_agg, rely on the assumption > that this never happens. I'm carefully coding to keep the existing behavior that when frame start is UNBOUNDED PRECEDING aggregate state is cached. I don't see how the copied state will help in other frame types, but is it rather than that? We assume the frame starting and ending boundary never goes back. Regards, -- Hitoshi Harada
2009/11/9 David Fetter <david@fetter.org>: > On Mon, Nov 09, 2009 at 06:39:54PM +0900, Hitoshi Harada wrote: >> I'm not sure if it can be finished until the start of the next CF, >> but I've been working on $subject. This work intends to extend >> current limited frame types of window functions such like below; > > This is very, very exciting. Is there a public repository people can > check out? Not so far as always. The step is quite small so I don't believe we need developing repository but I'll create it when needed. > In particular, I'm curious about how to handle ROWS vs. > RANGE, e.g.: > > avg(t) OVER (... > ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) AS smooth_five_points > > vs. > > avg(t) OVER (... > RANGE BETWEEN > INTERVAL '2 day' PRECEDING AND > INTERVAL '2 day' FOLLOWING) AS five_day_average > I've not finished reading spec completely, but in the first frame starts at exactly 2 rows before current row and ends at exactly 2 rows after current row. The latter is a bit more complicated but it means the frame starts at the beginning of peers whose value in ORDER BY clause is current row value - 2 days and so on. >> - ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING >> - ORDER BY x RANGE BETWEEN CURRENT ROW AND f FOLLOWING >> - ORDER BY x RANGE BETWEEN p PRECEDING AND f FOLLOWING >> >> where "p" and "f" are values that indicate preceding/following frame >> boundary offsets from current row (or peer). With this feature, you >> can calculate something like "moving average" by SQL. >> >> Frame types that won't be introduced in this work includes: >> >> - EXCLUDE clause >> >> The hardest point is that aggregates must be re-initialized as rows >> exit from current frame, which doesn't occur in 8.4 design. One of the >> solution for this is to let aggregates have "negative trans functions" >> (NTF), and some comments in nodeWindowAgg.c say about it, which >> current aggregate system doesn't have. But my work doesn't introduce >> this mechanism because >> >> 1) "negative trans function" doesn't do anything in normal aggregate >> 2) forcing that to everyone who writes his/her own aggregate is quite >> hard and incompatible to older releases >> 3) so, we must at least support aggregates that don't have NTF even if >> it will be introduced in the future >> >> That means moving average is initialized again on frame-off >> situation as the frame moves down. I know that may kill it's >> performance but reasons above result in my proposing design. >> >> If you have better ideas please feel free to tell me, and any >> comments welcomed. > > First, it's wonderful to hear you're working on this. :) Thanks. I hope it will be done until 8.5 release. > > Second, it's tradition on the PostgreSQL project to start with a > slow(ish) and correct implementation, then make it faster. NTFs or > other speed boosts, while nice to have, would not be needed for a > first production implementation. After all, the alternatives without > the native ones are usually slow, buggy, unstable, or combinations of > all three. Yep, NTF is the next stage after we implement various frame types correctly. I should be careful not to go wrong way where NTF cannot be applied in the future. Regards, -- Hitoshi Harada
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > One idea is to take a copy of the state datum after each row. Then, > instead of initializing the aggregate from scratch, you can "roll back" > to an earlier copied state. It doesn't always help, but might be a part > of the solution. That requires that you know how to copy the aggregate's state. You do not. (In some cases the aggregate function has extra state besides the nominal transition datum...) regards, tom lane
On Mon, Nov 09, 2009 at 11:20:39PM +0900, Hitoshi Harada wrote: > 2009/11/9 David Fetter <david@fetter.org>: > > On Mon, Nov 09, 2009 at 06:39:54PM +0900, Hitoshi Harada wrote: > >> I'm not sure if it can be finished until the start of the next CF, > >> but I've been working on $subject. This work intends to extend > >> current limited frame types of window functions such like below; > > > > This is very, very exciting. Is there a public repository people can > > check out? > Not so far as always. The step is quite small so I don't believe we > need developing repository but I'll create it when needed. Thanks :) > > In particular, I'm curious about how to handle ROWS vs. > > RANGE, e.g.: > > > > avg(t) OVER (... > > ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) AS smooth_five_points > > > > vs. > > > > avg(t) OVER (... > > RANGE BETWEEN > > INTERVAL '2 day' PRECEDING AND > > INTERVAL '2 day' FOLLOWING) AS five_day_average > > I've not finished reading spec completely, but in the first frame > starts at exactly 2 rows before current row and ends at exactly 2 > rows after current row. The latter is a bit more complicated but it > means the frame starts at the beginning of peers whose value in > ORDER BY clause is current row value - 2 days and so on. That's pretty much it. The spec may have some other things to say about corner cases, NULLs, etc. > > First, it's wonderful to hear you're working on this. :) > Thanks. I hope it will be done until 8.5 release. Will you be at the JPUG 10th anniversary? Might code be there in time for that? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
2009/11/10 David Fetter <david@fetter.org>: > On Mon, Nov 09, 2009 at 11:20:39PM +0900, Hitoshi Harada wrote: >> 2009/11/9 David Fetter <david@fetter.org>: >> > On Mon, Nov 09, 2009 at 06:39:54PM +0900, Hitoshi Harada wrote: >> >> I'm not sure if it can be finished until the start of the next CF, >> >> but I've been working on $subject. This work intends to extend >> >> current limited frame types of window functions such like below; >> > >> > This is very, very exciting. Is there a public repository people can >> > check out? >> Not so far as always. The step is quite small so I don't believe we >> need developing repository but I'll create it when needed. > > Thanks :) > >> > In particular, I'm curious about how to handle ROWS vs. >> > RANGE, e.g.: >> > >> > avg(t) OVER (... >> > ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) AS smooth_five_points >> > >> > vs. >> > >> > avg(t) OVER (... >> > RANGE BETWEEN >> > INTERVAL '2 day' PRECEDING AND >> > INTERVAL '2 day' FOLLOWING) AS five_day_average >> >> I've not finished reading spec completely, but in the first frame >> starts at exactly 2 rows before current row and ends at exactly 2 >> rows after current row. The latter is a bit more complicated but it >> means the frame starts at the beginning of peers whose value in >> ORDER BY clause is current row value - 2 days and so on. > > That's pretty much it. The spec may have some other things to say > about corner cases, NULLs, etc. OK, I'll check it. >> > First, it's wonderful to hear you're working on this. :) >> Thanks. I hope it will be done until 8.5 release. > > Will you be at the JPUG 10th anniversary? Sure, I'll be there for both two days and give lightning talk on Saturday evening (but in Japanese, sorry!) > > Might code be there in time for that? I hope so... -- Hitoshi Harada
On Mon, Nov 9, 2009 at 3:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: >> One idea is to take a copy of the state datum after each row. Then, >> instead of initializing the aggregate from scratch, you can "roll back" >> to an earlier copied state. It doesn't always help, but might be a part >> of the solution. > > That requires that you know how to copy the aggregate's state. You do > not. (In some cases the aggregate function has extra state besides the > nominal transition datum...) That's relatively unusual and usually a bad idea, imho. We could add a flag to indicate whether that's the case and only do the optimization if it's not set. It would rarely be set. We already faced this problem with aggregates over windows with range unbounded preceding to the current row. I lost track of the current way of handling the state for those when they're repeatedly finalized. Actually I had assumed the approach would be different. Instead of reinitializing the aggregate I had pictured keeping a rolling window of aggregate states. Ie, if you're doing a rolling average of the last five values you keep five aggregate states which started 1 row ago, 2 rows ago, ... up to 5 rows ago. You finalize the last one and roll the others down using the state transition function. My instinct was that that would be more efficient but working through it in detail it looks to me like it would do precisely the same work and have difficulties if the window doesn't fit in work_mem. So perhaps it's a bad idea, but I wanted to mention it anyways. -- greg
On Mon, Nov 09, 2009 at 10:01:53AM -0500, Tom Lane wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > > One idea is to take a copy of the state datum after each row. Then, > > instead of initializing the aggregate from scratch, you can "roll back" > > to an earlier copied state. It doesn't always help, but might be a part > > of the solution. > > That requires that you know how to copy the aggregate's state. You do > not. (In some cases the aggregate function has extra state besides the > nominal transition datum...) Other ways of doing aggregates/folds involve exploiting a "tree" like structure (mostly for reasons of parallelism, but these don't apply to PG). For example, instead of having the triple (init,accum,final) as we do at the moment, you could have a "merge" function that would take two intermediate states and combine them. For example, COUNT and SUM would be the add function, MIN and MAX would be least and greatest functions respectively, and so on. If this method exists then you can do the fancy stuff as suggested, if it doesn't exist then fall back to less optimal methods. -- Sam http://samason.me.uk/
Greg Stark <gsstark@mit.edu> writes: > On Mon, Nov 9, 2009 at 3:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> That requires that you know how to copy the aggregate's state. �You do >> not. �(In some cases the aggregate function has extra state besides the >> nominal transition datum...) > That's relatively unusual and usually a bad idea, imho. We could add a > flag to indicate whether that's the case and only do the optimization > if it's not set. It would rarely be set. We already burned that bridge, unfortunately --- core and contrib both contain examples of aggregates that do things this way, and I'll bet there are third-party examples by now. If you have a large working state for an aggregate, it's just too tempting to prevent nodeAgg.c from copying the darn thing all the time. It's fairly clear that efficient support for this will require extra help from the aggregate author. Ideally we would define it so that the author could implement it either with negative transitions or with multiple copies of the agg state, whichever seems most efficient. The cases that are hard are exactly where the agg state is large, so I doubt it's a good idea to legislate that multiple copies is how to do it. In any case, I don't think we can put retroactive restrictions on what the aggregates do in the normal case, and we can definitely not assume that existing aggregates will politely tell us what they are doing. regards, tom lane