Thread: Department of Redundancy Department: makeNode(FuncCall) division
Folks, Per suggestions and lots of help from Andrew Gierth, please find attached a patch to clean up the call sites for FuncCall nodes, which I'd like to expand centrally rather than in each of the 37 (or 38, but I only redid 37) places where it's called. The remaining one is in src/backend/nodes/copyfuncs.c, which has to be modified for any changes in the that struct anyhow. The immediate purpose is two-fold: to reduce some redundancies, which I believe is worth doing in and of itself, and to prepare for adding FILTER on aggregates from the spec, and possibly other things in the <aggregate function> part of the spec. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
David Fetter <david@fetter.org> writes: > Per suggestions and lots of help from Andrew Gierth, please find > attached a patch to clean up the call sites for FuncCall nodes, which > I'd like to expand centrally rather than in each of the 37 (or 38, but > I only redid 37) places where it's called. The remaining one is in > src/backend/nodes/copyfuncs.c, which has to be modified for any > changes in the that struct anyhow. TBH, I don't think this is an improvement. The problem with adding a new field to any struct is that you have to run around and examine (and, usually, modify) every place that manufactures that type of struct. With a makeFuncCall defined like this, you'd still have to do that; it would just become a lot easier to forget to do so. If the subroutine were defined like most other makeXXX subroutines, ie you have to supply *all* the fields, that argument would go away, but the notational advantage is then dubious. The bigger-picture point is that you're proposing to make the coding conventions for building FuncCalls different from what they are for any other grammar node. I don't think that's a great idea; it will mostly foster confusion. regards, tom lane
On Sun, Feb 10, 2013 at 10:09:19AM -0500, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > Per suggestions and lots of help from Andrew Gierth, please find > > attached a patch to clean up the call sites for FuncCall nodes, which > > I'd like to expand centrally rather than in each of the 37 (or 38, but > > I only redid 37) places where it's called. The remaining one is in > > src/backend/nodes/copyfuncs.c, which has to be modified for any > > changes in the that struct anyhow. > > TBH, I don't think this is an improvement. > > The problem with adding a new field to any struct is that you have to > run around and examine (and, usually, modify) every place that > manufactures that type of struct. With a makeFuncCall defined like > this, you'd still have to do that; it would just become a lot easier > to forget to do so. So you're saying that I've insufficiently put the choke point there? It seems to me that needing to go back to each and every jot and tittle of the code to add a new default parameter in each place is a recipe for mishaps, where in this one, the new correct defaults will just appear in all the places they need to. > If the subroutine were defined like most other makeXXX subroutines, > ie you have to supply *all* the fields, that argument would go away, > but the notational advantage is then dubious. > > The bigger-picture point is that you're proposing to make the coding > conventions for building FuncCalls different from what they are for > any other grammar node. I don't think that's a great idea; it will > mostly foster confusion. I find this a good bit less confusing than the litany of repeated parameter settings, the vast majority of which in most cases are along the lines of NULL/NIL/etc. This way, anything set that's not the default stands out. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Sun, Feb 10, 2013 at 10:09:19AM -0500, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > Per suggestions and lots of help from Andrew Gierth, please find > > attached a patch to clean up the call sites for FuncCall nodes, which > > I'd like to expand centrally rather than in each of the 37 (or 38, but > > I only redid 37) places where it's called. The remaining one is in > > src/backend/nodes/copyfuncs.c, which has to be modified for any > > changes in the that struct anyhow. > > TBH, I don't think this is an improvement. > > The problem with adding a new field to any struct is that you have to > run around and examine (and, usually, modify) every place that > manufactures that type of struct. With a makeFuncCall defined like > this, you'd still have to do that; it would just become a lot easier > to forget to do so. > > If the subroutine were defined like most other makeXXX subroutines, > ie you have to supply *all* the fields, that argument would go away, > but the notational advantage is then dubious. > > The bigger-picture point is that you're proposing to make the coding > conventions for building FuncCalls different from what they are for > any other grammar node. I don't think that's a great idea; it will > mostly foster confusion. The major difference between FuncCalls and others is that `while most raw-parsetree nodes are constructed only in their own syntax productions, FuncCall is constructed in many places unrelated to actual function call syntax. This really will make things a good bit easier on our successors. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
David Fetter
Date:
On Sat, Feb 09, 2013 at 11:59:22PM -0800, David Fetter wrote: > Folks, > > Per suggestions and lots of help from Andrew Gierth, please find > attached a patch to clean up the call sites for FuncCall nodes, which > I'd like to expand centrally rather than in each of the 37 (or 38, but > I only redid 37) places where it's called. The remaining one is in > src/backend/nodes/copyfuncs.c, which has to be modified for any > changes in the that struct anyhow. > > The immediate purpose is two-fold: to reduce some redundancies, which > I believe is worth doing in and of itself, and to prepare for adding > FILTER on aggregates from the spec, and possibly other things in > the <aggregate function> part of the spec. > > Cheers, > David. Folks, Please find attached two versions of a patch which provides optional FILTER clause for aggregates (T612, "Advanced OLAP operations"). The first is intended to be applied on top of the previous patch, the second without it. The first is, I believe, clearer in what it's doing. Rather than simply mechanically visiting every place a function call might be constructed, it visits a central one to change the default, then goes only to the places where it's relevant. The patches are both early WIP as they contain no docs or regression tests yet. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
David Fetter
Date:
On Wed, Feb 13, 2013 at 06:45:31AM -0800, David Fetter wrote: > On Sat, Feb 09, 2013 at 11:59:22PM -0800, David Fetter wrote: > > Folks, > > > > Per suggestions and lots of help from Andrew Gierth, please find > > attached a patch to clean up the call sites for FuncCall nodes, which > > I'd like to expand centrally rather than in each of the 37 (or 38, but > > I only redid 37) places where it's called. The remaining one is in > > src/backend/nodes/copyfuncs.c, which has to be modified for any > > changes in the that struct anyhow. > > > > The immediate purpose is two-fold: to reduce some redundancies, which > > I believe is worth doing in and of itself, and to prepare for adding > > FILTER on aggregates from the spec, and possibly other things in > > the <aggregate function> part of the spec. > > > > Cheers, > > David. > > Folks, > > Please find attached two versions of a patch which provides optional > FILTER clause for aggregates (T612, "Advanced OLAP operations"). > > The first is intended to be applied on top of the previous patch, the > second without it. I'll find a brown paper back to wear over my head at some point, but meanwhile, here's a cleaned-up version of the patch that doesn't use makeFuncArgs, now without merge artifacts and with the ability to actually compile. It's still WIP in the sense previously mentioned. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
David Fetter
Date:
On Wed, Feb 13, 2013 at 06:45:31AM -0800, David Fetter wrote: > On Sat, Feb 09, 2013 at 11:59:22PM -0800, David Fetter wrote: > > Folks, > > > > Per suggestions and lots of help from Andrew Gierth, please find > > attached a patch to clean up the call sites for FuncCall nodes, which > > I'd like to expand centrally rather than in each of the 37 (or 38, but > > I only redid 37) places where it's called. The remaining one is in > > src/backend/nodes/copyfuncs.c, which has to be modified for any > > changes in the that struct anyhow. > > > > The immediate purpose is two-fold: to reduce some redundancies, which > > I believe is worth doing in and of itself, and to prepare for adding > > FILTER on aggregates from the spec, and possibly other things in > > the <aggregate function> part of the spec. > > > > Cheers, > > David. > > Folks, > > Please find attached two versions of a patch which provides optional > FILTER clause for aggregates (T612, "Advanced OLAP operations"). > > The first is intended to be applied on top of the previous patch, the > second without it. The first is, I believe, clearer in what it's > doing. Rather than simply mechanically visiting every place a > function call might be constructed, it visits a central one to change > the default, then goes only to the places where it's relevant. > > The patches are both early WIP as they contain no docs or regression > tests yet. Docs and regression tests added, makeFuncArgs approached dropped for now, will re-visit later. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
David Fetter
Date:
On Tue, Feb 26, 2013 at 01:09:30PM -0800, David Fetter wrote: > On Wed, Feb 13, 2013 at 06:45:31AM -0800, David Fetter wrote: > > On Sat, Feb 09, 2013 at 11:59:22PM -0800, David Fetter wrote: > > > Folks, > > > > > > Per suggestions and lots of help from Andrew Gierth, please find > > > attached a patch to clean up the call sites for FuncCall nodes, which > > > I'd like to expand centrally rather than in each of the 37 (or 38, but > > > I only redid 37) places where it's called. The remaining one is in > > > src/backend/nodes/copyfuncs.c, which has to be modified for any > > > changes in the that struct anyhow. > > > > > > The immediate purpose is two-fold: to reduce some redundancies, which > > > I believe is worth doing in and of itself, and to prepare for adding > > > FILTER on aggregates from the spec, and possibly other things in > > > the <aggregate function> part of the spec. > > > > > > Cheers, > > > David. > > > > Folks, > > > > Please find attached two versions of a patch which provides optional > > FILTER clause for aggregates (T612, "Advanced OLAP operations"). > > > > The first is intended to be applied on top of the previous patch, the > > second without it. The first is, I believe, clearer in what it's > > doing. Rather than simply mechanically visiting every place a > > function call might be constructed, it visits a central one to change > > the default, then goes only to the places where it's relevant. > > > > The patches are both early WIP as they contain no docs or regression > > tests yet. > > Docs and regression tests added, makeFuncArgs approached dropped for > now, will re-visit later. Regression tests added to reflect bug fixes in COLLATE. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
David Fetter
Date:
On Sun, Apr 28, 2013 at 01:29:41PM -0700, David Fetter wrote: > On Tue, Feb 26, 2013 at 01:09:30PM -0800, David Fetter wrote: > > On Wed, Feb 13, 2013 at 06:45:31AM -0800, David Fetter wrote: > > > On Sat, Feb 09, 2013 at 11:59:22PM -0800, David Fetter wrote: > > > > Folks, > > > > > > > > Per suggestions and lots of help from Andrew Gierth, please find > > > > attached a patch to clean up the call sites for FuncCall nodes, which > > > > I'd like to expand centrally rather than in each of the 37 (or 38, but > > > > I only redid 37) places where it's called. The remaining one is in > > > > src/backend/nodes/copyfuncs.c, which has to be modified for any > > > > changes in the that struct anyhow. > > > > > > > > The immediate purpose is two-fold: to reduce some redundancies, which > > > > I believe is worth doing in and of itself, and to prepare for adding > > > > FILTER on aggregates from the spec, and possibly other things in > > > > the <aggregate function> part of the spec. > > > > > > > > Cheers, > > > > David. > > > > > > Folks, > > > > > > Please find attached two versions of a patch which provides optional > > > FILTER clause for aggregates (T612, "Advanced OLAP operations"). > > > > > > The first is intended to be applied on top of the previous patch, the > > > second without it. The first is, I believe, clearer in what it's > > > doing. Rather than simply mechanically visiting every place a > > > function call might be constructed, it visits a central one to change > > > the default, then goes only to the places where it's relevant. > > > > > > The patches are both early WIP as they contain no docs or regression > > > tests yet. > > > > Docs and regression tests added, makeFuncArgs approached dropped for > > now, will re-visit later. > > Regression tests added to reflect bug fixes in COLLATE. Rebased vs. master. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
On Mon, Feb 11, 2013 at 10:48:38AM -0800, David Fetter wrote: > On Sun, Feb 10, 2013 at 10:09:19AM -0500, Tom Lane wrote: > > David Fetter <david@fetter.org> writes: > > > Per suggestions and lots of help from Andrew Gierth, please find > > > attached a patch to clean up the call sites for FuncCall nodes, which > > > I'd like to expand centrally rather than in each of the 37 (or 38, but > > > I only redid 37) places where it's called. The remaining one is in > > > src/backend/nodes/copyfuncs.c, which has to be modified for any > > > changes in the that struct anyhow. > > > > TBH, I don't think this is an improvement. > > > > The problem with adding a new field to any struct is that you have to > > run around and examine (and, usually, modify) every place that > > manufactures that type of struct. With a makeFuncCall defined like > > this, you'd still have to do that; it would just become a lot easier > > to forget to do so. > > > > If the subroutine were defined like most other makeXXX subroutines, > > ie you have to supply *all* the fields, that argument would go away, > > but the notational advantage is then dubious. > > > > The bigger-picture point is that you're proposing to make the coding > > conventions for building FuncCalls different from what they are for > > any other grammar node. I don't think that's a great idea; it will > > mostly foster confusion. > > The major difference between FuncCalls and others is that `while most > raw-parsetree nodes are constructed only in their own syntax > productions, FuncCall is constructed in many places unrelated to > actual function call syntax. > > This really will make things a good bit easier on our successors. Here's a rebased patch with comments illustrating how best to employ. In my previous message, I characterized the difference between FuncCalls and other raw-parsetree nodes. Is there some flaw in that logic? If there isn't, is there some reason not to treat them in a less redundant, more uniform manner as this patch does? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
* David Fetter (david@fetter.org) wrote: > On Mon, Feb 11, 2013 at 10:48:38AM -0800, David Fetter wrote: > > On Sun, Feb 10, 2013 at 10:09:19AM -0500, Tom Lane wrote: > > > David Fetter <david@fetter.org> writes: > > > > Per suggestions and lots of help from Andrew Gierth, please find > > > > attached a patch to clean up the call sites for FuncCall nodes, which > > > > I'd like to expand centrally rather than in each of the 37 (or 38, but > > > > I only redid 37) places where it's called. The remaining one is in > > > > src/backend/nodes/copyfuncs.c, which has to be modified for any > > > > changes in the that struct anyhow. > > > > > > TBH, I don't think this is an improvement. > > > > > > The problem with adding a new field to any struct is that you have to > > > run around and examine (and, usually, modify) every place that > > > manufactures that type of struct. With a makeFuncCall defined like > > > this, you'd still have to do that; it would just become a lot easier > > > to forget to do so. I don't really see how finding all callers of makeFuncCall is particularly harder than finding the callers of makeNode(Func). If there were cases where we still wanted to use makeNode(Func), perhaps that would be annoying since you'd have to look for both- but, iiuc, this patch changes all of the callers to use makeFuncCall and it seems reasonable for all callers to do so in the future as well. It looks to me like the advantage of this patch is exactly that you *don't* have to run around and modify things which are completely unrelated to the new feature being added (eg: FILTER). Add the new field, set up the default/no-op case in makeFuncCall() and then only change those places that actually need to worry about your new field. > > > If the subroutine were defined like most other makeXXX subroutines, > > > ie you have to supply *all* the fields, that argument would go away, > > > but the notational advantage is then dubious. Having to supply all the fields certainly wouldn't make things any better. Providing the base set of fields which are required to be set for any FuncCall node does make sense though, which is what the patch does. The rest of the fields are all special cases for which a default value works perfectly fine (when the field isn't involved in the specific case being handled). > > > The bigger-picture point is that you're proposing to make the coding > > > conventions for building FuncCalls different from what they are for > > > any other grammar node. I don't think that's a great idea; it will > > > mostly foster confusion. > > > > The major difference between FuncCalls and others is that `while most > > raw-parsetree nodes are constructed only in their own syntax > > productions, FuncCall is constructed in many places unrelated to > > actual function call syntax. Yeah, FuncCall's are already rather special and they're built all over the place. That's my 2c on it anyhow. I don't see it as some kind of major milestone but it looks like improvement to me and likely to make things a bit easier on patch authors and reviewers who otherwise have to ponder a bunch of repeated 'x->q = false;' statements in areas which are completely unrelated to the new feature itself. Thanks, Stephen
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Dean Rasheed
Date:
On 17 June 2013 06:36, David Fetter <david@fetter.org> wrote: >> > > Please find attached two versions of a patch which provides optional >> > > FILTER clause for aggregates (T612, "Advanced OLAP operations"). >> > > >> > > The first is intended to be applied on top of the previous patch, the >> > > second without it. The first is, I believe, clearer in what it's >> > > doing. Rather than simply mechanically visiting every place a >> > > function call might be constructed, it visits a central one to change >> > > the default, then goes only to the places where it's relevant. >> > > >> > > The patches are both early WIP as they contain no docs or regression >> > > tests yet. >> > >> > Docs and regression tests added, makeFuncArgs approached dropped for >> > now, will re-visit later. >> >> Regression tests added to reflect bug fixes in COLLATE. > > Rebased vs. master. > Hi, I've been looking at this patch, which adds support for the SQL standard feature of applying a filter to rows used in an aggregate. The syntax matches the spec: agg_fn ( <args> [ ORDER BY <sort_clause> ] ) [ FILTER ( WHERE <qual> ) ] and this patch makes FILTER a new reserved keyword, usable as a function or type, but not in other contexts, e.g., as a table name or alias. I'm not sure if that might be a problem for some people, but I can't see any way round it, since otherwise there would be no way for the parser to distinguish a filter clause from an alias expression. The feature appears to work per-spec, and the code and doc changes look reasonable. However, it looks a little light on regression tests, and so I think some necessary changes have been overlooked. In my testing of sub-queries in the FILTER clause (an extension to the spec), I was able to produce the following error: CREATE TABLE t1(a1 int); CREATE TABLE t2(a2 int); INSERT INTO t1 SELECT * FROM generate_series(1,10); INSERT INTO t2 SELECT * FROM generate_series(3,6); SELECT array_agg(a1) FILTER (WHERE a1 IN (SELECT a2 FROM t2)) FROM t1; ERROR: plan should not reference subplan's variable which looks to be related to subselect.c's handling of sub-queries in aggregates. Regards, Dean
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
David Fetter
Date:
On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote: > On 17 June 2013 06:36, David Fetter <david@fetter.org> wrote: > >> > > Please find attached two versions of a patch which provides optional > >> > > FILTER clause for aggregates (T612, "Advanced OLAP operations"). > >> > > > >> > > The first is intended to be applied on top of the previous patch, the > >> > > second without it. The first is, I believe, clearer in what it's > >> > > doing. Rather than simply mechanically visiting every place a > >> > > function call might be constructed, it visits a central one to change > >> > > the default, then goes only to the places where it's relevant. > >> > > > >> > > The patches are both early WIP as they contain no docs or regression > >> > > tests yet. > >> > > >> > Docs and regression tests added, makeFuncArgs approached dropped for > >> > now, will re-visit later. > >> > >> Regression tests added to reflect bug fixes in COLLATE. > > > > Rebased vs. master. > > Hi, > > I've been looking at this patch, which adds support for the SQL > standard feature of applying a filter to rows used in an aggregate. > The syntax matches the spec: > > agg_fn ( <args> [ ORDER BY <sort_clause> ] ) [ FILTER ( WHERE <qual> ) ] > > and this patch makes FILTER a new reserved keyword, usable as a > function or type, but not in other contexts, e.g., as a table name or > alias. > > I'm not sure if that might be a problem for some people, but I can't > see any way round it, since otherwise there would be no way for the > parser to distinguish a filter clause from an alias expression. > > The feature appears to work per-spec, and the code and doc changes > look reasonable. However, it looks a little light on regression tests, What tests do you think should be there that aren't? > and so I think some necessary changes have been overlooked. > > In my testing of sub-queries in the FILTER clause (an extension to the > spec), I was able to produce the following error: Per the spec, B) A <filter clause> shall not contain a <query expression>, a <window function>, or an outer reference. I'd be happy to see about adding one despite this, though. That they didn't figure out how doesn't mean we shouldn't eventually, ideally in time for 9.4. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Alvaro Herrera
Date:
David Fetter escribió: > On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote: > > In my testing of sub-queries in the FILTER clause (an extension to the > > spec), I was able to produce the following error: > > Per the spec, > > B) A <filter clause> shall not contain a <query expression>, a <window function>, or an outer reference. If this is not allowed, I think there should be a clearer error message. What Dean showed is an internal (not user-facing) error message. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
David Fetter
Date:
On Fri, Jun 21, 2013 at 12:10:25AM -0400, Alvaro Herrera wrote: > David Fetter escribió: > > On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote: > > > > In my testing of sub-queries in the FILTER clause (an extension > > > to the spec), I was able to produce the following error: > > > > Per the spec, > > > > B) A <filter clause> shall not contain a <query expression>, a > > <window function>, or an outer reference. > > If this is not allowed, I think there should be a clearer error > message. What Dean showed is an internal (not user-facing) error > message. Folding to lower isn't allowed by the spec either, and then there's the matter of arrays... Before going to lots of trouble to figure out how to throw an error that says, "only the spec and no further," I'd like to investigate how to make this work. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
David Fetter
Date:
On Fri, Jun 21, 2013 at 12:10:25AM -0400, Alvaro Herrera wrote: > David Fetter escribió: > > On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote: > > > > In my testing of sub-queries in the FILTER clause (an extension to the > > > spec), I was able to produce the following error: > > > > Per the spec, > > > > B) A <filter clause> shall not contain a <query expression>, a <window function>, or an outer reference. > > If this is not allowed, I think there should be a clearer error message. > What Dean showed is an internal (not user-facing) error message. Please find attached a patch which allows subqueries in the FILTER clause and adds regression testing for same. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Dean Rasheed
Date:
On 21 June 2013 05:01, David Fetter <david@fetter.org> wrote: > What tests do you think should be there that aren't? > I think I expected to see more tests related to some of the specific code changes, such as CREATE TABLE t AS SELECT * FROM generate_series(1,10) t(x); -- Should fail (filter can't be used for non-aggregates) SELECT abs(x) FILTER (WHERE x > 5) FROM t; -- Should fail (filter clause can't contain aggregates) SELECT array_agg(x) FILTER (WHERE x > AVG(x)) t; -- OK (aggregate in filter sub-query) SELECT array_agg(x) FILTER (WHERE x > (SELECT AVG(x) FROM t)) FROM t; -- OK (trivial sub-query) SELECT array_agg(x) FILTER (WHERE (SELECT x > 5)) FROM t; -- OK SELECT array_agg(x) FILTER (WHERE (SELECT x > (SELECT AVG(x) FROM t))) FROM t; -- OK SELECT array_agg(x) FILTER (WHERE (SELECT (SELECT t.x > AVG(t2.x) FROM t as t2))) FROM t; -- Should fail SELECT array_agg(x) FILTER (WHERE (SELECT (SELECT 5 > AVG(t.x) FROM t as t2))) FROM t; -- OK (filtered aggregate in window context) SELECT x, AVG(x) OVER(ORDER BY x), AVG(x) FILTER (WHERE x > 5) OVER(ORDER BY x) FROM t; -- Should fail (non-aggregate window function with filter) SELECT x, rank() OVER(ORDER BY x), rank() FILTER (WHERE x > 5) OVER(ORDER BY x) FROM t; -- View definition test CREATE VIEW v AS SELECT array_agg(x) FILTER (WHERE (SELECT (SELECT t.x > AVG(t2.x) FROM t as t2))) FROM t; \d+ v etc... You could probably dream up better examples --- I just felt that the current tests don't give much coverage of the code changes. Regards, Dean
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Dean Rasheed
Date:
On 21 June 2013 06:16, David Fetter <david@fetter.org> wrote: > On Fri, Jun 21, 2013 at 12:10:25AM -0400, Alvaro Herrera wrote: >> David Fetter escribió: >> > On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote: >> >> > > In my testing of sub-queries in the FILTER clause (an extension to the >> > > spec), I was able to produce the following error: >> > >> > Per the spec, >> > >> > B) A <filter clause> shall not contain a <query expression>, a <window function>, or an outer reference. >> >> If this is not allowed, I think there should be a clearer error message. >> What Dean showed is an internal (not user-facing) error message. > > Please find attached a patch which allows subqueries in the FILTER > clause and adds regression testing for same. > Nice! I should have pointed out that it was already working for some sub-queries, just not all. It's good to see that it was basically just a one-line fix, because I think it would have been a shame to not support sub-queries. I hope to take another look at it over the weekend. Regards, Dean
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Dean Rasheed
Date:
On 21 June 2013 10:02, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > On 21 June 2013 06:16, David Fetter <david@fetter.org> wrote: >> On Fri, Jun 21, 2013 at 12:10:25AM -0400, Alvaro Herrera wrote: >>> David Fetter escribió: >>> > On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote: >>> >>> > > In my testing of sub-queries in the FILTER clause (an extension to the >>> > > spec), I was able to produce the following error: >>> > >>> > Per the spec, >>> > >>> > B) A <filter clause> shall not contain a <query expression>, a <window function>, or an outer reference. >>> >>> If this is not allowed, I think there should be a clearer error message. >>> What Dean showed is an internal (not user-facing) error message. >> >> Please find attached a patch which allows subqueries in the FILTER >> clause and adds regression testing for same. >> > > Nice! > > I should have pointed out that it was already working for some > sub-queries, just not all. It's good to see that it was basically just > a one-line fix, because I think it would have been a shame to not > support sub-queries. > > I hope to take another look at it over the weekend. > I'm still not happy that this patch is making FILTER a new reserved keyword, because I think it is a common enough English word (and an obscure enough SQL keyword) that people may well have used it for table names or aliases, and so their code will break. There is another thread discussing a similar issue with lead/lag ignore nulls: http://www.postgresql.org/message-id/CAOdE5WQnfR737OkxNXuLWnwpL7=OUssC-63ijP2=2RnRTw8emQ@mail.gmail.com Playing around with the idea proposed there, it seems that it is possible to make FILTER (and also OVER) unreserved keywords, by splitting out the production of aggregate/window functions from normal functions in gram.y. Aggregate/window functions are not allowed in index_elem or func_table constructs, but they may appear in c_expr's. That resolves the shift/reduce conflicts that would otherwise occur from subsequent alias clauses, allowing FILTER and OVER to be unreserved. There is a drawback --- certain error messages become less specific, for example: "SELECT * FROM rank() OVER(ORDER BY random());" is now a syntax error, rather than the more useful error saying that window functions aren't allowed in the FROM clause. That doesn't seem like such a common case though. What do you think? Regards, Dean
Attachment
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Kevin Grittner
Date:
Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > I'm still not happy that this patch is making FILTER a new reserved > keyword, because I think it is a common enough English word (and an > obscure enough SQL keyword) that people may well have used it for > table names or aliases, and so their code will break. Well, if it is a useful capability with a standard syntax, I think we should go with the standard syntax even if it might break application code that uses, as unquoted identifiers, words reserved by the SQL standard. Of course, non-reserved is better than reserved if we can manage it. > Playing around with the idea proposed there, it seems that it is > possible to make FILTER (and also OVER) unreserved keywords, by > splitting out the production of aggregate/window functions from normal > functions in gram.y. Aggregate/window functions are not allowed in > index_elem or func_table constructs, but they may appear in c_expr's. > That resolves the shift/reduce conflicts that would otherwise occur > from subsequent alias clauses, allowing FILTER and OVER to be > unreserved. > > There is a drawback --- certain error messages become less specific, > for example: "SELECT * FROM rank() OVER(ORDER BY random());" is now a > syntax error, rather than the more useful error saying that window > functions aren't allowed in the FROM clause. That doesn't seem like > such a common case though. > > What do you think? I think it is OK if that gets a syntax error. If that's the "worst case" I like this approach. This also helps keep down the size of the generated parse tables, doesn't it? -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
David Fetter
Date:
On Sun, Jun 23, 2013 at 07:44:26AM -0700, Kevin Grittner wrote: > Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > > > I'm still not happy that this patch is making FILTER a new reserved > > keyword, because I think it is a common enough English word (and an > > obscure enough SQL keyword) that people may well have used it for > > table names or aliases, and so their code will break. > > Well, if it is a useful capability with a standard syntax, I think > we should go with the standard syntax even if it might break > application code that uses, as unquoted identifiers, words reserved > by the SQL standard. Of course, non-reserved is better than > reserved if we can manage it. I'm not entirely sure that I agree with this. The SQL standard doesn't go adding keywords willy-nilly, or at least hasn't over a fairly long stretch of time. I can get precise numbers on this if needed. So far, only Tom and Greg have weighed in, Greg's response being here: http://www.postgresql.org/message-id/CAM-w4HOd3N_ozMygs==Lm5+hU8yQKKaYutGjiNp6z2hAzDrSTA@mail.gmail.com > > Playing around with the idea proposed there, it seems that it is > > possible to make FILTER (and also OVER) unreserved keywords, by > > splitting out the production of aggregate/window functions from normal > > functions in gram.y. Aggregate/window functions are not allowed in > > index_elem or func_table constructs, but they may appear in c_expr's. > > That resolves the shift/reduce conflicts that would otherwise occur > > from subsequent alias clauses, allowing FILTER and OVER to be > > unreserved. > > > > There is a drawback --- certain error messages become less specific, > > for example: "SELECT * FROM rank() OVER(ORDER BY random());" is now a > > syntax error, rather than the more useful error saying that window > > functions aren't allowed in the FROM clause. That doesn't seem like > > such a common case though. > > > > What do you think? > > I think it is OK if that gets a syntax error. If that's the "worst > case" I like this approach. I think reducing the usefulness of error messages is something we need to think extremely hard about before we do. Is there maybe a way to keep the error messages even if by some magic we manage to unreserve the words? > This also helps keep down the size of the generated parse tables, > doesn't it? Could well. I suspect we may need to rethink the whole way we do grammar at some point, but that's for a later discussion when I (or someone else) has something choate to say about it. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Tom Lane
Date:
David Fetter <david@fetter.org> writes: > On Sun, Jun 23, 2013 at 07:44:26AM -0700, Kevin Grittner wrote: >> I think it is OK if that gets a syntax error.� If that's the "worst >> case" I like this approach. > I think reducing the usefulness of error messages is something we need > to think extremely hard about before we do. Is there maybe a way to > keep the error messages even if by some magic we manage to unreserve > the words? Of the alternatives discussed so far, I don't really like anything better than adding another special case to base_yylex(). Robert opined in the other thread about RESPECT NULLS that the potential failure cases with that approach are harder to reason about, which is true, but that doesn't mean that we should accept failures we know of because there might be failures we don't know of. One thing that struck me while thinking about this is that it seems like we've been going about it wrong in base_yylex() in any case. For example, because we fold WITH followed by TIME into a special token WITH_TIME, we will fail on a statement beginning WITH time AS ... even though "time" is a legal ColId. But suppose that instead of merging the two tokens into one, we just changed the first token into something special; that is, base_yylex would return a special token WITH_FOLLOWED_BY_TIME and then TIME. We could then fix the above problem by allowing either WITH or WITH_FOLLOWED_BY_TIME as the leading keyword of a statement; and similarly for the few other places where WITH can be followed by an arbitrary identifier. Going on the same principle, we could probably let FILTER be an unreserved keyword while FILTER_FOLLOWED_BY_PAREN could be a type_func_name_keyword. (I've not tried this though.) This idea doesn't help much for OVER because one of the alternatives for over_clause is "OVER ColId", and I doubt we want to have base_yylex know all the alternatives for ColId. I also had no great success with the NULLS FIRST/LAST case: AFAICT the substitute token for NULLS still has to be fully reserved, meaning that something like "select nulls last" still doesn't work without quoting. We could maybe fix that with enough denormalization of the index_elem productions, but it'd be ugly. > Could well. I suspect we may need to rethink the whole way we do > grammar at some point, but that's for a later discussion when I (or > someone else) has something choate to say about it. It'd sure be interesting to know what the SQL committee's target parsing algorithm is. I find it hard to believe they're uninformed enough to not know that these random syntaxes they keep inventing are hard to deal with in LALR(1). Or maybe they really don't give a damn about breaking applications every time they invent a new reserved word? regards, tom lane
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Greg Stark
Date:
On Mon, Jun 24, 2013 at 3:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Or maybe they really don't give a damn about breaking > applications every time they invent a new reserved word? I think this is the obvious conclusion. In the standard the reserved words are pretty explicitly reserved and not legal column names, no? I think their model is that applications work with a certain version of SQL and they're not expected to work with a new version without extensive updating. -- greg
Re: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Tom Lane
Date:
Greg Stark <stark@mit.edu> writes: > I think their model is that applications work with a certain version > of SQL and they're not expected to work with a new version without > extensive updating. Hm. We could invent a "sql_version" parameter and tweak the lexer to return keywords added in spec versions later than that as IDENT. However, I fear such a parameter would be a major PITA from the user's standpoint, just like most of our backwards-compatibility GUCs have proven to be. regards, tom lane
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Kevin Grittner
Date:
Greg Stark <stark@mit.edu> wrote: > On Mon, Jun 24, 2013 at 3:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Or maybe they really don't give a damn about breaking >> applications every time they invent a new reserved word? > > I think this is the obvious conclusion. In the standard the reserved > words are pretty explicitly reserved and not legal column names, no? > > I think their model is that applications work with a certain version > of SQL and they're not expected to work with a new version without > extensive updating. IMO it is insanity to write an application of any significant complexity without a data abstraction layer sitting on a data access layer, and it is silly to write such layers which are intended to interface to SQL in a portable way without quoting *all* identifiers sent to the server. As a developer, new reserved words never bothered me in the slightest. At Wisconsin Courts the most heavily used table has been called "Case" since 1989, and the table to hold a row for each paper document printed to pay someone is called "Check". No need to change the names just because SQL started using those words for new language features after we had the tables. And there is no reason to assume that any particular word won't become reserved in the future. I think the most likely explanation is not that they don't mind breaking applications, but that they don't understand that there are significant numbers of people who choose to write code in a fashion which leaves them vulnerable to breakage when new reserved words are added. Being closer to the wide variety of users we know that there are many such people out there, and we try to look after them as best we can; which is entirely appropriate. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi David,
I hope this is the latest patch to review, right ?On Mon, Jun 17, 2013 at 11:13 AM, David Fetter <david@fetter.org> wrote:
Here's a rebased patch with comments illustrating how best to employ.On Mon, Feb 11, 2013 at 10:48:38AM -0800, David Fetter wrote:
> On Sun, Feb 10, 2013 at 10:09:19AM -0500, Tom Lane wrote:
> > David Fetter <david@fetter.org> writes:
> > > Per suggestions and lots of help from Andrew Gierth, please find
> > > attached a patch to clean up the call sites for FuncCall nodes, which
> > > I'd like to expand centrally rather than in each of the 37 (or 38, but
> > > I only redid 37) places where it's called. The remaining one is in
> > > src/backend/nodes/copyfuncs.c, which has to be modified for any
> > > changes in the that struct anyhow.
> >
> > TBH, I don't think this is an improvement.
> >
> > The problem with adding a new field to any struct is that you have to
> > run around and examine (and, usually, modify) every place that
> > manufactures that type of struct. With a makeFuncCall defined like
> > this, you'd still have to do that; it would just become a lot easier
> > to forget to do so.
> >
> > If the subroutine were defined like most other makeXXX subroutines,
> > ie you have to supply *all* the fields, that argument would go away,
> > but the notational advantage is then dubious.
> >
> > The bigger-picture point is that you're proposing to make the coding
> > conventions for building FuncCalls different from what they are for
> > any other grammar node. I don't think that's a great idea; it will
> > mostly foster confusion.
>
> The major difference between FuncCalls and others is that `while most
> raw-parsetree nodes are constructed only in their own syntax
> productions, FuncCall is constructed in many places unrelated to
> actual function call syntax.
>
> This really will make things a good bit easier on our successors.
In my previous message, I characterized the difference between
FuncCalls and other raw-parsetree nodes. Is there some flaw in that
logic? If there isn't, is there some reason not to treat them in a
less redundant, more uniform manner as this patch does?
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Jeevan B Chalke
Senior Software Engineer, R&D
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Phone: +91 20 30589500
Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb
This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.
On Tue, Jun 25, 2013 at 11:11 AM, Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote:
Will post my review soon.I have gone through the discussion on this thread and I agree with Stephen Frost that it don't add much improvements as such but definitely it is going to be easy for contributors in this area as they don't need to go all over to add any extra parameter they need to add. This patch simplifies it well enough.I am going to review it.Hi David,I hope this is the latest patch to review, right ?
Assuming makeFuncArgs_002.diff is the latest patch, here are my review points.
About this patch and feature:
===
This patch tries to reduce redundancy when we need FuncCall expression. With
this patch it will become easier to add new parameter if needed. We just need
to put it's default value at centralized location (I mean in this new function)
so that all other places need not require and changes. And this new parameter
is handled by the new feature who demanded it keep other untouched.
Review comments on patch:
===
1. Can't apply with "git apply" command but able to do it with patch -p1. So no
issues
2. Adds one whitespace error, hopefully it will get corrected once it goes
through pgindent.
3. There is absolutely NO user visibility and thus I guess we don't need any
test case. Also no documentation are needed.
4. Also make check went smooth and thus assumes that there is no issue as such.
Even I couldn't find any issue with my testing other than regression suite.
5. I had a code walk-through over patch and it looks good to me. However,
following line change seems unrelated (Line 186 in your patch)
! $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "~~", $1, (Node *) n, @2);
!
Changes required from author:
===
It will be good if you remove unrelated changes from the patch and possibly all
white-space errors.
Thanks
About this patch and feature:
===
This patch tries to reduce redundancy when we need FuncCall expression. With
this patch it will become easier to add new parameter if needed. We just need
to put it's default value at centralized location (I mean in this new function)
so that all other places need not require and changes. And this new parameter
is handled by the new feature who demanded it keep other untouched.
Review comments on patch:
===
1. Can't apply with "git apply" command but able to do it with patch -p1. So no
issues
2. Adds one whitespace error, hopefully it will get corrected once it goes
through pgindent.
3. There is absolutely NO user visibility and thus I guess we don't need any
test case. Also no documentation are needed.
4. Also make check went smooth and thus assumes that there is no issue as such.
Even I couldn't find any issue with my testing other than regression suite.
5. I had a code walk-through over patch and it looks good to me. However,
following line change seems unrelated (Line 186 in your patch)
! $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "~~", $1, (Node *) n, @2);
!
Changes required from author:
===
It will be good if you remove unrelated changes from the patch and possibly all
white-space errors.
Thanks
Thanks.On Mon, Jun 17, 2013 at 11:13 AM, David Fetter <david@fetter.org> wrote:Here's a rebased patch with comments illustrating how best to employ.On Mon, Feb 11, 2013 at 10:48:38AM -0800, David Fetter wrote:
> On Sun, Feb 10, 2013 at 10:09:19AM -0500, Tom Lane wrote:
> > David Fetter <david@fetter.org> writes:
> > > Per suggestions and lots of help from Andrew Gierth, please find
> > > attached a patch to clean up the call sites for FuncCall nodes, which
> > > I'd like to expand centrally rather than in each of the 37 (or 38, but
> > > I only redid 37) places where it's called. The remaining one is in
> > > src/backend/nodes/copyfuncs.c, which has to be modified for any
> > > changes in the that struct anyhow.
> >
> > TBH, I don't think this is an improvement.
> >
> > The problem with adding a new field to any struct is that you have to
> > run around and examine (and, usually, modify) every place that
> > manufactures that type of struct. With a makeFuncCall defined like
> > this, you'd still have to do that; it would just become a lot easier
> > to forget to do so.
> >
> > If the subroutine were defined like most other makeXXX subroutines,
> > ie you have to supply *all* the fields, that argument would go away,
> > but the notational advantage is then dubious.
> >
> > The bigger-picture point is that you're proposing to make the coding
> > conventions for building FuncCalls different from what they are for
> > any other grammar node. I don't think that's a great idea; it will
> > mostly foster confusion.
>
> The major difference between FuncCalls and others is that `while most
> raw-parsetree nodes are constructed only in their own syntax
> productions, FuncCall is constructed in many places unrelated to
> actual function call syntax.
>
> This really will make things a good bit easier on our successors.
In my previous message, I characterized the difference between
FuncCalls and other raw-parsetree nodes. Is there some flaw in that
logic? If there isn't, is there some reason not to treat them in a
less redundant, more uniform manner as this patch does?
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Jeevan B Chalke
Senior Software Engineer, R&D
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Phone: +91 20 30589500
Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb
This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.
--
Jeevan B Chalke
Senior Software Engineer, R&D
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Phone: +91 20 30589500
Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb
This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Dean Rasheed
Date:
On 24 June 2013 03:50, Tom Lane <tgl@sss.pgh.pa.us> wrote: > David Fetter <david@fetter.org> writes: >> On Sun, Jun 23, 2013 at 07:44:26AM -0700, Kevin Grittner wrote: >>> I think it is OK if that gets a syntax error. If that's the "worst >>> case" I like this approach. > >> I think reducing the usefulness of error messages is something we need >> to think extremely hard about before we do. Is there maybe a way to >> keep the error messages even if by some magic we manage to unreserve >> the words? > The flip side is that the error message you get if you don't realise a word is now reserved is not exactly useful: "Syntax error at or near xxx". I don't know of any way to improve that though. > Of the alternatives discussed so far, I don't really like anything > better than adding another special case to base_yylex(). Robert opined > in the other thread about RESPECT NULLS that the potential failure cases > with that approach are harder to reason about, which is true, but that > doesn't mean that we should accept failures we know of because there > might be failures we don't know of. > > One thing that struck me while thinking about this is that it seems > like we've been going about it wrong in base_yylex() in any case. > For example, because we fold WITH followed by TIME into a special token > WITH_TIME, we will fail on a statement beginning > > WITH time AS ... > > even though "time" is a legal ColId. But suppose that instead of > merging the two tokens into one, we just changed the first token into > something special; that is, base_yylex would return a special token > WITH_FOLLOWED_BY_TIME and then TIME. We could then fix the above > problem by allowing either WITH or WITH_FOLLOWED_BY_TIME as the leading > keyword of a statement; and similarly for the few other places where > WITH can be followed by an arbitrary identifier. > > Going on the same principle, we could probably let FILTER be an > unreserved keyword while FILTER_FOLLOWED_BY_PAREN could be a > type_func_name_keyword. (I've not tried this though.) > I've not tried either, but wouldn't that mean that "SELECT * FROM list_filters() filter" would be legal, whereas "SELECT * FROM list_filters() filter(id, val)" would be a syntax error? If so, I don't think that would be an improvement. Regards, Dean
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Tom Lane
Date:
Dean Rasheed <dean.a.rasheed@gmail.com> writes: > On 24 June 2013 03:50, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Going on the same principle, we could probably let FILTER be an >> unreserved keyword while FILTER_FOLLOWED_BY_PAREN could be a >> type_func_name_keyword. (I've not tried this though.) > I've not tried either, but wouldn't that mean that "SELECT * FROM > list_filters() filter" would be legal, whereas "SELECT * FROM > list_filters() filter(id, val)" would be a syntax error? If so, I > don't think that would be an improvement. Hm, good point. The SQL committee really managed to choose some unfortunate syntax here, didn't they. I know it's heresy in these parts, but maybe we should consider adopting a non-spec syntax for this feature? In particular, it's really un-obvious why the FILTER clause shouldn't be inside rather than outside the aggregate's parens, like ORDER BY. regards, tom lane
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Pavel Stehule
Date:
2013/6/25 Tom Lane <tgl@sss.pgh.pa.us>: > Dean Rasheed <dean.a.rasheed@gmail.com> writes: >> On 24 June 2013 03:50, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Going on the same principle, we could probably let FILTER be an >>> unreserved keyword while FILTER_FOLLOWED_BY_PAREN could be a >>> type_func_name_keyword. (I've not tried this though.) > >> I've not tried either, but wouldn't that mean that "SELECT * FROM >> list_filters() filter" would be legal, whereas "SELECT * FROM >> list_filters() filter(id, val)" would be a syntax error? If so, I >> don't think that would be an improvement. > > Hm, good point. The SQL committee really managed to choose some > unfortunate syntax here, didn't they. > > I know it's heresy in these parts, but maybe we should consider > adopting a non-spec syntax for this feature? In particular, it's > really un-obvious why the FILTER clause shouldn't be inside rather > than outside the aggregate's parens, like ORDER BY. the gram can be more free and final decision should be done in later stages ??? This technique was enough when I wrote prototype for CUBE ROLLUP without CUBE ROLLUP reserwed keywords. Regards Pavel > > regards, tom lane > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Robert Haas
Date:
On Sun, Jun 23, 2013 at 10:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > David Fetter <david@fetter.org> writes: >> On Sun, Jun 23, 2013 at 07:44:26AM -0700, Kevin Grittner wrote: >>> I think it is OK if that gets a syntax error. If that's the "worst >>> case" I like this approach. > >> I think reducing the usefulness of error messages is something we need >> to think extremely hard about before we do. Is there maybe a way to >> keep the error messages even if by some magic we manage to unreserve >> the words? > > Of the alternatives discussed so far, I don't really like anything > better than adding another special case to base_yylex(). Robert opined > in the other thread about RESPECT NULLS that the potential failure cases > with that approach are harder to reason about, which is true, but that > doesn't mean that we should accept failures we know of because there > might be failures we don't know of. Sure, that's true; but the proposal on the other thread is just to disallow invalid syntax early enough that it benefits the parser. The error message is different, but I don't think it's a BAD error message. > One thing that struck me while thinking about this is that it seems > like we've been going about it wrong in base_yylex() in any case. > For example, because we fold WITH followed by TIME into a special token > WITH_TIME, we will fail on a statement beginning > > WITH time AS ... > > even though "time" is a legal ColId. But suppose that instead of > merging the two tokens into one, we just changed the first token into > something special; that is, base_yylex would return a special token > WITH_FOLLOWED_BY_TIME and then TIME. We could then fix the above > problem by allowing either WITH or WITH_FOLLOWED_BY_TIME as the leading > keyword of a statement; and similarly for the few other places where > WITH can be followed by an arbitrary identifier. > > Going on the same principle, we could probably let FILTER be an > unreserved keyword while FILTER_FOLLOWED_BY_PAREN could be a > type_func_name_keyword. (I've not tried this though.) I think this whole direction is going to collapse under its own weight VERY quickly. The problems you're describing are essentially shift/reduce conflicts that are invisible because they're hidden behind lexer magic. Part of the value of using a parser generator is that it TELLS you when you've added ambiguous syntax. But it doesn't know about lexer hacks, so stuff will just silently break. I think this type of lexer hacks works reasonably well keyword-like things that are used in just one place in the grammar. As soon as you get up to two, the wheels come off - as with RESPECT NULLS vs. NULLS FIRST. > This idea doesn't help much for OVER because one of the alternatives for > over_clause is "OVER ColId", and I doubt we want to have base_yylex know > all the alternatives for ColId. I also had no great success with the > NULLS FIRST/LAST case: AFAICT the substitute token for NULLS still has > to be fully reserved, meaning that something like "select nulls last" > still doesn't work without quoting. We could maybe fix that with enough > denormalization of the index_elem productions, but it'd be ugly. I don't think that particular example is very compelling - there's a general rule that column aliases can't be keywords of any type. That's not wonderful, and EnterpriseDB has had bug reports filed about it, but the real-world impact is pretty minimal, certainly compared to what we used to do which is not allow column aliases AT ALL. > It'd sure be interesting to know what the SQL committee's target parsing > algorithm is. I find it hard to believe they're uninformed enough to > not know that these random syntaxes they keep inventing are hard to deal > with in LALR(1). Or maybe they really don't give a damn about breaking > applications every time they invent a new reserved word? Does the SQL committee contemplate that SELECT * FROM somefunc() filter (id, val) should act as a table alias and that SELECT * FROM somefunc() filter (where x > 1) is an aggregate filter? This all gets much easier to understand if one of those constructs isn't allowed in that particular context. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Josh Berkus
Date:
> I know it's heresy in these parts, but maybe we should consider > adopting a non-spec syntax for this feature? In particular, it's > really un-obvious why the FILTER clause shouldn't be inside rather > than outside the aggregate's parens, like ORDER BY. Well, what other DBMSes support this feature? Will being non-spec introduce migration pain? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Dean Rasheed
Date:
On 26 June 2013 01:01, Josh Berkus <josh@agliodbs.com> wrote: > >> I know it's heresy in these parts, but maybe we should consider >> adopting a non-spec syntax for this feature? In particular, it's >> really un-obvious why the FILTER clause shouldn't be inside rather >> than outside the aggregate's parens, like ORDER BY. > > Well, what other DBMSes support this feature? Will being non-spec > introduce migration pain? > I can't find any, but that doesn't mean they don't exist, or aren't being worked on. To recap, the options currently on offer are: 1). Make FILTER a new partially reserved keyword, accepting that that might break some users' application code. 2). Make FILTER unreserved, accepting that that will lead to syntax errors rather than more specific error messages if the user tries to use an aggregate/window function with FILTER or OVER in the FROM clause of a query, or as an index expression. 3). Adopt a non-standard syntax for this feature, accepting that that might conflict with other databases, and that we can never then claim to have implemented T612, "Advanced OLAP operations". 4). Some other parser hack that will offer a better compromise? My preference is for (2) as the lesser of several evils --- it's a fairly narrow case where the quality of the error message is reduced. Regards, Dean
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Pavel Stehule
Date:
2013/6/26 Dean Rasheed <dean.a.rasheed@gmail.com>: > On 26 June 2013 01:01, Josh Berkus <josh@agliodbs.com> wrote: >> >>> I know it's heresy in these parts, but maybe we should consider >>> adopting a non-spec syntax for this feature? In particular, it's >>> really un-obvious why the FILTER clause shouldn't be inside rather >>> than outside the aggregate's parens, like ORDER BY. >> >> Well, what other DBMSes support this feature? Will being non-spec >> introduce migration pain? >> > > I can't find any, but that doesn't mean they don't exist, or aren't > being worked on. > > To recap, the options currently on offer are: > > 1). Make FILTER a new partially reserved keyword, accepting that that > might break some users' application code. > > 2). Make FILTER unreserved, accepting that that will lead to syntax > errors rather than more specific error messages if the user tries to > use an aggregate/window function with FILTER or OVER in the FROM > clause of a query, or as an index expression. > > 3). Adopt a non-standard syntax for this feature, accepting that that > might conflict with other databases, and that we can never then claim > to have implemented T612, "Advanced OLAP operations". > > 4). Some other parser hack that will offer a better compromise? > > > My preference is for (2) as the lesser of several evils --- it's a > fairly narrow case where the quality of the error message is reduced. > @2 looks well Pavel > Regards, > Dean > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Andrew Gierth
Date:
Dean Rasheed said: > To recap, the options currently on offer are: > > 1). Make FILTER a new partially reserved keyword, accepting that that > might break some users' application code. > > 2). Make FILTER unreserved, accepting that that will lead to syntax > errors rather than more specific error messages if the user tries to > use an aggregate/window function with FILTER or OVER in the FROM > clause of a query, or as an index expression. > > 3). Adopt a non-standard syntax for this feature, accepting that that > might conflict with other databases, and that we can never then claim > to have implemented T612, "Advanced OLAP operations". > > 4). Some other parser hack that will offer a better compromise? > > My preference is for (2) as the lesser of several evils --- it's a > fairly narrow case where the quality of the error message is reduced. Possibly significant in this context is that there is a proof-of-concept patch in development for another part of T612, namely inverse distribution functions (e.g. percentile_disc and percentile_cont) which should be available by the next CF, and which will require a similar decision with respect to the keyword WITHIN (to support doing: select percentile_cont(0.5) within group (order by x) from...; which returns the median value of x). This syntax is much more widely supported than FILTER, including by both Oracle and MSSQL, so a non-standard alternative is much less attractive. A solution which suits both (i.e. either option 1 or option 2) would make a whole lot more sense than trying to handle them differently. -- Andrew (irc:RhodiumToad)
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > Possibly significant in this context is that there is a proof-of-concept > patch in development for another part of T612, namely inverse > distribution functions (e.g. percentile_disc and percentile_cont) which > should be available by the next CF, and which will require a similar > decision with respect to the keyword WITHIN (to support doing: > select percentile_cont(0.5) within group (order by x) from ...; > which returns the median value of x). Agreed, separating out the function-call-with-trailing-declaration syntaxes so they aren't considered in FROM and index_elem seems like the best compromise. If we do that for window function OVER clauses as well, can we make OVER less reserved? regards, tom lane
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Andrew Gierth
Date:
Tom Lane said: > Agreed, separating out the function-call-with-trailing-declaration > syntaxes so they aren't considered in FROM and index_elem seems like > the best compromise. > > If we do that for window function OVER clauses as well, can we make > OVER less reserved? Yes. At least, I tried it with both OVER and FILTER unreserved and there were no grammar conflicts (and I didn't have to do anything fancy to avoid them), and it passed regression with the exception of the changed error message for window functions in the from-clause. So is this the final decision on how to proceed? It seems good to me, and I can work with David to get it done. -- Andrew (irc:RhodiumToad)
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Pavel Stehule
Date:
Hello 2013/6/27 Andrew Gierth <andrew@tao11.riddles.org.uk>: > Tom Lane said: >> Agreed, separating out the function-call-with-trailing-declaration >> syntaxes so they aren't considered in FROM and index_elem seems like >> the best compromise. >> >> If we do that for window function OVER clauses as well, can we make >> OVER less reserved? > > Yes. > > At least, I tried it with both OVER and FILTER unreserved and there > were no grammar conflicts (and I didn't have to do anything fancy to > avoid them), and it passed regression with the exception of the > changed error message for window functions in the from-clause. > > So is this the final decision on how to proceed? It seems good to me, > and I can work with David to get it done. > Isn't dangerous do OVER unreserved keyword?? Regards Pavel > -- > Andrew (irc:RhodiumToad) > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
David Fetter
Date:
On Thu, Jun 27, 2013 at 08:41:59AM +0000, Andrew Gierth wrote: > Tom Lane said: > > Agreed, separating out the function-call-with-trailing-declaration > > syntaxes so they aren't considered in FROM and index_elem seems > > like the best compromise. > > > > If we do that for window function OVER clauses as well, can we > > make OVER less reserved? > > Yes. > > At least, I tried it with both OVER and FILTER unreserved and there > were no grammar conflicts (and I didn't have to do anything fancy to > avoid them), and it passed regression with the exception of the > changed error message for window functions in the from-clause. > > So is this the final decision on how to proceed? It seems good to > me, and I can work with David to get it done. If this is really the direction people want to go, I'm in. Is there some code I can look at? I still submit that having our reserved word ducks in a row in advance is a saner way to go about this, and will work up a patch for that as I have time. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > Tom Lane said: >> Agreed, separating out the function-call-with-trailing-declaration >> syntaxes so they aren't considered in FROM and index_elem seems like >> the best compromise. >> >> If we do that for window function OVER clauses as well, can we make >> OVER less reserved? > Yes. > At least, I tried it with both OVER and FILTER unreserved and there > were no grammar conflicts (and I didn't have to do anything fancy to > avoid them), and it passed regression with the exception of the > changed error message for window functions in the from-clause. > So is this the final decision on how to proceed? It seems good to me, > and I can work with David to get it done. Yeah, please submit a separate patch that just refactors the existing grammar as above; that'll simplify reviewing. regards, tom lane
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Tom Lane
Date:
Pavel Stehule <pavel.stehule@gmail.com> writes: >> Tom Lane said: >>> If we do that for window function OVER clauses as well, can we make >>> OVER less reserved? > Isn't dangerous do OVER unreserved keyword?? How so? The worst-case scenario is that we find we have to make it more reserved again in some future release, as a consequence of some new randomness from the SQL committee. That will just return us to the status quo, in which anybody who uses OVER as a table/column name has been broken since about 8.4. Since we still hear of people using releases as old as 7.2.x, I'm sure there are a few out there who would still be helped if we could de-reserve OVER again. (Not to mention people migrating from other systems in which it's not a keyword.) In any case, the general project policy has been to never make keywords any more reserved than we absolutely have to. If we didn't care about this, we wouldn't be bothering with four separate categories of keywords. regards, tom lane
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Peter Eisentraut
Date:
On 6/23/13 10:50 PM, Tom Lane wrote: > It'd sure be interesting to know what the SQL committee's target parsing > algorithm is. It's whatever Oracle and IBM implement. > Or maybe they really don't give a damn about breaking > applications every time they invent a new reserved word? Well, yes, I think that policy was built into the language at the very beginning.
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Dean Rasheed
Date:
On 27 June 2013 15:05, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andrew Gierth <andrew@tao11.riddles.org.uk> writes: >> Tom Lane said: >>> Agreed, separating out the function-call-with-trailing-declaration >>> syntaxes so they aren't considered in FROM and index_elem seems like >>> the best compromise. >>> >>> If we do that for window function OVER clauses as well, can we make >>> OVER less reserved? > >> Yes. > >> At least, I tried it with both OVER and FILTER unreserved and there >> were no grammar conflicts (and I didn't have to do anything fancy to >> avoid them), and it passed regression with the exception of the >> changed error message for window functions in the from-clause. > >> So is this the final decision on how to proceed? It seems good to me, >> and I can work with David to get it done. > > Yeah, please submit a separate patch that just refactors the existing > grammar as above; that'll simplify reviewing. > In that case, I'll re-review the latest FILTER patch over the weekend on the understanding that the reserved/unreserved keyword issue will be resolved in separate patch. Regards, Dean
On Tue, Jun 25, 2013 at 02:54:55PM +0530, Jeevan Chalke wrote: > On Tue, Jun 25, 2013 at 11:11 AM, Jeevan Chalke < > jeevan.chalke@enterprisedb.com> wrote: > > > Hi David, > > > > I hope this is the latest patch to review, right ? > > > > I am going to review it. > > > > I have gone through the discussion on this thread and I agree with Stephen > > Frost that it don't add much improvements as such but definitely it is > > going to be easy for contributors in this area as they don't need to go all > > over to add any extra parameter they need to add. This patch simplifies it > > well enough. > > > > Will post my review soon. > > > > > Assuming *makeFuncArgs_002.diff* is the latest patch, here are my review > points. > > About this patch and feature: > === > This patch tries to reduce redundancy when we need FuncCall expression. With > this patch it will become easier to add new parameter if needed. We just > need > to put it's default value at centralized location (I mean in this new > function) > so that all other places need not require and changes. And this new > parameter > is handled by the new feature who demanded it keep other untouched. > > Review comments on patch: > === > 1. Can't apply with "git apply" command but able to do it with patch -p1. > So no > issues > 2. Adds one whitespace error, hopefully it will get corrected once it goes > through pgindent. > 3. There is absolutely NO user visibility and thus I guess we don't need any > test case. Also no documentation are needed. > 4. Also make check went smooth and thus assumes that there is no issue as > such. > Even I couldn't find any issue with my testing other than regression suite. > 5. I had a code walk-through over patch and it looks good to me. However, > following line change seems unrelated (Line 186 in your patch) > > ! $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "~~", $1, > (Node *) n, @2); > ! > > Changes required from author: > === > It will be good if you remove unrelated changes from the patch and possibly > all > white-space errors. > > Thanks Thanks for the review! Please find attached the latest patch. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
David Fetter <david@fetter.org> writes: > Please find attached the latest patch. I remain of the opinion that this is simply a bad idea. It is unlike our habits for constructing other types of nodes, and makes it harder not easier to find all the places that need to be updated when adding another field to FuncCall. regards, tom lane
On Fri, Jun 28, 2013 at 10:31:16AM -0400, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > Please find attached the latest patch. > > I remain of the opinion that this is simply a bad idea. It is unlike > our habits for constructing other types of nodes, and makes it harder > not easier to find all the places that need to be updated when adding > another field to FuncCall. With utmost respect, this is just not true. There's exactly one place that needs updating after adding another field to FuncCall in the general case where the default value of the field doesn't affect most setters of FuncCall, i.e. where the default default is the right thing for current setters. In specific cases where such a field might need to be set to something other than its default value, finding calls to makeFuncCall is just as easy, and with some tools like cscope, even easier. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Fri, Jun 28, 2013 at 10:31:16AM -0400, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > Please find attached the latest patch. > > I remain of the opinion that this is simply a bad idea. It is unlike > our habits for constructing other types of nodes, and makes it harder > not easier to find all the places that need to be updated when adding > another field to FuncCall. We have precedents in makeRangeVar() and makeDefElem(). For me, this change would make it slightly easier to visit affected code sites after a change. I could cscope for callers of makeFuncCall() instead of doing "git grep 'makeNode(FuncCall)'". The advantage could go either way depending on your tooling, though. By having each call site only mention the seldom-used fields for which it does something special, the distinctive aspects of the call site stand out better. That's a nice advantage. -- Noah Misch EnterpriseDB http://www.enterprisedb.com
On Fri, Jun 28, 2013 at 10:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > David Fetter <david@fetter.org> writes: >> Please find attached the latest patch. > > I remain of the opinion that this is simply a bad idea. It is unlike > our habits for constructing other types of nodes, and makes it harder > not easier to find all the places that need to be updated when adding > another field to FuncCall. I think it's a nice code cleanup. I don't understand your objection. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 6/28/13 11:30 AM, Robert Haas wrote: > On Fri, Jun 28, 2013 at 10:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> David Fetter <david@fetter.org> writes: >>> Please find attached the latest patch. >> >> I remain of the opinion that this is simply a bad idea. It is unlike >> our habits for constructing other types of nodes, and makes it harder >> not easier to find all the places that need to be updated when adding >> another field to FuncCall. > > I think it's a nice code cleanup. I don't understand your objection. Yeah, I was reading the patch thinking, yes, finally someone cleans that up.
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Dean Rasheed
Date:
On 21 June 2013 06:16, David Fetter <david@fetter.org> wrote: > Please find attached a patch which allows subqueries in the FILTER > clause and adds regression testing for same. > This needs re-basing/merging following Robert's recent commit to make OVER unreserved. Regards, Dean
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
David Fetter
Date:
On Fri, Jun 28, 2013 at 09:22:52PM +0100, Dean Rasheed wrote: > On 21 June 2013 06:16, David Fetter <david@fetter.org> wrote: > > Please find attached a patch which allows subqueries in the FILTER > > clause and adds regression testing for same. > > > > This needs re-basing/merging following Robert's recent commit to make > OVER unreserved. Please find attached. Thanks, Andrew Gierth! In this one, FILTER is no longer a reserved word. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
On Fri, Jun 28, 2013 at 01:28:35PM -0400, Peter Eisentraut wrote: > On 6/28/13 11:30 AM, Robert Haas wrote: > > On Fri, Jun 28, 2013 at 10:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> David Fetter <david@fetter.org> writes: > >>> Please find attached the latest patch. > >> > >> I remain of the opinion that this is simply a bad idea. It is unlike > >> our habits for constructing other types of nodes, and makes it harder > >> not easier to find all the places that need to be updated when adding > >> another field to FuncCall. > > > > I think it's a nice code cleanup. I don't understand your objection. > > Yeah, I was reading the patch thinking, yes, finally someone cleans that up. Please find enclosed a patch reflecting the changes that de-reserved OVER as a keyword. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
On Mon, Jul 1, 2013 at 6:16 AM, David Fetter <david@fetter.org> wrote:
Please find enclosed a patch reflecting the changes that de-reservedOn Fri, Jun 28, 2013 at 01:28:35PM -0400, Peter Eisentraut wrote:
> On 6/28/13 11:30 AM, Robert Haas wrote:
> > On Fri, Jun 28, 2013 at 10:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> David Fetter <david@fetter.org> writes:
> >>> Please find attached the latest patch.
> >>
> >> I remain of the opinion that this is simply a bad idea. It is unlike
> >> our habits for constructing other types of nodes, and makes it harder
> >> not easier to find all the places that need to be updated when adding
> >> another field to FuncCall.
> >
> > I think it's a nice code cleanup. I don't understand your objection.
>
> Yeah, I was reading the patch thinking, yes, finally someone cleans that up.
OVER as a keyword.
I have re-validated this new patch and it looks good to go in now.
I saw that it's already marked ready for committer.
Thanks David
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
--
Jeevan B Chalke
Senior Software Engineer, R&D
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Phone: +91 20 30589500
Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb
This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Dean Rasheed
Date:
On 1 July 2013 01:44, David Fetter <david@fetter.org> wrote: > On Fri, Jun 28, 2013 at 09:22:52PM +0100, Dean Rasheed wrote: >> On 21 June 2013 06:16, David Fetter <david@fetter.org> wrote: >> > Please find attached a patch which allows subqueries in the FILTER >> > clause and adds regression testing for same. >> > >> >> This needs re-basing/merging following Robert's recent commit to make >> OVER unreserved. > > Please find attached. Thanks, Andrew Gierth! In this one, FILTER is > no longer a reserved word. > Looking at this patch again, it appears to be in pretty good shape. - Applies cleanly to head. - Compiles with no warnings. - Includes regression test cases and doc updates. - Compatible with the relevant part of T612, "Advanced OLAP operations". - Includes pg_dump support. - Code changes all look reasonable, and I can't find any corner cases that have been missed. - Appears to work as expected. I tested everything I could think of and couldn't break it. AFAICT all the bases have been covered. As mentioned upthread, I would have preferred a few more regression test cases, and a couple of the tests don't appear to return anything interesting, but I'll leave that for the committer to decide whether they're sufficient for regression tests. I have a few suggestions to improve the docs: 1). In syntax.sgml: "The aggregate_name can also be suffixed with FILTER as described below". It's not really a suffix to the aggregate name, since it follows the function arguments and optional order by clause. Perhaps it would be more consistent with the surrounding text to say something like <replaceable>expression</replaceable> is any value expression that does not itself contain an aggregate expressionor a window function call, and ! <replaceable>order_by_clause</replaceable> and ! <replaceable>filter_clause</replaceable> are optional ! <literal>ORDER BY</> and <literal>FILTER</> clauses as described below. 2). In syntax.sgml: "... or when a FILTER clause is present, each row matching same". In the context of that paragraph this suggests that the filter clause only applies to the first form, since that paragraph is a description of the 4 forms of the aggregate function. I don't think it's worth mentioning FILTER in this paragraph at all --- it's adequately described below that. 3). In syntax.sgml: "Adding a FILTER clause to an aggregate specifies which values of the expression being aggregated to evaluate". How about something a little more specific, along the lines of If <literal>FILTER</> is specified, then only input rows for which the <replaceable>filter_clause</replaceable> evaluatesto true are fed to the aggregate function; input rows for which the <replaceable>filter_clause</replaceable>evaluates to false or the null value are discarded. For example... 4). In select.sgml: "In the absence of a FILTER clause, aggregate functions...". It doesn't seem right to refer to the FILTER clause at the top level here because it's not part of the SELECT syntax being described on this page. Also I think this should include a cross-reference to the aggregate function syntax section, perhaps something like: Aggregate functions, if any are used, are computed across all rows making up each group, producing a separate valuefor each group (whereas without <literal>GROUP BY</literal>, an aggregate produces a single value computed acrossall the selected rows). + The set of rows fed to the aggregate function can be further filtered + by attaching a <literal>FILTER</literal> clause to the aggregate + function call, see <xref ...> for more information. When <literal>GROUP BY</literal> is present, it is not valid for the <command>SELECT</command> list expressions to refer to ungrouped columns except within aggregate functions orif the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more thanone possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (ora subset thereof) are the primary key of the table containing the ungrouped column. Regards, Dean
On Mon, Jul 1, 2013 at 3:19 AM, Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote: > I have re-validated this new patch and it looks good to go in now. > > I saw that it's already marked ready for committer. I don't normally like to commit things over another committer's objections, but this has +1 votes from four other committers (Stephen, Noah, Peter E, myself) so I think that's enough reason to move forward. So committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
David Fetter
Date:
On Mon, Jul 01, 2013 at 05:30:38PM +0100, Dean Rasheed wrote: > On 1 July 2013 01:44, David Fetter <david@fetter.org> wrote: > > On Fri, Jun 28, 2013 at 09:22:52PM +0100, Dean Rasheed wrote: > >> On 21 June 2013 06:16, David Fetter <david@fetter.org> wrote: > >> > Please find attached a patch which allows subqueries in the FILTER > >> > clause and adds regression testing for same. > >> > > >> > >> This needs re-basing/merging following Robert's recent commit to make > >> OVER unreserved. > > > > Please find attached. Thanks, Andrew Gierth! In this one, FILTER is > > no longer a reserved word. > > > > Looking at this patch again, it appears to be in pretty good shape. > > - Applies cleanly to head. > - Compiles with no warnings. > - Includes regression test cases and doc updates. > - Compatible with the relevant part of T612, "Advanced OLAP operations". > - Includes pg_dump support. > - Code changes all look reasonable, and I can't find any corner cases > that have been missed. > - Appears to work as expected. I tested everything I could think of > and couldn't break it. > > AFAICT all the bases have been covered. As mentioned upthread, I would > have preferred a few more regression test cases, and a couple of the > tests don't appear to return anything interesting, but I'll leave that > for the committer to decide whether they're sufficient for regression > tests. > > I have a few suggestions to improve the docs: > > 1). In syntax.sgml: "The aggregate_name can also be suffixed with > FILTER as described below". It's not really a suffix to the aggregate > name, since it follows the function arguments and optional order by > clause. Perhaps it would be more consistent with the surrounding text > to say something like > > <replaceable>expression</replaceable> is > any value expression that does not itself contain an aggregate > expression or a window function call, and > ! <replaceable>order_by_clause</replaceable> and > ! <replaceable>filter_clause</replaceable> are optional > ! <literal>ORDER BY</> and <literal>FILTER</> clauses as described below. > > 2). In syntax.sgml: "... or when a FILTER clause is present, each row > matching same". In the context of that paragraph this suggests that > the filter clause only applies to the first form, since that paragraph > is a description of the 4 forms of the aggregate function. I don't > think it's worth mentioning FILTER in this paragraph at all --- it's > adequately described below that. > > 3). In syntax.sgml: "Adding a FILTER clause to an aggregate specifies > which values of the expression being aggregated to evaluate". How > about something a little more specific, along the lines of > > If <literal>FILTER</> is specified, then only input rows for which > the <replaceable>filter_clause</replaceable> evaluates to true are > fed to the aggregate function; input rows for which the > <replaceable>filter_clause</replaceable> evaluates to false or the > null value are discarded. For example... > > 4). In select.sgml: "In the absence of a FILTER clause, aggregate > functions...". It doesn't seem right to refer to the FILTER clause at > the top level here because it's not part of the SELECT syntax being > described on this page. Also I think this should include a > cross-reference to the aggregate function syntax section, perhaps > something like: > > Aggregate functions, if any are used, are computed across all rows > making up each group, producing a separate value for each group > (whereas without <literal>GROUP BY</literal>, an aggregate > produces a single value computed across all the selected rows). > + The set of rows fed to the aggregate function can be further filtered > + by attaching a <literal>FILTER</literal> clause to the aggregate > + function call, see <xref ...> for more information. > When <literal>GROUP BY</literal> is present, it is not valid for > the <command>SELECT</command> list expressions to refer to > ungrouped columns except within aggregate functions or if the > ungrouped column is functionally dependent on the grouped columns, > since there would otherwise be more than one possible value to > return for an ungrouped column. A functional dependency exists if > the grouped columns (or a subset thereof) are the primary key of > the table containing the ungrouped column. Please find attached changes based on the above. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Dean Rasheed
Date:
On 5 July 2013 18:23, David Fetter <david@fetter.org> wrote: > Please find attached changes based on the above. > This looks good. The grammar changes are smaller and neater now on top of the makeFuncCall() patch. Overall I think this patch offers useful additional functionality, in compliance with the SQL spec, which should be handy to simplify complex grouping queries. There's a minor typo in syntax.sgml: "for each input row, each row matching same." --- fix attached. I think this is ready for committer. Regards, Dean
Attachment
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
David Fetter
Date:
On Sat, Jul 06, 2013 at 11:49:21AM +0100, Dean Rasheed wrote: > On 5 July 2013 18:23, David Fetter <david@fetter.org> wrote: > > Please find attached changes based on the above. > > > > This looks good. The grammar changes are smaller and neater now on top > of the makeFuncCall() patch. > > Overall I think this patch offers useful additional functionality, in > compliance with the SQL spec, which should be handy to simplify > complex grouping queries. > > There's a minor typo in syntax.sgml: "for each input row, each row > matching same." --- fix attached. That is actually correct grammar, but may not be the easiest to translate. > I think this is ready for committer. Thanks :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Noah Misch
Date:
On Sun, Jul 07, 2013 at 06:37:26PM -0700, David Fetter wrote: > On Sat, Jul 06, 2013 at 11:49:21AM +0100, Dean Rasheed wrote: > > Overall I think this patch offers useful additional functionality, in > > compliance with the SQL spec, which should be handy to simplify > > complex grouping queries. As I understand this feature, it is syntactic sugar for the typical case of an aggregate with a strict transition function. For example, "min(x) FILTER (WHERE y > 0)" is rigorously equivalent to "min(CASE y > 0 THEN x END)". Every SQL aggregate is strict (ISO 9075-2 section 4.15.4), so for standard queries it is *only* syntactic sugar. In PostgreSQL, it lets you do novel things with, for example, array_agg(). Is that accurate? > > I think this is ready for committer. The patch was thorough. I updated applicable comments, revisited some cosmetic choices, and made these functional changes: 1. The pg_stat_statements "query jumble" should incorporate the filter. 2. The patch did not update costing. I made it add the cost of the filter expression the same way we add the cost of the argument expressions. This makes "min(x) FILTER (WHERE y > 0)" match "min(case y > 0 THEN x end)" in terms of cost, which is a fair start. At some point, we could do better by reducing the argument cost by the filter selectivity. A few choices/standard interpretations may deserve discussion. The patch makes filter clauses contribute to the subquery level chosen to be the "aggregation query". This is observable through the behavior of these two standard-conforming queries: select (select count(outer_c) from (values (1)) t0(inner_c)) from (values (2),(3)) t1(outer_c); -- outer query is aggregation query select (select count(outer_c) filter (where inner_c <> 0) from (values (1)) t0(inner_c)) from (values (2),(3)) t1(outer_c); -- inner query is aggregation query I believe SQL (ISO 9075-2 section 6.9 SR 3,4,6) does require this. Since that still isn't crystal-clear to me, I mention it in case anyone has a different reading. Distinct from that, albeit in a similar vein, SQL does not permit outer references in a filter clause. This patch permits them; I think that qualifies as a reasonable PostgreSQL extension. > --- a/doc/src/sgml/keywords.sgml > +++ b/doc/src/sgml/keywords.sgml > @@ -3200,7 +3200,7 @@ > </row> > <row> > <entry><token>OVER</token></entry> > - <entry>reserved (can be function or type)</entry> > + <entry>non-reserved</entry> I committed this one-line correction separately. > --- a/src/backend/optimizer/plan/planagg.c > +++ b/src/backend/optimizer/plan/planagg.c > @@ -314,7 +314,7 @@ find_minmax_aggs_walker(Node *node, List **context) > ListCell *l; > > Assert(aggref->agglevelsup == 0); > - if (list_length(aggref->args) != 1 || aggref->aggorder != NIL) > + if (list_length(aggref->args) != 1 || aggref->aggorder != NIL || aggref->agg_filter != NULL) > return true; /* it couldn't be MIN/MAX */ > /* note: we do not care if DISTINCT is mentioned ... */ I twitched upon reading this, because neither ORDER BY nor FILTER preclude the aggregate being MIN or MAX. Perhaps Andrew can explain why he put aggorder there back in 2009. All I can figure is that writing max(c ORDER BY x) is so unlikely that we'd too often waste the next syscache lookup. But the same argument would apply to DISTINCT. With FILTER, the rationale is entirely different. The aggregate could well be MIN/MAX; we just haven't implemented the necessary support elsewhere in this file. See attached patch revisions. The first patch edits find_minmax_aggs_walker() per my comments just now. The second is an update of your FILTER patch with the changes to which I alluded above; it applies atop the first patch. Would you verify that I didn't ruin anything? Barring problems, will commit. Are you the sole named author of this patch? That's what the CF page says, but that wasn't fully clear to me from the list discussions. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Attachment
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
David Fetter
Date:
On Sun, Jul 14, 2013 at 10:15:12PM -0400, Noah Misch wrote: > On Sun, Jul 07, 2013 at 06:37:26PM -0700, David Fetter wrote: > > On Sat, Jul 06, 2013 at 11:49:21AM +0100, Dean Rasheed wrote: > > > Overall I think this patch offers useful additional functionality, in > > > compliance with the SQL spec, which should be handy to simplify > > > complex grouping queries. > > As I understand this feature, it is syntactic sugar for the typical case of an > aggregate with a strict transition function. For example, "min(x) FILTER > (WHERE y > 0)" is rigorously equivalent to "min(CASE y > 0 THEN x END)". > Every SQL aggregate is strict (ISO 9075-2 section 4.15.4), so for standard > queries it is *only* syntactic sugar. In PostgreSQL, it lets you do novel > things with, for example, array_agg(). Is that accurate? > > > > I think this is ready for committer. > > The patch was thorough. I updated applicable comments, revisited some > cosmetic choices, and made these functional changes: > > 1. The pg_stat_statements "query jumble" should incorporate the filter. > > 2. The patch did not update costing. I made it add the cost of the filter > expression the same way we add the cost of the argument expressions. This > makes "min(x) FILTER (WHERE y > 0)" match "min(case y > 0 THEN x end)" in > terms of cost, which is a fair start. At some point, we could do better by > reducing the argument cost by the filter selectivity. Thanks! > A few choices/standard interpretations may deserve discussion. The patch > makes filter clauses contribute to the subquery level chosen to be the > "aggregation query". This is observable through the behavior of these two > standard-conforming queries: > > select (select count(outer_c) > from (values (1)) t0(inner_c)) > from (values (2),(3)) t1(outer_c); -- outer query is aggregation query > select (select count(outer_c) filter (where inner_c <> 0) > from (values (1)) t0(inner_c)) > from (values (2),(3)) t1(outer_c); -- inner query is aggregation query > > I believe SQL (ISO 9075-2 section 6.9 SR 3,4,6) does require this. Since that > still isn't crystal-clear to me, I mention it in case anyone has a different > reading. > > Distinct from that, albeit in a similar vein, SQL does not permit outer > references in a filter clause. This patch permits them; I think that > qualifies as a reasonable PostgreSQL extension. Thanks again. > > --- a/doc/src/sgml/keywords.sgml > > +++ b/doc/src/sgml/keywords.sgml > > > @@ -3200,7 +3200,7 @@ > > </row> > > <row> > > <entry><token>OVER</token></entry> > > - <entry>reserved (can be function or type)</entry> > > + <entry>non-reserved</entry> > > I committed this one-line correction separately. > > > --- a/src/backend/optimizer/plan/planagg.c > > +++ b/src/backend/optimizer/plan/planagg.c > > @@ -314,7 +314,7 @@ find_minmax_aggs_walker(Node *node, List **context) > > ListCell *l; > > > > Assert(aggref->agglevelsup == 0); > > - if (list_length(aggref->args) != 1 || aggref->aggorder != NIL) > > + if (list_length(aggref->args) != 1 || aggref->aggorder != NIL || aggref->agg_filter != NULL) > > return true; /* it couldn't be MIN/MAX */ > > /* note: we do not care if DISTINCT is mentioned ... */ > > I twitched upon reading this, because neither ORDER BY nor FILTER preclude the > aggregate being MIN or MAX. Perhaps Andrew can explain why he put aggorder > there back in 2009. All I can figure is that writing max(c ORDER BY x) is so > unlikely that we'd too often waste the next syscache lookup. But the same > argument would apply to DISTINCT. With FILTER, the rationale is entirely > different. The aggregate could well be MIN/MAX; we just haven't implemented > the necessary support elsewhere in this file. Excellent reasoning, and good catch. > See attached patch revisions. The first patch edits > find_minmax_aggs_walker() per my comments just now. The second is > an update of your FILTER patch with the changes to which I alluded > above; it applies atop the first patch. Would you verify that I > didn't ruin anything? Barring problems, will commit. > > Are you the sole named author of this patch? That's what the CF > page says, but that wasn't fully clear to me from the list > discussions. While Andrew's help was invaluable and pervasive, I wrote the patch, and all flaws in it are my fault. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
David Fetter
Date:
On Sun, Jul 14, 2013 at 10:15:12PM -0400, Noah Misch wrote: > On Sun, Jul 07, 2013 at 06:37:26PM -0700, David Fetter wrote: > > On Sat, Jul 06, 2013 at 11:49:21AM +0100, Dean Rasheed wrote: > > > Overall I think this patch offers useful additional functionality, in > > > compliance with the SQL spec, which should be handy to simplify > > > complex grouping queries. > > As I understand this feature, it is syntactic sugar for the typical case of an > aggregate with a strict transition function. For example, "min(x) FILTER > (WHERE y > 0)" is rigorously equivalent to "min(CASE y > 0 THEN x END)". > Every SQL aggregate is strict (ISO 9075-2 section 4.15.4), so for standard > queries it is *only* syntactic sugar. In PostgreSQL, it lets you do novel > things with, for example, array_agg(). Is that accurate? > > > > I think this is ready for committer. > > The patch was thorough. I updated applicable comments, revisited some > cosmetic choices, and made these functional changes: > > 1. The pg_stat_statements "query jumble" should incorporate the filter. > > 2. The patch did not update costing. I made it add the cost of the filter > expression the same way we add the cost of the argument expressions. This > makes "min(x) FILTER (WHERE y > 0)" match "min(case y > 0 THEN x end)" in > terms of cost, which is a fair start. At some point, we could do better by > reducing the argument cost by the filter selectivity. > > > A few choices/standard interpretations may deserve discussion. The patch > makes filter clauses contribute to the subquery level chosen to be the > "aggregation query". This is observable through the behavior of these two > standard-conforming queries: > > select (select count(outer_c) > from (values (1)) t0(inner_c)) > from (values (2),(3)) t1(outer_c); -- outer query is aggregation query > select (select count(outer_c) filter (where inner_c <> 0) > from (values (1)) t0(inner_c)) > from (values (2),(3)) t1(outer_c); -- inner query is aggregation query > > I believe SQL (ISO 9075-2 section 6.9 SR 3,4,6) does require this. Since that > still isn't crystal-clear to me, I mention it in case anyone has a different > reading. > > Distinct from that, albeit in a similar vein, SQL does not permit outer > references in a filter clause. This patch permits them; I think that > qualifies as a reasonable PostgreSQL extension. > > > --- a/doc/src/sgml/keywords.sgml > > +++ b/doc/src/sgml/keywords.sgml > > > @@ -3200,7 +3200,7 @@ > > </row> > > <row> > > <entry><token>OVER</token></entry> > > - <entry>reserved (can be function or type)</entry> > > + <entry>non-reserved</entry> > > I committed this one-line correction separately. > > > --- a/src/backend/optimizer/plan/planagg.c > > +++ b/src/backend/optimizer/plan/planagg.c > > @@ -314,7 +314,7 @@ find_minmax_aggs_walker(Node *node, List **context) > > ListCell *l; > > > > Assert(aggref->agglevelsup == 0); > > - if (list_length(aggref->args) != 1 || aggref->aggorder != NIL) > > + if (list_length(aggref->args) != 1 || aggref->aggorder != NIL || aggref->agg_filter != NULL) > > return true; /* it couldn't be MIN/MAX */ > > /* note: we do not care if DISTINCT is mentioned ... */ > > I twitched upon reading this, because neither ORDER BY nor FILTER preclude the > aggregate being MIN or MAX. Perhaps Andrew can explain why he put aggorder > there back in 2009. All I can figure is that writing max(c ORDER BY x) is so > unlikely that we'd too often waste the next syscache lookup. But the same > argument would apply to DISTINCT. With FILTER, the rationale is entirely > different. The aggregate could well be MIN/MAX; we just haven't implemented > the necessary support elsewhere in this file. > > > See attached patch revisions. The first patch edits find_minmax_aggs_walker() > per my comments just now. The second is an update of your FILTER patch with > the changes to which I alluded above; it applies atop the first patch. Would > you verify that I didn't ruin anything? Barring problems, will commit. > > Are you the sole named author of this patch? That's what the CF page says, > but that wasn't fully clear to me from the list discussions. > > Thanks, > nm Tested your changes. They pass regression, etc. :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Andrew Gierth
Date:
Noah Misch said: > I twitched upon reading this, because neither ORDER BY nor FILTER preclude > the aggregate being MIN or MAX. Perhaps Andrew can explain why he put > aggorder there back in 2009. The bottom line is that I intentionally avoided assuming that an agg with an aggsortop could only be min() or max() and that having an order by clause would always be harmless in such cases. I can't think of an actual use case where it would matter, but I've seen people define some pretty strange aggs recently. So I mildly object to simply throwing away the ORDER BY clause in such cases. -- Andrew (irc:RhodiumToad)
Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Noah Misch
Date:
On Mon, Jul 15, 2013 at 06:56:17PM +0000, Andrew Gierth wrote: > Noah Misch said: > > > I twitched upon reading this, because neither ORDER BY nor FILTER preclude > > the aggregate being MIN or MAX. Perhaps Andrew can explain why he put > > aggorder there back in 2009. > > The bottom line is that I intentionally avoided assuming that an agg with an > aggsortop could only be min() or max() and that having an order by clause > would always be harmless in such cases. I can't think of an actual use case > where it would matter, but I've seen people define some pretty strange aggs > recently. > > So I mildly object to simply throwing away the ORDER BY clause in such cases. I can't think of another use for aggsortop as defined today. However, on further reflection, min(x ORDER BY y) is not identical to min(x) when the B-tree operator class of the aggsortop can find non-identical datums to be equal. This affects at least min(numeric) and min(interval). min(x) chooses an unspecified x among those equal to the smallest x, while min(x ORDER BY y) can be used to narrow the choice. I will update the comments along those lines and not change semantics after all. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Re: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
From
Noah Misch
Date:
On Mon, Jul 15, 2013 at 11:43:04AM -0700, David Fetter wrote: > On Sun, Jul 14, 2013 at 10:15:12PM -0400, Noah Misch wrote: > > See attached patch revisions. The first patch edits find_minmax_aggs_walker() > > per my comments just now. The second is an update of your FILTER patch with > > the changes to which I alluded above; it applies atop the first patch. Would > > you verify that I didn't ruin anything? Barring problems, will commit. > Tested your changes. They pass regression, etc. :) Committed. -- Noah Misch EnterpriseDB http://www.enterprisedb.com