Re: Aggregates containing outer references don't work per spec - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Aggregates containing outer references don't work per spec
Date
Msg-id 29261.1054784807@sss.pgh.pa.us
Whole thread Raw
In response to Aggregates containing outer references don't work per spec  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Aggregates containing outer references don't work per  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
I wrote:
> Now I finally understand why the spec has all that strange verbiage
> about outer references in set-function arguments. This is the case
> they're talking about.  (I don't much like their restriction to a single
> outer reference ... seems like it would be appropriate to allow multiple
> references as long as they're all from the same outer query level.)

I've been thinking about that some more.  Currently, if a subquery
contains an aggregate expression likeSUM(localvar + outervar)
(where localvar is a variable of the subquery and outervar is a variable
of the parent query), we accept this, and execute it in what seems to me
a perfectly reasonable way: the aggregate is evaluated over all the
appropriate rows of the subquery, taking the outer variable as a
constant for any one evaluation of the subquery.

The spec appears to forbid this case, but I really don't see why.

Obviously the aggregate argument could be more complex, with outer
references from several different levels of outer query, but that
doesn't change anything --- all the outer-reference variables are
constants from the perspective of the subquery, whether they come
from one level up or many levels up.

What we now realize is that the spec says SUM(outervar) ought to be
considered an aggregate evaluated at the outervar's query level, and
then imported *as a whole* as an effective constant for the subquery.

I claim that there should be nothing wrong with interpretingSUM(outervar1 + outervar2)
as an aggregate of the closest outer variable's level, with further-up
variables taken as constants with respect to that level.  This isn't
really a different case, it's the same as "SUM(localvar + outervar)"
from the perspective of that closest outer level.  We are just allowing
the expression to be referenced from within sub-subqueries.  The
constraint that the aggregate must appear in the SELECT targetlist or
HAVING clause applies to the query level that the aggregate belongs to,
but not to lower levels.

In short, I see no reason why the spec should restrict the aggregate's
argument to contain only a single outer reference, or even references
from just a single outer query level.  The behavior is perfectly well
defined without that constraint.  We can accommodate both our historical
behavior and the spec-mandated cases if we interpret multilevel cases
per this sketch.

Anyone see a flaw in this reasoning?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: [GENERAL] Anonymous CVS access
Next
From: Bruce Momjian
Date:
Subject: Re: default locale considered harmful? (was Re: [GENERAL]