Re: RFE: Column aliases in WHERE clauses - Mailing list pgsql-general

From Rafal Pietrak
Subject Re: RFE: Column aliases in WHERE clauses
Date
Msg-id 1347956976.21898.19.camel@localhost.localdomain
Whole thread Raw
In response to Re: RFE: Column aliases in WHERE clauses  (Mike Christensen <mike@kitchenpc.com>)
List pgsql-general
On Mon, 2012-09-17 at 16:44 -0700, Mike Christensen wrote:
> On Mon, Sep 17, 2012 at 4:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Mike Christensen <mike@kitchenpc.com> writes:

[-----------]

> > It's easier to understand why this is if you realize that SQL has a very
> > clear model of a "pipeline" of query execution.  Conceptually, what
> > happens is:
> >
> > 1. Form the cartesian product of the tables listed in FROM (ie, all
> > combinations of rows).
> >
> > 2. Apply the WHERE condition to each row from 1, and drop rows that
> > don't pass it.
> >
> > 3. If there's a GROUP BY, merge the surviving rows into groups.
> >
> > 4. If there's aggregate functions, compute those over the rows in
> > each group.
> >
> > 5. If there's a HAVING, filter the grouped rows according to that.
> >
> > 6. Evaluate the SELECT expressions for each remaining row.
> >
> > 7. If there's an ORDER BY, evaluate those expressions and sort the
> > remaining rows accordingly.
> >
> > (Obviously, implementations try to improve on this - you don't want
> > to actually form the cartesian product - but that's the conceptual
> > model.)
> >
> > The traditional shortcut of doing "ORDER BY select-column-reference"
> > is okay according to this world view, because the SELECT expressions

[--------------]

> > are already available when ORDER BY needs them.  However, it's not
> > sensible to refer to SELECT outputs in WHERE, HAVING, or GROUP BY,
> > because those steps precede the evaluation of the SELECT expressions.
> >
> > This isn't just academic nit-picking either, because the SELECT
> > expressions might not be valid for rows that don't pass WHERE etc.
> > Consider
> >         SELECT 1/x AS inverse FROM data WHERE x <> 0;
> > The implementation *must* apply WHERE before computing the SELECT
> > expressions, or it'll get zero-divide failures that should not happen.

[-----------------]

>
> Excellent information, Tom!  I've been somewhat curious on this
> behavior for some time now, and it's great to get a detailed answer..
>

Yes.

But it puzzles me, if it *conceptually* would be a signifficant
misstake, when what Tom calls "select-column-reference" (I understand
as: the colunm name introduced on the select-list), would actually be
regarded by the SQL parser as "macro-definition". Just to place the
*string*, defined at select-list-level by "AS <name>", to wherever it's
used in the WHERE/ORDER/GROUP or HAVING clauses at earlier then
select-list-evaluation processing stage.

Actual Tom's example(1):
    SELECT 1/x AS inverse FROM data WHERE x <> 0;
extended to (2):
    SELECT 1/x AS inverse FROM data WHERE x <> 0 AND 1/x > 20;
could be written by user as (3):
    SELECT 1/x AS inverse FROM data WHERE x <> 0 AND inverse > 20;
but token/replaced to its form (2) before WHERE evaluation.


-R



pgsql-general by date:

Previous
From: "Herouth Maoz"
Date:
Subject: Re: Index creation takes more time?
Next
From: Craig Ringer
Date:
Subject: Data recovery after inadvertent update?