Re: SQL feature requests - Mailing list pgsql-hackers

From Ben Tilly
Subject Re: SQL feature requests
Date
Msg-id acc274b30708231025u30c6e3a8i2fae2555b10fb617@mail.gmail.com
Whole thread Raw
In response to Re: SQL feature requests  (Michael Glaesemann <grzm@seespotcode.net>)
Responses Re: SQL feature requests  (Michael Glaesemann <grzm@seespotcode.net>)
List pgsql-hackers
On 8/22/07, Michael Glaesemann <grzm@seespotcode.net> wrote:
>
> On Aug 22, 2007, at 20:36 , Ben Tilly wrote:
[...]
> > I can well believe that the standard says that you must accept
> > subqueries with aliases.  But does it say you must reject subqueries
> > without aliases?  I strongly doubt that.
>
> If I'm reading my draft copy of the SQL:2003 spec right (and there's
> a good chance that I'm not, as it's not the easiest document for me
> to parse), aliases *are* required.

Again, it lays out very carefully the syntax that must be valid.  It
does not say that extensions to that syntax are not valid.  Every
database implements such extensions.

[...]
> > I have no clue what EnterpriseDB does.
>
> In case it wasn't clear, the reason I bring it up is that
> EnterpriseDB, while working from a PostgreSQL base, strives for
> Oracle compatibility.

I got the reference.  But I don't know what EnterpriseDB does - I've
never used it.

[...]
> >> AIUI, Integers are only allowed because the SQL standard explicitly
> >> allows you to refer to columns by the order they appear in the SELECT
> >> list. Otherwise the GROUP BY items need to be column names.
> >
> > Need to be?
> >
> > The SQL-92 standard is clear that you must accept a list of column
> > names.  It is also clear that a column name must be be of the form
> > field or table.field.
>
> The 2003 draft (same as above) seems to agree with the SQL92 standard:
>
[ large snippet of the draft elided ]
>
> There'd have to be a pretty strong reason to extend this, more than
> just a convenience, I should think.

It is already extended in postgres.  For pretty good reasons.

> >   In no way, shape or form does that allow having
> > terms like trim(foo.bar) in a group by.
> >
> > But every reasonable database that I know - including postgres -
> > allows that.
>
> Can you give an example of something like this working in PostgreSQL?
> I get an error when I try to use a text value in a GROUP BY clause.
> (Or are you referring specifically to the CASE expression corner case?)

This works in every reasonable database that I have tried it in:
 select trim(foo.bar), count(*) from foo group by trim(foo.bar)

And yes, I have tried it in postgres.

[...]
> > Postgres explicitly disallows a constant character expression.  But it
> > allows the constant case expression that I gave.  It would be nice for
> > me to not have to remember that very obscure and convoluted case.
>
> I agree, and would move that it should be disallowed if there isn't a
> reason for it to be maintained, for exactly the reason you give:
> there shouldn't be such convoluted, and obscure corner case.

There is an excellent reason to generally allow complex expressions in
group by statements, and that reason is that many useful and
reasonable queries won't work if you don't.  Such as the one I gave
above.

[...]
> > I don't know what the SQL spec says, but I know (having talked to
> > other developers) that many people would find it very nice.
>
> Since I had the spec open, I tried to look at this as well, though I
> must admit I found it very tough going.
>
> I think this is the key section:
>
> > 10.9 <aggregate function>
> >
> > ...
> >
> > Syntax Rules
> >
> > ...
> >
> > 4) The argument source of an <aggregate function> is
> > Case:
> > a) If AF is immediately contained in a <set function
> > specification>, then a table or group of a grouped table as
> > specified in Subclause 7.10, "<having clause>", and Subclause 7.12,
> > "<query specification>".
> >
> > b) Otherwise, the collection of rows in the current row's window
> > frame defined by the window structure  descriptor identified by the
> > <window function> that simply contains AF, as defined in Subclause
> > 7.11, "<window clause>".
>
> Now the <query specification> stuff is pretty dense, and the <window
> clause> is currently impenetrable for me, so I just looked at the
> <having clause> and <query specification> sections referenced above.

I'm not surprised that the window clause section is impenetrable to
you.  Window clauses are part of the definition of analytic functions,
which postgres does NOT yet implement.  However they are on the todo
list.

Speaking personally, analytic functions are the single feature from
Oracle that I've most missed when moving to postgres.  I would be
happy to explain what they are and how they should work either on the
list or off to anyone who is interested in implementing them.  However
they are a topic for another thread, and probably for another week.
(I'm out next week, and am loathe to open that can of worms just yet.)

However I will note as a practical matter that implementing analytic
functions will increase how often people will want to have anonymous
subqueries in the from statement.  (Because for a certain set of
problems a useful technique is to do a group by query of an analytic
query.)

[...]
> > A grouped table is a set of groups derived during the evaluation of
> > a <group by clause>. A group G is a collection of rows in which,
> > for every grouping column GC, if the value of GC in some row is not
> > distinct from GV, then the value of GC in every row is GV;
> > moreover, if R1 is a row in group G1 of grouped table GT and R2 is
> > a row in GT such that for every grouping column GC the value of GC
> > in R1 is not distinct from the value of GC in R2, then R2 is in G1.
> > Every row in GT is in exactly one group. A group may be considered
> > as a table. Set functions operate on groups.
>
> So we're already talking about an explicit GROUP BY clause, and it
> doesn't seem to leave much wiggle room grouping columns that aren't
> explicitly listed. I'll readily admit I very easily could be reading
> this wrong or have missed a section of the spec that discusses this
> in more detail. (And that goes for any of my interpretations of the
> spec!)

Again, the spec specifies what must be accepted.  It does not ban
extensions.  Postgres already implements the useful extension of
accepting complicated expressions in a group by.  It could implement
more.

> One place you and I differ is that I think you'd be in favor of many
> more extensions in cases where the SQL spec doesn't explicitly say
> "Thou shalt not do X". I'm a bit more conservative here: in my
> opinion, if the SQL speaks to a particular issue (for example, a
> WHERE clause), then you should pretty much keep to the letter of what
> it says. Extensions are more for areas where the standard is silent.

We definitely differ.  To my eyes an extension that is well defined
and clearly useful which is currently illegal deserves consideration.
Utility is not the only concern, of course, you have to think about
how well it fits with other features, its likelyhood of breaking in
the future, etc.

Inferring the group by statement that is likely to be meant the vast
majority of the time is clearly useful.  It would get rid of a lot of
useless redundancy in a lot of people's queries.  It may be that it is
deemed too much work or too risky to implement.

> For example, full text search will be added to PostgreSQL in 8.3.
> AFAIK, the SQL spec doesn't talk about full text search at all, so
> this is an area where extensions are a natural fit. Others, such as
> PostgreSQL's DISTINCT ON, is a much more difficult issue, and I'd
> suspect if it were proposed as a new feature now it would be
> rejected. However, it's been a part of PostgreSQL for a long time, so
> for historical reasons it remains. Likewise, I doubt "text" as a
> synonym for "varchar" would be accepted.

If we had this conversation in 15 years, I'm willing to bet that full
text search would be in the spec and you'd be able to complain that
something about how it was implemented was not quite standard but
would have to remain for historical reasons...

> Anyway, enough excursions into SpecLand for me. I'd welcome anyone
> else taking a crack at it.

Cheers,
Ben


pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: SQL feature requests
Next
From: Andrew Dunstan
Date:
Subject: Re: SQL feature requests