Re: BUG #4465: GROUP BY is not to SQL standard - Mailing list pgsql-bugs

From Craig Ringer
Subject Re: BUG #4465: GROUP BY is not to SQL standard
Date
Msg-id 48F6B8E1.7070200@postnewspapers.com.au
Whole thread Raw
In response to Re: BUG #4465: GROUP BY is not to SQL standard  ("Tony Marston" <tony@marston-home.demon.co.uk>)
List pgsql-bugs
[Note: I'm *not* an expert in the SQL standard, but I might be able to
help clarify a misunderstanding or two in the discussion so far].

Tony Marston wrote:
> Support for functional dependencies is not a feature that can be turned off
> in any database engine. Dependencies, whether they are functional,
> transitive, multi-valued or join dependencies, are inherent in the database
> design.

Well, my understanding is that they're a core part of relational theory,
anyway. This doesn't necessarily make them "inherent in the database
design" - in that AFAIK no computer database implementation fully
implements the correct and complete relational calculus.

In a sense, all relational database implementations can be considered
"incorrect" - they can all produce results that wouldn't be permitted by
a strict relational implementation. On the upside, they actually produce
the results this century, and correctness can still be assured by manual
locking in critical situations.

If there really is a strict and pure relational implementation, I'd be
curious to know about it. Does it work? Does it work in the real world?

> It is therefore nonsense to say that support for functional
> dependencies is optional.

Again, that's true for the theory, but practical implementations may not
achieve theoretical perfection. SQL is a standard describing
requirements and optional features fo a practial implementation of a
relational database engine, not a description of relational theory.

> Saying that the SQL standard and Relational Theory are unconnected is
> complete misdirection. The SQL standard is surely there to define how
> Relational Theory can/should be implemented.

You have used the word "surely" - in this case, as conjecture. People
here have already told you that the SQL standard does NOT describe a
strict implementation of relational theory, and that an SQL-confirming
implementation may lack features like predicate locking, identification
of functional dependencies in result sets, etc.

> A database cannot be classed as
> "relational" if it does not support standard SQL, therefore the two must go
> hand in hand.

Er, no. Support for standard SQL does not imply a strict implementation
of relational theory (ie being "classed as relational"), as the SQL
standard does not actually describe a strict implementation of
relational theory. I don't know enough about the standard to be sure,
but I wouldn't be too surprised if a fully conformant SQL implementation
could *not* strictly implement relational theory.

> The simple fact is that it is only in the 1992 standard that it states that
> ALL columns in the SELECT clause must be identified in the GROUP BY clause.
> In all subsequent standards it has been permissible to omit any column from
> the GROUP BY clause if it is functionally dependent on any other column in
> the GROUP BY clause. All you can do is point to paragraphs which are NOT
> contained in the definition of the GROUP BY clause and say "this is our get
> out". Now you wonder why I refer to your arguments as "weasel words".

What's been said in response to your initial post all looks pretty
reasonable to me, and I see attempts to explain the situation rather
than "weasel".

Standards aren't generally the perfect ideals we might want them to be,
and the SQL standard is full of optional features, practical
compromises, and so on. As Peter Eisentraut explained in reply to your
initial post, PostgreSQL does not support one of the optional features
in the SQL-99/SQL-2003 standard, namely T301, and retains SQL-92
behaviour in this area. That's perfectly acceptable according to the
standard.

The definition of functional dependencies in T301 that you are arguing
about is a definition in the SQL standard that PostgreSQL attempts to
follow, not a definition made up by the PostgreSQL developers. If you
don't like the definition, you'll need to take that up with the SQL
standards body. If it contradicts relational theory, then that's either
a practical compromise or an error made by the SQL standards body.

It would be nice if PostgreSQL did support that feature. However, it can
be compliant with the SQL standard without it, as it's an optional
feature. (In fact I don't know if PostgreSQL *is* fully compliant with
any of the SQL standards, but that's another issue).

Arguably no SQL database should be described as "relational" - but in
practice, SQL describes a good and workable approximation of relational
theory that works in the real world and is useful to describe as
"relational" to distinguish it from other, completely different,
database designs.

--
Craig Ringer

pgsql-bugs by date:

Previous
From: Rodriguez Fernando
Date:
Subject: Re: Postgres database problem
Next
From: Arthur Knight Hammer
Date:
Subject: 8.2.10 pg_ctl restart will stop but not start server