Thread: Thus spoke SQL3 (on OO)

Thus spoke SQL3 (on OO)

From
Peter Eisentraut
Date:
Okay, here's the ugly and dirty truth. Before you complain, please keep in
mind that I didn't write the standard, although I tend to give it the
benefit of the doubt after ignoring the truly silly stuff. :)


* How to create inheritance hierarchies

** CREATE TABLE syntax

Ignoring the parts that do no not pertain to inheritance, the CREATE
TABLE syntax in SQL3 looks like this:

CREATE TABLE <table name> { <table element list> | <subtable clause> }

where <table element list> is the usual `(colname type, colname type,
...)' variety and <subtable clause> looks like this:

UNDER <supertable>, <supertable>, ...

or optionally like

UNDER <supertable> WITH (<old colname> AS <new colname>, <old
colname2> AS <new colname2>, ...), ...

But notice how the syntax { <table element list> | <subtable clause> }
would force you to either create new columns in the table or inherit
all columns from one or more supertables. That evidently cannot be
right. Reinforcing this believe is that the standard at several places
talks about "inherited columns" vs "originally defined columns", which
would of course not be possible under this scheme. Let's therefore
assume that the syntax should really be more something like this:

CREATE TABLE <table name> <table element list> <subtable clause>
| CREATE TABLE <table name> <table element list>
| CREATE TABLE <table name> <subtable clause>

This is really not any different from the current INHERITS syntax;
perhaps in a fit of purity someone is willing to complement it
accordingly. One key element here that is new is the column renaming
option.

** Column naming and ordering

The ordering of the columns has apparently been a problem even for the
authors of the standard. The rules for CREATE TABLE merely say that
the columns are ordered according the order in which the supertables
are listed in the the UNDER clause. It does not say anything about
whether the originally defined columns come before or after the
inherited ones.

This does not make the issue of adding columns easier. The rules say:

"If [the table being altered] is a supertable, then an <add column
definition>, without further Access Rule checking, is effectively
performed for each of its subtables, thereby adding the column as an
inherited column in these subtables."

This would make some sort of sense if the originally defined columns
came before the inherited ones, but the way it stands it doesn't help
a lot.

The resolution of names is done as follows. First, a list of all
inherited columns is created. If one column is "replicated", that is,
more than one supertable inherited it from the same super-supertable,
all but the first occurrence is dropped. Then the column renaming
clauses are applied. The resulting list must not contain a duplicate
column name.

This scheme is quite different from the current PostgreSQL
implementation, which merges columns of equal name and datatype. It
has been mentioned before during OO discussions that the association
of inherited columns by name alone should probably be dropped and
replaced by pg_attribute.oid references. This would seem like a good
thing to do because it would allow us to detect replicated columns
reliably and give a chance to the column renaming option.

** OID, Identity, et al.

"An object identifier OID is a value generated when an object is
created, to give that object an immutable identity. It is unique in
the known universe of objects that are instances of abstract data
types, and is conceptually separate from the value, or state, of the
instance."

Since the way I understand it a table definition also defines an
abstract data type in some way or other, and rows are instantiations
of that data type, this definition of OID matches ours pretty good.

"The OID value is materialized as a character string with an
implementation-defined length and character set SQL_TEXT."

... or maybe not. :-)

What exactly IDENTITY is is still a bit unclear to me but it is
definitely not the proposed identification of the table a row came
from. The implicit column IDENTITY contains a `row identifier'.

"The value of a row identifier for a given base table row is equal to
itself and it not equal to the value of a row identifier for any other
base table row within the database."

(Note: `base table' is the opposite of `derived table' (a view), and
is unrelated to whether a table is a sub- or supertable.)

There is no literal for row identifiers and they do not interact with
other data types. The only manifestation is through the API, where
they are mapped to unique row "handles" in a language specific
fashion.

Not every table has row identifiers, you must ask for them when
creating the table. This all relates to inheritance because

"A row identifier is implicitly defined for [the table to be created].
For every table ST named in the <subtable clause>, a row identifier is
implicitly defined for ST."

Row identifiers were ANSI-only at the time of the draft, ISO simply
says that any supertable must have a primary key. I can't quite put my
finger on either of these requirements, though.

In any case I'd advise against overloading IDENTITY in the manner that
was proposed.

* Cloning

One thing that often comes up in `various ways of looking at
inheritance' threads is the idea of cloning the definition of a given
table as part of a newly created table. There's a syntax for that as
well in SQL3:

CREATE TABLE name (   colname type,   colname type,   LIKE other_table,   colname type,   ...
);

This effectively pastes whatever you wrote when you created the
"other_table" at the place of the LIKE. After the create table is
done, the new and the "old" table are completely unrelated. Of course
if you want to clone the data as well in one shot you could use CREATE
TABLE AS. In any case, this has really very little to do with the
inheritance we're discussing here, other than that it `feels' the
same.

* Operating on data

** SELECT

To make a long story short: subtables are automatically examined,
unless ONLY is specified.

To make the story longer, phrasing and viewing it like this is really
quite incorrect. Instead:

"Any row of a subtable must correspond to one and only one row of each
direct supertable. Any row of a supertable corresponds to at most one
row of a direct subtable."

The key word I see here is `correspondence', namely that a given row
is always a member of both the sub- and the supertable (probably
having more columns in the subtable obviously) and doesn't belong to
either of them more then to the other. In other words, the row is
conceptually shared. Then what the ONLY table reference really does is
select all rows of a supertable that do not have any corresponding row
in any subtable. (This is the wording the standard chose.)
Implementation-wise this would be the easier thing to do (which is
probably why it's done this way now), but conceptually it is really
the unnatural situation because it's similar to an `except' query.

** Insert, Update, Delete

These commands have no special notion of inheritance. Since all rows
are effectively shared between sub- and supertables you cannot update
them in one of them "only" without some copy-on-write concept. Of
course some rows in a supertable may have no corresponding rows in any
subtable, but that's nothing the row knows about or cares about. In
sophisticated inheritance hierarchies, rows and parts of rows may be
shared in very involved ways, so I foresee some issues with Update
Only.

(This sounds stranger than it really is: There is no requirement that
the `corresponding row' is physically stored at the supertable, it is
only required that it effectively exists there as well, which is
satisfied if SELECT retrieves it by default. In some sense the
apparently `favoured' storage model here is that all the inherited
attributes and their values are stored in the supertable heap and only
the originally defined attributes of subtables are in the subtable
heap. This method favours the SELECT semantics on supertables because
it doesn't have to worry about subclasses at all. But queries on
subtables effectively become joins.)


-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: Thus spoke SQL3 (on OO)

From
Chris
Date:
Peter Eisentraut wrote:

Excellent research Peter....


> Let's therefore
> assume that the syntax should really be more something like this:
> 
> CREATE TABLE <table name> <table element list> <subtable clause>
> | CREATE TABLE <table name> <table element list>
> | CREATE TABLE <table name> <subtable clause>

I agree.

> ** Column naming and ordering
<snip> 
> This would make some sort of sense if the originally defined columns
> came before the inherited ones, but the way it stands it doesn't help
> a lot.

Yes, super-class first seems reasonable.

> The resolution of names is done as follows. First, a list of all
> inherited columns is created. If one column is "replicated", that is,
> more than one supertable inherited it from the same super-supertable,
> all but the first occurrence is dropped. Then the column renaming
> clauses are applied. The resulting list must not contain a duplicate
> column name.
> 
> This scheme is quite different from the current PostgreSQL
> implementation, which merges columns of equal name and datatype.

In the absense of renaming, I don't think this amounts to anything
different to PostgreSQL.

> It
> has been mentioned before during OO discussions that the association
> of inherited columns by name alone should probably be dropped and
> replaced by pg_attribute.oid references. This would seem like a good
> thing to do because it would allow us to detect replicated columns
> reliably and give a chance to the column renaming option.

That would mean I guess sharing pg_attributes between different classes
in a hierarchy, which I assume doesn't happen now. But sounds good.

> "The OID value is materialized as a character string with an
> implementation-defined length and character set SQL_TEXT."
> 
> ... or maybe not. :-)

How does character set affect this? It is different to data type isn't
it?

> What exactly IDENTITY is is still a bit unclear to me but it is
> definitely not the proposed identification of the table a row came
> from. 

How do you know?

> * Cloning
> CREATE TABLE name (
>     colname type,
>     colname type,
>     LIKE other_table,
>     colname type,
>     ...
> );

Hmm. Fairly useless feature IMO.

> ** SELECT
> 
> To make a long story short: subtables are automatically examined,
> unless ONLY is specified.
> 
> To make the story longer, phrasing and viewing it like this is really
> quite incorrect. Instead:
> 
> "Any row of a subtable must correspond to one and only one row of each
> direct supertable. Any row of a supertable corresponds to at most one
> row of a direct subtable."

They've chosen this model to describe how things work. Unless there is
some subtlety I'm missing the model can equally be described by the
"subtables are automatically examined" model. Maybe they thought it was
easier to describe in those terms (I don't, I think it's lame), but it
shouldn't affect implementation. In particular I think implementing it
this way would be about as silly a thing I've ever heard. pgsql has it
right.

> The key word I see here is `correspondence', namely that a given row
> is always a member of both the sub- and the supertable (probably
> having more columns in the subtable obviously) and doesn't belong to
> either of them more then to the other. In other words, the row is
> conceptually shared. Then what the ONLY table reference really does is
> select all rows of a supertable that do not have any corresponding row
> in any subtable. (This is the wording the standard chose.)
> Implementation-wise this would be the easier thing to do (which is
> probably why it's done this way now), 

Umm no, it is not the way it is done now (in pgsql). When ONLY is
specified (or rather when "*" is not specified in current pgsql syntax),
it just queries the base class table. It doesn't check "rows of a super
table that do not have any corresponding row in any subtable". The
subtable just doesn't come into it.

If it were implemented that way, then a complex inheritance hierarchy
could result in a join across a dozen tables. Avoiding that is well
worth having to examine several tables in a query. (Most of the time
anyway). Put another way, a UNION is much cheaper than a join.

> ** Insert, Update, Delete
> 
> These commands have no special notion of inheritance. Since all rows
> are effectively shared between sub- and supertables you cannot update
> them in one of them "only" without some copy-on-write concept.

ONLY in the case of update and insert would refer to any row which is
does not correspond to any row in a sub-table. Boy I hate talking in
terms of this model, because it's really lame.

I suspect even in this lame model, it doesn't imply copy-on-write. It
just means delete cascades to sub-class tables, delete only doesn't need
to and update ignores sub-classes and update only updates only when it
doesn't correspond to sub-class tuples.

Back to the real world however and it means delete ONLY doesn't go off
searching subclasses and update ONLY doesn't go off searching
sub-classes.

Of course insert isn't affected for the same reason C++ constructors are
not polymorphic.

> Of
> course some rows in a supertable may have no corresponding rows in any
> subtable, but that's nothing the row knows about or cares about. In
> sophisticated inheritance hierarchies, rows and parts of rows may be
> shared in very involved ways, so I foresee some issues with Update
> Only.

Which is why it is totally insane to implement it that way. The way
postgres implements it now is very simple and works.

> (This sounds stranger than it really is: There is no requirement that
> the `corresponding row' is physically stored at the supertable, it is
> only required that it effectively exists there as well, which is
> satisfied if SELECT retrieves it by default.
> In some sense the
> apparently `favoured' storage model here is that all the inherited
> attributes and their values are stored in the supertable heap and only
> the originally defined attributes of subtables are in the subtable
> heap. This method favours the SELECT semantics on supertables because
> it doesn't have to worry about subclasses at all. But queries on
> subtables effectively become joins.)

Boy. I think we should  look to how other people have implemented object
models rather than how SQL3 describes the concept. This sounds like a
nightmare.


Re: Thus spoke SQL3 (on OO)

From
"Robert B. Easter"
Date:
On Sat, 20 May 2000, Peter Eisentraut wrote:
> 
> UNDER <supertable>, <supertable>, ...

The standard is very confusing,  I am probably wrong, but I didn't see the
syntax for allowing more than one <supertable clause> after the UNDER keyword.

Did Oracle approve ANSI-ISO-9075?  I didn't notice them listed anywhere.  9075
is the SQL3 standard right?  (or else I'm reading the wrong stuff!!)  :)
-- 
Robert B. Easter
reaster@comptechnews.com


Re: Thus spoke SQL3 (on OO)

From
"Robert B. Easter"
Date:
On Sat, 20 May 2000, Chris wrote:
> Boy. I think we should  look to how other people have implemented object
> models rather than how SQL3 describes the concept. This sounds like a
> nightmare.

SQL3 does not appear to really have an object model.  Rather, it appears to be
a hierarchial model on top of the relational model.  Section 4.16.2 (in 9075-2)
"Referenceable tables, subtables, and supertables" talks about "a leaf table",
if that clarifies anything.

Here is a quote (9075-2 4.16):
        Let T(a) be a maximal supertable and T be a subtable of T(a). The        set of all subtables of T(a) (which
includesT(a) itself) is called        the subtable family of T or (equivalently) of T(a). Every subtable        family
hasexactly one maximal supertable.
 
        A leaf table is a table that does not have any proper subtables.


Definitions (my interpretations anyway):
* Every table is a subtable and supertable of itself.

* A proper subtable (of a supertable) is a table that was CREATEd with an UNDER
clause that references its supertable - i.e, a proper subtable is just a
subtable that is not the supertable itself.

* A proper supertable (of a subtable) is a table that was specified in an
UNDER clause during creation of a subtable.

* A maximal supertable is a table that is not a subtable of any other table.


So, it says that every subtable family (or member of) has exactly one maximal
(root) supertable. This seems to make clear that multiple inheritance is not
allowed.  The picture of this hierarchy is inverted trees with the roots
at maximal supertables with subtables branching down, EXTENDing the
supertable.

Another quote (9075-2 4.16):
        Users must have the UNDER privilege on a table before they can use        the table in a subtable definition. A
tablecan have more than one        proper subtable. Similarly, a table can have more than one proper
supertable.


Ok, it can have more than one (proper) supertable.  This means that a chain
of inheritance is allowed: maximal supertable -> subtable1 -> (sub)subtable2
etc, where (sub)subtable2 has two supertables: maximal supertable and subtable1.

Only one table can be specified in the UNDER clause, which prevents the
following possibility:

(1)    subtable_a UNDER maximal_supertable
(2)    subtable_b UNDER maximal_supertable
(3)    subtable_abc UNDER subtable_a, subtable_b

(3) is not allowed, but if it where, then subtable_abc would still have had only one
maximal supertable.  If allowed, it would have inherited maximal supertable
twice.

Another quote (9075-2 4.16):
        The secondary effects of table updating operations on T on proper        supertables and subtables of T are as
follows:
        -  When row R is deleted from T, for every table ST that is a           proper supertable or proper subtable of
T,the corresponding           superrow or subrow SR of R in ST is deleted from ST.
 
        -  When row R is replaced in T, for every table ST that is a proper           supertable or a proper subtable
ofT the corresponding superrow           or subrow SR of R in ST is replaced in ST.
 
        -  When row R is inserted into T, for every proper supertable ST of           T the corresponding superrow SR
ofR is inserted into ST.
 


These effects describe a sharing of properties (columns) among the super and
subtables.  A row in a supertable may be part of a single row in 0 or 1 of its
subtables (if I got it right) - a 1:1 relationship if any.  The subtable and
supertable are linked together in the tree hierarchy and are not independent
after creation.  The subtable extends additional attributes onto the supertable.

Summing this up a little now, SQL3's UNDER clause appears to allow an EXTENDS
type of inheritance, which is like a hierarchial (tree) model.  It does not have
a general-pupose object-oriented capability.  It does not provide for the
CLONE and ASSIMILATE types of inheritance that I decribed in an earlier message
to this list.  As other messages have stated, UNDER is not too different than
what INHERITS currently does.  Actually, INHERITS allows multiple inheritance
too, so it does more right now (I guess).

Since INHERITS, as it is implemented now, is like SQL3's UNDER, maybe it should
NOT allow multiple inheritance and should strive to become UNDER if SQL3 is a
good idea.

If the other object-oriented methods, like CLONES and ASSIMILATES (or whatever
you want to call them), is ever wanted in PostgreSQL, then looks like some other
standard(s) will have to be drawn from.  I have not looked at the ODMG 3.0
(standard) yet.  But maybe it has the missing capabilities.  Is ODMG 3.0 an
international standard?  I'd like to just download it and read it, but looks
like you have to buy it for $39.95.

I hope my comments are useful. :)

-- 
Robert B. Easter
reaster@comptechnews.com


Re: Thus spoke SQL3 (on OO)

From
Chris
Date:
"Robert B. Easter" wrote:

> SQL3 does not appear to really have an object model.  Rather, it 
> appears to be a hierarchial model on top of the relational model. 

It seems like it amounts to the same thing to me. A bit like me saying
"A circle is the set of points equi-distant from a point", and someone
else arguing "No, a circle is the graph of points represented by the
formula x^2 + y^2 = n". At the end of the day they amount to the same
thing I think.

The other thing is that some SQL3 statements seem to revert to the
object model - 
"If [the table being altered] is a supertable, then an <add column
definition>, without further Access Rule checking, is effectively
performed for each of its subtables, thereby adding the column as an
inherited column in these subtables."


> So, it says that every subtable family (or member of) has exactly one 
> maximal (root) supertable. This seems to make clear that multiple 
> inheritance is not allowed.  The picture of this hierarchy is inverted 
> trees with the roots at maximal supertables with subtables branching 
> down, EXTENDing the supertable.

"Effectively, components of all direct supertype representations are
copied to the subtype's representation with same name and data type. To
avoid name clashes, a subtype can rename selected components of the
representation inherited from its direct supertypes"

Notice it says "all direct supertype", which says to me you can multiple
_direct_ supertypes. Also note the "name clashes". How can you have name
clashes without multiple inheritance?

> Only one table can be specified in the UNDER clause, which prevents the
> following possibility:
> 
> (1)     subtable_a UNDER maximal_supertable
> (2)     subtable_b UNDER maximal_supertable
> (3)     subtable_abc UNDER subtable_a, subtable_b
> 
> (3) is not allowed, but if it where, then subtable_abc would still have 
> had only one maximal supertable.  If allowed, it would have inherited 
> maximal supertable twice.

If allowed, it doesn't mean it would inherit "maximal_supertable" twice.
It would have inherited it once through two routes. Like virtual
inheritance in C++. It some cases it could mean though that there is not
one maximal supertable though. If A inherits from B and C, you can't say
which is the maximal supertable. Don't know what those guys were
smoking, but whatever it is I want some.

> These effects describe a sharing of properties (columns) among the super and
> subtables.  A row in a supertable may be part of a single row in 0 or 1 of its
> subtables (if I got it right) - a 1:1 relationship if any.  The subtable and
> supertable are linked together in the tree hierarchy and are not independent
> after creation.  The subtable extends additional attributes onto the supertable.
i.e. The object model expressed in a convoluted way?


> Since INHERITS, as it is implemented now, is like SQL3's UNDER, maybe 
> it should NOT allow multiple inheritance and should strive to become 
> UNDER if SQL3 is a good idea.

Renaming it UNDER might be ok. Breaking multiple inheritance would be
pretty silly, even if this is what SQL3 says (which I doubt).

> If the other object-oriented methods, like CLONES and ASSIMILATES (or whatever
> you want to call them), is ever wanted in PostgreSQL, then looks like some other
> standard(s) will have to be drawn from.  I have not looked at the ODMG 3.0
> (standard) yet.  But maybe it has the missing capabilities.  Is ODMG 3.0 an
> international standard?  I'd like to just download it and read it, but 
> looks like you have to buy it for $39.95.

The best way to get an overview of ODMG is probably do go to the poet
database web site and download their documentation. I say poet because
they are one of the few with an OQL implementation. But ODMG is not so
much focused on query language and you could go to other ODBMS web sites
like Versant and look at documentation for the interfaces.


Re: Thus spoke SQL3 (on OO)

From
"Robert B. Easter"
Date:
On Sun, 21 May 2000, Chris wrote:
> "Robert B. Easter" wrote:
> 
> > SQL3 does not appear to really have an object model.  Rather, it 
> > appears to be a hierarchial model on top of the relational model. 
> 
> It seems like it amounts to the same thing to me. A bit like me saying
> "A circle is the set of points equi-distant from a point", and someone
> else arguing "No, a circle is the graph of points represented by the
> formula x^2 + y^2 = n". At the end of the day they amount to the same
> thing I think.

I guess the major difference is that the hierarchial-model does not support
multiple inheritance.  Again it is basically a tree going from one parent
branch to many children branches where the children basically ARE the parent
table, just adding some more column leaves so to speak (adding nodes to a
tree, its still all one tree). Object-oriented allows both one parent to yield
many children and many parents to combine to yield a single child.  The
instances are not dependent on each other,  just the declarations are so that
you can delete a parent instance and it has no effect on a child since they are
not part of a data tree together.  A tree generally never allows two branches to
merge into a single branch.  Something like that.  OO lets you do more
without the instances having to be part of a dependent data tree.

I was thinking that maybe this hierarchial model over relational-model in SQL3
(as I see it) was designed that way to allow easier transitions of legacy
hierarchy databases to the new SQL3 relational systems.

I'm no OO expert, and again I may have this ALL wrong!  But, I don't see the
OO features of C++ being comparable to OO in databases all the time.  C++
generally uses only the CLONES type of inheritance, is procedural, and
allows a derived class to be passed anywhere the parent might normally be
passed.

What follows is my attempt to compare C++ OO and database OO:   In C++, a
function programmed to take a parent class as input, is only programmed to
use/access the attributes that parent has.  If you pass a derived class, the
function still only uses the attributes that the parent has too.  I think it is
abusing C++ if a function that takes a parent arg but is also aware of derived
classes in advance and does a test to see what is being passed.  The idea of C++
inheritance was that you could make parent, and functions that use parent, then
later someday derive a child from parent that you never thought you'd need. 
Then, the functions that work on parent still do their thing even on the child. 
A function that is programmed for a parent class and that has advance knowledge
of some derived class is not good OOP.  Passing different row types to the
client from one select, forces that client to be like the C++ function
programmed in advance to deal with some derived class too.  The SQL declaration
"SELECT * FROM parent" is like the C++ function declaration "void
useparent(parent *p)". Both only know about parent objects in the definition
(what uses the data obtained).  The definitions of what to do with the data
lies inside the client that issues the SQL declaration, and within the C++
function definition, respectively.  Both should only be expected to understand
how to process what they declare.  Sending back unpredictable numbers of and
types of columns might break the procedural/definition part that is outside
SQL's declaritive domain which is to precisely declare what data to get.  I
think that single-type rows should still be returned from selects.  Its the
relational way too, and the database still is a relational database.  Returning
the additional child columns just seems to be a waste of processing and
bandwidth when selecting parent.  C++ will just send a pointer, so there is no
penalty, but in the database, there is a speed penalty for sending those other
columns that the parent doesn't have.

Another, more difficult point about why not to send variable row types, has to
do with the ISA relationship in inheritance.  Child ISA parent.  Parent is not
necessarily a child.  A child can be used anywhere a parent can be used.  A
parent cannot be used anywhere a child can.  By sending the differing row
types, the procedure that processes the rows might end up expecting child rows
more than parent rows, even though you are using SELECT * FROM parent. The
different rows are processed differently.   If programmers become accoustomed
to obtaining the child-type rows by selecting parent, they might eventually
mistake a parent to be ISA child when one is received.  Say that mostly
child-type rows are returned by SELECT * FROM parent,  only an occasional
parent-type actually comes through.  A procedure might sample the input (or a
human might) and decide that the procedure should be different since it looks
like only child-type rows are returning.  Then, when a parent-type appears, it
is processed wrong.  Maybe this argument is weak.  Comments?  :-)

Robert B. Easter


Re: Thus spoke SQL3 (on OO)

From
Chris Bitmead
Date:
"Robert B. Easter" wrote:

> I guess the major difference is that the hierarchial-model does not 
> support multiple inheritance.  

I don't agree. From SQL3...

"To avoid name clashes, a subtype can rename selected components of the
representation inherited from its direct supertypes."

and if that doesn't clinch it...

"Let the term replicated column mean a column appearing in more than one
direct supertable of T that is inherited by at least one of those direct
supertables from the same column of a single higher-level supertable."

That sounds like multiple repeated inheritance to me.

> Passing different row types to the
> client from one select, forces that client to be like the C++ function
> programmed in advance to deal with some derived class too. 

You are assuming that the client application will be responsible for
dealing with these differences. What really happens is that a query is
more like a List<Baseclass> in C++. As long as you only call methods
contained in Baseclass on each element of the List, you are ok.

But those "virtual" methods you call need real objects to work with.
They need ALL the attributes in other words. The piece of language
infrastructure that behind the scenes instantiates all the C++ objects
as they fall out of the database can't create abstract Baseclass
objects. It needs all the attributes to instantiate Subclass objects, so
that the application code needn't know about different classes.

I suggest you download an evaluation copy of an ODBMS and have a play,
it will probably become clear.

> By sending the differing row
> types, the procedure that processes the rows might end up expecting child rows
> more than parent rows, even though you are using SELECT * FROM parent. The
> different rows are processed differently.   If programmers become accoustomed
> to obtaining the child-type rows by selecting parent, they might eventually
> mistake a parent to be ISA child when one is received.  

Whether a programmer is likely to make such a mistake depends more on
the programming language used. In an ODBMS situation almost all object
retrievals are not via an explicit query, but rather by object
navigation. You might have

class Purchase {   Link<Customer> buyer;   List<StockItem> items;
}

Result<Purchase> r = Query<Purchase>::select("select * from purchase");
Iterator<Purchase> i = r.iterator();
while (i.hasNext()) {  Purchase *p = i.next();  Customer *c = p.buyer;  Iterator<StockItem> = p.items.iterator();  //
etc.
}

Any one of Purchase, Customer or StockItem might really be some
sub-class of those classes for all the application knows. But the behind
the scenes infrastructure needs to have all the attributes so that the
application code need not know.


Re: Thus spoke SQL3 (on OO)

From
"Robert B. Easter"
Date:
On Sun, 21 May 2000, Chris Bitmead wrote:
> "Robert B. Easter" wrote:
> 
> > I guess the major difference is that the hierarchial-model does not 
> > support multiple inheritance.  
> 
> I don't agree. From SQL3...
> 
> "To avoid name clashes, a subtype can rename selected components of the
> representation inherited from its direct supertypes."
> 
> and if that doesn't clinch it...
> 
> "Let the term replicated column mean a column appearing in more than one
> direct supertable of T that is inherited by at least one of those direct
> supertables from the same column of a single higher-level supertable."
> 
> That sounds like multiple repeated inheritance to me.
> 

What is the date on the copy of the SQL/Foundation you are reading?  My copy is
dated September 23, 1999 ISO/IEC 9075-2 SQL3_ISO.  I tried searching for the
quotes above and could not find them.  Do I have the correct version?


-- 
Robert B. Easter
reaster@comptechnews.com


Re: Thus spoke SQL3 (on OO)

From
Chris Bitmead
Date:
It is from 
ftp://gatekeeper.dec.com/pub/standards/sql
and dated 1994. Is there something more recent?

> What is the date on the copy of the SQL/Foundation you are reading?  My copy is
> dated September 23, 1999 ISO/IEC 9075-2 SQL3_ISO.  I tried searching for the
> quotes above and could not find them.  Do I have the correct version?


Re: Thus spoke SQL3 (on OO)

From
Hannu Krosing
Date:
Chris wrote:
> 
> 
> > What exactly IDENTITY is is still a bit unclear to me but it is
> > definitely not the proposed identification of the table a row came
> > from.
> 
> How do you know?
> 
> > * Cloning
> > CREATE TABLE name (
> >     colname type,
> >     colname type,
> >     LIKE other_table,
> >     colname type,
> >     ...
> > );
> 
> Hmm. Fairly useless feature IMO.

The main use would be for those users who are using INHERITS with
current
PostgreSQL and need to port from it.

---------
Hannu


Re: Thus spoke SQL3 (on OO)

From
Hannu Krosing
Date:
Chris Bitmead wrote:
> 
> "Robert B. Easter" wrote:
> 
> > I guess the major difference is that the hierarchial-model does not
> > support multiple inheritance.
> 
> I don't agree. From SQL3...
> 
> "To avoid name clashes, a subtype can rename selected components of the
> representation inherited from its direct supertypes."
> 
> and if that doesn't clinch it...

Chris, what is your position on having a single primary key for all 
inherited columns ?

It seems right for single inheritance (tree-like), but generally 
impossible for multiple inheritance, unless we will allow multiple
"primary" keys (which we could allow anyhow, as they seem useful even in 
several non-OO situations). For purity we could set the syntax to be 
ALTERNATE KEY or ALTERNATE PRIMARY KEY, but they would really be 
still primary keys ;) 

> > Passing different row types to the
> > client from one select, forces that client to be like the C++ function
> > programmed in advance to deal with some derived class too.
> 
> You are assuming that the client application will be responsible for
> dealing with these differences. What really happens is that a query is
> more like a List<Baseclass> in C++. As long as you only call methods
> contained in Baseclass on each element of the List, you are ok.

For more dynamic client languages you could even first ask each object 
to enumerate methods it knows about and then perhaps make a separate
menu 
(combobox) from them for client to choose from for each instance.

------------
Hannu


Re: Thus spoke SQL3 (on OO)

From
"Robert B. Easter"
Date:
On Sun, 21 May 2000, Chris Bitmead wrote:
> It is from 
> ftp://gatekeeper.dec.com/pub/standards/sql
> and dated 1994. Is there something more recent?

I believe so!  1994 is an old draft.  From what I understand, SQL3 is an
official ISO standard as of sometime back in 1999.  It may be that the
official standard cut out the things you quoted.

Try downloading the stuff at:
ftp://jerry.ece.umassd.edu/isowg3/x3h2/Standards/

> 
> > What is the date on the copy of the SQL/Foundation you are reading?  My copy is
> > dated September 23, 1999 ISO/IEC 9075-2 SQL3_ISO.  I tried searching for the
> > quotes above and could not find them.  Do I have the correct version?
-- 
Robert B. Easter
reaster@comptechnews.com


Re: Thus spoke SQL3 (on OO)

From
Chris Bitmead
Date:
"Robert B. Easter" wrote:
> 
> On Sun, 21 May 2000, Chris Bitmead wrote:
> > It is from
> > ftp://gatekeeper.dec.com/pub/standards/sql
> > and dated 1994. Is there something more recent?
> 
> I believe so!  1994 is an old draft.  From what I understand, SQL3 is an
> official ISO standard as of sometime back in 1999.  It may be that the
> official standard cut out the things you quoted.
> 
> Try downloading the stuff at:
> ftp://jerry.ece.umassd.edu/isowg3/x3h2/Standards/

Oh I see you are right. The latest draft has removed multiple
inheritance. I wonder why, the 1994 draft for multiple inheritance
actually looked ok. Maybe they couldn't agree on details and wanted to
get it out the door? Pretty sad decicision if you ask me. Or maybe when
they went back in 1999, they couldn't figure out their own 1994 document
any more :-).


Re: Thus spoke SQL3 (on OO)

From
"Robert B. Easter"
Date:
On Sun, 21 May 2000, Chris Bitmead wrote:
> Hannu Krosing wrote:
> 
> > Chris, what is your position on having a single primary key for all
> > inherited columns ?
> 
> What is the significance of a primary key compared to any old unique
> key?

For referential integrity, the REFERENCES or FOREIGN KEY clauses specify a
table name.  That table is expected to have one and only one PRIMARY KEY.  You
can't select which unque column you want to reference within a table - it must
be the one and only PRIMARY KEY.

I hope this is answering the question. :)

Multiple inheritance and referential integrity are a complex mix.  It becomes
hard for the database to maintain data integrity and uphold the relational
model that is based on functional dependency where one set of atomic values
are determined by a key or composite key containing attributes that are not
functionally dependent on each other.  With multiple inheritance, it is easy to
end up with two separate keys determining the same data, which is a conflict if
they are not a composite key.  Something like that.

The EXTENDS type of inheritance is single-inheritance compatible.  This is what
I think SQL3 is allowing.  It allows you to make a hierarchy tree out of
tables.  Only one primary key can possibly be inherited.  A subtable is
forbidden from specifying its own primary key - it must inherit one.

The CLONES and ASSIMILATES stuff that I mentioned before, would require some
restrictions to ensure they don't break the relational model data intergrity
enforcement infrastructure (primary keys/foreign keys etc).  For example, to
multiple inherit, it would maybe be required that the inherited table have an
inherited primary key consisting of the composite of all inherited keys.  If it
inherits no primary key, then it is free to specify one for itself.  But
remember, that CLONE just branches off from its parents,
who are not connected to child so that eases things a litte for that case. 
ASSIMILATES is more complicated, and is not possible to compare it even with
anything you can do in a programming language since a child can exist after a
parent class has been dropped.  I'd have to think about CLONES and ASSIMILATES
more since they multiple inherit.


I looks like SQL3 has taken care of the EXTENDS type for us.

Attached is a diagram of the way UNDER appears to work in SQL3.  The second
gif is a rough idea of how I think INHERITS/CLONES might work.  The 3rd pic is
about the assimilate inheritance idea, which is half baked but maybe somwhat
interesting.

-- 
Robert B. Easter
reaster@comptechnews.com

Re: Thus spoke SQL3 (on OO)

From
Hannu Krosing
Date:
Chris Bitmead wrote:
> 
> Hannu Krosing wrote:
> 
> > Chris, what is your position on having a single primary key for all
> > inherited columns ?
> 
> What is the significance of a primary key compared to any old unique
> key?

I don't know ;) Some theorists seem to think it important, and PG allows 
only one PK per table.

I just meant that primary key (as well as any other uniqe key) should be 
inherited from parent table

------------------
Hannu


Re: Thus spoke SQL3 (on OO)

From
Chris Bitmead
Date:
Hannu Krosing wrote:
> 
> Chris Bitmead wrote:
> >
> > Hannu Krosing wrote:
> >
> > > Chris, what is your position on having a single primary key for all
> > > inherited columns ?
> >
> > What is the significance of a primary key compared to any old unique
> > key?
> 
> I don't know ;) Some theorists seem to think it important, and PG allows
> only one PK per table.
> 
> I just meant that primary key (as well as any other uniqe key) should be
> inherited from parent table

What object theory would say is that oid uniquely identifies an object.
Other unique keys should usually be inherited.


Re: Thus spoke SQL3 (on OO)

From
Hannu Krosing
Date:
Chris Bitmead wrote:
> 
> Hannu Krosing wrote:
> >
> > Chris Bitmead wrote:
> > >
> > > Hannu Krosing wrote:
> > >
> > > > Chris, what is your position on having a single primary key for all
> > > > inherited columns ?
> > >
> > > What is the significance of a primary key compared to any old unique
> > > key?
> >
> > I don't know ;) Some theorists seem to think it important, and PG allows
> > only one PK per table.
> >
> > I just meant that primary key (as well as any other uniqe key) should be
> > inherited from parent table
> 
> What object theory would say is that oid uniquely identifies an object.
> Other unique keys should usually be inherited.

it would be hard to define RI by just saying that some field references "an
OID",
often you want to be able do define something more specific.

It would be too much for most users to require that all primary and foreign
keys 
must be of type OID.

It about flexibility, much much like the situation with SERIAL vs.
INT DEFAULT NEXTVAL('SOME_SEQUENCE')

------------
Hannu


Re: Thus spoke SQL3 (on OO)

From
Chris Bitmead
Date:
Hannu Krosing wrote:

> it would be hard to define RI by just saying that some field references "an
> OID",
> often you want to be able do define something more specific.
> 
> It would be too much for most users to require that all primary and foreign
> keys
> must be of type OID.

Since it would be object and relational, you could do either. But all
pure object databases _always_ rely on oid to define relationships, and
that is likely to be all an ODMG inteface would support. Unless we want
to break new ground anyway.


Re: Thus spoke SQL3 (on OO)

From
Hannu Krosing
Date:
Chris Bitmead wrote:
> 
> Hannu Krosing wrote:
> 
> > it would be hard to define RI by just saying that some field references "an
> > OID",
> > often you want to be able do define something more specific.
> >
> > It would be too much for most users to require that all primary and foreign
> > keys must be of type OID.
> 
> Since it would be object and relational, you could do either. But all
> pure object databases _always_ rely on oid to define relationships, and
> that is likely to be all an ODMG inteface would support.

Is the ODMG interface available on the net, or is the plan to do a Poet clone
?

> Unless we want to break new ground anyway.

We would need some syntax to distinguish between REFERENCES (primary key) and
REFERENCES (oid).

Of course we would also need fast lookups by oid and oid->object lookup
tables(s)/function(s) but that's another part of the story.

--------------
Hannu


Re: Thus spoke SQL3 (on OO)

From
Chris Bitmead
Date:
Hannu Krosing wrote:

> Is the ODMG interface available on the net, or is the plan to do a Poet > clone

Looking at database vendor specs will get us a fair way. Or you can
shell the $35 for a spec.

> > Unless we want to break new ground anyway.
> 
> We would need some syntax to distinguish between REFERENCES (primary 
> key) and REFERENCES (oid).

The trouble is the client cache code is generally all set up to cache by
oid. If you want to start referencing objects by various criteria, the
client cache becomes a lot more complex. More inefficient too because
you would have to set up hash tables on multiple criteria and jump
between them.

It's not such a big deal really. When you do an OO model you don't need
to think about your own primary key.

> Of course we would also need fast lookups by oid and oid->object lookup
> tables(s)/function(s) but that's another part of the story.

An index on oid will be a start.

-- 
Chris Bitmead
mailto:chris@bitmead.com
http://www.techphoto.org - Photography News, Stuff that Matters


Re: Thus spoke SQL3 (on OO)

From
Karl DeBisschop
Date:
Robert B. Easter wrote:

>On Sun, 21 May 2000, Chris Bitmead wrote:
>> It is from
>> ftp://gatekeeper.dec.com/pub/standards/sql
>> and dated 1994. Is there something more recent?
>
>I believe so!  1994 is an old draft.  From what I understand, SQL3 is
an
>official ISO standard as of sometime back in 1999.  It may be that the
>official standard cut out the things you quoted.
>
>Try downloading the stuff at:
>ftp://jerry.ece.umassd.edu/isowg3/x3h2/Standards/
>
>>
>> > What is the date on the copy of the SQL/Foundation you are
reading?  My copy is
>> > dated September 23, 1999 ISO/IEC 9075-2 SQL3_ISO.  I tried
searching for the
>> > quotes above and could not find them.  Do I have the correct
version?

I have that the ISO statndard was adopted in July 1999.  My copy is the
ANSI document, which I'm told is unchanged from ISO, and the adoption
date is listed as 8 December 1999. (Since I'm told both are identical,
and I can get the ANSI PDF for $20, versus $310 for the ISO version,
that was an easy choice)

Of course, now that the standard has been adopted, it is properly
referred to a SQL99.

Karl DeBisschop
www.infoplease.com




Re: Thus spoke SQL3 (on OO)

From
Marten Feldtmann
Date:
> Hannu Krosing wrote:
> 
> It's not such a big deal really. When you do an OO model you don't need
> to think about your own primary key.
> 
Hmm, I see here more and more postings, that do say, the OID (or the
result of a SEQUENCE) is usable for a key to identify an object stored
within a database.
Though it's true, that SEQUENCE can be used to create unique
identifiers, the function is simply a hack - nothing more for greater
OO software systems and worse than software solutions, which provide
more power and lower traffic.
The identification of an object has to be based on a unique key and
it does not matter of which type it is.
The foreign key is of course not useful for the oo-model, but for the
programmer, which produces the object-relational wrapper this is VERY
urgent !
And here again: if you use SEQUENCE for the OID you use a special
feature of the database ... and that is bad.
Marten






Re: Thus spoke SQL3 (on OO)

From
Hannu Krosing
Date:
Marten Feldtmann wrote:
> 
> > Hannu Krosing wrote:
> >
> > It's not such a big deal really. When you do an OO model you don't need
> > to think about your own primary key.
> >

I don't remember saying that, must have been someone else.

But it is true, you don't need anything but OID if you don't want to 
distinguish your objects yourself but only need them to be distinct 
for your program, i.e. yo have two cheques absolutely similar, except
that 
there are two of them ;)

----------
Hannu