Thread: Proposed Changes to PostgreSQL

Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Hi,

I've been spending a lot of time lately with gdb and tracing the
back-end seeing if I can understand it enough to make some changes.
I'm starting to actually understand a lot of stuff, so in order
to have some possibility of having my changes accepted, I want to
discuss
them here first. Based on that, I'm going to hopefully make an attempt
at implementation. I have a patch for one of these changes already
if I get the go ahead.

THESE CHANGES DON'T AFFECT YOU IF YOU DON'T USE INHERITANCE.

Speak now about these changes or please, forever hold your peace. Of
course you can comment later if I screw up implementation.

The proposed changes are....

1) An imaginary field in every tuple that tells you the class it came
from.
This is useful when you select from table* and want to know which
relation the object actually came from. It wouldn't be stored on disk,
and like oid it wouldn't be displayed when you do SELECT *. The field
would be called classname. So you could have...
SELECT p.classname, p.name FROM person p;
person   | Fred
student  | Bill
employee | Jim
person   | Chris

If you want to know the exact behaviour it is as if every table in the
database had done to it...
ALTER TABLE foo ADD COLUMN classname TEXT;
UPDATE foo SET classname='foo';

Of course this is not how it would be implemented. It is just
reference for how it will appear to work. BTW, this idea was also
in the original berkeley design notes.

2) Changing the sense of the default for getting inherited tuples.
Currently you only get inherited tuples if you specify "tablename*".
This would be changed so that you get all sub-class tuples too by
default unless you specify "ONLY tablename". There are several
rationale for this. Firstly this is what Illustra/Informix have
implemented. Secondly, I believe it is more logical from an OO
perspective as well as giving a more useful default. If a politician
IS a person and I say SELECT * from person, then logically I should
see all the politicians because they are people too (so they claim
:). Thirdly, there are a whole range of SQL statements that should
probably be disallowed without including sub-classes. e.g. an ALTER
TABLE ADD COLUMN that does not include sub-classes is almost certainly
undesirable. It seems ashame to have to resort to non-standard SQL
with the "*" syntax in this case when it is really your only
choice. Basicly, wanting ONLY a classname is a far more unusual
choice, and leaving off the "*" is a common error. Fourthly, it seems
out of character for the SQL language to have this single character
operator. The SQL style is to use wordy descriptions of the operators
meaning. "ONLY" fits well here because it describes its own meaning
perfectly whereas to the unitiated, "*" is harder to guess at. While
this change is an incompatibility I hope for those few people using
inheritance they can accept the need to move forward without
over-burden of backwards compatibility.

3) The ability to return different types of rows from a SELECT. This
is to allow implementation of ODBMS functionality where a query could
be required to instantiate objects of differing types with differing
attributes.

I would propose that that anytime you do a SELECT * from a base table
that you would get back the full rows from those sub tables. Since the
current PQ interface which doesn't support this notion would remain
unchanged this wouldn't affect current users.

It's probably also desirable to have a syntax for getting just the
columns of the base table when this is desired. Say perhaps SELECT %
from table. This would be a performance hack for users of libpq and a
functionality difference for users of psql.

The reason I think the "*" syntax should take on the new functionality
is because it would be more consistent with what the OQL (object query
language) standard specifies, and also because it seems the more
useful default. Also there is no compatibility reason not to do it.

In addition it would be legal to specify columns that only exist in
sub-classes. For example, if we had

CREATE TABLE person (name TEXT);
CREATE TABLE student (studentid TEXT, faculty TEXT) INHERITS (person);

it would be legal to say...
> SELECT * FROM person;
NAME
----
Fred
Bill

NAME | STUDENTID | FACULTY
--------------------------
Jim  | 23455     | Science
Chris| 45666     | Arts

> SELECT *, studentid FROM person;
NAME
----
Fred
Bill

NAME | STUDENTID
----------------
Jim  | 23455
Chris| 45666

> SELECT *, studentid FROM ONLY person;
ERROR: person does not contain studentid.

> SELECT % FROM person;
NAME
----
Fred
Bill
Jim
Chris

As you can see, it is desirable that psql be modified to be able to
print these differing tuple types. Presumably new column headings will
be printed when a tuple is differing to the previous one. Likely it
will be often desirable to do a
SELECT * FROM person p ORDER BY p.classname;
in order to have all the tuples of a particular type grouped together.

In addition some extenions will be done to the PQ interface to support
these differing return types. The current PQ interface will be left
unchanged and backwards compatible for retrieving rows of a single
type.

Also there should be an settable option that specifies that "*" should
also return the normally ignored columns of oid and classname. This is
so that OO programs that embed SQL into them also get back the oid and
classname which are required for the behind the scenes implementation
of an ODMG client. Something like...

SET SHOW_OID TRUE;
SHOW_CLASSNAME TRUE;

SELECT * FROM person;

OID  CLASSNAME NAME
-------------------
2344 person    Fred
3445 person    Bill

OID  CLASSNAME NAME | STUDENTID | FACULTY
-----------------------------------------
2355 student   Jim  | 23455     | Science
5655 student   Chris| 45666     | Arts

Re: [GENERAL] Proposed Changes to PostgreSQL

From
Bruce Momjian
Date:
> THESE CHANGES DON'T AFFECT YOU IF YOU DON'T USE INHERITANCE.
>
> Speak now about these changes or please, forever hold your peace. Of
> course you can comment later if I screw up implementation.
>
> The proposed changes are....
>
> 1) An imaginary field in every tuple that tells you the class it came
> from.
> This is useful when you select from table* and want to know which
> relation the object actually came from. It wouldn't be stored on disk,
> and like oid it wouldn't be displayed when you do SELECT *. The field
> would be called classname. So you could have...
> SELECT p.classname, p.name FROM person p;
> person   | Fred
> student  | Bill
> employee | Jim
> person   | Chris

So the field is created on the fly to show what table it came from.
Seems like a good idea, though implementing another usually-invisible
column will be tough.  However, because it is not really a column like
the oid is a column, it should be ok.   Of course, internally it is
relid.


> 2) Changing the sense of the default for getting inherited tuples.
> Currently you only get inherited tuples if you specify "tablename*".
> This would be changed so that you get all sub-class tuples too by
> default unless you specify "ONLY tablename". There are several
> rationale for this. Firstly this is what Illustra/Informix have
> implemented. Secondly, I believe it is more logical from an OO
> perspective as well as giving a more useful default. If a politician
> IS a person and I say SELECT * from person, then logically I should
> see all the politicians because they are people too (so they claim
> :). Thirdly, there are a whole range of SQL statements that should
> probably be disallowed without including sub-classes. e.g. an ALTER
> TABLE ADD COLUMN that does not include sub-classes is almost certainly
> undesirable. It seems ashame to have to resort to non-standard SQL
> with the "*" syntax in this case when it is really your only
> choice. Basicly, wanting ONLY a classname is a far more unusual
> choice, and leaving off the "*" is a common error. Fourthly, it seems
> out of character for the SQL language to have this single character
> operator. The SQL style is to use wordy descriptions of the operators
> meaning. "ONLY" fits well here because it describes its own meaning
> perfectly whereas to the unitiated, "*" is harder to guess at. While
> this change is an incompatibility I hope for those few people using
> inheritance they can accept the need to move forward without
> over-burden of backwards compatibility.

Sounds fine to me.  Just realize you are taking on a long-overdue but
big job here.

>
> 3) The ability to return different types of rows from a SELECT. This
> is to allow implementation of ODBMS functionality where a query could
> be required to instantiate objects of differing types with differing
> attributes.

This bothers me.  We return relational data, showing the same number of
columns and types for every query.  I don't think we want to change
that, even for OO.  How are you going to return that info the the client
side?


--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [GENERAL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Bruce Momjian wrote:

> So the field is created on the fly to show what table it came from.
> Seems like a good idea, though implementing another usually-invisible
> column will be tough.

What problems do you forsee?

>  However, because it is not really a column like
> the oid is a column, it should be ok.   Of course, internally it is
> relid.
>
> > 2) Changing the sense of the default for getting inherited tuples.
> > Currently you only get inherited tuples if you specify "tablename*".
>
> Sounds fine to me.  Just realize you are taking on a long-overdue but
> big job here.

I already have a patch for this one. The change is a few pretty simple
changes
to gram.y.

> > 3) The ability to return different types of rows from a SELECT. This
> > is to allow implementation of ODBMS functionality where a query could
> > be required to instantiate objects of differing types with differing
> > attributes.
>
> This bothers me.  We return relational data, showing the same number of
> columns and types for every query.  I don't think we want to change
> that, even for OO.

What aspects bother you? This is the fundamental important thing about
object databases.

It's also something that I'm always wanting to do when generating web
pages.
I have web links like http://foo.com/page?id=123. I want to retrieve
the webpage object (which is an inheritance hierarchy) of id=123 which
may
represent a web page of different types. Then process appropriately for
different objects. i.e. typical OO polymorphism.

> How are you going to return that info the the client side?

Well the backend <-> frontend protocol that used to be able to return
tuples of different types would be put back in.

Also the berkerly postgres docs had other scenarios where different
tuples
could be returned. One is you could have a field of type postquel called
say
EMP.hobbies which had a value of "retrieve HOBBIES.all where...", and
then "retrieve
EMP.hobbies would return tuples of different types of hobbies.

Re: [GENERAL] Proposed Changes to PostgreSQL

From
Bruce Momjian
Date:
> Bruce Momjian wrote:
>
> > So the field is created on the fly to show what table it came from.
> > Seems like a good idea, though implementing another usually-invisible
> > column will be tough.
>
> What problems do you forsee?

Well, it is usually pretty strange to carry around a column that doesn't
exist through all the code and finally contruct it at the end.  I would
suspect something in the rewrite system could do that pretty easily,
though.  That is the direction I would go with that.

>
> >  However, because it is not really a column like
> > the oid is a column, it should be ok.   Of course, internally it is
> > relid.
> >
> > > 2) Changing the sense of the default for getting inherited tuples.
> > > Currently you only get inherited tuples if you specify "tablename*".
> >
> > Sounds fine to me.  Just realize you are taking on a long-overdue but
> > big job here.
>
> I already have a patch for this one. The change is a few pretty simple
> changes
> to gram.y.

OK, you will have to canvas the general list to make sure this does not
break things for people, though our inheritance system needs an overhaul
badly.

>
> > > 3) The ability to return different types of rows from a SELECT. This
> > > is to allow implementation of ODBMS functionality where a query could
> > > be required to instantiate objects of differing types with differing
> > > attributes.
> >
> > This bothers me.  We return relational data, showing the same number of
> > columns and types for every query.  I don't think we want to change
> > that, even for OO.
>
> What aspects bother you? This is the fundamental important thing about
> object databases.

I fear it is totally against the way our API works.  How does someone
see how many columns in the returned row?

> > How are you going to return that info the the client side?
>
> Well the backend <-> frontend protocol that used to be able to return
> tuples of different types would be put back in.
>
> Also the berkerly postgres docs had other scenarios where different
> tuples
> could be returned. One is you could have a field of type postquel called
> say
> EMP.hobbies which had a value of "retrieve HOBBIES.all where...", and
> then "retrieve
> EMP.hobbies would return tuples of different types of hobbies.

Yikes.  Strange.  Can we just return nulls for the empty fields?

How many new API calls are required?

--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [GENERAL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Bruce Momjian wrote:

> > I already have a patch for this one. The change is a few pretty simple
> > changes
> > to gram.y.
>
> OK, you will have to canvas the general list to make sure this does not
> break things for people, though our inheritance system needs an overhaul
> badly.

This is already CCed to the general list.

> I fear it is totally against the way our API works.  How does someone
> see how many columns in the returned row?

A new API PQnfieldsv(PQresult, tupnum) or some such.

> Yikes.  Strange.

Strange for C code perhaps. Very useful for constructing real objects in
OO application code framework.

> Can we just return nulls for the empty fields?

Well, I think we should probably distinguish between a field that is
null,
and a field that simply doesn't exist.

> How many new API calls are required?

Perhaps just the one. (above).

Re: [GENERAL] Proposed Changes to PostgreSQL

From
Bruce Momjian
Date:
I can live with this.  Thanks.


> Bruce Momjian wrote:
>
> > > I already have a patch for this one. The change is a few pretty simple
> > > changes
> > > to gram.y.
> >
> > OK, you will have to canvas the general list to make sure this does not
> > break things for people, though our inheritance system needs an overhaul
> > badly.
>
> This is already CCed to the general list.
>
> > I fear it is totally against the way our API works.  How does someone
> > see how many columns in the returned row?
>
> A new API PQnfieldsv(PQresult, tupnum) or some such.
>
> > Yikes.  Strange.
>
> Strange for C code perhaps. Very useful for constructing real objects in
> OO application code framework.
>
> > Can we just return nulls for the empty fields?
>
> Well, I think we should probably distinguish between a field that is
> null,
> and a field that simply doesn't exist.
>
> > How many new API calls are required?
>
> Perhaps just the one. (above).
>


--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [SQL] Re: [GENERAL] Proposed Changes to PostgreSQL

From
Mathijs Brands
Date:
On Wed, Feb 02, 2000 at 09:57:48PM -0500, Bruce Momjian allegedly wrote:
> > > > 3) The ability to return different types of rows from a SELECT. This
> > > > is to allow implementation of ODBMS functionality where a query could
> > > > be required to instantiate objects of differing types with differing
> > > > attributes.
> > >
> > > This bothers me.  We return relational data, showing the same number of
> > > columns and types for every query.  I don't think we want to change
> > > that, even for OO.
> >
> > What aspects bother you? This is the fundamental important thing about
> > object databases.
>
> I fear it is totally against the way our API works.  How does someone
> see how many columns in the returned row?

This would probably break applications written in PHP and Perl (and
possibly others) that have their queryresults returned to them in a
numerically indexed array (index by offset). If this behaviour could
be turned off, than it shouldn't be a problem.

Mathijs

Re: [SQL] Re: [GENERAL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Mathijs Brands wrote:
>
> On Wed, Feb 02, 2000 at 09:57:48PM -0500, Bruce Momjian allegedly wrote:
> > > > > 3) The ability to return different types of rows from a SELECT. This
> > > > > is to allow implementation of ODBMS functionality where a query could
> > > > > be required to instantiate objects of differing types with differing
> > > > > attributes.
> > > >
> > > > This bothers me.  We return relational data, showing the same number of
> > > > columns and types for every query.  I don't think we want to change
> > > > that, even for OO.
> > >
> > > What aspects bother you? This is the fundamental important thing about
> > > object databases.
> >
> > I fear it is totally against the way our API works.  How does someone
> > see how many columns in the returned row?
>
> This would probably break applications written in PHP and Perl (and
> possibly others) that have their queryresults returned to them in a
> numerically indexed array (index by offset). If this behaviour could
> be turned off, than it shouldn't be a problem.

It wouldn't affect them because the current APIs would continue to
return
the same base-level columns. You would only get access to the extra
columns
with a new API.

Re: [HACKERS] Re: [GENERAL] Proposed Changes to PostgreSQL

From
Hannu Krosing
Date:
Bruce Momjian wrote:
>
> > Bruce Momjian wrote:
> >
> > > So the field is created on the fly to show what table it came from.
> > > Seems like a good idea, though implementing another usually-invisible
> > > column will be tough.
> >
> > What problems do you forsee?
>
> Well, it is usually pretty strange to carry around a column that doesn't
> exist through all the code and finally contruct it at the end.  I would
> suspect something in the rewrite system could do that pretty easily,
> though.  That is the direction I would go with that.
>

Oracle has a ROWNR (IIRC) pseudo-column that is added in th every end of
query and is a convienient way to put numbers on report rows (among other
things).

------------
Hannu

Re: [HACKERS] Re: [GENERAL] Proposed Changes to PostgreSQL

From
Hannu Krosing
Date:
Bruce Momjian wrote:
>
>
> Yikes.  Strange.  Can we just return nulls for the empty fields?

I think more natural way would be to define a new type (NAF - NotAFiled),
like we have NAN for floats (do we ?, at least IEEE has)

-----------------
Hannu

Re: [GENERAL] Proposed Changes to PostgreSQL

From
Yury Don
Date:
Hello Chris,

Once, Thursday, February 03, 2000, 6:30:26 AM, you wrote:

CB> 1) An imaginary field in every tuple that tells you the class it came
CB> from.
CB> This is useful when you select from table* and want to know which
CB> relation the object actually came from. It wouldn't be stored on disk,
CB> and like oid it wouldn't be displayed when you do SELECT *. The field
CB> would be called classname. So you could have...
CB> SELECT p.classname, p.name FROM person p;
CB> person   | Fred
CB> student  | Bill
CB> employee | Jim
CB> person   | Chris

I am voting for this by both hands. Now we forced to use an additional
column classname in every table and rule to fill this column.

CB> 2) Changing the sense of the default for getting inherited tuples.
CB> Currently you only get inherited tuples if you specify "tablename*".
CB> This would be changed so that you get all sub-class tuples too by
CB> default unless you specify "ONLY tablename". There are several
CB> rationale for this. Firstly this is what Illustra/Informix have
CB> implemented. Secondly, I believe it is more logical from an OO
CB> perspective as well as giving a more useful default. If a politician
CB> IS a person and I say SELECT * from person, then logically I should
CB> see all the politicians because they are people too (so they claim
CB> :). Thirdly, there are a whole range of SQL statements that should
CB> probably be disallowed without including sub-classes. e.g. an ALTER
CB> TABLE ADD COLUMN that does not include sub-classes is almost certainly
CB> undesirable. It seems ashame to have to resort to non-standard SQL
CB> with the "*" syntax in this case when it is really your only
CB> choice. Basicly, wanting ONLY a classname is a far more unusual
CB> choice, and leaving off the "*" is a common error. Fourthly, it seems
CB> out of character for the SQL language to have this single character
CB> operator. The SQL style is to use wordy descriptions of the operators
CB> meaning. "ONLY" fits well here because it describes its own meaning
CB> perfectly whereas to the unitiated, "*" is harder to guess at. While
CB> this change is an incompatibility I hope for those few people using
CB> inheritance they can accept the need to move forward without
CB> over-burden of backwards compatibility.

Sounds very logically.

--
Best regards,
 Yury  ICQ 11831432
 mailto:yura@vpcit.ru



Re: [HACKERS] Proposed Changes to PostgreSQL

From
Peter Eisentraut
Date:
While I think that these kinds of changes are a No Go because they'd break
a lot of applications (including mine), IF (big if) you really want to
make major changes to the inheritance scheme, I got a few ideas.

First let me say that I like the conceptual simplicity of relational
databases. Some or all of the ideas thrown around here break with
simplicity and consistency, by suggesting, e.g., that some commands be
allowed only on entire inheritance structures, while others be allowed on
individual tables, and attached to it a discussion which ones those should
be. That doesn't strike me as too promising.

A lot of people use inheritance to create "consistent schemas", that is,
they empty create base tables, such as "address" which are inherited by
tables such as customer, vendor, office, etc. That is probably not what
inheritance is for, perhaps it should be some sort of a macro-like
concept, such as create table vendor (name text,
copy_schema_from(address), more fields), expanded by the parser. This is
pretty much what it does now, only this scheme wouldn't have to actually
store the (useless) inheritance link.

Anyway, an idea I had would be to reimplement inheritance based on joins,
since this is what the "pure relational" solution would be anyway. When I
create a table B that is based on A, all the system does is create the
table B as usual and store a note "I inherit from A". Any row you insert
into B also creates a row in A, and the row in B contains an oid pointer
to it. Thus a select on B performs a join on A.oid and B.row_in_A_pointer.
A select on A just returns all the rows in A, no extras needed. A delete
on B deletes the row in B and in A. A delete in A would cascade to B. Both
of this can be gotten for free with foreign keys. Adding a column to A
just adds the column to A, all other tables get the new column magically
and in the right order. Same with dropping columns, etc.

In short, this approach solves all inheritance problems at once and does
so without adding any extra kludges besides the "I inherited from" field,
which is static, plus the necessary transformations necessary in the
parser. The drawback is of course that a select from an inherited table
would always incur a join, perhaps some optimizing could be done in this
direction. But the bottom line is that the compatibility issue looms big.

    -Peter


On Thu, 3 Feb 2000, Chris Bitmead wrote:

> Hi,
>
> I've been spending a lot of time lately with gdb and tracing the
> back-end seeing if I can understand it enough to make some changes.
> I'm starting to actually understand a lot of stuff, so in order
> to have some possibility of having my changes accepted, I want to
> discuss
> them here first. Based on that, I'm going to hopefully make an attempt
> at implementation. I have a patch for one of these changes already
> if I get the go ahead.
>
> THESE CHANGES DON'T AFFECT YOU IF YOU DON'T USE INHERITANCE.
>
> Speak now about these changes or please, forever hold your peace. Of
> course you can comment later if I screw up implementation.
>
> The proposed changes are....
>
> 1) An imaginary field in every tuple that tells you the class it came
> from.
> This is useful when you select from table* and want to know which
> relation the object actually came from. It wouldn't be stored on disk,
> and like oid it wouldn't be displayed when you do SELECT *. The field
> would be called classname. So you could have...
> SELECT p.classname, p.name FROM person p;
> person   | Fred
> student  | Bill
> employee | Jim
> person   | Chris
>
> If you want to know the exact behaviour it is as if every table in the
> database had done to it...
> ALTER TABLE foo ADD COLUMN classname TEXT;
> UPDATE foo SET classname='foo';
>
> Of course this is not how it would be implemented. It is just
> reference for how it will appear to work. BTW, this idea was also
> in the original berkeley design notes.
>
> 2) Changing the sense of the default for getting inherited tuples.
> Currently you only get inherited tuples if you specify "tablename*".
> This would be changed so that you get all sub-class tuples too by
> default unless you specify "ONLY tablename". There are several
> rationale for this. Firstly this is what Illustra/Informix have
> implemented. Secondly, I believe it is more logical from an OO
> perspective as well as giving a more useful default. If a politician
> IS a person and I say SELECT * from person, then logically I should
> see all the politicians because they are people too (so they claim
> :). Thirdly, there are a whole range of SQL statements that should
> probably be disallowed without including sub-classes. e.g. an ALTER
> TABLE ADD COLUMN that does not include sub-classes is almost certainly
> undesirable. It seems ashame to have to resort to non-standard SQL
> with the "*" syntax in this case when it is really your only
> choice. Basicly, wanting ONLY a classname is a far more unusual
> choice, and leaving off the "*" is a common error. Fourthly, it seems
> out of character for the SQL language to have this single character
> operator. The SQL style is to use wordy descriptions of the operators
> meaning. "ONLY" fits well here because it describes its own meaning
> perfectly whereas to the unitiated, "*" is harder to guess at. While
> this change is an incompatibility I hope for those few people using
> inheritance they can accept the need to move forward without
> over-burden of backwards compatibility.
>
> 3) The ability to return different types of rows from a SELECT. This
> is to allow implementation of ODBMS functionality where a query could
> be required to instantiate objects of differing types with differing
> attributes.
>
> I would propose that that anytime you do a SELECT * from a base table
> that you would get back the full rows from those sub tables. Since the
> current PQ interface which doesn't support this notion would remain
> unchanged this wouldn't affect current users.
>
> It's probably also desirable to have a syntax for getting just the
> columns of the base table when this is desired. Say perhaps SELECT %
> from table. This would be a performance hack for users of libpq and a
> functionality difference for users of psql.
>
> The reason I think the "*" syntax should take on the new functionality
> is because it would be more consistent with what the OQL (object query
> language) standard specifies, and also because it seems the more
> useful default. Also there is no compatibility reason not to do it.
>
> In addition it would be legal to specify columns that only exist in
> sub-classes. For example, if we had
>
> CREATE TABLE person (name TEXT);
> CREATE TABLE student (studentid TEXT, faculty TEXT) INHERITS (person);
>
> it would be legal to say...
> > SELECT * FROM person;
> NAME
> ----
> Fred
> Bill
>
> NAME | STUDENTID | FACULTY
> --------------------------
> Jim  | 23455     | Science
> Chris| 45666     | Arts
>
> > SELECT *, studentid FROM person;
> NAME
> ----
> Fred
> Bill
>
> NAME | STUDENTID
> ----------------
> Jim  | 23455
> Chris| 45666
>
> > SELECT *, studentid FROM ONLY person;
> ERROR: person does not contain studentid.
>
> > SELECT % FROM person;
> NAME
> ----
> Fred
> Bill
> Jim
> Chris
>
> As you can see, it is desirable that psql be modified to be able to
> print these differing tuple types. Presumably new column headings will
> be printed when a tuple is differing to the previous one. Likely it
> will be often desirable to do a
> SELECT * FROM person p ORDER BY p.classname;
> in order to have all the tuples of a particular type grouped together.
>
> In addition some extenions will be done to the PQ interface to support
> these differing return types. The current PQ interface will be left
> unchanged and backwards compatible for retrieving rows of a single
> type.
>
> Also there should be an settable option that specifies that "*" should
> also return the normally ignored columns of oid and classname. This is
> so that OO programs that embed SQL into them also get back the oid and
> classname which are required for the behind the scenes implementation
> of an ODMG client. Something like...
>
> SET SHOW_OID TRUE;
> SHOW_CLASSNAME TRUE;
>
> SELECT * FROM person;
>
> OID  CLASSNAME NAME
> -------------------
> 2344 person    Fred
> 3445 person    Bill
>
> OID  CLASSNAME NAME | STUDENTID | FACULTY
> -----------------------------------------
> 2355 student   Jim  | 23455     | Science
> 5655 student   Chris| 45666     | Arts
>
> ************
>
>

--
Peter Eisentraut                  Sernanders vaeg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden


Re: [HACKERS] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Peter Eisentraut wrote:

> A lot of people use inheritance to create "consistent schemas", that is,
> they empty create base tables, such as "address" which are inherited by
> tables such as customer, vendor, office, etc.

This is a really bad idea. You could never have both a postal address
AND
a home address for example. I thought the original postgres supported
this
by having
CREATE TABLE ADDRESS (...)
CREATE TABLE PERSON(add ADDRESS).

Anyway, this is what Oracle and others can do these days, and this is
the right
thing.

> Anyway, an idea I had would be to reimplement inheritance based on joins,
> since this is what the "pure relational" solution would be anyway. When I
> create a table B that is based on A, all the system does is create the
> table B as usual and store a note "I inherit from A". Any row you insert
> into B also creates a row in A, and the row in B contains an oid pointer
> to it.

This is a really stu^H^H^H bad idea. I have hierarchies 5 levels deep
with
multiple inheritance, and I
don't want to do a 10 way join just to retrieve an object.

This is why RDBMS's performance sucks so incredibly badly on some
applications.
an ODBMS can perform 100x as fast in these cases just because of what
you
are proposing.

> Thus a select on B performs a join on A.oid and B.row_in_A_pointer.
> A select on A just returns all the rows in A, no extras needed. A delete
> on B deletes the row in B and in A. A delete in A would cascade to B. Both
> of this can be gotten for free with foreign keys. Adding a column to A
> just adds the column to A, all other tables get the new column magically
> and in the right order. Same with dropping columns, etc.
>
> In short, this approach solves all inheritance problems at once and does
> so without adding any extra kludges besides the "I inherited from" field,
> which is static, plus the necessary transformations necessary in the
> parser. The drawback is of course that a select from an inherited table
> would always incur a join, perhaps some optimizing could be done in this
> direction. But the bottom line is that the compatibility issue looms big.
>
>         -Peter
>
> On Thu, 3 Feb 2000, Chris Bitmead wrote:
>
> > Hi,
> >
> > I've been spending a lot of time lately with gdb and tracing the
> > back-end seeing if I can understand it enough to make some changes.
> > I'm starting to actually understand a lot of stuff, so in order
> > to have some possibility of having my changes accepted, I want to
> > discuss
> > them here first. Based on that, I'm going to hopefully make an attempt
> > at implementation. I have a patch for one of these changes already
> > if I get the go ahead.
> >
> > THESE CHANGES DON'T AFFECT YOU IF YOU DON'T USE INHERITANCE.
> >
> > Speak now about these changes or please, forever hold your peace. Of
> > course you can comment later if I screw up implementation.
> >
> > The proposed changes are....
> >
> > 1) An imaginary field in every tuple that tells you the class it came
> > from.
> > This is useful when you select from table* and want to know which
> > relation the object actually came from. It wouldn't be stored on disk,
> > and like oid it wouldn't be displayed when you do SELECT *. The field
> > would be called classname. So you could have...
> > SELECT p.classname, p.name FROM person p;
> > person   | Fred
> > student  | Bill
> > employee | Jim
> > person   | Chris
> >
> > If you want to know the exact behaviour it is as if every table in the
> > database had done to it...
> > ALTER TABLE foo ADD COLUMN classname TEXT;
> > UPDATE foo SET classname='foo';
> >
> > Of course this is not how it would be implemented. It is just
> > reference for how it will appear to work. BTW, this idea was also
> > in the original berkeley design notes.
> >
> > 2) Changing the sense of the default for getting inherited tuples.
> > Currently you only get inherited tuples if you specify "tablename*".
> > This would be changed so that you get all sub-class tuples too by
> > default unless you specify "ONLY tablename". There are several
> > rationale for this. Firstly this is what Illustra/Informix have
> > implemented. Secondly, I believe it is more logical from an OO
> > perspective as well as giving a more useful default. If a politician
> > IS a person and I say SELECT * from person, then logically I should
> > see all the politicians because they are people too (so they claim
> > :). Thirdly, there are a whole range of SQL statements that should
> > probably be disallowed without including sub-classes. e.g. an ALTER
> > TABLE ADD COLUMN that does not include sub-classes is almost certainly
> > undesirable. It seems ashame to have to resort to non-standard SQL
> > with the "*" syntax in this case when it is really your only
> > choice. Basicly, wanting ONLY a classname is a far more unusual
> > choice, and leaving off the "*" is a common error. Fourthly, it seems
> > out of character for the SQL language to have this single character
> > operator. The SQL style is to use wordy descriptions of the operators
> > meaning. "ONLY" fits well here because it describes its own meaning
> > perfectly whereas to the unitiated, "*" is harder to guess at. While
> > this change is an incompatibility I hope for those few people using
> > inheritance they can accept the need to move forward without
> > over-burden of backwards compatibility.
> >
> > 3) The ability to return different types of rows from a SELECT. This
> > is to allow implementation of ODBMS functionality where a query could
> > be required to instantiate objects of differing types with differing
> > attributes.
> >
> > I would propose that that anytime you do a SELECT * from a base table
> > that you would get back the full rows from those sub tables. Since the
> > current PQ interface which doesn't support this notion would remain
> > unchanged this wouldn't affect current users.
> >
> > It's probably also desirable to have a syntax for getting just the
> > columns of the base table when this is desired. Say perhaps SELECT %
> > from table. This would be a performance hack for users of libpq and a
> > functionality difference for users of psql.
> >
> > The reason I think the "*" syntax should take on the new functionality
> > is because it would be more consistent with what the OQL (object query
> > language) standard specifies, and also because it seems the more
> > useful default. Also there is no compatibility reason not to do it.
> >
> > In addition it would be legal to specify columns that only exist in
> > sub-classes. For example, if we had
> >
> > CREATE TABLE person (name TEXT);
> > CREATE TABLE student (studentid TEXT, faculty TEXT) INHERITS (person);
> >
> > it would be legal to say...
> > > SELECT * FROM person;
> > NAME
> > ----
> > Fred
> > Bill
> >
> > NAME | STUDENTID | FACULTY
> > --------------------------
> > Jim  | 23455     | Science
> > Chris| 45666     | Arts
> >
> > > SELECT *, studentid FROM person;
> > NAME
> > ----
> > Fred
> > Bill
> >
> > NAME | STUDENTID
> > ----------------
> > Jim  | 23455
> > Chris| 45666
> >
> > > SELECT *, studentid FROM ONLY person;
> > ERROR: person does not contain studentid.
> >
> > > SELECT % FROM person;
> > NAME
> > ----
> > Fred
> > Bill
> > Jim
> > Chris
> >
> > As you can see, it is desirable that psql be modified to be able to
> > print these differing tuple types. Presumably new column headings will
> > be printed when a tuple is differing to the previous one. Likely it
> > will be often desirable to do a
> > SELECT * FROM person p ORDER BY p.classname;
> > in order to have all the tuples of a particular type grouped together.
> >
> > In addition some extenions will be done to the PQ interface to support
> > these differing return types. The current PQ interface will be left
> > unchanged and backwards compatible for retrieving rows of a single
> > type.
> >
> > Also there should be an settable option that specifies that "*" should
> > also return the normally ignored columns of oid and classname. This is
> > so that OO programs that embed SQL into them also get back the oid and
> > classname which are required for the behind the scenes implementation
> > of an ODMG client. Something like...
> >
> > SET SHOW_OID TRUE;
> > SHOW_CLASSNAME TRUE;
> >
> > SELECT * FROM person;
> >
> > OID  CLASSNAME NAME
> > -------------------
> > 2344 person    Fred
> > 3445 person    Bill
> >
> > OID  CLASSNAME NAME | STUDENTID | FACULTY
> > -----------------------------------------
> > 2355 student   Jim  | 23455     | Science
> > 5655 student   Chris| 45666     | Arts
> >
> > ************
> >
> >
>
> --
> Peter Eisentraut                  Sernanders vaeg 10:115
> peter_e@gmx.net                   75262 Uppsala
> http://yi.org/peter-e/            Sweden

Re: [SQL] Re: [HACKERS] Proposed Changes to PostgreSQL

From
Marten Feldtmann
Date:
> Peter Eisentraut wrote:
>
>
> This is a really stu^H^H^H bad idea. I have hierarchies 5 levels deep
> with
> multiple inheritance, and I
> don't want to do a 10 way join just to retrieve an object.
>
> This is why RDBMS's performance sucks so incredibly badly on some
> applications.
> an ODBMS can perform 100x as fast in these cases just because of what
> you
> are proposing.
>

 Hmm, and yes one may find problems where the pure relational system
is 100x faster than your ODBMS.

 After doing a project with VERSANT and VisualWorks (election projection
system for the first television sender here in Germany) I like the
idea of OODBMS, but I've also noticed, that they are not the solution
to all problems.

 Clever database desing leeds to good performance on both systems, but
one should consider, that the designs of the database layout will be
different. There are cases, where a pure relational system is very
fast and an ODBMS never get it, but there are the examples you
mentioned.

 Joins per se are not that bad .. it depends on when and how they
are used and how good the analyzer of the database is and how good
he uses the indices to get the job done.

 One very good point is the query language of the rdbms systems. On
the odbms side no standard is really available, which can be seen as
the sql of the odbms.

 Marten



Re: [SQL] Re: [HACKERS] Proposed Changes to PostgreSQL

From
Chris
Date:
Marten Feldtmann wrote:

>  Hmm, and yes one may find problems where the pure
> relational system is 100x faster than your ODBMS.
>
>  After doing a project with VERSANT and VisualWorks
> (election projection system for the first television
> sender here in Germany) I like the idea of OODBMS,
> but I've also noticed, that they are  not the
> solution to all problems.

Give me a clear application spec and VERSANT, and I will ALWAYS flog
Oracle into the dust. But...

Where SQL comes into it's own is _conveniently_ doing queries that I
never thought of when I first designed my app. Of course many ODBMSes
have SQL or similar too.

>  Joins per se are not that bad .. it depends on when
> and how they are used and how good the analyzer of
> the database is and how good he uses the indices to
> get the job done.

Take the simple SUPPLIER, PART and SUPPLIER_PART situation. The very
fact that you've got an extra table here means you've got to touch many
more disk pages and transfer more data. An RDBMS just can't win when the
ODBMS data model is designed right.

>  One very good point is the query language of the
> rdbms systems. On the odbms side no standard is
> really available, which can be seen as the sql of
> the odbms.

There is a standard called OQL which is very similar to SQL. It's just
rather poorly supported.

--
Chris Bitmead
mailto:chris@bitmead.com

Re: [SQL] Re: [HACKERS] Proposed Changes to PostgreSQL

From
Hannu Krosing
Date:
Chris wrote:
>
> >  One very good point is the query language of the
> > rdbms systems. On the odbms side no standard is
> > really available, which can be seen as the sql of
> > the odbms.
>
> There is a standard called OQL which is very similar to SQL. It's just
> rather poorly supported.
>

I think the operative word here is "available". I know that SQL specs
are'nt freely available either, but due to SQL being already widely
supported one can get the general idea from many freely available sources,
like the bunch of freely downloadable DB's currently available for linux.
Most of them have some docs included.

It is still quite a job to reconstruct SQL92 from them ;)

I know now description (except a BNF syntax available from some ODBMS website)
that I could use to get some idea about OQL.

----------------------
Hannu

Re: [SQL] Re: [HACKERS] Proposed Changes to PostgreSQL

From
Hannu Krosing
Date:
Hannu Krosing wrote:
>
>
> It is still quite a job to reconstruct SQL92 from them ;)
>
> I know now description (except a BNF syntax available from some ODBMS website)

SHould be "I know no description ..."

> that I could use to get some idea about OQL.
>
> ----------------------
> Hannu
>
> ************

Re: [SQL] Re: [HACKERS] Proposed Changes to PostgreSQL

From
Date:
1) free is freedom, not free bear ;-) also, there are some sites
    has oql draft -- like sql draft. sorry, can not remember.
2) good books, like " C++ object databases" (David Jordan) has
    a lot material.
3) a lot of OODBM evaluation copy there. Not totally complied, but
   together with 1 and 2, still can see what is going on.

so, no excuse for not knowing oodbm/oql :-) -- I'm waiting for
trying them on pg .

On Sat, 5 Feb 2000, Hannu Krosing wrote:

> Chris wrote:
> >
> > >  One very good point is the query language of the
> > > rdbms systems. On the odbms side no standard is
> > > really available, which can be seen as the sql of
> > > the odbms.
> >
> > There is a standard called OQL which is very similar to SQL. It's just
> > rather poorly supported.
> >
>
> I think the operative word here is "available". I know that SQL specs
> are'nt freely available either, but due to SQL being already widely
> supported one can get the general idea from many freely available sources,
> like the bunch of freely downloadable DB's currently available for linux.
> Most of them have some docs included.
>
> It is still quite a job to reconstruct SQL92 from them ;)
>
> I know now description (except a BNF syntax available from some ODBMS website)
> that I could use to get some idea about OQL.
>
> ----------------------
> Hannu
>
> ************
>


Re: [SQL] Re: [HACKERS] Proposed Changes to PostgreSQL

From
Marten Feldtmann
Date:
> 2) good books, like " C++ object databases" (David Jordan) has
>     a lot material.


As an example:

 Cattel, "The Object Database Standard  ODMG 2.0"

 Morgan Kaufmann, ISBN 1 - 55860 - 463 -4


 Marten Feldtmann