Thread: Proposed Changes to PostgreSQL

Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Hi,

I've been spending a lot of time lately with gdb and tracing the
back-end seeing if I can understand it enough to make some changes.
I'm starting to actually understand a lot of stuff, so in order
to have some possibility of having my changes accepted, I want to
discuss
them here first. Based on that, I'm going to hopefully make an attempt
at implementation. I have a patch for one of these changes already
if I get the go ahead.

THESE CHANGES DON'T AFFECT YOU IF YOU DON'T USE INHERITANCE.

Speak now about these changes or please, forever hold your peace. Of
course you can comment later if I screw up implementation.

The proposed changes are....

1) An imaginary field in every tuple that tells you the class it came
from.
This is useful when you select from table* and want to know which
relation the object actually came from. It wouldn't be stored on disk,
and like oid it wouldn't be displayed when you do SELECT *. The field
would be called classname. So you could have...
SELECT p.classname, p.name FROM person p;
person   | Fred
student  | Bill
employee | Jim
person   | Chris

If you want to know the exact behaviour it is as if every table in the
database had done to it...
ALTER TABLE foo ADD COLUMN classname TEXT;
UPDATE foo SET classname='foo';

Of course this is not how it would be implemented. It is just
reference for how it will appear to work. BTW, this idea was also
in the original berkeley design notes.

2) Changing the sense of the default for getting inherited tuples.
Currently you only get inherited tuples if you specify "tablename*".
This would be changed so that you get all sub-class tuples too by
default unless you specify "ONLY tablename". There are several
rationale for this. Firstly this is what Illustra/Informix have
implemented. Secondly, I believe it is more logical from an OO
perspective as well as giving a more useful default. If a politician
IS a person and I say SELECT * from person, then logically I should
see all the politicians because they are people too (so they claim
:). Thirdly, there are a whole range of SQL statements that should
probably be disallowed without including sub-classes. e.g. an ALTER
TABLE ADD COLUMN that does not include sub-classes is almost certainly
undesirable. It seems ashame to have to resort to non-standard SQL
with the "*" syntax in this case when it is really your only
choice. Basicly, wanting ONLY a classname is a far more unusual
choice, and leaving off the "*" is a common error. Fourthly, it seems
out of character for the SQL language to have this single character
operator. The SQL style is to use wordy descriptions of the operators
meaning. "ONLY" fits well here because it describes its own meaning
perfectly whereas to the unitiated, "*" is harder to guess at. While
this change is an incompatibility I hope for those few people using
inheritance they can accept the need to move forward without
over-burden of backwards compatibility.

3) The ability to return different types of rows from a SELECT. This
is to allow implementation of ODBMS functionality where a query could
be required to instantiate objects of differing types with differing
attributes.

I would propose that that anytime you do a SELECT * from a base table
that you would get back the full rows from those sub tables. Since the
current PQ interface which doesn't support this notion would remain
unchanged this wouldn't affect current users.

It's probably also desirable to have a syntax for getting just the
columns of the base table when this is desired. Say perhaps SELECT %
from table. This would be a performance hack for users of libpq and a
functionality difference for users of psql.

The reason I think the "*" syntax should take on the new functionality
is because it would be more consistent with what the OQL (object query
language) standard specifies, and also because it seems the more
useful default. Also there is no compatibility reason not to do it.

In addition it would be legal to specify columns that only exist in
sub-classes. For example, if we had

CREATE TABLE person (name TEXT);
CREATE TABLE student (studentid TEXT, faculty TEXT) INHERITS (person);

it would be legal to say...
> SELECT * FROM person;
NAME
----
Fred
Bill

NAME | STUDENTID | FACULTY
--------------------------
Jim  | 23455     | Science
Chris| 45666     | Arts

> SELECT *, studentid FROM person;
NAME
----
Fred
Bill

NAME | STUDENTID
----------------
Jim  | 23455
Chris| 45666

> SELECT *, studentid FROM ONLY person;
ERROR: person does not contain studentid.

> SELECT % FROM person;
NAME
----
Fred
Bill
Jim
Chris

As you can see, it is desirable that psql be modified to be able to
print these differing tuple types. Presumably new column headings will
be printed when a tuple is differing to the previous one. Likely it
will be often desirable to do a
SELECT * FROM person p ORDER BY p.classname;
in order to have all the tuples of a particular type grouped together.

In addition some extenions will be done to the PQ interface to support
these differing return types. The current PQ interface will be left
unchanged and backwards compatible for retrieving rows of a single
type.

Also there should be an settable option that specifies that "*" should
also return the normally ignored columns of oid and classname. This is
so that OO programs that embed SQL into them also get back the oid and
classname which are required for the behind the scenes implementation
of an ODMG client. Something like...

SET SHOW_OID TRUE;
SHOW_CLASSNAME TRUE;

SELECT * FROM person;

OID  CLASSNAME NAME
-------------------
2344 person    Fred
3445 person    Bill

OID  CLASSNAME NAME | STUDENTID | FACULTY
-----------------------------------------
2355 student   Jim  | 23455     | Science
5655 student   Chris| 45666     | Arts

Re: [GENERAL] Proposed Changes to PostgreSQL

From
Bruce Momjian
Date:
> THESE CHANGES DON'T AFFECT YOU IF YOU DON'T USE INHERITANCE.
>
> Speak now about these changes or please, forever hold your peace. Of
> course you can comment later if I screw up implementation.
>
> The proposed changes are....
>
> 1) An imaginary field in every tuple that tells you the class it came
> from.
> This is useful when you select from table* and want to know which
> relation the object actually came from. It wouldn't be stored on disk,
> and like oid it wouldn't be displayed when you do SELECT *. The field
> would be called classname. So you could have...
> SELECT p.classname, p.name FROM person p;
> person   | Fred
> student  | Bill
> employee | Jim
> person   | Chris

So the field is created on the fly to show what table it came from.
Seems like a good idea, though implementing another usually-invisible
column will be tough.  However, because it is not really a column like
the oid is a column, it should be ok.   Of course, internally it is
relid.


> 2) Changing the sense of the default for getting inherited tuples.
> Currently you only get inherited tuples if you specify "tablename*".
> This would be changed so that you get all sub-class tuples too by
> default unless you specify "ONLY tablename". There are several
> rationale for this. Firstly this is what Illustra/Informix have
> implemented. Secondly, I believe it is more logical from an OO
> perspective as well as giving a more useful default. If a politician
> IS a person and I say SELECT * from person, then logically I should
> see all the politicians because they are people too (so they claim
> :). Thirdly, there are a whole range of SQL statements that should
> probably be disallowed without including sub-classes. e.g. an ALTER
> TABLE ADD COLUMN that does not include sub-classes is almost certainly
> undesirable. It seems ashame to have to resort to non-standard SQL
> with the "*" syntax in this case when it is really your only
> choice. Basicly, wanting ONLY a classname is a far more unusual
> choice, and leaving off the "*" is a common error. Fourthly, it seems
> out of character for the SQL language to have this single character
> operator. The SQL style is to use wordy descriptions of the operators
> meaning. "ONLY" fits well here because it describes its own meaning
> perfectly whereas to the unitiated, "*" is harder to guess at. While
> this change is an incompatibility I hope for those few people using
> inheritance they can accept the need to move forward without
> over-burden of backwards compatibility.

Sounds fine to me.  Just realize you are taking on a long-overdue but
big job here.

>
> 3) The ability to return different types of rows from a SELECT. This
> is to allow implementation of ODBMS functionality where a query could
> be required to instantiate objects of differing types with differing
> attributes.

This bothers me.  We return relational data, showing the same number of
columns and types for every query.  I don't think we want to change
that, even for OO.  How are you going to return that info the the client
side?


--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [GENERAL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Bruce Momjian wrote:

> So the field is created on the fly to show what table it came from.
> Seems like a good idea, though implementing another usually-invisible
> column will be tough.

What problems do you forsee?

>  However, because it is not really a column like
> the oid is a column, it should be ok.   Of course, internally it is
> relid.
>
> > 2) Changing the sense of the default for getting inherited tuples.
> > Currently you only get inherited tuples if you specify "tablename*".
>
> Sounds fine to me.  Just realize you are taking on a long-overdue but
> big job here.

I already have a patch for this one. The change is a few pretty simple
changes
to gram.y.

> > 3) The ability to return different types of rows from a SELECT. This
> > is to allow implementation of ODBMS functionality where a query could
> > be required to instantiate objects of differing types with differing
> > attributes.
>
> This bothers me.  We return relational data, showing the same number of
> columns and types for every query.  I don't think we want to change
> that, even for OO.

What aspects bother you? This is the fundamental important thing about
object databases.

It's also something that I'm always wanting to do when generating web
pages.
I have web links like http://foo.com/page?id=123. I want to retrieve
the webpage object (which is an inheritance hierarchy) of id=123 which
may
represent a web page of different types. Then process appropriately for
different objects. i.e. typical OO polymorphism.

> How are you going to return that info the the client side?

Well the backend <-> frontend protocol that used to be able to return
tuples of different types would be put back in.

Also the berkerly postgres docs had other scenarios where different
tuples
could be returned. One is you could have a field of type postquel called
say
EMP.hobbies which had a value of "retrieve HOBBIES.all where...", and
then "retrieve
EMP.hobbies would return tuples of different types of hobbies.

Re: [SQL] Proposed Changes to PostgreSQL

From
Tom Lane
Date:
[ I trimmed the cc list a bit ]

Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> The proposed changes are....

> 1) An imaginary field in every tuple that tells you the class it came
> from.
> This is useful when you select from table* and want to know which
> relation the object actually came from. It wouldn't be stored on disk,
> and like oid it wouldn't be displayed when you do SELECT *. The field
> would be called classname. So you could have...
> SELECT p.classname, p.name FROM person p;

This is a good idea, but it seems to me that it'd fit into the system
traditions better if the pseudo-field gave the OID of the source
relation.  If you wanted the actual name of the relation, you'd need
to join against pg_class.  You could argue it either way I suppose;
a name would be more convenient for simple interactive uses, but an
OID would probably be more convenient and efficient for applications
using this feature.  I tend to lean towards the programmatic convenience
side --- far more SQL queries are issued by programs than humans.

> 2) Changing the sense of the default for getting inherited tuples.
> Currently you only get inherited tuples if you specify "tablename*".
> This would be changed so that you get all sub-class tuples too by
> default unless you specify "ONLY tablename". There are several
> rationale for this. Firstly this is what Illustra/Informix have
> implemented. Secondly, I believe it is more logical from an OO
> perspective as well as giving a more useful default.

Well, mumble.  That would be the cleanest choice if we were designing
in a green field, but we aren't.  You're talking about breaking every
single extant Postgres application that uses inheritance, and possibly
some that don't use it except as a shorthand for making their schemas
more compact.  (That's not a hypothetical case; I have DBs that use
schema inheritance but never do SELECT FROM table*.)  I think that's
a mighty high price to pay for achieving a little more logical
cleanliness.

There is also a nontrivial performance penalty that would be paid
for reversing this default, because then every ordinary SQL query
would suffer the overhead of looking to see whether there are
child tables for each table named in the query.  That *really*
doesn't strike me as a good idea.

If Illustra were popular enough to have defined an industry standard
about inheritance, I might think we should follow their lead --- but
who else has followed their lead?

In short, I vote for leaving well enough alone.  It's not so badly
wrong as to be intolerable, and the pain of changing looks high.

> Thirdly, there are a whole range of SQL statements that should
> probably be disallowed without including sub-classes. e.g. an ALTER
> TABLE ADD COLUMN that does not include sub-classes is almost certainly
> undesirable.

This is true.  We could either silently add *, or reject it ("hey bozo,
have you forgotten that this table has subclasses?").  The reject
option would be more conservative, just in case the admin *has*
forgotten that the table has subclasses --- as a crude analogy,
Unix "rm" doesn't assume "-r" by default ;-).  I agree that allowing
an ALTER to make a parent table inconsistent with its children is
very bad news and should be prevented.  (Dropping an inherited column
is another example of something we shouldn't allow.)

> I would propose that that anytime you do a SELECT * from a base table
> that you would get back the full rows from those sub tables.

Frankly: ugh.  This doesn't square with *my* ideas of object
inheritance.  When you are dealing with something that ISA person,
you do not really want to hear about any additional properties it may
have; you are dealing with it as a person and not at any finer grain of
detail.  That goes double for dealing with whole collections of persons.
If you want to examine a particular member of the collection and
dynamically downcast it to some more-specific type, the proposed
classname/classoid feature will give you the ability to do that;
but I think it's a mistake to assume that this should happen by default.

> Since the current PQ interface which doesn't support this notion would
> remain unchanged this wouldn't affect current users.

How would you implement this without actually breaking the current
PQ interface?

> It's probably also desirable to have a syntax for getting just the
> columns of the base table when this is desired. Say perhaps SELECT %
> from table. This would be a performance hack for users of libpq and a
> functionality difference for users of psql.

Again, I think you've got the default backwards.  I remind you also
of something we've been beating on Peter about: psql is an application
scripting tool, so you don't get to redefine its behavior at whim,
anymore than you can change libpq's API at whim.


> In addition it would be legal to specify columns that only exist in
> sub-classes. For example,
> it would be legal to say...
>> SELECT *, studentid FROM person;

Yipes.  I really, really, really DON'T like that one.  At the level
of table person, studentid is unequivocally an invalid column name.
If you do this, you couldn't even guarantee that different subtables
that had studentid columns would have compatible datatypes for those
columns.


> SELECT * FROM person;

> OID  CLASSNAME NAME
> -------------------
> 2344 person    Fred
> 3445 person    Bill

> OID  CLASSNAME NAME | STUDENTID | FACULTY
> -----------------------------------------
> 2355 student   Jim  | 23455     | Science
> 5655 student   Chris| 45666     | Arts

This is not too hard for a person to make sense of, but I think that
it'd be mighty unwieldy for a program to deal with.  What would the
libpq-like interface look like, and what would a typical client
routine look like?
        regards, tom lane


Re: [GENERAL] Proposed Changes to PostgreSQL

From
Bruce Momjian
Date:
> Bruce Momjian wrote:
>
> > So the field is created on the fly to show what table it came from.
> > Seems like a good idea, though implementing another usually-invisible
> > column will be tough.
>
> What problems do you forsee?

Well, it is usually pretty strange to carry around a column that doesn't
exist through all the code and finally contruct it at the end.  I would
suspect something in the rewrite system could do that pretty easily,
though.  That is the direction I would go with that.

>
> >  However, because it is not really a column like
> > the oid is a column, it should be ok.   Of course, internally it is
> > relid.
> >
> > > 2) Changing the sense of the default for getting inherited tuples.
> > > Currently you only get inherited tuples if you specify "tablename*".
> >
> > Sounds fine to me.  Just realize you are taking on a long-overdue but
> > big job here.
>
> I already have a patch for this one. The change is a few pretty simple
> changes
> to gram.y.

OK, you will have to canvas the general list to make sure this does not
break things for people, though our inheritance system needs an overhaul
badly.

>
> > > 3) The ability to return different types of rows from a SELECT. This
> > > is to allow implementation of ODBMS functionality where a query could
> > > be required to instantiate objects of differing types with differing
> > > attributes.
> >
> > This bothers me.  We return relational data, showing the same number of
> > columns and types for every query.  I don't think we want to change
> > that, even for OO.
>
> What aspects bother you? This is the fundamental important thing about
> object databases.

I fear it is totally against the way our API works.  How does someone
see how many columns in the returned row?

> > How are you going to return that info the the client side?
>
> Well the backend <-> frontend protocol that used to be able to return
> tuples of different types would be put back in.
>
> Also the berkerly postgres docs had other scenarios where different
> tuples
> could be returned. One is you could have a field of type postquel called
> say
> EMP.hobbies which had a value of "retrieve HOBBIES.all where...", and
> then "retrieve
> EMP.hobbies would return tuples of different types of hobbies.

Yikes.  Strange.  Can we just return nulls for the empty fields?

How many new API calls are required?

--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [SQL] Re: [GENERAL] Proposed Changes to PostgreSQL

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>>>> So the field is created on the fly to show what table it came from.
>>>> Seems like a good idea, though implementing another usually-invisible
>>>> column will be tough.
>> 
>> What problems do you forsee?

> Well, it is usually pretty strange to carry around a column that doesn't
> exist through all the code and finally contruct it at the end.  I would
> suspect something in the rewrite system could do that pretty easily,
> though.  That is the direction I would go with that.

Yeah.  In fact, since the field is not required except on specific
user request (explicit SELECT, or if you like Chris' SET SHOW_CLASSNAME
idea, that'd still get translated into a SELECT target item at some
pretty early stage), I don't see any need for it to get added to the
HeapTupleHeader fields.  That makes the implementation a *lot* cleaner
because you wouldn't need in-memory HeapTupleHeader to be different from
on-disk headers.  I'm visualizing this as a parameterless function (or
maybe a new primitive expression node type) that gets evaluated during
ExecProject's construction of the output tuple for a a bottom-level
seqscan or indexscan plan node.  The only trick is to persuade the
planner to push it down to the bottom level; normally anything that
isn't a Var gets evaluated at the top of the plan tree.

>>>> This bothers me.  We return relational data, showing the same number of
>>>> columns and types for every query.  I don't think we want to change
>>>> that, even for OO. 

My thought also.  If we had a *real* object orientation, then a returned
column would have an abstract data type that might correspond to an
object supertype.  Of course that just pushes the problem down a level:
how does the application know what methods the returned object has?
How can it even invoke those methods --- whatever code might exist
for them would live on the server, presumably, not get shipped around
in query results.
        regards, tom lane


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
The Hermit Hacker
Date:
On Wed, 2 Feb 2000, Tom Lane wrote:

> > 2) Changing the sense of the default for getting inherited tuples.
> > Currently you only get inherited tuples if you specify "tablename*".
> > This would be changed so that you get all sub-class tuples too by
> > default unless you specify "ONLY tablename". There are several
> > rationale for this. Firstly this is what Illustra/Informix have
> > implemented. Secondly, I believe it is more logical from an OO
> > perspective as well as giving a more useful default.
> 
> Well, mumble.  That would be the cleanest choice if we were designing
> in a green field, but we aren't.  You're talking about breaking every
> single extant Postgres application that uses inheritance, and possibly
> some that don't use it except as a shorthand for making their schemas
> more compact.  (That's not a hypothetical case; I have DBs that use
> schema inheritance but never do SELECT FROM table*.)  I think that's
> a mighty high price to pay for achieving a little more logical
> cleanliness.
> 
> There is also a nontrivial performance penalty that would be paid
> for reversing this default, because then every ordinary SQL query
> would suffer the overhead of looking to see whether there are
> child tables for each table named in the query.  That *really*
> doesn't strike me as a good idea.
> 
> If Illustra were popular enough to have defined an industry standard
> about inheritance, I might think we should follow their lead --- but
> who else has followed their lead?
> 
> In short, I vote for leaving well enough alone.  It's not so badly
> wrong as to be intolerable, and the pain of changing looks high.

Could this be implemented/patched in using #ifdef's, so that you could
configure using --old-style-inheritance so that those that require it
still have it, giving applications a chance to catch up?  

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: [SQL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Tom Lane wrote:

> > 1) An imaginary field in every tuple that tells you the class it came
> This is a good idea, but it seems to me that it'd fit into the system
> traditions better if the pseudo-field gave the OID of the source
> relation.  

This was my initial thought too, but then it occured to me that SQL
doesn't normally deal in oids. For example you don't do a DROP TABLE
oid;

OTOH, oids are probably programmatically useful for things like ODBMSs.

What do you think about having both? I know you can go from one to the 
other by joining with pg_class, but that's too inconvenient, and I can't
make up my mind which is the better "system tradition" either.

I'm not overly fussed on this point though.

> Well, mumble.  That would be the cleanest choice if we were designing
> in a green field, but we aren't.  You're talking about breaking every
> single extant Postgres application that uses inheritance, and possibly
> some that don't use it except as a shorthand for making their schemas
> more compact.  (That's not a hypothetical case; I have DBs that use
> schema inheritance but never do SELECT FROM table*.)  I think that's
> a mighty high price to pay for achieving a little more logical
> cleanliness.

Ok, well compatibility is always a contentious thing. But in your case
you are mis-using the inheritance feature.

The question is, are you willing to do the (simple) changes to your
code to cater for the common good? I'm wanting to make postgresql into a
REAL odbms, and this is a stumbling point that will eventually affect
100x
as many users as it does now (I hope :).

We can also leave the old gram.y for people who want to retain
compatibility
for longer.

> There is also a nontrivial performance penalty that would be paid
> for reversing this default, because then every ordinary SQL query
> would suffer the overhead of looking to see whether there are
> child tables for each table named in the query.  That *really*
> doesn't strike me as a good idea.

I can't comment on what the current performance penalty would be, but 
I'm sure this can be optimised to be a completely trivial overhead.
> If Illustra were popular enough to have defined an industry standard
> about inheritance, I might think we should follow their lead --- but
> who else has followed their lead?

Well Informix of course, which is not small potatoes.
> > I would propose that that anytime you do a SELECT * from a base table
> > that you would get back the full rows from those sub tables.
> 
> Frankly: ugh.  This doesn't square with *my* ideas of object
> inheritance.  When you are dealing with something that ISA person,
> you do not really want to hear about any additional properties it may
> have; you are dealing with it as a person and not at any finer grain of
> detail.  That goes double for dealing with whole collections of persons.
> If you want to examine a particular member of the collection and
> dynamically downcast it to some more-specific type, the proposed
> classname/classoid feature will give you the ability to do that;
> but I think it's a mistake to assume that this should happen by default.

This would be the case if the database were the whole world. But it is
not,
it is a repository for applications written in other languages. How can
you
"dynamically downcast to a more specific type" if the database hasn't
returned
the columns of the more specific type? How can I instantiate a C++
object of
type "Student" if the database has only returned to me the data members
of type
"Person"?

> > Since the current PQ interface which doesn't support this notion would
> > remain unchanged this wouldn't affect current users.
> 
> How would you implement this without actually breaking the current
> PQ interface?

By adding new functions for use when you need to access the extra
columns.

> > It's probably also desirable to have a syntax for getting just the
> > columns of the base table when this is desired. Say perhaps SELECT %
> > from table. This would be a performance hack for users of libpq and a
> > functionality difference for users of psql.
> 
> Again, I think you've got the default backwards.  I remind you also
> of something we've been beating on Peter about: psql is an application
> scripting tool, so you don't get to redefine its behavior at whim,
> anymore than you can change libpq's API at whim.

I am less adamant about the default in this scenario than in the "ONLY
table"
scenario. I'm a bit concerned about the fact that this would break
compatibility with OQL standards, but I can live with this.
> > In addition it would be legal to specify columns that only exist in
> > sub-classes. For example,
> > it would be legal to say...
> >> SELECT *, studentid FROM person;
> 
> Yipes.  I really, really, really DON'T like that one.  At the level
> of table person, studentid is unequivocally an invalid column name.

The reason for this is you need some kind of compromise between seeing
every single column (which overwhelms you in psql) and not seeing any
sub-type columns at all.

> If you do this, you couldn't even guarantee that different subtables
> that had studentid columns would have compatible datatypes for those
> columns.

I think you can because postgres won't let you create sub-types with
column of the same name with incompatible data types. In fact it is
this very fact about postgres that makes this feature feasible. 

> > SELECT * FROM person;
> 
> > OID  CLASSNAME NAME
> > -------------------
> > 2344 person    Fred
> > 3445 person    Bill
> 
> > OID  CLASSNAME NAME | STUDENTID | FACULTY
> > -----------------------------------------
> > 2355 student   Jim  | 23455     | Science
> > 5655 student   Chris| 45666     | Arts
> 
> This is not too hard for a person to make sense of, but I think that
> it'd be mighty unwieldy for a program to deal with.  What would the
> libpq-like interface look like, and what would a typical client
> routine look like?

The PQ interface would have a new function something like
PQnfieldsv(PQresult,tuplenum),
so it returns a different number for each tuple.

But the real benefit is not writing "unwieldy" code in C, but ODBMS
style code where you 
can go...
List<Shape> l = query("SELECT * FROM shape");
Shape *s;
for (l.begin(); s = l.get(); l.next())     s.display();

Because if the dbms returns ALL the columns, a C++ runtime system can
properly instantiate subtypes and use polymorphism.


Re: [GENERAL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Bruce Momjian wrote:

> > I already have a patch for this one. The change is a few pretty simple
> > changes
> > to gram.y.
>
> OK, you will have to canvas the general list to make sure this does not
> break things for people, though our inheritance system needs an overhaul
> badly.

This is already CCed to the general list.

> I fear it is totally against the way our API works.  How does someone
> see how many columns in the returned row?

A new API PQnfieldsv(PQresult, tupnum) or some such.

> Yikes.  Strange.

Strange for C code perhaps. Very useful for constructing real objects in
OO application code framework.

> Can we just return nulls for the empty fields?

Well, I think we should probably distinguish between a field that is
null,
and a field that simply doesn't exist.

> How many new API calls are required?

Perhaps just the one. (above).

Re: [HACKERS] Re: [SQL] Re: [GENERAL] Proposed Changes to PostgreSQL

From
Bruce Momjian
Date:
> Yeah.  In fact, since the field is not required except on specific
> user request (explicit SELECT, or if you like Chris' SET SHOW_CLASSNAME
> idea, that'd still get translated into a SELECT target item at some
> pretty early stage), I don't see any need for it to get added to the
> HeapTupleHeader fields.  That makes the implementation a *lot* cleaner
> because you wouldn't need in-memory HeapTupleHeader to be different from
> on-disk headers.  I'm visualizing this as a parameterless function (or
> maybe a new primitive expression node type) that gets evaluated during
> ExecProject's construction of the output tuple for a a bottom-level
> seqscan or indexscan plan node.  The only trick is to persuade the
> planner to push it down to the bottom level; normally anything that
> isn't a Var gets evaluated at the top of the plan tree.

Yes, I agree this is a good way to do it.

> >>>> This bothers me.  We return relational data, showing the same number of
> >>>> columns and types for every query.  I don't think we want to change
> >>>> that, even for OO. 
> 
> My thought also.  If we had a *real* object orientation, then a returned
> column would have an abstract data type that might correspond to an
> object supertype.  Of course that just pushes the problem down a level:
> how does the application know what methods the returned object has?
> How can it even invoke those methods --- whatever code might exist
> for them would live on the server, presumably, not get shipped around
> in query results.

Agreed.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [SQL] Re: [GENERAL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Tom Lane wrote:
> >>>> This bothers me.  We return relational data, showing the same number of
> >>>> columns and types for every query.  I don't think we want to change
> >>>> that, even for OO.
> 
> My thought also.  If we had a *real* object orientation, then a returned
> column would have an abstract data type that might correspond to an
> object supertype.  Of course that just pushes the problem down a level:
> how does the application know what methods the returned object has?
> How can it even invoke those methods --- whatever code might exist
> for them would live on the server, presumably, not get shipped around
> in query results.

In (most) ODBMSes, the code for a class does NOT live in the database
server. (How
would you store a C++ binary in a database?).

What happens is when a query returns an object, some magic behind the
scenes
checks the type of the returned object (thus the need for the
"classname" column
or similar.) The magic behind the scenes then instantiates a C++ object
of
the correct class and populates all the data members from the query
results.

The application code is then free to make polymorphic calls on the
object
because ALL the fields are populated, not just those of the base class.


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
The Hermit Hacker wrote:

> Could this be implemented/patched in using #ifdef's, so that you could
> configure using --old-style-inheritance so that those that require it
> still have it, giving applications a chance to catch up?

Sounds like an excellent idea, although I'm not sure how to ifdef a .y
bison file.


Re: [GENERAL] Proposed Changes to PostgreSQL

From
Bruce Momjian
Date:
I can live with this.  Thanks.


> Bruce Momjian wrote:
>
> > > I already have a patch for this one. The change is a few pretty simple
> > > changes
> > > to gram.y.
> >
> > OK, you will have to canvas the general list to make sure this does not
> > break things for people, though our inheritance system needs an overhaul
> > badly.
>
> This is already CCed to the general list.
>
> > I fear it is totally against the way our API works.  How does someone
> > see how many columns in the returned row?
>
> A new API PQnfieldsv(PQresult, tupnum) or some such.
>
> > Yikes.  Strange.
>
> Strange for C code perhaps. Very useful for constructing real objects in
> OO application code framework.
>
> > Can we just return nulls for the empty fields?
>
> Well, I think we should probably distinguish between a field that is
> null,
> and a field that simply doesn't exist.
>
> > How many new API calls are required?
>
> Perhaps just the one. (above).
>


--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [SQL] Proposed Changes to PostgreSQL

From
Bruce Momjian
Date:
> Tom Lane wrote:
> 
> > > 1) An imaginary field in every tuple that tells you the class it came
> > This is a good idea, but it seems to me that it'd fit into the system
> > traditions better if the pseudo-field gave the OID of the source
> > relation.  
> 
> This was my initial thought too, but then it occured to me that SQL
> doesn't normally deal in oids. For example you don't do a DROP TABLE
> oid;
> 
> OTOH, oids are probably programmatically useful for things like ODBMSs.
> 
> What do you think about having both? I know you can go from one to the 
> other by joining with pg_class, but that's too inconvenient, and I can't
> make up my mind which is the better "system tradition" either.

Sure, let them have both.  Why not, or you could force them to join to
pg_class for the name.  That would work too.

> Ok, well compatibility is always a contentious thing. But in your case
> you are mis-using the inheritance feature.
> 
> The question is, are you willing to do the (simple) changes to your
> code to cater for the common good? I'm wanting to make postgresql into a
> REAL odbms, and this is a stumbling point that will eventually affect
> 100x
> as many users as it does now (I hope :).
> 
> We can also leave the old gram.y for people who want to retain
> compatibility
> for longer.

I would canvas the list to find out how many people object, and if there
are few, you may be able to get away with something in config.h.in that
they can change if they want the old behavour.

> > > Since the current PQ interface which doesn't support this notion would
> > > remain unchanged this wouldn't affect current users.
> > 
> > How would you implement this without actually breaking the current
> > PQ interface?
> 
> By adding new functions for use when you need to access the extra
> columns.

Whatever it is, the API has to be lean and clean.

I saw your PQnfieldsv, and that looks fine to me.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [SQL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Tom Lane wrote:

> Again, I think you've got the default backwards.  I remind you also
> of something we've been beating on Peter about: psql is an application
> scripting tool, so you don't get to redefine its behavior at whim,
> anymore than you can change libpq's API at whim.

If this is the only objection, we could make the old behaviour available
by a SET command, as well as a command-line switch, as well as a 
./configure option.

I hope we can get the best design here possible without over-emphasis
on compatibility.


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Bruce Momjian
Date:
> Tom Lane wrote:
> 
> > Again, I think you've got the default backwards.  I remind you also
> > of something we've been beating on Peter about: psql is an application
> > scripting tool, so you don't get to redefine its behavior at whim,
> > anymore than you can change libpq's API at whim.
> 
> If this is the only objection, we could make the old behaviour available
> by a SET command, as well as a command-line switch, as well as a 
> ./configure option.
> 
> I hope we can get the best design here possible without over-emphasis
> on compatibility.

SET command is probably the best.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Don Baccus
Date:
At 09:55 PM 2/2/00 -0500, Tom Lane wrote:

>There is also a nontrivial performance penalty that would be paid
>for reversing this default, because then every ordinary SQL query
>would suffer the overhead of looking to see whether there are
>child tables for each table named in the query.  That *really*
>doesn't strike me as a good idea.

Thank you for pointing this out, because my first reaction to
the proposal was "what's the overhead for SQL users"?

Given the stated goals of becoming a fast, efficient, reliable
SQL engine, this has to be a crucial consideration.

On the other hand, as someone who once made his living off his 
designed and implemented optimizing multi-language, multi-platform
compiler technology...is it entirely out of the question to 
consider more greatly abstracting the language (gram.y/analyze.c)
and backend (optimizer and executor) interfaces so more than one
front-end could exist (even if only in experimental and research
environments)?  Along with front-end specific versions of libpq?

These front-ends wouldn't necessarily need to be supported by
the mainstream PG development group, except to support a defined
and sufficiently abstract interface to the optimization/planning and
executing guts of the system so that folks could mess around to
their heart's content.  And bear the burden of doing so if they
pick up users :)

Just a thought...

>> I would propose that that anytime you do a SELECT * from a base table
>> that you would get back the full rows from those sub tables.
>
>Frankly: ugh.  This doesn't square with *my* ideas of object
>inheritance.

Nor mine, in fact the stuff I've seen about primitive OO in databases
make me thing the folks just don't get it.

Not to mention that I'm not convinced that "getting it" is worth it.  OO
fits some paradigms, not others, when programming in the large.  And 
most database stuff is really programming in the small (the query parts,
the data is often huge, of course).  The notion of asking a query, as
in (say) psql is more related to the notion of typing a few lines at
BASIC than the notion of writing a few million lines of integrated 
code.  In database design, even more so than in conventional programming,
it is the data model that reigns supreme and the actual size tends to
be manageable, though the models themselves can be very complex.

I offer this as a reason why commercial DB users are more concerned 
with things like performance, scalability, and the like than with
reworking of the RDBMS paradigm.  Complaints about queries seem to
place heavy emphasis on "why they are slow", and the OO paradigm
doesn't help here.  I'm not certain that psuedo-OO features help.

One reason I raise the issue of possible multiple front-ends (or making
it easy for folks to make there own by making the parser->optimizer/backend
interface more general) is that this whole area would seem to be one 
that begs for RESEARCH and experimentalism.

The reality, AFAIK, is that in the crucible of commercial use, real
OO databases and thinking simply haven't penetrated.  

Nor is Postgres written in C++ :)  (GOOD decision to abandon that
thought, IMO, though at the moment I'm working on C++ tools for
my current client).



>Again, I think you've got the default backwards.  I remind you also
>of something we've been beating on Peter about: psql is an application
>scripting tool, so you don't get to redefine its behavior at whim,
>anymore than you can change libpq's API at whim.

Yeah, this is VERY important.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Don Baccus wrote:

> Given the stated goals of becoming a fast, efficient, reliable
> SQL engine, this has to be a crucial consideration.

I'm sure this can be made fast.

> On the other hand, as someone who once made his living off his
> designed and implemented optimizing multi-language, multi-platform
> compiler technology...is it entirely out of the question to
> consider more greatly abstracting the language (gram.y/analyze.c)
> and backend (optimizer and executor) interfaces so more than one
> front-end could exist (even if only in experimental and research
> environments)?  Along with front-end specific versions of libpq?

A good thought, but we still need one good front end that supports
all the features.

> >> I would propose that that anytime you do a SELECT * from a base table
> >> that you would get back the full rows from those sub tables.
> >
> >Frankly: ugh.  This doesn't square with *my* ideas of object
> >inheritance.
> 
> Nor mine, in fact the stuff I've seen about primitive OO in databases
> make me thing the folks just don't get it.
> 
> Not to mention that I'm not convinced that "getting it" is worth it.  OO
> fits some paradigms, not others, when programming in the large.

Well, the features I'm talking about don't affect you unless you want
OO.

> And
> most database stuff is really programming in the small (the query parts,
> the data is often huge, of course).  The notion of asking a query, as
> in (say) psql is more related to the notion of typing a few lines at
> BASIC than the notion of writing a few million lines of integrated
> code.  In database design, even more so than in conventional programming,
> it is the data model that reigns supreme and the actual size tends to
> be manageable, though the models themselves can be very complex.

And as those models become so complex it is crucial that the data-model
that
"reigns supreme" is properly integrated with the programming language.

For example, in an IBM Java project I'm working on there is 15000 lines
of 
code that converts about 10 or so  SQL tables into Java objects. Insane
stuff.

> I offer this as a reason why commercial DB users are more concerned
> with things like performance, scalability, and the like than with
> reworking of the RDBMS paradigm. 

Actually developers are very interested in supporting the ODBMS paradigm
as you can see from the Sun proposed standard for RDBMS interface which
is an exact copy of the ODMG ODBMS interface standard.

In fact I think about 90% of "stuff" is best solved with an ODBMS
style of interaction. The trouble is that most ODBMS don't do the other
10% very well (i.e. wierd and wonderful queries), which is where
postgresql _could_ be the ultimate at solving both.

> Complaints about queries seem to
> place heavy emphasis on "why they are slow", and the OO paradigm
> doesn't help here.

Huh? The OO paradigm helps heaps here because you can model something
with a far smaller number of tables.

>  I'm not certain that psuedo-OO features help.

Don't know what a pseudo-OO feature is.

> One reason I raise the issue of possible multiple front-ends (or making
> it easy for folks to make there own by making the parser->optimizer/backend
> interface more general) is that this whole area would seem to be one
> that begs for RESEARCH and experimentalism.

No research is required. I simply want to implement the ODMG STANDARD
for ODBMS databases on PostgreSQL. There are no great design issues
here,
just a matter of nailing down the details so that everyone can live 
with them.

> The reality, AFAIK, is that in the crucible of commercial use, real
> OO databases and thinking simply haven't penetrated.

Not really true. In certain areas ODBMSes are pervasive. For example
many
Telco companies use ODBMSes for the majority of their stuff. It's
necessary
to get the performance they need. Also of course CAD apps can only use
an ODBMS.

No offence, but you havn't actually used one have you?


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Don Baccus wrote:
> 
> At 09:55 PM 2/2/00 -0500, Tom Lane wrote:
> 
> >There is also a nontrivial performance penalty that would be paid
> >for reversing this default, because then every ordinary SQL query
> >would suffer the overhead of looking to see whether there are
> >child tables for each table named in the query.  That *really*
> >doesn't strike me as a good idea.
> 
> Thank you for pointing this out, because my first reaction to
> the proposal was "what's the overhead for SQL users"?


I just did a performance check on this. I found that the overhead
is one tenth of a millisecond on a Sun desktop workstation. Pretty
trivial, and I'm sure it can be improved.


Re: [SQL] Re: [GENERAL] Proposed Changes to PostgreSQL

From
Mathijs Brands
Date:
On Wed, Feb 02, 2000 at 09:57:48PM -0500, Bruce Momjian allegedly wrote:
> > > > 3) The ability to return different types of rows from a SELECT. This
> > > > is to allow implementation of ODBMS functionality where a query could
> > > > be required to instantiate objects of differing types with differing
> > > > attributes.
> > >
> > > This bothers me.  We return relational data, showing the same number of
> > > columns and types for every query.  I don't think we want to change
> > > that, even for OO.
> >
> > What aspects bother you? This is the fundamental important thing about
> > object databases.
>
> I fear it is totally against the way our API works.  How does someone
> see how many columns in the returned row?

This would probably break applications written in PHP and Perl (and
possibly others) that have their queryresults returned to them in a
numerically indexed array (index by offset). If this behaviour could
be turned off, than it shouldn't be a problem.

Mathijs

Re: [SQL] Re: [GENERAL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Mathijs Brands wrote:
>
> On Wed, Feb 02, 2000 at 09:57:48PM -0500, Bruce Momjian allegedly wrote:
> > > > > 3) The ability to return different types of rows from a SELECT. This
> > > > > is to allow implementation of ODBMS functionality where a query could
> > > > > be required to instantiate objects of differing types with differing
> > > > > attributes.
> > > >
> > > > This bothers me.  We return relational data, showing the same number of
> > > > columns and types for every query.  I don't think we want to change
> > > > that, even for OO.
> > >
> > > What aspects bother you? This is the fundamental important thing about
> > > object databases.
> >
> > I fear it is totally against the way our API works.  How does someone
> > see how many columns in the returned row?
>
> This would probably break applications written in PHP and Perl (and
> possibly others) that have their queryresults returned to them in a
> numerically indexed array (index by offset). If this behaviour could
> be turned off, than it shouldn't be a problem.

It wouldn't affect them because the current APIs would continue to
return
the same base-level columns. You would only get access to the extra
columns
with a new API.

Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Tom Lane
Date:
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> Tom Lane wrote:
>>>> 1) An imaginary field in every tuple that tells you the class it came
>> This is a good idea, but it seems to me that it'd fit into the system
>> traditions better if the pseudo-field gave the OID of the source
>> relation.  

> What do you think about having both? I know you can go from one to the 
> other by joining with pg_class, but that's too inconvenient, and I can't
> make up my mind which is the better "system tradition" either.

If we can implement it as I sketched before, there's no reason not to
offer both, since either one would create zero overhead for any query
not using the feature.

I'll comment on the other issues later ... but I will say that I don't
think it's acceptable to add *any* overhead to standard-SQL queries
in order to support inheritance better.  The vast majority of our users
want SQL performance and don't give a damn about inheritance.  We have
to pay attention to that.
        regards, tom lane


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Chris
Date:
Tom Lane wrote:

> I'll comment on the other issues later ... but I will say that I don't
> think it's acceptable to add *any* overhead to standard-SQL queries
> in order to support inheritance better.  The vast majority of our users
> want SQL performance and don't give a damn about inheritance.  We have
> to pay attention to that.

Well I see that pg_class has columns like "relhasindex". If we added a
"relhassubclass", the overhead should be unmeasureable.


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Hannu Krosing
Date:
Don Baccus wrote:
> 
> 
> The reality, AFAIK, is that in the crucible of commercial use, real
> OO databases and thinking simply haven't penetrated.

AFAIK Informix integrated most OO features from Illustra into their UDB
and also latest versions of Oracle have moved a lot in that direction too.
> Nor is Postgres written in C++ :)

what does C++ have to do with OO ;)

----------------------
Hannu


Re: [HACKERS] Re: [GENERAL] Proposed Changes to PostgreSQL

From
Hannu Krosing
Date:
Bruce Momjian wrote:
>
> > Bruce Momjian wrote:
> >
> > > So the field is created on the fly to show what table it came from.
> > > Seems like a good idea, though implementing another usually-invisible
> > > column will be tough.
> >
> > What problems do you forsee?
>
> Well, it is usually pretty strange to carry around a column that doesn't
> exist through all the code and finally contruct it at the end.  I would
> suspect something in the rewrite system could do that pretty easily,
> though.  That is the direction I would go with that.
>

Oracle has a ROWNR (IIRC) pseudo-column that is added in th every end of
query and is a convienient way to put numbers on report rows (among other
things).

------------
Hannu

Re: [HACKERS] Re: [GENERAL] Proposed Changes to PostgreSQL

From
Hannu Krosing
Date:
Bruce Momjian wrote:
>
>
> Yikes.  Strange.  Can we just return nulls for the empty fields?

I think more natural way would be to define a new type (NAF - NotAFiled),
like we have NAN for floats (do we ?, at least IEEE has)

-----------------
Hannu

Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Hannu Krosing
Date:
Chris Bitmead wrote:
> 
> Tom Lane wrote:
> 
> > > 1) An imaginary field in every tuple that tells you the class it came
> > This is a good idea, but it seems to me that it'd fit into the system
> > traditions better if the pseudo-field gave the OID of the source
> > relation.
> 
> This was my initial thought too, but then it occured to me that SQL
> doesn't normally deal in oids. For example you don't do a DROP TABLE
> oid;

DROP TABLE (SELECT relname FROM pg_class WHERE oid=the_oid);

would be cool ;)

> > > I would propose that that anytime you do a SELECT * from a base table
> > > that you would get back the full rows from those sub tables.

Maybe SELECT ** FROM BASE would be more flexible as it leaves the standard 
SQL with its "standard" meaning ?

> > Frankly: ugh.  This doesn't square with *my* ideas of object
> > inheritance.  When you are dealing with something that ISA person,
> > you do not really want to hear about any additional properties it may
> > have; you are dealing with it as a person and not at any finer grain of
> > detail.  That goes double for dealing with whole collections of persons.
> > If you want to examine a particular member of the collection and
> > dynamically downcast it to some more-specific type, the proposed
> > classname/classoid feature will give you the ability to do that;
> > but I think it's a mistake to assume that this should happen by default.
> 
> This would be the case if the database were the whole world. But it is
> not,
> it is a repository for applications written in other languages. How can
> you
> "dynamically downcast to a more specific type" if the database hasn't
> returned
> the columns of the more specific type? How can I instantiate a C++
> object of
> type "Student" if the database has only returned to me the data members
> of type
> "Person"?

You could do as some DB's (IIRC Oracle) do with large objects - return the 
whole row if doing a select that has many rows.

return just a handle when going over a cursor with FETCH 1 and then have 
calls to get the rest.

We will have to change the API sometime not too distant anyway, the current 
api is unable to deal with anything that does not have a nice textual 
representation (like an image or sound) in spite of all the talks about 
easy extensibility - the extensibility is all in the backend, ther is no 
easy way to get new datatypes in/out.

---------------
Hannu


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Chris
Date:
Hannu Krosing wrote:

> Maybe SELECT ** FROM BASE would be more flexible as it leaves the standard
> SQL with its "standard" meaning ?

That was my first thought and it's definitely a possibility. My argument
against it is that SQL doesn't have a "standard meaning" in the case of
inheritance, and ** is an incompatibility with OQL.

I suspect we need both. Something like 
SET GET_INHERITED_COLUMNS true; etc. 
> We will have to change the API sometime not too distant anyway, the current
> api is unable to deal with anything that does not have a nice textual
> representation (like an image or sound) in spite of all the talks about
> easy extensibility - the extensibility is all in the backend, ther is no
> easy way to get new datatypes in/out.

What about PQbinaryTuples() and friends?


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Bruce Momjian
Date:
> Don Baccus wrote:
> > 
> > At 09:55 PM 2/2/00 -0500, Tom Lane wrote:
> > 
> > >There is also a nontrivial performance penalty that would be paid
> > >for reversing this default, because then every ordinary SQL query
> > >would suffer the overhead of looking to see whether there are
> > >child tables for each table named in the query.  That *really*
> > >doesn't strike me as a good idea.
> > 
> > Thank you for pointing this out, because my first reaction to
> > the proposal was "what's the overhead for SQL users"?
> 
> 
> I just did a performance check on this. I found that the overhead
> is one tenth of a millisecond on a Sun desktop workstation. Pretty
> trivial, and I'm sure it can be improved.

Good point.  Has to be non-mearurable performance penalty because most
people don't use it.  Maybe you will need a system cache entry for this.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Bruce Momjian
Date:
> Tom Lane wrote:
> 
> > I'll comment on the other issues later ... but I will say that I don't
> > think it's acceptable to add *any* overhead to standard-SQL queries
> > in order to support inheritance better.  The vast majority of our users
> > want SQL performance and don't give a damn about inheritance.  We have
> > to pay attention to that.
> 
> Well I see that pg_class has columns like "relhasindex". If we added a
> "relhassubclass", the overhead should be unmeasureable.

Yes, but how do you keep that accurate?  If I add indexes, then drop
them, does relhasindex go to false.  Could you do that for relhassubclass?

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Chris
Date:
Bruce Momjian wrote:

> > Well I see that pg_class has columns like "relhasindex". If we added a
> > "relhassubclass", the overhead should be unmeasureable.
> 
> Yes, but how do you keep that accurate?  If I add indexes, then drop
> them, does relhasindex go to false. 

I don't know. Does it? 

>Could you do that for relhassubclass?

If we made it relnumsubclasses and incremented/decremented on
CREATE/DROP, it seems easy in theory.


-- 
Chris Bitmead
mailto:chris@bitmead.com
http://www.techphoto.org - Photography News, Stuff that Matters


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Bruce Momjian
Date:
> Bruce Momjian wrote:
> 
> > > Well I see that pg_class has columns like "relhasindex". If we added a
> > > "relhassubclass", the overhead should be unmeasureable.
> > 
> > Yes, but how do you keep that accurate?  If I add indexes, then drop
> > them, does relhasindex go to false. 
> 
> I don't know. Does it? 

Oops:test=> create table test(x int);CREATEtest=> create index i_test on test(x);CREATEtest=> select relhasindex from
pg_classwhere relname = 'test'; relhasindex ------------- t(1 row)
 
test=> drop index i_test;DROPtest=> select relhasindex from pg_class where relname = 'test'; relhasindex -------------
t(1row)
 

Let me add that to the TODO list.

> 
> >Could you do that for relhassubclass?
> 
> If we made it relnumsubclasses and incremented/decremented on
> CREATE/DROP, it seems easy in theory.

Yes, that would work.  Seems hasindex has problems.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Chris
Date:
Bruce Momjian wrote:

> Oops:

>         test=> drop index i_test;
>         DROP
>         test=> select relhasindex from pg_class where relname = 'test';
>          relhasindex
>         -------------
>          t
>         (1 row)
> 
> Let me add that to the TODO list.

Why not change that to a relnumindexes as well? Easier to maintain and
more useful information.

> > >Could you do that for relhassubclass?
> >
> > If we made it relnumsubclasses and incremented/decremented on
> > CREATE/DROP, it seems easy in theory.
> 
> Yes, that would work.  Seems hasindex has problems.


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Bruce Momjian
Date:
> > Let me add that to the TODO list.
> 
> Why not change that to a relnumindexes as well? Easier to maintain and
> more useful information.

Yes, we probably should do that, but I bet some interfaces us it. 
Comments?

Actually, looks like only pg_dump uses it, so maybe we would be OK.
Maybe 7.0 is a good time to fix this.

> 
> > > >Could you do that for relhassubclass?
> > >
> > > If we made it relnumsubclasses and incremented/decremented on
> > > CREATE/DROP, it seems easy in theory.
> > 
> > Yes, that would work.  Seems hasindex has problems.
> 
> ************
> 
> 


--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
"Mark Hollomon"
Date:
> [ discussion on changing the default to getting subclasses ]

I object.

How about a set variable?

SET GETSUBCLASSES = true

With the '*' and ONLY being explicit overrides to the setting
of the variable. The default would be 'false'. I would not
object to a configuration switch that would change the
default.
-- 

Mark Hollomon
mhh@nortelnetworks.com
ESN 451-9008 (302)454-9008


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Don Baccus
Date:
At 12:00 PM 2/3/00 +0200, Hannu Krosing wrote:
>Don Baccus wrote:

>what does C++ have to do with OO ;)

Nothing, but don't tell them :)  Having worked on C++ compilers,
don't get me started on THAT subject!




- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Don Baccus
Date:
At 04:38 PM 2/3/00 +1100, Chris Bitmead wrote:
>Don Baccus wrote:

>> On the other hand, as someone who once made his living off his
>> designed and implemented optimizing multi-language, multi-platform
>> compiler technology...is it entirely out of the question to
>> consider more greatly abstracting the language (gram.y/analyze.c)
>> and backend (optimizer and executor) interfaces so more than one
>> front-end could exist (even if only in experimental and research
>> environments)?  Along with front-end specific versions of libpq?
>
>A good thought, but we still need one good front end that supports
>all the features.

I wasn't think in terms of this being mutually exclusive with your
desires.  Merely raising up the notion that the possibility exists
of creating a sandbox, so to speak, for people to play in, a tool
for the exploration of such concepts.

>> Nor mine, in fact the stuff I've seen about primitive OO in databases
>> make me thing the folks just don't get it.
>> 
>> Not to mention that I'm not convinced that "getting it" is worth it.  OO
>> fits some paradigms, not others, when programming in the large.
>
>Well, the features I'm talking about don't affect you unless you want
>OO.

No, and I wasn't arguing that you shouldn't move forward, either.  I
was just stating my personal opinion regarding the utility of simple
OO-ish features, that's all.

>> One reason I raise the issue of possible multiple front-ends (or making
>> it easy for folks to make there own by making the parser->optimizer/backend
>> interface more general) is that this whole area would seem to be one
>> that begs for RESEARCH and experimentalism.
>
>No research is required. I simply want to implement the ODMG STANDARD
>for ODBMS databases on PostgreSQL. There are no great design issues
>here,
>just a matter of nailing down the details so that everyone can live 
>with them.

Well...that's sorta like saying no research into procedural language
design is needed 'cause now we've got C++.

Whether or not the existing standard for ODBMS is the greatest thing
since sliced bread, I find it hard to believe that no research is
required or design issues raised by the fundamental problems of 
database technology.

Maybe I'm wrong, though, maybe the problem's been solved.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Tom Lane
Date:
Chris <chris@bitmead.com> writes:
> Why not change that to a relnumindexes as well? Easier to maintain and
> more useful information.

Maintaining an accurate count of descendants (or indexes for that
matter) would be expensive; in particular, it'd create severe
concurrency problems.  If one transaction is in the middle of creating
or dropping a child C of table P, then all other transactions would be
blocked from creating or dropping any other children of P until the C
transaction commits or aborts.  They'd have to wait or they wouldn't
know what to set relnumchildren to.

For the purpose at hand, I think it would be OK to have a
"relhaschildren" field that is set true when the first child is created
and then never changed.  If you have a table that once had children but
has none at the moment, then you pay the price of looking through
pg_inherits; but the case that we're really concerned about (a pure SQL,
no-inheritance table) would still win.

Not sure whether we can concurrently create/delete indexes on a rel,
but I'd be inclined to leave relhasindexes alone: again its main
function in life is to let you short-circuit looking for indexes on
a table that's never had and never will have any.
        regards, tom lane


Re: [HACKERS] Proposed Changes to PostgreSQL

From
Peter Eisentraut
Date:
While I think that these kinds of changes are a No Go because they'd break
a lot of applications (including mine), IF (big if) you really want to
make major changes to the inheritance scheme, I got a few ideas.

First let me say that I like the conceptual simplicity of relational
databases. Some or all of the ideas thrown around here break with
simplicity and consistency, by suggesting, e.g., that some commands be
allowed only on entire inheritance structures, while others be allowed on
individual tables, and attached to it a discussion which ones those should
be. That doesn't strike me as too promising.

A lot of people use inheritance to create "consistent schemas", that is,
they empty create base tables, such as "address" which are inherited by
tables such as customer, vendor, office, etc. That is probably not what
inheritance is for, perhaps it should be some sort of a macro-like
concept, such as create table vendor (name text,
copy_schema_from(address), more fields), expanded by the parser. This is
pretty much what it does now, only this scheme wouldn't have to actually
store the (useless) inheritance link.

Anyway, an idea I had would be to reimplement inheritance based on joins,
since this is what the "pure relational" solution would be anyway. When I
create a table B that is based on A, all the system does is create the
table B as usual and store a note "I inherit from A". Any row you insert
into B also creates a row in A, and the row in B contains an oid pointer
to it. Thus a select on B performs a join on A.oid and B.row_in_A_pointer.
A select on A just returns all the rows in A, no extras needed. A delete
on B deletes the row in B and in A. A delete in A would cascade to B. Both
of this can be gotten for free with foreign keys. Adding a column to A
just adds the column to A, all other tables get the new column magically
and in the right order. Same with dropping columns, etc.

In short, this approach solves all inheritance problems at once and does
so without adding any extra kludges besides the "I inherited from" field,
which is static, plus the necessary transformations necessary in the
parser. The drawback is of course that a select from an inherited table
would always incur a join, perhaps some optimizing could be done in this
direction. But the bottom line is that the compatibility issue looms big.

    -Peter


On Thu, 3 Feb 2000, Chris Bitmead wrote:

> Hi,
>
> I've been spending a lot of time lately with gdb and tracing the
> back-end seeing if I can understand it enough to make some changes.
> I'm starting to actually understand a lot of stuff, so in order
> to have some possibility of having my changes accepted, I want to
> discuss
> them here first. Based on that, I'm going to hopefully make an attempt
> at implementation. I have a patch for one of these changes already
> if I get the go ahead.
>
> THESE CHANGES DON'T AFFECT YOU IF YOU DON'T USE INHERITANCE.
>
> Speak now about these changes or please, forever hold your peace. Of
> course you can comment later if I screw up implementation.
>
> The proposed changes are....
>
> 1) An imaginary field in every tuple that tells you the class it came
> from.
> This is useful when you select from table* and want to know which
> relation the object actually came from. It wouldn't be stored on disk,
> and like oid it wouldn't be displayed when you do SELECT *. The field
> would be called classname. So you could have...
> SELECT p.classname, p.name FROM person p;
> person   | Fred
> student  | Bill
> employee | Jim
> person   | Chris
>
> If you want to know the exact behaviour it is as if every table in the
> database had done to it...
> ALTER TABLE foo ADD COLUMN classname TEXT;
> UPDATE foo SET classname='foo';
>
> Of course this is not how it would be implemented. It is just
> reference for how it will appear to work. BTW, this idea was also
> in the original berkeley design notes.
>
> 2) Changing the sense of the default for getting inherited tuples.
> Currently you only get inherited tuples if you specify "tablename*".
> This would be changed so that you get all sub-class tuples too by
> default unless you specify "ONLY tablename". There are several
> rationale for this. Firstly this is what Illustra/Informix have
> implemented. Secondly, I believe it is more logical from an OO
> perspective as well as giving a more useful default. If a politician
> IS a person and I say SELECT * from person, then logically I should
> see all the politicians because they are people too (so they claim
> :). Thirdly, there are a whole range of SQL statements that should
> probably be disallowed without including sub-classes. e.g. an ALTER
> TABLE ADD COLUMN that does not include sub-classes is almost certainly
> undesirable. It seems ashame to have to resort to non-standard SQL
> with the "*" syntax in this case when it is really your only
> choice. Basicly, wanting ONLY a classname is a far more unusual
> choice, and leaving off the "*" is a common error. Fourthly, it seems
> out of character for the SQL language to have this single character
> operator. The SQL style is to use wordy descriptions of the operators
> meaning. "ONLY" fits well here because it describes its own meaning
> perfectly whereas to the unitiated, "*" is harder to guess at. While
> this change is an incompatibility I hope for those few people using
> inheritance they can accept the need to move forward without
> over-burden of backwards compatibility.
>
> 3) The ability to return different types of rows from a SELECT. This
> is to allow implementation of ODBMS functionality where a query could
> be required to instantiate objects of differing types with differing
> attributes.
>
> I would propose that that anytime you do a SELECT * from a base table
> that you would get back the full rows from those sub tables. Since the
> current PQ interface which doesn't support this notion would remain
> unchanged this wouldn't affect current users.
>
> It's probably also desirable to have a syntax for getting just the
> columns of the base table when this is desired. Say perhaps SELECT %
> from table. This would be a performance hack for users of libpq and a
> functionality difference for users of psql.
>
> The reason I think the "*" syntax should take on the new functionality
> is because it would be more consistent with what the OQL (object query
> language) standard specifies, and also because it seems the more
> useful default. Also there is no compatibility reason not to do it.
>
> In addition it would be legal to specify columns that only exist in
> sub-classes. For example, if we had
>
> CREATE TABLE person (name TEXT);
> CREATE TABLE student (studentid TEXT, faculty TEXT) INHERITS (person);
>
> it would be legal to say...
> > SELECT * FROM person;
> NAME
> ----
> Fred
> Bill
>
> NAME | STUDENTID | FACULTY
> --------------------------
> Jim  | 23455     | Science
> Chris| 45666     | Arts
>
> > SELECT *, studentid FROM person;
> NAME
> ----
> Fred
> Bill
>
> NAME | STUDENTID
> ----------------
> Jim  | 23455
> Chris| 45666
>
> > SELECT *, studentid FROM ONLY person;
> ERROR: person does not contain studentid.
>
> > SELECT % FROM person;
> NAME
> ----
> Fred
> Bill
> Jim
> Chris
>
> As you can see, it is desirable that psql be modified to be able to
> print these differing tuple types. Presumably new column headings will
> be printed when a tuple is differing to the previous one. Likely it
> will be often desirable to do a
> SELECT * FROM person p ORDER BY p.classname;
> in order to have all the tuples of a particular type grouped together.
>
> In addition some extenions will be done to the PQ interface to support
> these differing return types. The current PQ interface will be left
> unchanged and backwards compatible for retrieving rows of a single
> type.
>
> Also there should be an settable option that specifies that "*" should
> also return the normally ignored columns of oid and classname. This is
> so that OO programs that embed SQL into them also get back the oid and
> classname which are required for the behind the scenes implementation
> of an ODMG client. Something like...
>
> SET SHOW_OID TRUE;
> SHOW_CLASSNAME TRUE;
>
> SELECT * FROM person;
>
> OID  CLASSNAME NAME
> -------------------
> 2344 person    Fred
> 3445 person    Bill
>
> OID  CLASSNAME NAME | STUDENTID | FACULTY
> -----------------------------------------
> 2355 student   Jim  | 23455     | Science
> 5655 student   Chris| 45666     | Arts
>
> ************
>
>

--
Peter Eisentraut                  Sernanders vaeg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Taral
Date:
On Thu, 3 Feb 2000, Tom Lane wrote:

> Maintaining an accurate count of descendants (or indexes for that
> matter) would be expensive; in particular, it'd create severe
> concurrency problems.

What about fixing these things on VACUUM then?

Taral



Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Tom Lane
Date:
"Mark Hollomon" <mhh@nortelnetworks.com> writes:
> How about a set variable?

> SET GETSUBCLASSES = true

> With the '*' and ONLY being explicit overrides to the setting
> of the variable. The default would be 'false'.

I like that a lot.  Clean, flexible, doesn't break any existing
applications.

Perhaps the business of whether to fetch extra columns from subclasses
could be done similarly.  I am beginning to understand why Chris wants
to do that, and I see that it would support a particular style of
database programming very nicely.  But I really fail to see why it's
necessary to change the default behavior to cater to those apps rather
than existing ones.  Let the new apps use a variant syntax; don't
expect people to change existing code in order to avoid getting tripped
up by a new feature.

Note that "oh they won't see the extra columns if they're using an
old API" doesn't answer my objection.  I'm concerned about the
performance hit from fetching those columns and transferring them to
the client, as well as the memory hit of storing them in query results
on the client side.  We should *not* set things up in such a way that
that happens by default when the client didn't ask for it and isn't
even using an API that can support it.  That's why it'd be a mistake
to redefine the existing query syntax to act this way.

The suggestion of "SELECT ** FROM ..." sounds pretty good to me,
actually.  I don't really see any need for changing the behavior of
anything that looks like a standard SQL query.  Applications that
need this feature will know that they need it and can issue a query
that specifically requests it.

> I would not object to a configuration switch that would change the
> default.

Mmm, I think that would probably not be such a hot idea.  That would
introduce a pretty fundamental semantics incompatibility between
different installations, which would hurt script portability, complicate
debugging and support, yadda yadda.  I think a SET variable is enough...
        regards, tom lane


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Tom Lane
Date:
Taral <taral@taral.net> writes:
> On Thu, 3 Feb 2000, Tom Lane wrote:
>> Maintaining an accurate count of descendants (or indexes for that
>> matter) would be expensive; in particular, it'd create severe
>> concurrency problems.

> What about fixing these things on VACUUM then?

Could probably do that ... not sure if it's worth the trouble ...
        regards, tom lane


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
"Mark Hollomon"
Date:
Tom Lane wrote:
> 
> Chris <chris@bitmead.com> writes:
> > Why not change that to a relnumindexes as well? Easier to maintain and
> > more useful information.
> 
> Maintaining an accurate count of descendants (or indexes for that
> matter) would be expensive; in particular, it'd create severe
> concurrency problems.  If one transaction is in the middle of creating
> or dropping a child C of table P, then all other transactions would be
> blocked from creating or dropping any other children of P until the C
> transaction commits or aborts.  They'd have to wait or they wouldn't
> know what to set relnumchildren to.
> 
> For the purpose at hand, I think it would be OK to have a
> "relhaschildren" field that is set true when the first child is created
> and then never changed.  If you have a table that once had children but
> has none at the moment, then you pay the price of looking through
> pg_inherits; but the case that we're really concerned about (a pure SQL,
> no-inheritance table) would still win.
> 
> Not sure whether we can concurrently create/delete indexes on a rel,
> but I'd be inclined to leave relhasindexes alone: again its main
> function in life is to let you short-circuit looking for indexes on
> a table that's never had and never will have any.
> 

WOuld it be possible to consider this a 'statistic' and let
vacuum update it?

In other words, creating an index (or subtable) sets 
relhasindex (relhaschild) but vacuum will set it to false
if it finds no children or indexes. or would this
run into concurrency problems as well?

-- 

Mark Hollomon
mhh@nortelnetworks.com
ESN 451-9008 (302)454-9008


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Hannu Krosing
Date:
Chris wrote:
> 
> Hannu Krosing wrote:
> 
> > Maybe SELECT ** FROM BASE would be more flexible as it leaves the standard
> > SQL with its "standard" meaning ?
> 
> That was my first thought and it's definitely a possibility. My argument
> against it is that SQL doesn't have a "standard meaning" in the case of
> inheritance, and ** is an incompatibility with OQL.
> 
> I suspect we need both. Something like
> SET GET_INHERITED_COLUMNS true; etc.
> 
> > We will have to change the API sometime not too distant anyway, the current
> > api is unable to deal with anything that does not have a nice textual
> > representation (like an image or sound) in spite of all the talks about
> > easy extensibility - the extensibility is all in the backend, ther is no
> > easy way to get new datatypes in/out.
> 
> What about PQbinaryTuples() and friends?

They don't help you at all when doing inserts and are by definition in native
byte order on queries.

Something like [ PREPARE query; BIND arguments ; EXEC ] which knows about
binary 
formats would be needed here.

One could use LOs except that the current ineffective implementation.

-------------
Hannu


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Mark Hollomon wrote:
> 
> > [ discussion on changing the default to getting subclasses ]
> 
> I object.

Tell me why you object. Performance concerns? Compatibility?

A SET might be a good idea, but to decide whether and also a 
default, it's good to know what the objections are.

> 
> How about a set variable?
> 
> SET GETSUBCLASSES = true
> 
> With the '*' and ONLY being explicit overrides to the setting
> of the variable. The default would be 'false'. I would not
> object to a configuration switch that would change the
> default.
> --
> 
> Mark Hollomon
> mhh@nortelnetworks.com
> ESN 451-9008 (302)454-9008


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Hannu Krosing
Date:
Don Baccus wrote:
> 
> >
> >A good thought, but we still need one good front end that supports
> >all the features.
> 
> I wasn't think in terms of this being mutually exclusive with your
> desires.  Merely raising up the notion that the possibility exists
> of creating a sandbox, so to speak, for people to play in, a tool
> for the exploration of such concepts.

So we would be returning to roots. The original Postgres was exactly that -
a tool for the exploration of such concepts.

> No, and I wasn't arguing that you shouldn't move forward, either.  I
> was just stating my personal opinion regarding the utility of simple
> OO-ish features, that's all.

Yes, it needs quite much discussion/design befor going forth, lest we 
will be in the next level of the current situation where some peoples 
usage of the current limited inheritance is an obstacle to moving 
forward to a more developed one.

> >> One reason I raise the issue of possible multiple front-ends (or making
> >> it easy for folks to make there own by making the parser->optimizer/backend
> >> interface more general) is that this whole area would seem to be one
> >> that begs for RESEARCH and experimentalism.
> >
> >No research is required. I simply want to implement the ODMG STANDARD
> >for ODBMS databases on PostgreSQL. There are no great design issues
> >here, just a matter of nailing down the details so that everyone can 
> >live with them.
> 
> Well...that's sorta like saying no research into procedural language
> design is needed 'cause now we've got C++.
> 
> Whether or not the existing standard for ODBMS is the greatest thing
> since sliced bread, I find it hard to believe that no research is
> required or design issues raised by the fundamental problems of
> database technology.
> 
> Maybe I'm wrong, though, maybe the problem's been solved.
> 

My wife has forbidden me to buy any sliced bread, because the slices are of 
wrong thickness.

Hardly the situation can be any better in OODB design.

The ODMG standard may be a good starting point for discussion, but one can't 
run any programs on a standard - one needs a real db. 
And IIRC the standard is only semi-public (not freely
available/distributable).

------------------
Hannu


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Don Baccus wrote:

> >No research is required. I simply want to implement the ODMG STANDARD
> >for ODBMS databases on PostgreSQL. There are no great design issues
> >here,
> >just a matter of nailing down the details so that everyone can live
> >with them.
> 
> Well...that's sorta like saying no research into procedural language
> design is needed 'cause now we've got C++.
> 
> Whether or not the existing standard for ODBMS is the greatest thing
> since sliced bread, I find it hard to believe that no research is
> required or design issues raised by the fundamental problems of
> database technology.
> 
> Maybe I'm wrong, though, maybe the problem's been solved.

No research is required _for what I want to do_. (or if there is
research required, I think I've just done it over the last 5 years :).
i.e. I don't want to explore some new style database, only implement
a current ODMG standard on postgresql. This style of database is
fairly well understood now for good or bad. Once the RDBMS and ODBMS
features exist in one database, maybe then research can be done
to move forward. That's my opinion anyway.


Re: [HACKERS] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Peter Eisentraut wrote:

> A lot of people use inheritance to create "consistent schemas", that is,
> they empty create base tables, such as "address" which are inherited by
> tables such as customer, vendor, office, etc.

This is a really bad idea. You could never have both a postal address
AND
a home address for example. I thought the original postgres supported
this
by having
CREATE TABLE ADDRESS (...)
CREATE TABLE PERSON(add ADDRESS).

Anyway, this is what Oracle and others can do these days, and this is
the right
thing.

> Anyway, an idea I had would be to reimplement inheritance based on joins,
> since this is what the "pure relational" solution would be anyway. When I
> create a table B that is based on A, all the system does is create the
> table B as usual and store a note "I inherit from A". Any row you insert
> into B also creates a row in A, and the row in B contains an oid pointer
> to it.

This is a really stu^H^H^H bad idea. I have hierarchies 5 levels deep
with
multiple inheritance, and I
don't want to do a 10 way join just to retrieve an object.

This is why RDBMS's performance sucks so incredibly badly on some
applications.
an ODBMS can perform 100x as fast in these cases just because of what
you
are proposing.

> Thus a select on B performs a join on A.oid and B.row_in_A_pointer.
> A select on A just returns all the rows in A, no extras needed. A delete
> on B deletes the row in B and in A. A delete in A would cascade to B. Both
> of this can be gotten for free with foreign keys. Adding a column to A
> just adds the column to A, all other tables get the new column magically
> and in the right order. Same with dropping columns, etc.
>
> In short, this approach solves all inheritance problems at once and does
> so without adding any extra kludges besides the "I inherited from" field,
> which is static, plus the necessary transformations necessary in the
> parser. The drawback is of course that a select from an inherited table
> would always incur a join, perhaps some optimizing could be done in this
> direction. But the bottom line is that the compatibility issue looms big.
>
>         -Peter
>
> On Thu, 3 Feb 2000, Chris Bitmead wrote:
>
> > Hi,
> >
> > I've been spending a lot of time lately with gdb and tracing the
> > back-end seeing if I can understand it enough to make some changes.
> > I'm starting to actually understand a lot of stuff, so in order
> > to have some possibility of having my changes accepted, I want to
> > discuss
> > them here first. Based on that, I'm going to hopefully make an attempt
> > at implementation. I have a patch for one of these changes already
> > if I get the go ahead.
> >
> > THESE CHANGES DON'T AFFECT YOU IF YOU DON'T USE INHERITANCE.
> >
> > Speak now about these changes or please, forever hold your peace. Of
> > course you can comment later if I screw up implementation.
> >
> > The proposed changes are....
> >
> > 1) An imaginary field in every tuple that tells you the class it came
> > from.
> > This is useful when you select from table* and want to know which
> > relation the object actually came from. It wouldn't be stored on disk,
> > and like oid it wouldn't be displayed when you do SELECT *. The field
> > would be called classname. So you could have...
> > SELECT p.classname, p.name FROM person p;
> > person   | Fred
> > student  | Bill
> > employee | Jim
> > person   | Chris
> >
> > If you want to know the exact behaviour it is as if every table in the
> > database had done to it...
> > ALTER TABLE foo ADD COLUMN classname TEXT;
> > UPDATE foo SET classname='foo';
> >
> > Of course this is not how it would be implemented. It is just
> > reference for how it will appear to work. BTW, this idea was also
> > in the original berkeley design notes.
> >
> > 2) Changing the sense of the default for getting inherited tuples.
> > Currently you only get inherited tuples if you specify "tablename*".
> > This would be changed so that you get all sub-class tuples too by
> > default unless you specify "ONLY tablename". There are several
> > rationale for this. Firstly this is what Illustra/Informix have
> > implemented. Secondly, I believe it is more logical from an OO
> > perspective as well as giving a more useful default. If a politician
> > IS a person and I say SELECT * from person, then logically I should
> > see all the politicians because they are people too (so they claim
> > :). Thirdly, there are a whole range of SQL statements that should
> > probably be disallowed without including sub-classes. e.g. an ALTER
> > TABLE ADD COLUMN that does not include sub-classes is almost certainly
> > undesirable. It seems ashame to have to resort to non-standard SQL
> > with the "*" syntax in this case when it is really your only
> > choice. Basicly, wanting ONLY a classname is a far more unusual
> > choice, and leaving off the "*" is a common error. Fourthly, it seems
> > out of character for the SQL language to have this single character
> > operator. The SQL style is to use wordy descriptions of the operators
> > meaning. "ONLY" fits well here because it describes its own meaning
> > perfectly whereas to the unitiated, "*" is harder to guess at. While
> > this change is an incompatibility I hope for those few people using
> > inheritance they can accept the need to move forward without
> > over-burden of backwards compatibility.
> >
> > 3) The ability to return different types of rows from a SELECT. This
> > is to allow implementation of ODBMS functionality where a query could
> > be required to instantiate objects of differing types with differing
> > attributes.
> >
> > I would propose that that anytime you do a SELECT * from a base table
> > that you would get back the full rows from those sub tables. Since the
> > current PQ interface which doesn't support this notion would remain
> > unchanged this wouldn't affect current users.
> >
> > It's probably also desirable to have a syntax for getting just the
> > columns of the base table when this is desired. Say perhaps SELECT %
> > from table. This would be a performance hack for users of libpq and a
> > functionality difference for users of psql.
> >
> > The reason I think the "*" syntax should take on the new functionality
> > is because it would be more consistent with what the OQL (object query
> > language) standard specifies, and also because it seems the more
> > useful default. Also there is no compatibility reason not to do it.
> >
> > In addition it would be legal to specify columns that only exist in
> > sub-classes. For example, if we had
> >
> > CREATE TABLE person (name TEXT);
> > CREATE TABLE student (studentid TEXT, faculty TEXT) INHERITS (person);
> >
> > it would be legal to say...
> > > SELECT * FROM person;
> > NAME
> > ----
> > Fred
> > Bill
> >
> > NAME | STUDENTID | FACULTY
> > --------------------------
> > Jim  | 23455     | Science
> > Chris| 45666     | Arts
> >
> > > SELECT *, studentid FROM person;
> > NAME
> > ----
> > Fred
> > Bill
> >
> > NAME | STUDENTID
> > ----------------
> > Jim  | 23455
> > Chris| 45666
> >
> > > SELECT *, studentid FROM ONLY person;
> > ERROR: person does not contain studentid.
> >
> > > SELECT % FROM person;
> > NAME
> > ----
> > Fred
> > Bill
> > Jim
> > Chris
> >
> > As you can see, it is desirable that psql be modified to be able to
> > print these differing tuple types. Presumably new column headings will
> > be printed when a tuple is differing to the previous one. Likely it
> > will be often desirable to do a
> > SELECT * FROM person p ORDER BY p.classname;
> > in order to have all the tuples of a particular type grouped together.
> >
> > In addition some extenions will be done to the PQ interface to support
> > these differing return types. The current PQ interface will be left
> > unchanged and backwards compatible for retrieving rows of a single
> > type.
> >
> > Also there should be an settable option that specifies that "*" should
> > also return the normally ignored columns of oid and classname. This is
> > so that OO programs that embed SQL into them also get back the oid and
> > classname which are required for the behind the scenes implementation
> > of an ODMG client. Something like...
> >
> > SET SHOW_OID TRUE;
> > SHOW_CLASSNAME TRUE;
> >
> > SELECT * FROM person;
> >
> > OID  CLASSNAME NAME
> > -------------------
> > 2344 person    Fred
> > 3445 person    Bill
> >
> > OID  CLASSNAME NAME | STUDENTID | FACULTY
> > -----------------------------------------
> > 2355 student   Jim  | 23455     | Science
> > 5655 student   Chris| 45666     | Arts
> >
> > ************
> >
> >
>
> --
> Peter Eisentraut                  Sernanders vaeg 10:115
> peter_e@gmx.net                   75262 Uppsala
> http://yi.org/peter-e/            Sweden

Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Taral wrote:
> 
> On Thu, 3 Feb 2000, Tom Lane wrote:
> 
> > Maintaining an accurate count of descendants (or indexes for that
> > matter) would be expensive; in particular, it'd create severe
> > concurrency problems.
> 
> What about fixing these things on VACUUM then?

It could produce wrong results to queries if the data is wrong.


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Tom, I agree with most of what you say. If we want to have ** be the
default
syntax for getting sub-columns I can live with that (for suggestion (3))

But for (2), I do feel very strongly that getting sub-tuples should be
the
"default default", and a SET GETSUBCLASSES=true should be the default
setting.

I've been using the postgres inheritance for a real system and I can
say with certainty that this is a massive source of errors. Not 
wanting sub-class tuples seems rarely needed, and leaving off the "*" is
something that too often seems forgotten. I often can trawl through
code and realise that some query is missing the "*" but it hasn't been
discovered yet. In fact I find that almost all queries require the "*"
when you have a proper OO model, and not using "*" is usually laziness.

Also when adding a sub-class where there previously was none, one 
usually has to trawl through the queries and add "*" to all of them
because as I said, there are almost never occasions where "*" is not
required in real life OO models.

So I understand the compatibility issue here, but I really feel strongly
that this should be changed now before there really are a lot of people
using it. Sure, have as many compatibility modes as you like, but I
think
this is a broken enough design that the default should be changed.
Apparently Illustra/Informix agreed.

Tom Lane wrote:
> 
> "Mark Hollomon" <mhh@nortelnetworks.com> writes:
> > How about a set variable?
> 
> > SET GETSUBCLASSES = true
> 
> > With the '*' and ONLY being explicit overrides to the setting
> > of the variable. The default would be 'false'.
> 
> I like that a lot.  Clean, flexible, doesn't break any existing
> applications.
> 
> Perhaps the business of whether to fetch extra columns from subclasses
> could be done similarly.  I am beginning to understand why Chris wants
> to do that, and I see that it would support a particular style of
> database programming very nicely.  But I really fail to see why it's
> necessary to change the default behavior to cater to those apps rather
> than existing ones.  Let the new apps use a variant syntax; don't
> expect people to change existing code in order to avoid getting tripped
> up by a new feature.
> 
> Note that "oh they won't see the extra columns if they're using an
> old API" doesn't answer my objection.  I'm concerned about the
> performance hit from fetching those columns and transferring them to
> the client, as well as the memory hit of storing them in query results
> on the client side.  We should *not* set things up in such a way that
> that happens by default when the client didn't ask for it and isn't
> even using an API that can support it.  That's why it'd be a mistake
> to redefine the existing query syntax to act this way.
> 
> The suggestion of "SELECT ** FROM ..." sounds pretty good to me,
> actually.  I don't really see any need for changing the behavior of
> anything that looks like a standard SQL query.  Applications that
> need this feature will know that they need it and can issue a query
> that specifically requests it.
> 
> > I would not object to a configuration switch that would change the
> > default.
> 
> Mmm, I think that would probably not be such a hot idea.  That would
> introduce a pretty fundamental semantics incompatibility between
> different installations, which would hurt script portability, complicate
> debugging and support, yadda yadda.  I think a SET variable is enough...
> 
>                         regards, tom lane


relhasindex(was RE: [HACKERS] Proposed Changes to PostgreSQL)

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: owner-pgsql-hackers@postgreSQL.org
> [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Bruce Momjian
> 
> > Bruce Momjian wrote:
> > 
> > > > Well I see that pg_class has columns like "relhasindex". If 
> we added a
> > > > "relhassubclass", the overhead should be unmeasureable.
> > > 
> > > Yes, but how do you keep that accurate?  If I add indexes, then drop
> > > them, does relhasindex go to false. 
> > 
> > I don't know. Does it? 
> 
> Let me add that to the TODO list.
> 
> > 
> > >Could you do that for relhassubclass?
> > 
> > If we made it relnumsubclasses and incremented/decremented on
> > CREATE/DROP, it seems easy in theory.
> 
> Yes, that would work.  Seems hasindex has problems.
>

This posting may be off the point,sorry.

Isn't relhasindex a kind of item that we can live without it ?
I proposed to change the use of this item in [[HACKERS] Index
recreation in vacuum].  Though I have heard no clear objection,
I want to confirm again.  My proposal is as follows.

1) DDL commands don't rely on relhasindex.
2) DML commands don't take indexes into account if   relhasindex is set to false.
3) REINDEX command and vacuum with REINDEX option   sets this flag to false at the beginning and sets it to true   when
recreationof all indexes completed.
 

Comments ?

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp



Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
"Ross J. Reedstrom"
Date:
On Fri, Feb 04, 2000 at 10:55:39AM +1100, Chris Bitmead wrote:
<snipped it all!>

Just wanted to chime in on this thread with the sugestion that Chris
clearly has been thinking about this a lot, and has some strong opinions
about the 'right way to do things'.  How about an offical, postgresql.org
hosted, CVS branch for ORDBMS development? Let Chris and whomever is
interested take a crack at doing it however they want, and _prove_
that the performance is as good, or much better, and is compatible, etc.
Clearly, details of implementation can be discussed to death, until
Chris gets fed up and goes away: not good. So, what do the core
developers think? Sound feasable? As to problems of keeping in sync with
HEAD, etc., that'd be up to Chris and his crew. Does postgresql.org
have the extra 20-30 MB of disk?

Ross
-- 
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu> 
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St.,  Houston, TX 77005


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Marten Feldtmann
Date:
> Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> 
> I'll comment on the other issues later ... but I will say that I don't
> think it's acceptable to add *any* overhead to standard-SQL queries
> in order to support inheritance better.  The vast majority of our users
> want SQL performance and don't give a damn about inheritance.  We have
> to pay attention to that.
> 
Well said ! 
Actually I'm a little bit uncertain what ORDBMS really improves ? After
writing a full mapper and wrapper for PostgreSQL and a Smalltalk dialect
I see really no usage for these additional inheritance features databases
like PostgreSQL offer.
Some points about this:
 - all these additional features are very specific to PostgreSQL and   are not compatible with other databases. Writing
anapplication    based on these features results in non-portable systems. - Speed is still a very, very important
featurefor a database. A   single query, which uses about 5 seconds because the optimizer   is not very clever to use
severalindices to improove the    query execution is much more worse and can change the structure   of the whole
applicationprogram.
 
 - when creating automatic sql-queries through a mapper one can get   very complicated sql queries which tests the
parservery hard and   the limits of PostgreSQL has been seen very quickly during   the development of the wrapper
above.
 What I'm missing from these new database are structural changes to
the query system: the possibility to execute complicated 
concatenated queries on the server .. perhaps with different 
parameters.
Just some ideas about all these nice features
Marten






Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Chris Bitmead
Date:
Marten Feldtmann wrote:

>  Actually I'm a little bit uncertain what ORDBMS really improves ? After
> writing a full mapper and wrapper for PostgreSQL and a Smalltalk dialect
> I see really no usage for these additional inheritance features databases
> like PostgreSQL offer.
> 
>  Some points about this:
> 
>   - all these additional features are very specific to PostgreSQL and
>     are not compatible with other databases. Writing an application
>     based on these features results in non-portable systems.

Not true, because if the wrapper conforms to the ODMG standard, it will 
be compatible with ObjectStore, Versant, the new Sun RDBS standard,
Gemstone, and many others.

>   - Speed is still a very, very important feature for a database. A
>     single query, which uses about 5 seconds because the optimizer
>     is not very clever to use several indices to improove the
>     query execution is much more worse and can change the structure
>     of the whole application program.

The biggest thing you can do for speed is to have less objects/tuples
in the database. Inheritance and the array feature of postgresql
can improve things here by orders of magnitude. The problem is that
these
two features are not viable to use at present. With an ODMG interface,
and TOAST to allow tuples of unlimited size this will then be a viable
feature. In some situations this will improve queries by 100x even
with the most brain-dead optimizer. ODBMS doesn't care a great deal
about wonderful optimizers because joins are less necessary.

>   - when creating automatic sql-queries through a mapper one can get
>     very complicated sql queries which tests the parser very hard and
>     the limits of PostgreSQL has been seen very quickly during
>     the development of the wrapper above.

Exactly, so stop mapping things and creating complicated joins. ODBMSes
do not do ANY joins to re-create objects. That's why mappers suck so
hard.

>   What I'm missing from these new database are structural changes to
> the query system: the possibility to execute complicated
> concatenated queries on the server .. perhaps with different
> parameters.

What is a concatenated query? 

I'm all in favour of more powerful queries, but that is not what this
proposal is about. This is about AVOIDING queries. Mappers and so forth
are great query generators because the database representation is
different from the in-memory object representation. This proposal
is all about making the in-memory object representation the same
as in the database.

If you still don't get it take an example..

class CarPart {int volume;
}
class Wheel : CarPart {int diameter;
}
class SteeringWheel : Wheel {      boolean horn;
}
class RoadWheel : Wheel {      int airpressure;
}
class Car {   List<CarPart> parts;
}

Now with an ODBMS, a Car with 4 wheels and a steering wheel we'll have 6
objects in the database - 1 Car, 4 RoadWheels and 1 SteeringWheel. With
a relational mapper, depending on how you map it you'll have 21 objects
- 5 CarPart objects, 5 wheel objects, 4 road wheel, 1 steering wheel, 1
car and 5 car_carpart relation entities. And when you join it all
together you'll have to join against 6 tables instead of 3.


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
"Mark Hollomon"
Date:
<I trimmed the CC list a bit>

Chris Bitmead wrote:
> 
> Mark Hollomon wrote:
> >
> > > [ discussion on changing the default to getting subclasses ]
> >
> > I object.
> 
> Tell me why you object. Performance concerns? Compatibility?

Definitely compatibility. The load I see (200 - 300 queries a DAY)
isn't enough for me to be concerned about an extra millisecond
or two per query. But I certainly understand others concerns in
this area.

One of my responsibilities at work is the maintenance of a homegrown
document indexing and retrieval system. It is about 100K of Perl
that calls into a custom Perl wrapper around libpq. The system
is an escaped 'proof-of-concept'. I wrote it using inheritance
features of Postgres95.

The upshot is, that this proposed change would require me to examine
almost every line of this system in order to make sure that I put
ONLY in just the right spots. Yes, this would be where ever there
_isn't_ a '*', but how do I grep for the lack of a asterisk? Since
it is a "prototype", The code feels very free to pass around small
snippets of SQL, a disembodied FROM clause, a portion of a VALUES
clause.

I simply would not be allowed the time to do the rewrite necessary
to accomodate this change. And if I _did_ have the time, I would
probably rewrite it for Oracle because then DB Admin would be someone
_else's_ job.

Now, one of the days, I will find a good excuse (eg new feature)
to do a complete rewrite. And _then_ your proposal will actually
be a help.

And that is why I suggest a SET variable. When I'm ready to
use the new feature, I can. But no work is necessary until that
day arrives.

Thanks for listening.

-- 

Mark Hollomon
mhh@nortelnetworks.com
ESN 451-9008 (302)454-9008


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Don Baccus
Date:
At 02:42 PM 2/4/00 +1100, Chris Bitmead wrote:

>Not true, because if the wrapper conforms to the ODMG standard, it will 
>be compatible with ObjectStore, Versant, the new Sun RDBS standard,
>Gemstone, and many others.

Without prejudice, I'd be interested in some order-of-magnitude 
market share for these technologies vs., say, Oracle.

>The biggest thing you can do for speed is to have less objects/tuples
>in the database. Inheritance and the array feature of postgresql
>can improve things here by orders of magnitude.

There's no doubt of this, for applications that can make use
of the paradigms.


>The problem is that
>these
>two features are not viable to use at present. With an ODMG interface,
>and TOAST to allow tuples of unlimited size this will then be a viable
>feature. In some situations this will improve queries by 100x even
>with the most brain-dead optimizer. ODBMS doesn't care a great deal
>about wonderful optimizers because joins are less necessary.

And this last statement I really have to wonder about.  For restricted
application spaces, yeah, no doubt.  But in general, no way.

>Exactly, so stop mapping things and creating complicated joins. ODBMSes
>do not do ANY joins to re-create objects. That's why mappers suck so
>hard.

If they don't do joins, then presumably they map many-to-one relations
by copying data into each of the "many" table rows.  TANSTAAFL, no?

Though this strategy is a very viable one in today's big-memory, big-disk
environment.  It's not clear to me that a extremely smart RDBMS system
couldn't decide to add redundancy itself and gain much of the efficiency,
but, heck, that's just my weak, uncreative compiler-writer mind at work
again.

(and clearly, of course, PG isn't on any threshold of doing it, I'm 
thinking in theoretical space here).


>Now with an ODBMS, a Car with 4 wheels and a steering wheel we'll have 6
>objects in the database - 1 Car, 4 RoadWheels and 1 SteeringWheel. With
>a relational mapper, depending on how you map it you'll have 21 objects
>- 5 CarPart objects, 5 wheel objects, 4 road wheel, 1 steering wheel, 1
>car and 5 car_carpart relation entities. And when you join it all
>together you'll have to join against 6 tables instead of 3.

Not really.  You'd probably denormalize and not worry about it, in 
practice.

Would the result be as beautiful?  I don't know - do most car designers
think that SteeringMechanism and PavementInterface are the same?  It's
true for a variety of reasons in today's cars that aren't actually
related, and high-end race cars are exploring joystick control.

So one could claim that your hierarchy is merely limiting creative
expression...



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: [SQL] Re: [HACKERS] Proposed Changes to PostgreSQL

From
Marten Feldtmann
Date:
> Peter Eisentraut wrote:
>
>
> This is a really stu^H^H^H bad idea. I have hierarchies 5 levels deep
> with
> multiple inheritance, and I
> don't want to do a 10 way join just to retrieve an object.
>
> This is why RDBMS's performance sucks so incredibly badly on some
> applications.
> an ODBMS can perform 100x as fast in these cases just because of what
> you
> are proposing.
>

 Hmm, and yes one may find problems where the pure relational system
is 100x faster than your ODBMS.

 After doing a project with VERSANT and VisualWorks (election projection
system for the first television sender here in Germany) I like the
idea of OODBMS, but I've also noticed, that they are not the solution
to all problems.

 Clever database desing leeds to good performance on both systems, but
one should consider, that the designs of the database layout will be
different. There are cases, where a pure relational system is very
fast and an ODBMS never get it, but there are the examples you
mentioned.

 Joins per se are not that bad .. it depends on when and how they
are used and how good the analyzer of the database is and how good
he uses the indices to get the job done.

 One very good point is the query language of the rdbms systems. On
the odbms side no standard is really available, which can be seen as
the sql of the odbms.

 Marten



Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Hannu Krosing
Date:
Chris Bitmead wrote:
> 
> Tom, I agree with most of what you say. If we want to have ** be the
> default
> syntax for getting sub-columns I can live with that (for suggestion (3))
> 
> But for (2), I do feel very strongly that getting sub-tuples should be
> the
> "default default", and a SET GETSUBCLASSES=true should be the default
> setting.

Then maybe we need a way to "break off" inheritance, i.e. make the inherited 
table independent but retain the columns as they are at the time of breakage.

At least it could be given as an option in pg_dump. (--dump_flat_creates or
smth.)

> I've been using the postgres inheritance for a real system and I can
> say with certainty that this is a massive source of errors. Not
> wanting sub-class tuples seems rarely needed, and leaving off the "*" is
> something that too often seems forgotten. I often can trawl through
> code and realise that some query is missing the "*" but it hasn't been
> discovered yet. In fact I find that almost all queries require the "*"
> when you have a proper OO model, and not using "*" is usually laziness.

True. I also think that people who used inheritance as a create table shortcut
can most easily ensure compatibility by dumping their not-really-inherited 
tables as independent. They will have to dump-relaod anyway.

> Also when adding a sub-class where there previously was none, one
> usually has to trawl through the queries and add "*" to all of them
> because as I said, there are almost never occasions where "*" is not
> required in real life OO models.
> 
> So I understand the compatibility issue here, but I really feel strongly
> that this should be changed now before there really are a lot of people
> using it. Sure, have as many compatibility modes as you like, but I
> think
> this is a broken enough design that the default should be changed.
> Apparently Illustra/Informix agreed.

And they are probably the only external DB we can aim to be compatible with,
or what does SQL3 say?

----------------------
Hannu


Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Chris
Date:
Don Baccus wrote:

> Without prejudice, I'd be interested in some order-of-magnitude
> market share for these technologies vs., say, Oracle.

Would you be interested in the market share of Win98 compared to Linux?

Nobody uses an ODBMS if they can get it to work with Oracle. They go to
an ODBMS when  they realise that's the only way they can get it to work.

However, as I said, Sun is defining for Java a standard interface for
RDBMS which is exactly the same as ODMG. So expect a lot of people using
Oracle to be writing code that ports to an ODBMS. Maybe when they
realise they can slot a real ODBMS under their
app and greatly increase performance, it might be
good for the ODBMS market.


> There's no doubt of this, for applications that can > make use of the paradigms.

To my mind that is like saying OO is useful for programs that can make
use of the paradigms. In fact I think nearly all programs can make use
of OO.

> And this last statement I really have to wonder 
> about.  For restricted
> application spaces, yeah, no doubt.  But in general, 
> no way.

It's only when you need a great deal of ad-hoc queries that you really
need a RDBMS. But a very great proportion of apps have only very
specific querying needs, and an ODBMS can do those queries MUCH faster.

And if postgresql has *both*, then it should be the
best of both worlds. I'm not going to go around
claiming RDBMS is obsolete, but I do know that ODBMS
is much more convenient to use for programming. Once
you've done your app and you want to spew off a few
reports, that's when you wish you had RDBMS.

> >Exactly, so stop mapping things and creating complicated joins. ODBMSes
> >do not do ANY joins to re-create objects. That's why mappers suck so
> >hard.
> 
> If they don't do joins, then presumably they map many-to-one relations
> by copying data into each of the "many" table rows.  TANSTAAFL, no?
^^^ ?

They have a similar layout on disk to what you might have in memory. So
if you store a 1:M in memory as an array of pointers, that's how you
might do it on disk too.
> Though this strategy is a very viable one in today's 
> big-memory, big-disk
> environment.  It's not clear to me that a extremely 
> smart RDBMS system
> couldn't decide to add redundancy itself and gain 
> much of the efficiency,
> but, heck, that's just my weak, uncreative 
> compiler-writer mind at work again.

Do you mean an RDBMS might try and be smart and store it the same way?
Well if it did that, we might call it an ODBMS. But the other main
benefit of an ODBMS is that retrieving records for many cases
(non-ad-hoc) is very simple to program for because you don't have to map
say a join table into say a C++ List<type>. In 
other words it's not just the performance of ODBMS
that is good, but also the interface. Also
if an RDBMS maps an object to a table and then maps it
back to an array on disk, well you've done an 
unnecessary conversion.

> >Now with an ODBMS, a Car with 4 wheels and a steering wheel we'll have 6
> >objects in the database - 1 Car, 4 RoadWheels and 1 SteeringWheel. With
> >a relational mapper, depending on how you map it you'll have 21 objects
> >- 5 CarPart objects, 5 wheel objects, 4 road wheel, 1 steering wheel, 1
> >car and 5 car_carpart relation entities. And when you join it all
> >together you'll have to join against 6 tables instead of 3.
> 
> Not really.  You'd probably denormalize and not 
> worry about it, in practice.

Then what happens to your RDBMSes wonderful ad-hoc query facility if you
de-normalise? Will you have to do a UNION with about 5000 clauses to
retrieve the volume and price of each type of car part?

> Would the result be as beautiful?  I don't know - do 
> most car designers
> think that SteeringMechanism and PavementInterface 
>are the same?  It's
> true for a variety of reasons in today's cars that 
> aren't actually
> related, and high-end race cars are exploring 
> joystick control.
> 
> So one could claim that your hierarchy is merely 
> limiting creative expression...

My hierarchy? The point is that you can _have_ a hierarchy. It's well
accepted that OO hierarchies are
good. The good thing here is being able to directly
store it in the database.


Re: [SQL] Re: [HACKERS] Proposed Changes to PostgreSQL

From
Chris
Date:
Marten Feldtmann wrote:

>  Hmm, and yes one may find problems where the pure
> relational system is 100x faster than your ODBMS.
>
>  After doing a project with VERSANT and VisualWorks
> (election projection system for the first television
> sender here in Germany) I like the idea of OODBMS,
> but I've also noticed, that they are  not the
> solution to all problems.

Give me a clear application spec and VERSANT, and I will ALWAYS flog
Oracle into the dust. But...

Where SQL comes into it's own is _conveniently_ doing queries that I
never thought of when I first designed my app. Of course many ODBMSes
have SQL or similar too.

>  Joins per se are not that bad .. it depends on when
> and how they are used and how good the analyzer of
> the database is and how good he uses the indices to
> get the job done.

Take the simple SUPPLIER, PART and SUPPLIER_PART situation. The very
fact that you've got an extra table here means you've got to touch many
more disk pages and transfer more data. An RDBMS just can't win when the
ODBMS data model is designed right.

>  One very good point is the query language of the
> rdbms systems. On the odbms side no standard is
> really available, which can be seen as the sql of
> the odbms.

There is a standard called OQL which is very similar to SQL. It's just
rather poorly supported.

--
Chris Bitmead
mailto:chris@bitmead.com

Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

From
Don Baccus
Date:
At 12:41 PM 2/5/00 +1100, Chris wrote:
>Don Baccus wrote:
>
>> Without prejudice, I'd be interested in some order-of-magnitude
>> market share for these technologies vs., say, Oracle.
>
>Would you be interested in the market share of Win98 compared to Linux?

Postgres isn't in competition with either of those software products.  It
is probably worth pointing out that at least some of the folks in the Linux
community would like to derail Win98 to some degree.

And I, at least, would love to see Postgres derail Oracle to some degree.

...

>> There's no doubt of this, for applications that can > make use of the
paradigms.
>
>To my mind that is like saying OO is useful for programs that can make
>use of the paradigms. In fact I think nearly all programs can make use
>of OO.

This really isn't the place for a religious fight.  Personally, I believe
the OO paradigm is well-suited to the decomposition of some problems, not
particularly well-suited to others.  I've only been a professional software
engineer for 29 years, though, so I don't pretend to have all the answers.
I'd humbly suggest that OO methodologists don't, either.

But, that's just my opinion.

Far more important to me is that SQL queries not suffer performance hits
as a result of whatever changes to OO support make it into the standard
version of PG.

Let's just leave it at that, OK?  



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: [SQL] Re: [HACKERS] Proposed Changes to PostgreSQL

From
Hannu Krosing
Date:
Chris wrote:
>
> >  One very good point is the query language of the
> > rdbms systems. On the odbms side no standard is
> > really available, which can be seen as the sql of
> > the odbms.
>
> There is a standard called OQL which is very similar to SQL. It's just
> rather poorly supported.
>

I think the operative word here is "available". I know that SQL specs
are'nt freely available either, but due to SQL being already widely
supported one can get the general idea from many freely available sources,
like the bunch of freely downloadable DB's currently available for linux.
Most of them have some docs included.

It is still quite a job to reconstruct SQL92 from them ;)

I know now description (except a BNF syntax available from some ODBMS website)
that I could use to get some idea about OQL.

----------------------
Hannu

Re: [SQL] Re: [HACKERS] Proposed Changes to PostgreSQL

From
Hannu Krosing
Date:
Hannu Krosing wrote:
>
>
> It is still quite a job to reconstruct SQL92 from them ;)
>
> I know now description (except a BNF syntax available from some ODBMS website)

SHould be "I know no description ..."

> that I could use to get some idea about OQL.
>
> ----------------------
> Hannu
>
> ************

Re: [SQL] Re: [HACKERS] Proposed Changes to PostgreSQL

From
Chris
Date:
Hannu Krosing wrote:

> I think the operative word here is "available". I know that SQL specs
> are'nt freely available either, but due to SQL being already widely
> supported one can get the general idea from many freely available sources,
> like the bunch of freely downloadable DB's currently available for linux.
> Most of them have some docs included.
> 
> It is still quite a job to reconstruct SQL92 from them ;)
> 
> I know now description (except a BNF syntax available from some ODBMS website)
> that I could use to get some idea about OQL.

Poet at http://www.poet.com have their doco online including OQL.


-- 
Chris Bitmead
mailto:chris@bitmead.com


Re: [SQL] Re: [HACKERS] Proposed Changes to PostgreSQL

From
Hannu Krosing
Date:
Chris wrote:
> 
> Hannu Krosing wrote:
> 
> > I think the operative word here is "available". I know that SQL specs
> > are'nt freely available either, but due to SQL being already widely
> > supported one can get the general idea from many freely available sources,
> > like the bunch of freely downloadable DB's currently available for linux.
> > Most of them have some docs included.
> >
> > It is still quite a job to reconstruct SQL92 from them ;)
> >
> > I know now description (except a BNF syntax available from some ODBMS website)
> > that I could use to get some idea about OQL.
> 
> Poet at http://www.poet.com have their doco online including OQL.
> 

Thanks, I'll check that.

Btw, has anyone compared PostgreSQL's object features with SQL3 (draft)
features.

For example they seem to use UNDER instead of INHERITS and no parentheses.

They also have a special privilege also called UNDER for being able to define 
a subtype (as they call it). A lot of other features seem to be considered too
-
no wonder it is 2.3 MB text file.

available at:

ftp://ftp.digital.com/pub/standards/sql/

or

http://gatekeeper.dec.com/pub/standards/sql/

the main file is sql-foundation-aug94.txt

------------------------
Hannu


Re: [SQL] Re: [HACKERS] Proposed Changes to PostgreSQL

From
Date:
1) free is freedom, not free bear ;-) also, there are some sites
    has oql draft -- like sql draft. sorry, can not remember.
2) good books, like " C++ object databases" (David Jordan) has
    a lot material.
3) a lot of OODBM evaluation copy there. Not totally complied, but
   together with 1 and 2, still can see what is going on.

so, no excuse for not knowing oodbm/oql :-) -- I'm waiting for
trying them on pg .

On Sat, 5 Feb 2000, Hannu Krosing wrote:

> Chris wrote:
> >
> > >  One very good point is the query language of the
> > > rdbms systems. On the odbms side no standard is
> > > really available, which can be seen as the sql of
> > > the odbms.
> >
> > There is a standard called OQL which is very similar to SQL. It's just
> > rather poorly supported.
> >
>
> I think the operative word here is "available". I know that SQL specs
> are'nt freely available either, but due to SQL being already widely
> supported one can get the general idea from many freely available sources,
> like the bunch of freely downloadable DB's currently available for linux.
> Most of them have some docs included.
>
> It is still quite a job to reconstruct SQL92 from them ;)
>
> I know now description (except a BNF syntax available from some ODBMS website)
> that I could use to get some idea about OQL.
>
> ----------------------
> Hannu
>
> ************
>


Re: [SQL] Re: [HACKERS] Proposed Changes to PostgreSQL

From
Marten Feldtmann
Date:
> 2) good books, like " C++ object databases" (David Jordan) has
>     a lot material.


As an example:

 Cattel, "The Object Database Standard  ODMG 2.0"

 Morgan Kaufmann, ISBN 1 - 55860 - 463 -4


 Marten Feldtmann