Thread: postgresql table inheritance

postgresql table inheritance

From
Lincoln Yeoh
Date:
Hi,

Found this post on Slashdot which I found interesting, any comments?

--- post follows ---
by Anonymous Coward on Wed Nov 28, '07 03:23 PM (#21509173)

Speak for your database -- postgresql does.

Postgresql's "table inheritance" is a flawed concept and has nothing to do
with the *type system*. Relations contain tuples, and tuples contain
attributes, which are a name plus a VALUE. Those values are chosen from TYPES
(sets of possible values). Those types are the TYPE SYSTEM.

Table inheritence doesn't even make sense. Tables are analogous to relations.
All relations are the same type, the relation type (think "set" or "array" to
make it easier). How can one value of a type (one table) be a subtype of
another value (another table)? That's like saying, "3" is a subtype of "5",
if your types are integers. What if you use the expression "3+2" Is that "5"
still the subtype of 3? likewise, when you make complex queries with a "base"
table, does the result have any connection with the "sub" table? It's like
gobbledygook, just mashing words together without any understanding. That's
why the postgresql table inheritance concept doesn't see more widespread use.
Many people quickly discover the limitations (and incorrectly think it's just
"unfinished", when it actually is flawed).

The correct way to store types and subtypes in the database is to store them
in the columns. In other words, choose attribute VALUES from a TYPE SYSTEM.
Nothing else in the relational model needs to be changed. Something like
this, in hypothetical SQL-like language:

     CREATE TABLE People ( INT id, PERSON_CLASS person )

     p1 = PERSON_CLASS.new(name: "joe", etc)

     p2 = CUSTOMER_CLASS.new(name: "bob", etc) // assume CUSTOMER_CLASS
subclass of PERSON_CLASS

     INSERT INTO People VALUES (1, p1), (2, p2)

     SELECT person FROM People WHERE person.name = "bob"

     SELECT person, order FROM People JOIN Orders // can't do this in the
typical "object database"

This is a "solved problem" (see "The Third Manifesto"). It's just a matter of
getting somebody to implement it. But the vendors are clueless, thinking
object databases are a "different model" and not wanting to confuse
programmers, and programmers are clueless, not even understanding SQL or
types and values half the time, so they don't demand anything new from
vendors... we never move forward.


Re: postgresql table inheritance

From
Martijn van Oosterhout
Date:
On Fri, Nov 30, 2007 at 09:42:53PM +0800, Lincoln Yeoh wrote:
> Found this post on Slashdot which I found interesting, any comments?

I think this person is slightly confused.

> Table inheritence doesn't even make sense. Tables are analogous to
> relations.
> All relations are the same type, the relation type (think "set" or "array"
> to
> make it easier). How can one value of a type (one table) be a subtype of
> another value (another table)?

Easy, by having the columns of one table be a subset of the columns or
another table. Perhaps someone should point out an example, like a
table with "people" and subtables "employees" and "customers". The
subtables share the columns of the parent tables. This is nothing that
any OO language doesn't do.

> The correct way to store types and subtypes in the database is to store them
> in the columns. In other words, choose attribute VALUES from a TYPE SYSTEM.
> Nothing else in the relational model needs to be changed. Something like
> this, in hypothetical SQL-like language:

His example is a little wierd, but it is possible:

test=# create type foo as (a text, b text);
CREATE TYPE
test=# create table test( id  int4, vals foo );
CREATE TABLE
test=# insert into test values ( 4, ROW('a', 'b'));
INSERT 0 1
test=# select * from test;
 id | vals
----+-------
  4 | (a,b)
(1 row)

The syntax is different but the ideas are there...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Those who make peaceful revolution impossible will make violent revolution inevitable.
>  -- John F Kennedy

Attachment

Re: postgresql table inheritance

From
Ivan Sergio Borgonovo
Date:
On Fri, 30 Nov 2007 21:42:53 +0800
Lincoln Yeoh <lyeoh@pop.jaring.my> wrote:

> Hi,
>
> Found this post on Slashdot which I found interesting, any comments?

I wrote a memo about inheritance at the bottom of which there are 2
links to good use of the feature:

http://www.webthatworks.it/d1/page/postgresql_inheritance_surprises

--
Ivan Sergio Borgonovo
http://www.webthatworks.it


Re: postgresql table inheritance

From
"Peter Childs"
Date:


On 30/11/2007, Lincoln Yeoh <lyeoh@pop.jaring.my> wrote:
Hi,

Found this post on Slashdot which I found interesting, any comments?
--- post follows ---
by Anonymous Coward on Wed Nov 28, '07 03:23 PM (#21509173)

Speak for your database -- postgresql does.

Postgresql's "table inheritance" is a flawed concept and has nothing to do
with the *type system*. Relations contain tuples, and tuples contain
attributes, which are a name plus a VALUE. Those values are chosen from TYPES
(sets of possible values). Those types are the TYPE SYSTEM.

Table inheritence doesn't even make sense. Tables are analogous to relations.
All relations are the same type, the relation type (think "set" or "array" to
make it easier). How can one value of a type (one table) be a subtype of
another value (another table)? That's like saying, "3" is a subtype of "5",
if your types are integers. What if you use the expression "3+2" Is that "5"
still the subtype of 3? likewise, when you make complex queries with a "base"
table, does the result have any connection with the "sub" table? It's like
gobbledygook, just mashing words together without any understanding. That's
why the postgresql table inheritance concept doesn't see more widespread use.
Many people quickly discover the limitations (and incorrectly think it's just
"unfinished", when it actually is flawed).

The correct way to store types and subtypes in the database is to store them
in the columns. In other words, choose attribute VALUES from a TYPE SYSTEM.
Nothing else in the relational model needs to be changed. Something like
this, in hypothetical SQL-like language:

     CREATE TABLE People ( INT id, PERSON_CLASS person )

     p1 = PERSON_CLASS.new(name: "joe", etc)

     p2 = CUSTOMER_CLASS.new(name: "bob", etc) // assume CUSTOMER_CLASS
subclass of PERSON_CLASS

     INSERT INTO People VALUES (1, p1), (2, p2)

     SELECT person FROM People WHERE person.name = "bob"

     SELECT person, order FROM People JOIN Orders // can't do this in the
typical "object database"

This is a "solved problem" (see "The Third Manifesto"). It's just a matter of
getting somebody to implement it. But the vendors are clueless, thinking
object databases are a "different model" and not wanting to confuse
programmers, and programmers are clueless, not even understanding SQL or
types and values half the time, so they don't demand anything new from
vendors... we never move forward.


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match


Seams like two completely different concepts are getting confused. ie that of Table Inheritance and that of Type Inheritance. They are completely different concepts.

Table Inheritance is table structure ie a child table has all the same columns as the old one with some added columns that sore specialist items. This feature is used heavily used  in Table Partitioning. Perhaps it should be renamed.

Type Inheritance is adding extra features to types eg

Varchar(5) is a child of text that adds a maximum length limit of 4 and char(5) is a type of text with a fixed length of 5. But they are all text. This is a very silly example.

Just thoughts.

Peter.
 

Re: postgresql table inheritance

From
Gregory Stark
Date:
"Lincoln Yeoh" <lyeoh@pop.jaring.my> writes:

> The correct way to store types and subtypes in the database is to store them
> in the columns. In other words, choose attribute VALUES from a TYPE SYSTEM.
> Nothing else in the relational model needs to be changed. Something like
> this, in hypothetical SQL-like language:

That's what we call "denormalized" data in the database world. This is why
there's such a big impedance mismatch between procedural languages and
relational databases.

The natural place to end up in his world would be to have every table have
precisely one column which is some kind of object. Then you access fields and
methods on those objects.

The problem is that then you don't have a relational database since it's
awfully hard to tell the database you have a foreign key from whatever this
method returns here to whatever that method returns there for some record
somewhere... And it's awfully hard to index and join between complex
expressions picking data out from inside objects on both sides, etc.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

Re: postgresql table inheritance

From
Jeff Davis
Date:
On Fri, 2007-11-30 at 16:06 +0000, Gregory Stark wrote:
> "Lincoln Yeoh" <lyeoh@pop.jaring.my> writes:
>
> > The correct way to store types and subtypes in the database is to store them
> > in the columns. In other words, choose attribute VALUES from a TYPE SYSTEM.
> > Nothing else in the relational model needs to be changed. Something like
> > this, in hypothetical SQL-like language:
>
> That's what we call "denormalized" data in the database world. This is why
> there's such a big impedance mismatch between procedural languages and
> relational databases.
>

CJ Date believes that normalized relations can indeed contain complex
types such as images or other relations.

The argument is that atomicity has no absolute meaning, but only has
meaning in the context of what you're trying to do with it in the
database. A string can obviously be decomposed into its parts, as can a
timestamp, etc., so those aren't exactly atomic, either. [1]

The impedance mismatch has more to do with the fact that the meaning of
an application's internal data structures changes frequently (through
revisions of the code), while data in a database needs to be consistent
across long periods of time. So, a well-designed database will hold
facts that have meaning in the real world and from which inferences can
be made. Mapping application data structures (which contain context-
sensitive information and implementation artifacts) to real-world facts
is the impedance mismatch.

Regards,
    Jeff Davis

[1] Paraphrased from "Database in Depth", C.J. Date, pp 29-32


Re: postgresql table inheritance

From
Jeff Davis
Date:
On Fri, 2007-11-30 at 21:42 +0800, Lincoln Yeoh wrote:
> --- post follows ---
> by Anonymous Coward on Wed Nov 28, '07 03:23 PM (#21509173)
>
> Speak for your database -- postgresql does.
>
> Postgresql's "table inheritance" is a flawed concept and has nothing to do
> with the *type system*. Relations contain tuples, and tuples contain
> attributes, which are a name plus a VALUE. Those values are chosen from TYPES
> (sets of possible values). Those types are the TYPE SYSTEM.

"Foundation for Future Database Systems: The Third Manifesto" by C.J.
Date and Hugh Darwen discusses this topic in Appendix E, and a related
topic in Appendix D.

They also propose, in detail with a lot of analysis, how they think that
type inheritance should work in the set of chapters on the Inheritance
Model, or "IM".

Regards,
    Jeff Davis


Re: postgresql table inheritance

From
Jeff Davis
Date:
On Fri, 2007-11-30 at 14:33 +0000, Peter Childs wrote:
> Table Inheritance is table structure ie a child table has all the same
> columns as the old one with some added columns that sore specialist
> items. This feature is used heavily used  in Table Partitioning.
> Perhaps it should be renamed.

I think that was the point of the slashdot post: it creates confusion to
call two separate concepts by the same name.

Regards,
    Jeff Davis


Re: postgresql table inheritance

From
Lincoln Yeoh
Date:
At 03:17 AM 12/1/2007, Jeff Davis wrote:
>The impedance mismatch has more to do with the fact that the meaning of
>an application's internal data structures changes frequently (through
>revisions of the code), while data in a database needs to be consistent
>across long periods of time. So, a well-designed database will hold
>facts that have meaning in the real world and from which inferences can
>be made. Mapping application data structures (which contain context-
>sensitive information and implementation artifacts) to real-world facts
>is the impedance mismatch.

The people who try to make a database that maps so well with the
objects in a single particular program are solving a very different
problem from those of us who use a database partly as a "lingua
franca" (or "vehicular language") for many different programs and people.

The "impedance" then is unavoidable. It's not going to be easy to
change a hundred other programs anyway - probably some unknown (till
they inconveniently stop working because someone decided to "match
the impedances" with some pet program ;) ).

But anyway, I guess postgresql's "table inheritance" thing isn't
broken then just misunderstood...

Link.

One man's impedance mismatch is another man's layer of abstraction or
"comms protocol" :).


Re: postgresql table inheritance

From
Jeff Davis
Date:
On Sat, 2007-12-01 at 04:16 +0800, Lincoln Yeoh wrote:
> The people who try to make a database that maps so well with the
> objects in a single particular program are solving a very different
> problem from those of us who use a database partly as a "lingua
> franca" (or "vehicular language") for many different programs and people.
>

Replace "in a single particular program" with "in a specific revision of
a specific component of an application" ;)

The reason I say this is because most people don't realize that, by just
making their objects "persist", that all of their data is now very
context sensitive (to specific revisions of specific parts of their
code). Contrast that with storing real world facts, which are both
context insensitive and time insensitive.

I do see your point, but in this context I don't think the two uses are
very different. In the first case you mention, you are using the
database as a lingua franca between the application at time T and the
application at time T + N years (which is, in reality, a different
application); rather than in the second case, where it's a lingua franca
between two different applications at the same time.

> Link.
>

Did you intend to include a URL?

> One man's impedance mismatch is another man's layer of abstraction or
> "comms protocol" :).
>

Good point.

Regards,
    Jeff Davis