Thread: Some thoughts on table inheritance (which is uniquely awesome on PostgreSQL)

Some thoughts on table inheritance (which is uniquely awesome on PostgreSQL)

From
Chris Travers
Date:
I have now been working with table inheritance for a while and after
starting to grapple with many of the use cases it has have become
increasingly impressed with this feature.  I also think that some of
the apparent limitations fundamentally follow from the support for
multiple inheritance, and multiple inheritance itself is so useful I
would not want to see this go away.  Inheritance really starts to come
to its own once you start using table methods, and some features that
are useful in some sorts of inheritance modelling are useless in
others.

Below I will offer a few suggestions regarding what can be done to
make life a little easier for those of us using these features and
they are not the typical suggestions.  I still stand by my view that
at least from what I have looked at, PostgreSQL allows you to do
object-relational modelling better than you can do in other ORDBMS's I
have looked at so far.

What table inheritance gives you is an ability to model pieces of a
database, and derived information, on small units which can then be
later combined.  When we think of that as the primary use case
(instead of set/subset modelling) then the current DDL caveats largely
don't apply.  Logic can thus follow a group of columns rather than
having to be repetitively attached to tables.  The fact that this
allows you to create essentially derived values from groups of re-used
columns is itself remarkable and can be used to implement path
traversal etc. which is not directly supported in PostgreSQL in the
sense that it is in Oracle or DB2.  With multiple inheritance you can
actually build superior path traversal systems than you can easily on
DB2 or Oracle because you can re-use enforced foreign keys (Oracle has
an IS DANGLING operator for cross-table references!).

As far as I can tell, this sort of use is PostgreSQL-only because it
relies on multiple inheritance which is not supported on DB2,
Informix, or Oracle.  I am not aware of any other ORDBMS that allows
for multiple inheritance and this has profound impacts on things like
primary key inheritance, which I think becomes meaningless when
combined with multiple inheritance.  If I inherit two tables each with
a different primary key, I obviously cannot inherit both without
having multiple primary keys in the child table.

I have to be the sort of person who sees bugs as features, but in this
respect I cannot see the lack of inheriting a primary key as a bug
anymore.  It seems to me mathematically incompatible with PostgreSQL's
take on table inheritance generally and this is one of those cases
where multiple inheritance changes everything.

Additionally it is important to note that primary key management is
not a huge problem because it can be solved using techniques borrowed
from table partitioning.  If you are doing set/subset modelling (as in
the cities/capitals example) the solution is to have a cities table
which is constrained with a trigger or rule which does not allow
inserts and then capitals and noncapitals tables. The primary key can
then include an is_capital bool field which can be constrained
differently on both tables.  This has the advantage of knowing whether
a city selected is a capital from the top-level query as well, and
allows for the planner to treat the inheritance tree as a partitioned
table set.  Superset-constraint management would also have to use a
second table which would be referenced by all child tables (and
perhaps maintained by triggers).   While superset management tables
can be used to solve a subset of foreign key problems, they highlight
a different (and perhaps more solvable) set of these problems.

As far as I can tell, Oracle and DB2 do not discuss primary key
inheritance and it isn't clear whether this is a problem on those
platforms too.  Foreign key management pretty clearly is a problem
given the way these platforms handle cross-relational REFs.  In other
words, I think that on the whole table inheritance is still cutting
edge on PostgreSQL and has been for some time.

Foreign keys can be managed in a few ways including superset
constraint tables maintained with triggers.  These work well for
enforcing foreign keys against subsets, but inheriting a foreign key
constraint means redefining it repetitively on every child table.  At
the same time, not all foreign keys may want to be inherited.

The following changes to behavior I would personally find very useful
(and I believe would be useful in partitioned tables as well):

   * foreign keys (i.e. REFERENCES clauses) being able to be marked
INHERIT or NOINHERIT on the parent table.  INHERIT foreign keys would
be automatically created on child tables.  The default could be left
to be NOINHERIT to avoid breaking backwards compatibility.

   * unique constraints being able to be marked INHERIT or NOINHERIT.
A unique constraint that is marked INHERIT would be automatically
created again on the child table.  This could be documented to be
domain-specific to each child table, and that if you need super-set
unique constraints, you need to borrow techniques from table
partitioning.

  * an ability to allow a check constraint to be marked NOINHERIT and
thus excluded down-tree. This could be used to exclude inserts onto
parent tables both in partitioning and object inheritance tree
environments and it greatly simplifies set/subset modelling.

  * PRIMARY KEY inheritance would be documented as necessarily
excluded by multiple inheritance.  The concept simply doesn't make
sense when a child table can have multiple parents.  If it did, you'd
have multiple primary keys.  According to Oracle and DB2
documentation, the best they offer regarding such things is an OID
field anyway.....

Even without these changes, however, I am finding PostgreSQL's table
inheritance to be extremely useful, and I will be covering it
extensively in an upcoming blog post.  Also instead of saying that it
is mostly only useful in table partitioning, I would say that the
techniques of table partitioning are useful in addressing the
difficulties one runs into in set/subset modelling.

Best Wishes,
Chris Travers


Re: Some thoughts on table inheritance (which is uniquely awesome on PostgreSQL)

From
David Johnston
Date:
On Aug 22, 2012, at 23:22, Chris Travers <chris.travers@gmail.com> wrote:

>  * unique constraints being able to be marked INHERIT or NOINHERIT.
> A unique constraint that is marked INHERIT would be automatically
> created again on the child table.  This could be documented to be
> domain-specific to each child table, and that if you need super-set
> unique constraints, you need to borrow techniques from table
> partitioning.
>
>
> * PRIMARY KEY inheritance would be documented as necessarily
> excluded by multiple inheritance.  The concept simply doesn't make
> sense when a child table can have multiple parents.  If it did, you'd
> have multiple primary keys.  According to Oracle and DB2
> documentation, the best they offer regarding such things is an OID
> field anyway.....
>

late night thinking here...

An inherited PK constraint could be made into a unique/not-null constraint if a second PK constraint becomes inherited.
In that case the table no longer has a PK constraint until the user creates one that makes sense. 

This whole issue occurs due to surrogate keys being used as PK.  In a partitioning scheme the partitioning field should
bepart of the natural key and thus cross-relation matching could not occur in the absence of a mis-allocation which a
partitionspecific check constraint on that column would solve. 

In an OO situation, in the absence of partitioning, a key is a concept of identity.  Identity requires that the type of
twoentities matches; and the type of a child object will never match the type of an object of its parent.  Thus
regardlessof single or multiple inheritance PK inheritance makes no sense in an OO situation. 

Even with multiple inheritance you might want to inherit a PK from from parent but from the other parent(s) you might
simplywant to inherit their PK as a unique constraint.  In so doing you assert that you are on the same level as the PK
parentobjects while you have different attributes than your siblings.  Jack and Jill can both inherit PK from human
beingbut could inherit phone_number and email from contact_info (not the best example I know...I tried making gender
workhere but my mind went blank if trying to rationalize why gender wouldn't just be a FK). 

For FK, however, the question is whether I am referencing a specific instance or whether I simply am referencing an
arbitraryset of properties that anything matching those properties could match.  Currently the former is what we have,
andsince we are dealing with entities (as opposed to behavior) that makes sense.  Ignoring partitioning if I define an
FKrelationship to "flying things"  I supposedly do not care whether you store a bird-type flyer or an airplane-type
flyer. If someone names their pet bird Polly and someone else names a plane Polly then what...Inheriting an FK to a
targetnon-partitioned table makes sense but how does one deal with inheriting onto a target table that has children? 

Just some thoughts as I have not, as became obvious writing this, thought through using the database in this fashion.
Mostinheritance I have used is behavioral in nature whereas a database deals with identity.  Segregating between type
compositionand partitioning mentally, and ideally in the language, makes a lot of sense to me.  It seems that currently
bothmodels are partially implemented and done so using the same syntactical foundation... 

David J.









Re: Some thoughts on table inheritance (which is uniquely awesome on PostgreSQL)

From
Thomas Kellerer
Date:
Chris Travers, 23.08.2012 05:22:
> The fact that this allows you to create essentially derived values
> from groups of re-used columns is itself remarkable and can be used
> to implement path traversal etc. which is not directly supported in
> PostgreSQL in the sense that it is in Oracle or DB2.

What exactly do you mean with "path traversal" here? If you are talking about Oracle's CONNECT BY
then this is fully supported in PostgreSQL using a recursive common table expression.

Thomas


Re: Some thoughts on table inheritance (which is uniquely awesome on PostgreSQL)

From
Merlin Moncure
Date:
On Wed, Aug 22, 2012 at 10:22 PM, Chris Travers <chris.travers@gmail.com> wrote:
> I have now been working with table inheritance for a while and after
> starting to grapple with many of the use cases it has have become
> increasingly impressed with this feature.  I also think that some of
> the apparent limitations fundamentally follow from the support for
> multiple inheritance, and multiple inheritance itself is so useful I
> would not want to see this go away.  Inheritance really starts to come
> to its own once you start using table methods, and some features that
> are useful in some sorts of inheritance modelling are useless in
> others.

The problem with postgres table inheritance is that it doesn't really
solve the problem that people wanted solved: to be able to define an
set of specific extra attributes for each row depending on some
characteristic of that row.  The feature only tantalizingly
masquerades as such.

Until it found use in table partitioning, I found the inheritance
feature to be basically useless.

merlin


On Thu, Aug 23, 2012 at 12:36 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Wed, Aug 22, 2012 at 10:22 PM, Chris Travers <chris.travers@gmail.com> wrote:
>> I have now been working with table inheritance for a while and after
>> starting to grapple with many of the use cases it has have become
>> increasingly impressed with this feature.  I also think that some of
>> the apparent limitations fundamentally follow from the support for
>> multiple inheritance, and multiple inheritance itself is so useful I
>> would not want to see this go away.  Inheritance really starts to come
>> to its own once you start using table methods, and some features that
>> are useful in some sorts of inheritance modelling are useless in
>> others.
>
> The problem with postgres table inheritance is that it doesn't really
> solve the problem that people wanted solved: to be able to define an
> set of specific extra attributes for each row depending on some
> characteristic of that row.  The feature only tantalizingly
> masquerades as such.

This is true to some extent.  I may have found a solution there, which
is to build your inheritance solutions on top of the solutions used
for table partitioning.  It's one reason why I say you should start
working with it on table partitioning before you try to do set/subset
modelling elsewhere.  Interestingly one appealing solution doesn't
really work (which is to put a check constraint which checks the
tableoid column, presumably because this isn't set on insert until
after the check constraint fires).

One thing I have found looking through Oracle and DB2 docs is that
their table inheritance seems to have all the same problems as ours
and their solutions to these problems seem rather.... broken from a
pure relational perspective.

For example, Oracle and DB2 make extensive use of OID's here (which
must be recorded in some sort of system catalog somewhere given what
they do with them), and they have functions to take a "reference" to a
row and operators to "dereference" the row.  This sounds all good and
well until you come across the IS DANGLING operator, which returns
true when the reference no longer is there...

In other words, as far as I can see nobody else has come up with a
sane foreign key solution for inherited tables either.
>
> Until it found use in table partitioning, I found the inheritance
> feature to be basically useless.

I think one can actually borrow techniques from table partitioning to
solve the problems associated with inheritance.

However here's what turned me around on table inheritance:

1)  First, in LedgerSMB, we started using it to create consistent
interfaces to sets of storage tables.  The storage tables would behave
differently, but would inherit essentially interfaces from their
parents.  In this regard, you can think of an inheritance tree as a
partitioned table set, but where the partitioning is necessary because
foreign key fields reference different tables in different children.
We use this for example, to avoid having to have a global notes table
or global file attachments table and it gives us clear control over
where these can be attached along with central maintenance of data
structures.  In cases, like with file attachments, where foreign keys
to inheritance trees ended up being needed, we started out with a more
complex but workable solution but I think are going to a simpler one.
This is a good thing.

In essence what we did was use inheritance to give us variable target
tables for a foreign key column.  I would still like to see
inheritable foreign key constraints because that would make some
things a lot easier, but the idea that foreign keys are not, by
default, copied in, means that you can override the destination in the
child table.  It isn't the use documented but it actually works very
well.

2)  Secondly I re-read Stonebraker's "Object-Relational Database:  The
Next Wave" and I had a rather sudden epiphany.  Relational databases
are about modelling your data so you can ensure consistency and gain
as many answers as you can.  Object-relational modelling adds
interfaces (possibly written in arbitrary programming languages) to
derive additional information from stored information.  The example he
gives could be summarized in English to be "Give me all pictures of
sunsets taken within 20 miles of Sacramento" where whether a picture
is of a sunset is determined by analyzing the graphic itself.  Thus
you have to add features to allow you to plug into the query to answer
that question, and you have to have a planner capable of optimizing
such a query.

I also read some other papers which discussed table inheritance and
what sort of modelling problems it was designed to solve (the main one
is actually part/whole modelling where a row may be a whole in itself
and also a part of another whole--- for example we might sell timing
belts, but they might also come included in an engine assembly).

3)  I was talking with Matt Trout regarding object-oriented
programming in Perl, and he turned me on to Moose::Role as essentially
an interface class.  It cannot be instantiated and one would not
simply inherit it in order to instantiate it.  Rather it provides an
ability to assemble classes from re-usable pieces and thus provide
consistent interfaces across a project.    This has become a key to my
understanding of the use cases for multiple inheritance in PostgreSQL,
namely that you can define interfaces across column combinations and
combine those column combinations into a table.  The parent tables are
then largely uninteresting.

Viewed from this perspective, multiple inheritance gives you something
similar to what you get in the Informix examples I have seen regarding
embedding structured data type objects in columns in the table, but it
is superior to that because the individual columns can still be easily
queried using simple, relational queries.  No need to do something
like select customer.display_name() from invoices where .... just in
order to avoid getting something back in tuple notation.  Although we
could still do:  select i.customer_display_name from invoices i where
.... if we want to use derived values.

These things have turned me around from seeing inheritance as a
partially implemented, not particularly useful misfeature into
something I think is actually both extremely useful and can/should be
further developed in the future.  I also think that once solutions to
the harder modelling problems (like set/subset) are well understood
and documented, that more modest efforts can be used to smooth what
sharp corners remain.  However it is *very* hard to see value in a
feature when the examples in the documentation lead people down paths
which cause real pain for db professionals, especially when the docs
are, on the whole, of the quality they are in this project.
Inheritance can still solve the problems it was intended to solve, but
it must be used differently.  Given the existing solutions out there
(which all seem to be based on Informix's half-solution for this
problem), I think we may be in a position to do this right.

Best Wishes,
Chris Travers


One other thing that seems worth mentioning is that as soon as you
jump from relational to object-relational modelling is that the latter
is more rich and hence more complex than the former.  Because
object-relational modelling is a much expanded semantic superset of
relational modelling, the antipatterns are a much expanded superset
there too.  Additionally because the patterns are not yet well
understood, the whole area needs to be seen as somewhat immature.

That doesn't mean that the features are primarily useful as foot-guns,
but it does mean that productive use of features like this involves
careful documentation of patterns and antipatterns.

Best Wishes,
Chris Travers


Re: Some thoughts on table inheritance (which is uniquely awesome on PostgreSQL)

From
Ondrej Ivanič
Date:
Hi,

On 24 August 2012 11:44, Chris Travers <chris.travers@gmail.com> wrote:
> One thing I have found looking through Oracle and DB2 docs is that
> their table inheritance seems to have all the same problems as ours
> and their solutions to these problems seem rather.... broken from a
> pure relational perspective.

I can second that. Additionally, some vendors tried to fix
partitioning (which uses table inheritance) issues by creating all
sort of extension like CREATE TABLE ... PARTITION BY, and ALTER TABLE
... ALTER PARTITION ... which create all sorts of issues which are not
documented at all but you get response like "yes, we know about this
bug; fix not yet available".

Many people asked for "SQL wrappers" for table partitioning but that's
not easy to do. I would be happy to have out of the box "auto-routing"
for insert/update/copy statements

--
Ondrej Ivanic
(ondrej.ivanic@gmail.com)


Re: Some thoughts on table inheritance (which is uniquely awesome on PostgreSQL)

From
Wolfgang Keller
Date:
> 1)  First, in LedgerSMB,

<duck>

What a pity that this is implemented in a write-only programming
language and as a "web application" instead of an actual GUI...

</duck>

> we started using it to create consistent interfaces to sets of
> storage tables.  The storage tables would behave differently, but
> would inherit essentially interfaces from their parents.  In this
> regard, you can think of an inheritance tree as a partitioned table
> set, but where the partitioning is necessary because foreign key
> fields reference different tables in different children. We use this
> for example, to avoid having to have a global notes table or global
> file attachments table and it gives us clear control over where these
> can be attached along with central maintenance of data structures.
> In cases, like with file attachments, where foreign keys to
> inheritance trees ended up being needed, we started out with a more
> complex but workable solution but I think are going to a simpler one.
> This is a good thing.
>
> In essence what we did was use inheritance to give us variable target
> tables for a foreign key column.  I would still like to see
> inheritable foreign key constraints because that would make some
> things a lot easier, but the idea that foreign keys are not, by
> default, copied in, means that you can override the destination in the
> child table.  It isn't the use documented but it actually works very
> well.

Is there some documentation (database schema, DDL, etc.) where this is
described in detail?

> 2)  Secondly I re-read Stonebraker's "Object-Relational Database:  The
> Next Wave" and I had a rather sudden epiphany.  Relational databases
> are about modelling your data so you can ensure consistency and gain
> as many answers as you can.  Object-relational modelling adds
> interfaces (possibly written in arbitrary programming languages) to
> derive additional information from stored information.

As a non-computer scientist by training, I thought object orientation
from the purely utilitarian point of view would be essentially about
inheriting common features such as attributes, methods etc. from base
classes, respectively base tables in the case of an object-relational
database, to reduce implementation effort for code, increase consistency
etc...

And about "encapsulating" methods (procedures) together with the classes
(tables) that they work on.

> 3)  I was talking with Matt Trout regarding object-oriented
> programming in Perl, and he turned me on to Moose::Role as essentially
> an interface class.  It cannot be instantiated and one would not
> simply inherit it in order to instantiate it.  Rather it provides an
> ability to assemble classes from re-usable pieces and thus provide
> consistent interfaces across a project.    This has become a key to my
> understanding of the use cases for multiple inheritance in PostgreSQL,
> namely that you can define interfaces across column combinations and
> combine those column combinations into a table.

What's the difference from the use of custom composite types to
aggregate tables? Except for access syntax?

Sincerely,

Wolfgang



On Aug 27, 2012 12:58 AM, "Wolfgang Keller" <feliphil@gmx.net> wrote:
>
> > 1)  First, in LedgerSMB,
>
> <duck>
>
> What a pity that this is implemented in a write-only programming
> language and as a "web application" instead of an actual GUI...
>
> </duck>

Actually this has continued to remind me how ugly HTTP is for actual application programming.  There are areas where we'd have far improved performance --- including db performance ---if it was in a GUI...
<snip>
> >
> > In essence what we did was use inheritance to give us variable target
> > tables for a foreign key column.  I would still like to see
> > inheritable foreign key constraints because that would make some
> > things a lot easier, but the idea that foreign keys are not, by
> > default, copied in, means that you can override the destination in the
> > child table.  It isn't the use documented but it actually works very
> > well.
>
Accidently deleted your question about docs (on phone, travelling) but answer is it is in the ddl docs but probably needs more detail.
>

> As a non-computer scientist by training, I thought object orientation
> from the purely utilitarian point of view would be essentially about
> inheriting common features such as attributes, methods etc. from base
> classes, respectively base tables in the case of an object-relational
> database, to reduce implementation effort for code, increase consistency
> etc...
>
I am also not a computer scientist by training but I dont see anything wrong with that.

I would add that in all cases interface is key and what OOP does is tie procedure and structure together for interface purposes.  Whether this is always good or not is an open question.

> And about "encapsulating" methods (procedures) together with the classes
> (tables) that they work on.
>

The big issue is that inheritance gets "sticky" when modelling information rather than behavior and it is all too easy to create ambiguous fkeys (a relational antipattern) which are difficult to enforce and make a mess of things...

>
> What's the difference from the use of custom composite types to
> aggregate tables? Except for access syntax?

To make composite types work gracefully you really want to give them methods and access based on that.  With inheritance the type gets inlined in the table. This makes it easy to access via standars relational tools.

Also composite types do not provide centralized chack or not null constraints...

So the access syntax difference is significant but so is the operation.

Best Wishes,
Chris Travers
>
> Sincerely,
>
> Wolfgang
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general