Some thoughts on table inheritance (which is uniquely awesome on PostgreSQL) - Mailing list pgsql-general

From Chris Travers
Subject Some thoughts on table inheritance (which is uniquely awesome on PostgreSQL)
Date
Msg-id CAKt_ZfvmUiagjsB9eZUNGHVbMeqzmjZqj89P4P5JoNWRJEr+Aw@mail.gmail.com
Whole thread Raw
Responses Re: Some thoughts on table inheritance (which is uniquely awesome on PostgreSQL)  (David Johnston <polobo@yahoo.com>)
Re: Some thoughts on table inheritance (which is uniquely awesome on PostgreSQL)  (Thomas Kellerer <spam_eater@gmx.net>)
Re: Some thoughts on table inheritance (which is uniquely awesome on PostgreSQL)  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-general
I have now been working with table inheritance for a while and after
starting to grapple with many of the use cases it has have become
increasingly impressed with this feature.  I also think that some of
the apparent limitations fundamentally follow from the support for
multiple inheritance, and multiple inheritance itself is so useful I
would not want to see this go away.  Inheritance really starts to come
to its own once you start using table methods, and some features that
are useful in some sorts of inheritance modelling are useless in
others.

Below I will offer a few suggestions regarding what can be done to
make life a little easier for those of us using these features and
they are not the typical suggestions.  I still stand by my view that
at least from what I have looked at, PostgreSQL allows you to do
object-relational modelling better than you can do in other ORDBMS's I
have looked at so far.

What table inheritance gives you is an ability to model pieces of a
database, and derived information, on small units which can then be
later combined.  When we think of that as the primary use case
(instead of set/subset modelling) then the current DDL caveats largely
don't apply.  Logic can thus follow a group of columns rather than
having to be repetitively attached to tables.  The fact that this
allows you to create essentially derived values from groups of re-used
columns is itself remarkable and can be used to implement path
traversal etc. which is not directly supported in PostgreSQL in the
sense that it is in Oracle or DB2.  With multiple inheritance you can
actually build superior path traversal systems than you can easily on
DB2 or Oracle because you can re-use enforced foreign keys (Oracle has
an IS DANGLING operator for cross-table references!).

As far as I can tell, this sort of use is PostgreSQL-only because it
relies on multiple inheritance which is not supported on DB2,
Informix, or Oracle.  I am not aware of any other ORDBMS that allows
for multiple inheritance and this has profound impacts on things like
primary key inheritance, which I think becomes meaningless when
combined with multiple inheritance.  If I inherit two tables each with
a different primary key, I obviously cannot inherit both without
having multiple primary keys in the child table.

I have to be the sort of person who sees bugs as features, but in this
respect I cannot see the lack of inheriting a primary key as a bug
anymore.  It seems to me mathematically incompatible with PostgreSQL's
take on table inheritance generally and this is one of those cases
where multiple inheritance changes everything.

Additionally it is important to note that primary key management is
not a huge problem because it can be solved using techniques borrowed
from table partitioning.  If you are doing set/subset modelling (as in
the cities/capitals example) the solution is to have a cities table
which is constrained with a trigger or rule which does not allow
inserts and then capitals and noncapitals tables. The primary key can
then include an is_capital bool field which can be constrained
differently on both tables.  This has the advantage of knowing whether
a city selected is a capital from the top-level query as well, and
allows for the planner to treat the inheritance tree as a partitioned
table set.  Superset-constraint management would also have to use a
second table which would be referenced by all child tables (and
perhaps maintained by triggers).   While superset management tables
can be used to solve a subset of foreign key problems, they highlight
a different (and perhaps more solvable) set of these problems.

As far as I can tell, Oracle and DB2 do not discuss primary key
inheritance and it isn't clear whether this is a problem on those
platforms too.  Foreign key management pretty clearly is a problem
given the way these platforms handle cross-relational REFs.  In other
words, I think that on the whole table inheritance is still cutting
edge on PostgreSQL and has been for some time.

Foreign keys can be managed in a few ways including superset
constraint tables maintained with triggers.  These work well for
enforcing foreign keys against subsets, but inheriting a foreign key
constraint means redefining it repetitively on every child table.  At
the same time, not all foreign keys may want to be inherited.

The following changes to behavior I would personally find very useful
(and I believe would be useful in partitioned tables as well):

   * foreign keys (i.e. REFERENCES clauses) being able to be marked
INHERIT or NOINHERIT on the parent table.  INHERIT foreign keys would
be automatically created on child tables.  The default could be left
to be NOINHERIT to avoid breaking backwards compatibility.

   * unique constraints being able to be marked INHERIT or NOINHERIT.
A unique constraint that is marked INHERIT would be automatically
created again on the child table.  This could be documented to be
domain-specific to each child table, and that if you need super-set
unique constraints, you need to borrow techniques from table
partitioning.

  * an ability to allow a check constraint to be marked NOINHERIT and
thus excluded down-tree. This could be used to exclude inserts onto
parent tables both in partitioning and object inheritance tree
environments and it greatly simplifies set/subset modelling.

  * PRIMARY KEY inheritance would be documented as necessarily
excluded by multiple inheritance.  The concept simply doesn't make
sense when a child table can have multiple parents.  If it did, you'd
have multiple primary keys.  According to Oracle and DB2
documentation, the best they offer regarding such things is an OID
field anyway.....

Even without these changes, however, I am finding PostgreSQL's table
inheritance to be extremely useful, and I will be covering it
extensively in an upcoming blog post.  Also instead of saying that it
is mostly only useful in table partitioning, I would say that the
techniques of table partitioning are useful in addressing the
difficulties one runs into in set/subset modelling.

Best Wishes,
Chris Travers


pgsql-general by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: Alternatives to very large tables with many performance-killing indicies?
Next
From: Chris Angelico
Date:
Subject: Re: Can column name aliases be supported?