Inheritance vs. LIKE - need advice - Mailing list pgsql-general

From William Bug
Subject Inheritance vs. LIKE - need advice
Date
Msg-id 744FF49E-3971-4C5C-8F3D-F19D49B1A5FE@drexel.edu
Whole thread Raw
List pgsql-general
Dear pgsql-general moderators,

I sent this post out over the weekend.

Is there a reason why it is not getting posted to the list?

Many thanks for your assistance.  I really need some advice on this
issue from people with experience using both inheritance mechanisms
in PostgreSQL.

Cheers,
Bill Bug

>
> Hi All,
>
> Sorry to bring up the topic of PostgreSQL inheritance again, but
> after going through the archives and Google results, I still don't
> have a clear sense of whether my plan to implement a schema I'm
> working on is the preferred way to go.
>
> First, I'd like to find out if the way I'm thinking about
> Inheritance vs. the SQL DDL CREATE TABLE modifier LIKE is correct.
>
> The simplest analogy I can think of from OO techniques is PGSQL
> Inheritance corresponds to Class Inheritance, while LIKE is more
> like an inheritable Interface (Java) or Mixin (Ruby).  Inheritance
> maintains strict hierarchical relationships propagating the "Class"
> identity down through to all progeny. LIKE on the other hand simply
> provides a means to re-use a set of fields in an unlimited number
> tables without having to redefine those fields for each table you
> use them in.
>
> This view is incomplete and far from a perfect fit to the way PGSQL
> Inheritance & LIKE work, but I think it's a helpful way of thinking
> of these 2 related mechanisms, when trying to decide how and when
> to use them in their current form.  As has been mentioned many
> times in posts here, as well as in the PGSQL docs, PGSQL
> Inheritance is only partial. Table fields are propagated as well as
> the group identity, but no other RDBMS objects created on the
> parent ( INDEXES, CONSTRAINTS & SEQUENCES primarily) are
> inherited.  As has been endlessly stated in posts here and
> elsewhere, this is a significant short-coming for the PGSQL
> Inheritance mechanism which those of us desirous of using
> Inheritance would love to see fixed (I understand it has been on
> the TODO list for many years, as this mechanism has been in the
> PGSQL code base for over 15 years).
>
> I don't agree this makes PGSQL Inheritance unusable.  There are
> situations where I think it can still be useful, and I describe one
> below.  I'd welcome feedback on that opinion, however, as I'd hate
> to have my relative ignorance doom the data schema I'm about to
> fill with a few million rows of data to serious problems later.
>
> The following is an example of using both Inheritance and LIKE in
> the context described above.
>
> CREATE TABLE curation_info (
>    created_by       TEXT    NOT NULL,
>    create_date     TIMESTAMP WITH TIME ZONE,
>    modified_by     TEXT    NOT NULL,
>    mod_date          TIMESTAMP WITH TIME ZONE
> );
>
> CREATE TABLE book (
>     id_pk                    SERIAL    PRIMARY KEY,
>     title                        TEXT        NOT NULL,
>     author_id_fk        INT            NOT NULL,
>    publisher_id_fk    INT            NOT NULL,
>     pub_year              DATE        NOT NULL,
>     total_pages         INT            NOT NULL
>     LIKE curation_info
> );
>
> CREATE TABLE novel (
>     id_pk                    SERIAL    PRIMARY KEY,
>     genre_id_fk        INT            NOT NULL
> ) INHERITS (book);
>
> CREATE TABLE textbook (
>     id_pk                    SERIAL    PRIMARY KEY,
>     subject_id_fk    INT    NOT NULL
> ) INHERITS (book);
>
>
> CREATE TABLE publisher (
>     id_pk                    SERIAL    PRIMARY KEY,
>     name                        TEXT        NOT NULL,
>     address_id_fk        INT            NOT NULL,
>     LIKE curation_info
> );
>
> CREATE TABLE author (
>     id_pk                    SERIAL    PRIMARY KEY,
>     last_name           TEXT        NOT NULL,
>     first_name             TEXT        NOT NULL,
>     middle_name         TEXT        NOT NULL,
>     address_id_fk        INT            NOT NULL,
>     LIKE curation_info
> );
>
> This is not the best way to model book info (for instance, books
> are only allowed to have 1 author in this schema), but it will help
> me to make my point.
>
> Books, novels and textbooks will be considered equivalent in the
> context of many queries.  At the same time, there will be other
> queries where it will be important to consider novels & textbooks
> as distinct entities.  The PGSQL Inheritance mechanism easily
> supports both of these situations.
>
> The curation fields listed in the 'curation_info' table are found
> ubiquitously in tables throughout many data schema.  However, it is
> not likely there would be a circumstance where you would want to
> consider all tables containing these fields "curatable entities" to
> be queried as a group.  That simply makes no sense.  In this case,
> LIKE seems to be the best way to propagate these fields, since it
> doesn't couple all tables containing them to the parent
> 'curation_info' table.
>
> As I see it, there are at least 3 major problems with adopting such
> a schema - despite the obvious efficiencies it offers (most of
> which have been reported elsewhere):
>     1) none of the parent table ('book') CONSTRAINTS or INDEXES are
> propagated to the children.  This means if you want the children to
> have the same CONSTRAINTS - as you probably will - you need to
> build them yourself for each child table.
>     2) the primary keys generated across the book, novel & textbook
> tables are completely uncouple and will definitely collide.  In
> other words, due to the fact that neither the SEQUENCE behind the
> 'book.id_pk' SERIAL field, not the PK CONSTRAINTS & INDEX that
> comes with that field will automatically propagate to the child
> tables.  That is why the SQL DDL given above has an 'id_pk' SERIAL
> field in all 3 tables.  There may be some conditions where you want
> those PKs to be independent from one another, but those will be
> much less frequent than the times when you will require they all
> derive from the same SEQUENCE.
>     3) The fields inherited from the 'curation_info' table via the
> LIKE modifier are in no way linked back to the table from which
> they originated, unlike a an Interface (in Java) or Mixin (in Ruby)
> would be.  If the 'mod_date' field is remove from 'curation_info'
> it will still remain in all the tables created using
> 'curation_info' prior to making that change.  Same is true if a new
> field is added to 'curation_info'.  If you want that field to be
> represented in all those tables that had previously been created
> using the LIKE 'curation_info' modifier, you will have to re-CREATE
> those tables from scratch.
>
> As I see it, '1' & '3' above are significant drawbacks with no
> obvious work-around, but they are not deal breakers.  I would still
> have reason to want to use both Inheritance and LIKE because of the
> efficiencies they provide.
>
> '2' above has a simple and obvious work around which I've been
> surprised I've not been able to find posted anywhere (leading me to
> believe I must be missing something).  PGSQL automatically builds a
> SEQUENCE object on SERIAL fields.  The name of the SEQUENCE is
> simply the concatenated name of the table + SERIAL field with
> '_seq' appended to the end - e.g., for the 'book' table, it would
> be 'book_id_pk_seq'.  In order to guarantee the child tables use
> that same sequence, you simply declare them as follows instead of
> using the SERIAL type:
>
> CREATE TABLE textbook (
>     id_pk                    INT8     DEFAULT nextval
> ('book_id_pk_seq')         NOT NULL,
>     subject_id_fk      INT        NOT NULL
> ) INHERITS (book);
> ALTER TABLE textbook ADD CONSTRAINT textbook_pk_c PRIMARY KEY(id_pk);
>
> Is it a pain to have to write - and maintain - this extra SQL DDL?
> Yes.  Having said that, will it provide the desired behavior?  Most
> definitely yes - again, unless I'm missing something.
>
> So this is how I plan to use INHERIT & LIKE.
>
> My main reason for posting this here, is to get a chance to draw on
> the breadth of PostgreSQL experience out there to bring the major
> pitfalls in taking this approach to my attention.  I think whatever
> feedback folks have to offer will be very helpful to me and to
> others in search of guidance on this issue.
>
> I also thought it would be helpful to present this for archive
> purposes, since, despite the fact much of what I say here is
> mentioned elsewhere, I had to search far and wide to find it all
> and consolidate my thinking on the topic, so it might save others
> some time to see it all in one place.
>
> It might also be worth adding some of this "advice" - if the
> consensus is this view is reasonable given the current state of the
> Inheritance mechanism in PostgreSQL - to one of the official
> PostgreSQL docs - e.g., FAQ, etc..
>
> Many thanks ahead of time for your feedback and patience in reading
> this through to the end.
>
> Cheers,
> Bill Bug
>
>
> P.S.: Should this end up double posted, I apologize.  There was a
> problem with the list accepting my aliased email address.
>
>
> Bill Bug
> Senior Analyst/Ontological Engineer
>
> Laboratory for Bioimaging  & Anatomical Informatics
> www.neuroterrain.org
> Department of Neurobiology & Anatomy
> Drexel University College of Medicine
> 2900 Queen Lane
> Philadelphia, PA    19129
> 215 991 8430 (ph)
>
>
>


pgsql-general by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Check postgres compile-time options
Next
From: Oliver Siegmar
Date:
Subject: Re: Problem with dropping a tablespace