Inheritance vs. LIKE - need advice - Mailing list pgsql-general
From | William Bug |
---|---|
Subject | Inheritance vs. LIKE - need advice |
Date | |
Msg-id | 744FF49E-3971-4C5C-8F3D-F19D49B1A5FE@drexel.edu Whole thread Raw |
List | pgsql-general |
Dear pgsql-general moderators, I sent this post out over the weekend. Is there a reason why it is not getting posted to the list? Many thanks for your assistance. I really need some advice on this issue from people with experience using both inheritance mechanisms in PostgreSQL. Cheers, Bill Bug > > Hi All, > > Sorry to bring up the topic of PostgreSQL inheritance again, but > after going through the archives and Google results, I still don't > have a clear sense of whether my plan to implement a schema I'm > working on is the preferred way to go. > > First, I'd like to find out if the way I'm thinking about > Inheritance vs. the SQL DDL CREATE TABLE modifier LIKE is correct. > > The simplest analogy I can think of from OO techniques is PGSQL > Inheritance corresponds to Class Inheritance, while LIKE is more > like an inheritable Interface (Java) or Mixin (Ruby). Inheritance > maintains strict hierarchical relationships propagating the "Class" > identity down through to all progeny. LIKE on the other hand simply > provides a means to re-use a set of fields in an unlimited number > tables without having to redefine those fields for each table you > use them in. > > This view is incomplete and far from a perfect fit to the way PGSQL > Inheritance & LIKE work, but I think it's a helpful way of thinking > of these 2 related mechanisms, when trying to decide how and when > to use them in their current form. As has been mentioned many > times in posts here, as well as in the PGSQL docs, PGSQL > Inheritance is only partial. Table fields are propagated as well as > the group identity, but no other RDBMS objects created on the > parent ( INDEXES, CONSTRAINTS & SEQUENCES primarily) are > inherited. As has been endlessly stated in posts here and > elsewhere, this is a significant short-coming for the PGSQL > Inheritance mechanism which those of us desirous of using > Inheritance would love to see fixed (I understand it has been on > the TODO list for many years, as this mechanism has been in the > PGSQL code base for over 15 years). > > I don't agree this makes PGSQL Inheritance unusable. There are > situations where I think it can still be useful, and I describe one > below. I'd welcome feedback on that opinion, however, as I'd hate > to have my relative ignorance doom the data schema I'm about to > fill with a few million rows of data to serious problems later. > > The following is an example of using both Inheritance and LIKE in > the context described above. > > CREATE TABLE curation_info ( > created_by TEXT NOT NULL, > create_date TIMESTAMP WITH TIME ZONE, > modified_by TEXT NOT NULL, > mod_date TIMESTAMP WITH TIME ZONE > ); > > CREATE TABLE book ( > id_pk SERIAL PRIMARY KEY, > title TEXT NOT NULL, > author_id_fk INT NOT NULL, > publisher_id_fk INT NOT NULL, > pub_year DATE NOT NULL, > total_pages INT NOT NULL > LIKE curation_info > ); > > CREATE TABLE novel ( > id_pk SERIAL PRIMARY KEY, > genre_id_fk INT NOT NULL > ) INHERITS (book); > > CREATE TABLE textbook ( > id_pk SERIAL PRIMARY KEY, > subject_id_fk INT NOT NULL > ) INHERITS (book); > > > CREATE TABLE publisher ( > id_pk SERIAL PRIMARY KEY, > name TEXT NOT NULL, > address_id_fk INT NOT NULL, > LIKE curation_info > ); > > CREATE TABLE author ( > id_pk SERIAL PRIMARY KEY, > last_name TEXT NOT NULL, > first_name TEXT NOT NULL, > middle_name TEXT NOT NULL, > address_id_fk INT NOT NULL, > LIKE curation_info > ); > > This is not the best way to model book info (for instance, books > are only allowed to have 1 author in this schema), but it will help > me to make my point. > > Books, novels and textbooks will be considered equivalent in the > context of many queries. At the same time, there will be other > queries where it will be important to consider novels & textbooks > as distinct entities. The PGSQL Inheritance mechanism easily > supports both of these situations. > > The curation fields listed in the 'curation_info' table are found > ubiquitously in tables throughout many data schema. However, it is > not likely there would be a circumstance where you would want to > consider all tables containing these fields "curatable entities" to > be queried as a group. That simply makes no sense. In this case, > LIKE seems to be the best way to propagate these fields, since it > doesn't couple all tables containing them to the parent > 'curation_info' table. > > As I see it, there are at least 3 major problems with adopting such > a schema - despite the obvious efficiencies it offers (most of > which have been reported elsewhere): > 1) none of the parent table ('book') CONSTRAINTS or INDEXES are > propagated to the children. This means if you want the children to > have the same CONSTRAINTS - as you probably will - you need to > build them yourself for each child table. > 2) the primary keys generated across the book, novel & textbook > tables are completely uncouple and will definitely collide. In > other words, due to the fact that neither the SEQUENCE behind the > 'book.id_pk' SERIAL field, not the PK CONSTRAINTS & INDEX that > comes with that field will automatically propagate to the child > tables. That is why the SQL DDL given above has an 'id_pk' SERIAL > field in all 3 tables. There may be some conditions where you want > those PKs to be independent from one another, but those will be > much less frequent than the times when you will require they all > derive from the same SEQUENCE. > 3) The fields inherited from the 'curation_info' table via the > LIKE modifier are in no way linked back to the table from which > they originated, unlike a an Interface (in Java) or Mixin (in Ruby) > would be. If the 'mod_date' field is remove from 'curation_info' > it will still remain in all the tables created using > 'curation_info' prior to making that change. Same is true if a new > field is added to 'curation_info'. If you want that field to be > represented in all those tables that had previously been created > using the LIKE 'curation_info' modifier, you will have to re-CREATE > those tables from scratch. > > As I see it, '1' & '3' above are significant drawbacks with no > obvious work-around, but they are not deal breakers. I would still > have reason to want to use both Inheritance and LIKE because of the > efficiencies they provide. > > '2' above has a simple and obvious work around which I've been > surprised I've not been able to find posted anywhere (leading me to > believe I must be missing something). PGSQL automatically builds a > SEQUENCE object on SERIAL fields. The name of the SEQUENCE is > simply the concatenated name of the table + SERIAL field with > '_seq' appended to the end - e.g., for the 'book' table, it would > be 'book_id_pk_seq'. In order to guarantee the child tables use > that same sequence, you simply declare them as follows instead of > using the SERIAL type: > > CREATE TABLE textbook ( > id_pk INT8 DEFAULT nextval > ('book_id_pk_seq') NOT NULL, > subject_id_fk INT NOT NULL > ) INHERITS (book); > ALTER TABLE textbook ADD CONSTRAINT textbook_pk_c PRIMARY KEY(id_pk); > > Is it a pain to have to write - and maintain - this extra SQL DDL? > Yes. Having said that, will it provide the desired behavior? Most > definitely yes - again, unless I'm missing something. > > So this is how I plan to use INHERIT & LIKE. > > My main reason for posting this here, is to get a chance to draw on > the breadth of PostgreSQL experience out there to bring the major > pitfalls in taking this approach to my attention. I think whatever > feedback folks have to offer will be very helpful to me and to > others in search of guidance on this issue. > > I also thought it would be helpful to present this for archive > purposes, since, despite the fact much of what I say here is > mentioned elsewhere, I had to search far and wide to find it all > and consolidate my thinking on the topic, so it might save others > some time to see it all in one place. > > It might also be worth adding some of this "advice" - if the > consensus is this view is reasonable given the current state of the > Inheritance mechanism in PostgreSQL - to one of the official > PostgreSQL docs - e.g., FAQ, etc.. > > Many thanks ahead of time for your feedback and patience in reading > this through to the end. > > Cheers, > Bill Bug > > > P.S.: Should this end up double posted, I apologize. There was a > problem with the list accepting my aliased email address. > > > Bill Bug > Senior Analyst/Ontological Engineer > > Laboratory for Bioimaging & Anatomical Informatics > www.neuroterrain.org > Department of Neurobiology & Anatomy > Drexel University College of Medicine > 2900 Queen Lane > Philadelphia, PA 19129 > 215 991 8430 (ph) > > >
pgsql-general by date: