Thread: Table inheritance implementation.
Hi. I'm developing an application using PostgreSQL and it happened table inheritance is THE solution to some design problems I have. Unfortunately the feature doesn't exactly work as true class/object inheritance would. Main problems are well recognized and documented: child table doesn't inherit parent constraints and parent's index doesn't get updated with child's keys. While I didn't dig in the Postgres internals, from the symptoms I guess the inheritance is implemented as implicit UNION of the tables. To be more specific, I have: CREATE TABLE parent ( p int PRIMARY KEY ); CREATE TABLE child ( c int ); If I'm right, in the backend there are two tables: parent(pid) and child(pid,cdata) and INSERT INTO child ... just go to child. Then when I SELECT ... FROM parent Postgres does SELECT ... FROM parent UNION SELECT ... FROM child for me (might be syntax error, I'm not so familiar with SQL). This scenario of course explains these problems and I understand solving them won't be easy. But I have another question: why can't be inheritance implemented as implicit JOIN? I mean, in the backend there would be tables parent(p) and child(c) plus some glue added (if oids/tids are not enough). So INSERT INTO child VALUES (1,2) would INSERT INTO parent VALUES (1) INSERT INTO child (2) And SELECT ... FROM parent would work as is, but SELECT ... FROM child would effect in SELECT ... FROM parent JOIN child ON glue It seems to me that it would solve both mentioned problems in one shot: parent contains all keys it should have (and so index does) and parent's constraints are enforced at the same time. The glue can be issue or may be not. The real issue would be with overriding parent's constraints (from my point of view it's minor one compared to contemporary problems). There may be other deficiencies I'm not aware of. On the bright side, I think this implementation (or at least some functionality of) can be made with rules. Anyone share thought about the whole idea? Or details? Best regards. -- Grzegorz Nowakowski
Grzegorz Nowakowski <krecik@e-wro.net> writes: > But I have another question: why can't be > inheritance implemented as implicit JOIN? Interesting thought, but joins are expensive --- this would be quite a lot slower than the current way, I fear, especially when you consider more than one level of inheritance. Also, switching over to this would destroy the current usefulness of inheritance for partitioning. regards, tom lane
Speaking of partitioning, I see there some improvements planed for this feature in 8.3 - any info on what exactly users can expect? Any possibility to improve it so we don't have to add insert trigger that selects the right table for operation? Also, propagation of Alter table on inherited tables is a sweat feature... :) > Interesting thought, but joins are expensive --- this would be quite a > lot slower than the current way, I fear, especially when you consider > more than one level of inheritance. Also, switching over to this would > destroy the current usefulness of inheritance for partitioning. > > regards, tom lane -- vlad
ops. alter table seems to be propagating OK in 8.2... On 1/4/07, Vlad <marchenko@gmail.com> wrote: > Speaking of partitioning, I see there some improvements planed for > this feature in 8.3 - any info on what exactly users can expect? Any > possibility to improve it so we don't have to add insert trigger that > selects the right table for operation? Also, propagation of Alter > table on inherited tables is a sweat feature... :) > -- vlad
On czw, 2007-01-04 at 10:44 -0500, Tom Lane wrote: > Grzegorz Nowakowski <krecik@e-wro.net> writes: > > But I have another question: why can't be > > inheritance implemented as implicit JOIN? > > Interesting thought, but joins are expensive --- this would be quite a > lot slower than the current way, I fear, especially when you consider > more than one level of inheritance. Also, switching over to this would > destroy the current usefulness of inheritance for partitioning. Well, I never used partitioning and I don't know what it's worth but just after sending my original mail I got another variant of the idea: to duplicate columns (parent(p), child(p,c)), so inserts into child update both parent's and child's index. This way we trade space (common columns are replicated along inheritance hierarchy) and some speed (inserts into child are slower because they also have to update parent) for some other speed (selects work without join penalty). Yet still we have the primary benefit: parent's constraints and indexes work as expected. I'm well aware that even if my idea has some merit, it won't be soon when it goes into code, if ever. I expect it would take small revolution to make it. Anyway, I tried. :) Best regards. -- Grzegorz Nowakowski
On Fri, Jan 05, 2007 at 09:27:31AM +0100, Grzegorz Nowakowski wrote: > Well, I never used partitioning and I don't know what it's worth but > just after sending my original mail I got another variant of the idea: > to duplicate columns (parent(p), child(p,c)), so inserts into child > update both parent's and child's index. This way we trade space (common > columns are replicated along inheritance hierarchy) and some speed > (inserts into child are slower because they also have to update parent) > for some other speed (selects work without join penalty). Yet still we > have the primary benefit: parent's constraints and indexes work as > expected. One of the reasons it hasn't happened yet is related to locking of indexes. It is currently assumed that if you lock a table, you've locked all the indexes implicitly. If you have an index that can be updated by multiple tables, what are the locking semantics then? If you want to drop the parent index, do you have to lock every child table? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
On pią, 2007-01-05 at 10:55 +0100, Martijn van Oosterhout wrote: > On Fri, Jan 05, 2007 at 09:27:31AM +0100, Grzegorz Nowakowski wrote: > > Well, I never used partitioning and I don't know what it's worth but > > just after sending my original mail I got another variant of the idea: > > to duplicate columns (parent(p), child(p,c)), so inserts into child > > update both parent's and child's index. This way we trade space (common > > columns are replicated along inheritance hierarchy) and some speed > > (inserts into child are slower because they also have to update parent) > > for some other speed (selects work without join penalty). Yet still we > > have the primary benefit: parent's constraints and indexes work as > > expected. > > One of the reasons it hasn't happened yet is related to locking of > indexes. It is currently assumed that if you lock a table, you've > locked all the indexes implicitly. If you have an index that can be > updated by multiple tables, what are the locking semantics then? If you > want to drop the parent index, do you have to lock every child table? <disclaimer>My SQL experience isn't wide and broad. I just happen to use it and when programming queries I often think no in DB-like terms but instead as if I'm manipulating shadows of data used by application. I know it can cause problems and misunderstandings.</> Frankly, my answer would be 'yes, if you lock parent, you have to lock every child'. Only this way it makes sense: parent contains all children polymorphed into its base type so if I lock whole the stuff I'm locking every child's instance, tough luck. On the other hand that is behavior I would expect. If I want the primary key to be unique at the parent level, I want the Postgres to enforce it on every descendant, so I must to take it into account that additional lookups/locking on child tables would be performed. Best regards. -- Grzegorz Nowakowski