Thread: Unintuitive behavior regarding inheritance
Hi all; I started trying to use table partitioning to handle a rather odd case in the software I am working on. I ran into an issue (one I can correct in my code) that strikes me as extremely unintuitive. I figured I would report it here as behavior I would like to see change. The basic problem is that while nearly all table contraints are inherited, unique constraints are not. This means that primary keys and so forth end up having to be redefined on all child tables explicitly as part of the table creation process. This is sufficiently unintuitive that as a result I am having to go back and amend contributions of people far more knowledgeable on this than I am. It would be really helpful in the future if unique constraints and primary keys were inherited. At least I can adjust code, but this is a pretty big gotcha which could go unnoticed until you have duplicates for primary key fields in specific child tables. Best wishes, Chris Travers
On Sat, 2011-07-09 at 05:59 -0700, Chris Travers wrote: > Hi all; > > I started trying to use table partitioning to handle a rather odd case > in the software I am working on. I ran into an issue (one I can > correct in my code) that strikes me as extremely unintuitive. I > figured I would report it here as behavior I would like to see change. > > The basic problem is that while nearly all table contraints are > inherited, unique constraints are not. This means that primary keys > and so forth end up having to be redefined on all child tables > explicitly as part of the table creation process. This is > sufficiently unintuitive that as a result I am having to go back and > amend contributions of people far more knowledgeable on this than I > am. > > It would be really helpful in the future if unique constraints and > primary keys were inherited. > > At least I can adjust code, but this is a pretty big gotcha which > could go unnoticed until you have duplicates for primary key fields in > specific child tables. > To have a primary key or a unique key on an partitioned table, it would mean that we should be able to have one index on multiple tables. Because primary key and unique constraints are enforced with an index. That's not something easy to do, and I guess it would make the index bigger, which isn't performance savvy. -- Guillaume http://blog.guillaume.lelarge.info http://www.dalibo.com
On Sat, Jul 9, 2011 at 6:09 AM, Guillaume Lelarge <guillaume@lelarge.info> wrote: > > To have a primary key or a unique key on an partitioned table, it would > mean that we should be able to have one index on multiple tables. > Because primary key and unique constraints are enforced with an index. > That's not something easy to do, and I guess it would make the index > bigger, which isn't performance savvy. I don't think you necessarily have to go as far as creating cross-table indexes. In the case I am looking at, one of the values I partition on is part of the primary key, so collisions across partitions can be avoided in this way. However, the problem here is that this also makes it far more difficult to create partitioned tables which reference keys on table partitions because those keys have to be defined on each partition. Simply creating per-partition indexes would be sufficient for that use case. Otherwise it becomes relatively maintenance heavy and quite error prone. I agree that this wouldn't get us to what would make everyone happy, but it would create reasonable workarounds for when one needs fkeys against partitioned tables.... Best Wishes, Chris Travers
On Jul 9, 2011, at 9:21, Chris Travers <chris.travers@gmail.com> wrote: > On Sat, Jul 9, 2011 at 6:09 AM, Guillaume Lelarge > <guillaume@lelarge.info> wrote: > >> >> To have a primary key or a unique key on an partitioned table, it would >> mean that we should be able to have one index on multiple tables. >> Because primary key and unique constraints are enforced with an index. >> That's not something easy to do, and I guess it would make the index >> bigger, which isn't performance savvy. > > I don't think you necessarily have to go as far as creating > cross-table indexes. In the case I am looking at, one of the values I > partition on is part of the primary key, so collisions across > partitions can be avoided in this way. > > However, the problem here is that this also makes it far more > difficult to create partitioned tables which reference keys on table > partitions because those keys have to be defined on each partition. > Simply creating per-partition indexes would be sufficient for that use > case. Otherwise it becomes relatively maintenance heavy and quite > error prone. > > I agree that this wouldn't get us to what would make everyone happy, > but it would create reasonable workarounds for when one needs fkeys > against partitioned tables.... > > Table inheritance has it's flaws. They are well-documented and often discussed. Your last two paragraphs and unintelligible to me. If you FK the parent table in an Inheritance scheme the system will onlycheck the parent table and not any inheritors. While I am unfamiliar with why indexes and FK constraints are not copied they can be added quite easily manually to any tableI create, and from the docs I am fully aware they are indeed not copied. And yes, I understand that because something is written doesn't mean it is understood. Understanding usually comes via experiencewhich you either earn or recruit. Partitioning is an advanced feature with, in it's current implementation, enoughcaveats to make it dangerous to use. But, band-aids are not going to be enough. A full graft is needed in order tomaintain backward compatibility while implementing a system that meets the identified needs of the community. David J.
Ok So what I am trying to do is: 1) Paritioned tables 2) Joins against the table partitions The way I have this done is: The first table is partitioned on the basis of one part of the primary key. So in theory since nobody has permission to insert into the base table, all records should have unique primary keys in the inheritance tree. The join table against the partition table is more complex and I had to partition it across two pieces of the primary key. To be clear I am joining a partitioned table against a partitioned table (since that seems to be the only sane way of joining against a partitioned table if referential integrity has to be enforced. Basically imagine the following (the actual schema is quite a bit more complex): create table invoice ( id serial primary key, reference text, .... ); create table order (id serial primary key, reference text, .... ); create table file_class ( id serial not null unique, label text primary key ); create table file_attachment ( id serial not null unique, file_class int references file_class(id), ref_key int, file_name text, primary key(file_class, ref_key, file_name) ); create table invoice_attachment ( check (file_class = 1), foreign key(ref_key) references invoice(id), ) inherits (file_attachment); create table order_attachment ( check(file_class = 2), foreign key (ref_key) references order(id), ) inherits (file_attachment); create table order_to_invoice_file_map( src_class int references invoice_attachment(id), dest_class int references order_attachment(id), file_id int references order_attachment(id), .... ); -- this is actually inherited in my schema too It seems for this to work all primary and foreign key constraints have to be redeclared on each child table. OTOH check constraints are additive. The idea here is to provide a system where by a consistent relational interface is provided for file attachments, and where neither higher levels of the application nor pieces of the model have to be aware of the existance or not of partitions. Ideally given the fact that each child table constrains part of the primary key to a reserved set of values means I shouldn't have to worry about primary key collision across the inheritance tree. Unfortunately, it seems that to do this a lot of copy/paste is required. The major drawback with this approach is that since check constraints are always inherited they can't be used to enforce the idea that base tables should have no rows. At the same time, since unique constraints are never inherited they have to be redeclared on every child. I recognize that one of the big hurdles here is the need sometimes to drop indexes while loading data.... but at the same time there are several places where this sort of thing has been overlooked so far by people far more knowledgeable about PostgreSQL than I am. Hope this helps. Chris Travers
> > create table invoice_attachment ( > check (file_class = 1), > foreign key(ref_key) references invoice(id), > ) inherits (file_attachment); > > create table order_attachment ( > check(file_class = 2), > foreign key (ref_key) references order(id), > ) inherits (file_attachment); > > While I get your proposal I am not a developer so I cannot really comment as to the cost-benefit of implementing it but mytake is that there is too much variety to effectively code the automation you desire (ignoring the fact that you'd wantthe indexes to remain independent) and so it is left to the developer to specify exactly what is desired. The real issue is that your child tables are distinct sub-types of the parent as opposed to being identical to the parentin all ways except for the range of allowable values- which what a partition is and the driver behind the current inheritanceimplementation. It would have been better if they had restricted partitions such that you could not add columnsand used a syntax such as "Create Table () PARTITION OF (parent_table)" and not even attempt to support object-likeinheritance. As it is now object-inheritance is only partially supported and so while you can fake sub-classingthe database is incapable of properly enforcing the normal use cases. A generally better way to implement object-inheritance is to use one-to-one tables and encapsulate using functions/view/triggersand possibly rules. Suggestions and ideas are encouraged but are more readily received if they at least acknowledge that there are reasons forthe existing behavior and that any solution needs to addres more than the single problem that is driving the suggestion,and that it generally wants to solve the problem without introducing more problems/complexity. David J.