Thread: Re: tracking inherited columns (was: patch for check constraints using multiple inheritance)
Re: tracking inherited columns (was: patch for check constraints using multiple inheritance)
From
Robert Haas
Date:
On Tue, Aug 3, 2010 at 3:05 PM, Yeb Havinga <yebhavinga@gmail.com> wrote: > Yeb Havinga wrote: >> The underlying cause is the failure of the code to recognize that if >> relation C inherits from both A and B, where A and B both have column x, >> that A.x 'is the same as' B.x, where the 'is the same as' relation is the >> same that holds for (A.x, C.x) and (B.x, C.x), which the code does a lot of >> trouble for to recognize. This means that if some definition is altered on >> A.x, only C.x is updated and B.x not touched. IMO this is wrong and either a >> multiple inheritance structure like this should be prohibited, since the >> user did not explicitly declare that A.x and B.x 'are the same' (by e.g. >> defining a relation D.x and have A and B inherit from that), or the code >> should update parents of relations when the childs are updated. > > Thinking about this a bit more, the name 'is the same as' is a bit > confusing, since that relation might not be commutative. C.x 'inherits > properties from' A.x, or C.x 'is defined by' A.x are perhaps better names, > that reflect that the converse might not hold. OTOH, what does C.x 'inherits > (all) properties from' A.x mean? If it means that for all properties P, > P(C.x) iff P(A.x), then C.x = A.x commutatively and by similar reasoning > A.x = B.x. > >> ALTER TABLE top1 RENAME COLUMN a_table_column TO another_table_column; > > When looking for previous discussions that was referred to upthread, the > first thing I found was this recent thread about the exactly the same > problem http://archives.postgresql.org/pgsql-hackers/2010-01/msg03117.php > > Sorry for the double post, however the previous discussion postponed work to > .. now, so maybe there is some value in first trying to specify exactly what > 'inherits' means, and derive consequences for code behaviour from that. Yeah, I was thinking about that thread, too, on my drive home from Metuchen. I wouldn't get too bogged down in formal logic; it seems there are a couple of distinct cases here: 1. If you're changing properties of a column, you need to verify for each relation in the inheritance tree that the "expected attinhcount" and the actual attinhcount match. If, for any relation in the inheritance tree rooted at the named table, they don't, then they are doubly inherited there, from some other table outside the hierarchy rooted at the named table, and the operation must fail. We'd need similar logic for constraints, if we had support for renaming or otherwise modifying them, but right now we don't. 2. If you're adding a column, you need to propagate the new column to relations that don't have it yet, but if you find one that already has it than you adjust attinhcount and don't recurse to its chidlren. 3. If you're dropping a column, you essentially decrement the attinhcount of all your children; then you recurse into any that reach attincount = 0 and not attislocal and drop the column there as well. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Re: tracking inherited columns (was: patch for check constraints using multiple inheritance)
From
Yeb Havinga
Date:
Robert Haas wrote: > On Tue, Aug 3, 2010 at 3:05 PM, Yeb Havinga <yebhavinga@gmail.com> wrote: > >> Yeb Havinga wrote: >> >>> The underlying cause is the failure of the code to recognize that if >>> relation C inherits from both A and B, where A and B both have column x, >>> that A.x 'is the same as' B.x, where the 'is the same as' relation is the >>> same that holds for (A.x, C.x) and (B.x, C.x), which the code does a lot of >>> trouble for to recognize. This means that if some definition is altered on >>> A.x, only C.x is updated and B.x not touched. IMO this is wrong and either a >>> multiple inheritance structure like this should be prohibited, since the >>> user did not explicitly declare that A.x and B.x 'are the same' (by e.g. >>> defining a relation D.x and have A and B inherit from that), or the code >>> should update parents of relations when the childs are updated. >>> >> Thinking about this a bit more, the name 'is the same as' is a bit >> confusing, since that relation might not be commutative. C.x 'inherits >> properties from' A.x, or C.x 'is defined by' A.x are perhaps better names, >> that reflect that the converse might not hold. OTOH, what does C.x 'inherits >> (all) properties from' A.x mean? If it means that for all properties P, >> P(C.x) iff P(A.x), then C.x = A.x commutatively and by similar reasoning >> A.x = B.x. >> >> >>> ALTER TABLE top1 RENAME COLUMN a_table_column TO another_table_column; >>> >> When looking for previous discussions that was referred to upthread, the >> first thing I found was this recent thread about the exactly the same >> problem http://archives.postgresql.org/pgsql-hackers/2010-01/msg03117.php >> >> Sorry for the double post, however the previous discussion postponed work to >> .. now, so maybe there is some value in first trying to specify exactly what >> 'inherits' means, and derive consequences for code behaviour from that. >> > > Yeah, I was thinking about that thread, too, on my drive home from > Metuchen. I just read that thread. In the beginning there is a short discussion what the non-astonishing behaviour of the RENAME in the case of multiple origin inheritance should be, which is preventing renames or any property change in that case. I think we should explore the possibilty of allowing the RENAME more. What if on a RENAME of a column (maybe with a necessary explicit CASCADE) in the multiple origin parent case, the parents with the same columns are altered too? I don't think it is ashtonishing for users; after all they've created the tree in the first place, but mostly for programmers with some experience with inheritance in computer languages: inheritance should go down, not up. That's why I tried to make reasoning exact, to figure out why it would be ok (or not) to update another parent as well. The reasoning can be made more formal/exact, but I believe in its current form it makes a strong case to technically allow to prograpage property changes to other parents as well (if they have the same inherited column). > 1. If you're changing properties of a column, you need to verify for > each relation in the inheritance tree that the "expected attinhcount" > and the actual attinhcount match. If, for any relation in the > inheritance tree rooted at the named table, they don't, then they are > doubly inherited there, from some other table outside the hierarchy > rooted at the named table, and the operation must fail. > If we want to block these RENAMES, yes. This is essentially KaiGai's patch http://archives.postgresql.org/pgsql-hackers/2010-01/msg02878.php > 2. If you're adding a column, you need to propagate the new column to > relations that don't have it yet, but if you find one that already has > it than you adjust attinhcount and don't recurse to its chidlren. > Sound ok. > 3. If you're dropping a column, you essentially decrement the > attinhcount of all your children; then you recurse into any that reach > attincount = 0 and not attislocal and drop the column there as well. > This too. regards, Yeb Havinga
Re: tracking inherited columns (was: patch for check constraints using multiple inheritance)
From
Robert Haas
Date:
On Wed, Aug 4, 2010 at 3:48 AM, Yeb Havinga <yebhavinga@gmail.com> wrote: > I just read that thread. In the beginning there is a short discussion what > the non-astonishing behaviour of the RENAME in the case of multiple origin > inheritance should be, which is preventing renames or any property change in > that case. I think we should explore the possibilty of allowing the RENAME > more. If child inherits column A from parent1 and parent2, and it is then renamed to B in parent2, what should the name be in the child after the rename is completed? For bonus points, how should pg_dump handle this to make sure the state after a dump and reload matches the state before the dump and reload? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Re: tracking inherited columns (was: patch for check constraints using multiple inheritance)
From
Yeb Havinga
Date:
Robert Haas wrote: > On Wed, Aug 4, 2010 at 3:48 AM, Yeb Havinga <yebhavinga@gmail.com> wrote: > >> I just read that thread. In the beginning there is a short discussion what >> the non-astonishing behaviour of the RENAME in the case of multiple origin >> inheritance should be, which is preventing renames or any property change in >> that case. I think we should explore the possibilty of allowing the RENAME >> more. >> > > If child inherits column A from parent1 and parent2, and it is then > renamed to B in parent2, what should the name be in the child after > the rename is completed? > The column should be renamed to B in parent2, child and parent1. > For bonus points, how should pg_dump handle this to make sure the > state after a dump and reload matches the state before the dump and > reload? > If the change happens in a single transaction there should be no problems here, as opposed to e.g. have the user issue two renames. Did I get the bonus points? :-) regards, Yeb Havinga
Re: tracking inherited columns (was: patch for check constraints using multiple inheritance)
From
Robert Haas
Date:
On Wed, Aug 4, 2010 at 6:41 AM, Yeb Havinga <yebhavinga@gmail.com> wrote: >> If child inherits column A from parent1 and parent2, and it is then >> renamed to B in parent2, what should the name be in the child after >> the rename is completed? > > The column should be renamed to B in parent2, child and parent1. Uh, really? Wow. You want to follow the inheritance hierarchy in both directions, both down and up? That seems like it could be confusing. >> For bonus points, how should pg_dump handle this to make sure the >> state after a dump and reload matches the state before the dump and >> reload? > > If the change happens in a single transaction there should be no problems > here, as opposed to e.g. have the user issue two renames. Did I get the > bonus points? :-) Sure, though I'm not sure I like the basic idea. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Re: tracking inherited columns (was: patch for check constraints using multiple inheritance)
From
Yeb Havinga
Date:
Robert Haas wrote: > On Wed, Aug 4, 2010 at 6:41 AM, Yeb Havinga <yebhavinga@gmail.com> wrote: > >>> If child inherits column A from parent1 and parent2, and it is then >>> renamed to B in parent2, what should the name be in the child after >>> the rename is completed? >>> >> The column should be renamed to B in parent2, child and parent1. >> > > Uh, really? Wow. You want to follow the inheritance hierarchy in > both directions, both down and up? That seems like it could be > confusing. > Yes, the idea is to follow the up direction in the case of column changes, only if the column is already present in another parent. To avoid confusion we could block the first attempt of a change with a message other parents exists, and the hint to add e.g. CASCADE to update the definition in the other parent as well. -- Yeb
Re: tracking inherited columns (was: patch for check constraints using multiple inheritance)
From
Andrew Dunstan
Date:
On 08/04/2010 06:41 AM, Robert Haas wrote: > On Wed, Aug 4, 2010 at 6:41 AM, Yeb Havinga<yebhavinga@gmail.com> wrote: >>> If child inherits column A from parent1 and parent2, and it is then >>> renamed to B in parent2, what should the name be in the child after >>> the rename is completed? >> The column should be renamed to B in parent2, child and parent1. > Uh, really? Wow. You want to follow the inheritance hierarchy in > both directions, both down and up? That seems like it could be > confusing. > > It seems more than confusing. It seems fundamentally wrong. It would certainly be a violation of POLA. Unless there's an extremely persuasive case made for it I'm inclined just to say no. cheers andrew
Re: tracking inherited columns (was: patch for check constraints using multiple inheritance)
From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes: > On 08/04/2010 06:41 AM, Robert Haas wrote: >> Uh, really? Wow. You want to follow the inheritance hierarchy in >> both directions, both down and up? That seems like it could be >> confusing. > It seems more than confusing. It seems fundamentally wrong. It would > certainly be a violation of POLA. I agree, this idea seems completely nuts. It is *not* reasonable for an action applied to a child to change the definition of the parent. regards, tom lane
Re: tracking inherited columns (was: patch for check constraints using multiple inheritance)
From
Yeb Havinga
Date:
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> On 08/04/2010 06:41 AM, Robert Haas wrote: >>> Uh, really? Wow. You want to follow the inheritance hierarchy in >>> both directions, both down and up? That seems like it could be >>> confusing. > >> It seems more than confusing. It seems fundamentally wrong. It would >> certainly be a violation of POLA. > > I agree, this idea seems completely nuts. It is *not* reasonable for > an action applied to a child to change the definition of the parent. Also not in the case that we're talking about here? A.a_column B.a_column | / v v C.a_column C inherits from A and B. The user wants to change a_column to better_name. ALTER TABLE A RENAME COLUMN a_column TO better_name; ERROR: could not rename column because an inherited child inherits the same column from other inheritance parents HINT: use CASCADE to rename the column in the other parents and their childs as well ALTER TABLE A RENAME COLUMN a_column TO better_name CASCADE; (succeeds) This doesn't seem nuts to me. After all, the set of columns with name 'a_column' is like a domain, in the sense that all names and types of all three columns are the same. If the user wants to rename a_column, with the current code he gets an inconsistent database. There is a patch that prevents renaming in this case, and then the user could work around it by adding an artificial relation from which A and B inherit, rename a_column there and then remove that relation again. IMHO to allow the rename if the user explicitly asks for it is more user friendly, with no compromises at all. Since the upward inheritance relation scanning is only used to gather the set of a_columns to be updated in the cascade case, I do not see why this is nuts, nor why it should violate any definition of inheritance. After all: all conditions regarding inheritance I can think of are valid *after* the DDL update. regards, Yeb Havinga
Re: tracking inherited columns (was: patch for check constraints using multiple inheritance)
From
Tom Lane
Date:
Yeb Havinga <yebhavinga@gmail.com> writes: > Tom Lane wrote: >> I agree, this idea seems completely nuts. It is *not* reasonable for >> an action applied to a child to change the definition of the parent. > Also not in the case that we're talking about here? > A.a_column B.a_column > | / > v v > C.a_column > C inherits from A and B. > The user wants to change a_column to better_name. Well, if A and B inherited the column from a common ancestor, he can easily do that. If not, maybe he should have thought harder before he started. I do NOT agree that issuing a rename against C is a sane way of dealing with this. > This doesn't seem nuts to me. You're in the minority. regards, tom lane
Re: tracking inherited columns (was: patch for check constraints using multiple inheritance)
From
Yeb Havinga
Date:
Tom Lane wrote: > Yeb Havinga <yebhavinga@gmail.com> writes: > >> A.a_column B.a_column >> | / >> v v >> C.a_column >> >> C inherits from A and B. >> > > Well, if A and B inherited the column from a common ancestor, he can > easily do that. If not, maybe he should have thought harder before he > started. I do NOT agree that issuing a rename against C is a sane way > of dealing with this. > Ok, I understand the intuition behind not wanting this kind of update. The root cause seems to center around multiple inheritance of the same column without a common ancestor. Another way to approach the problem, is to prevent the user to create a setup, i.e. when adding a column to B that already exists in A, or when adding a inheritance relation A-C or B-c, if A and B share column names. He could then get a hint he should add a common ancestor with that column. This preemptively prevents problems with renames and other changes. /me ducks regards, Yeb Havinga
Re: tracking inherited columns (was: patch for check constraints using multiple inheritance)
From
Robert Haas
Date:
On Thu, Aug 5, 2010 at 6:15 AM, Yeb Havinga <yebhavinga@gmail.com> wrote: > Tom Lane wrote: >> >> Yeb Havinga <yebhavinga@gmail.com> writes: >> >>> >>> A.a_column B.a_column >>> | / >>> v v >>> C.a_column >>> C inherits from A and B. >>> >> >> Well, if A and B inherited the column from a common ancestor, he can >> easily do that. If not, maybe he should have thought harder before he >> started. I do NOT agree that issuing a rename against C is a sane way >> of dealing with this. >> > > Ok, I understand the intuition behind not wanting this kind of update. > > The root cause seems to center around multiple inheritance of the same > column without a common ancestor. Another way to approach the problem, is to > prevent the user to create a setup, i.e. when adding a column to B that > already exists in A, or when adding a inheritance relation A-C or B-c, if A > and B share column names. He could then get a hint he should add a common > ancestor with that column. This preemptively prevents problems with renames > and other changes. It also breaks compatibility with previous releases for no particular reason. These cases are all marginal enough that it doesn't really make sense to change historical behavior; I think we should confine our efforts to fixing the bugs. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Re: tracking inherited columns (was: patch for check constraints using multiple inheritance)
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > Yeb Havinga <yebhavinga@gmail.com> writes: >> The root cause seems to center around multiple inheritance of the same >> column without a common ancestor. Another way to approach the problem, is to >> prevent the user to create a setup, i.e. when adding a column to B that >> already exists in A, or when adding a inheritance relation A-C or B-c, if A >> and B share column names. He could then get a hint he should add a common >> ancestor with that column. This preemptively prevents problems with renames >> and other changes. > It also breaks compatibility with previous releases for no particular > reason. Well, if it were only a hint, and thus didn't actually "prevent" anything, then it wouldn't be breaking compatibility. But I don't like the idea much either. It would be extremely expensive, if not impossible, to determine whether all parents having the similarly-named column got it from the same common ancestor. (In particular, if the user had previously ignored the hint, you could have situations where there isn't a unique ancestor that the column can be traced to; then what do you do?) I think we'd be putting huge amounts of effort into a case that no more than one or two people would ever hit. regards, tom lane
Re: tracking inherited columns (was: patch for check constraints using multiple inheritance)
From
Robert Haas
Date:
On Thu, Aug 5, 2010 at 9:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Well, if it were only a hint, and thus didn't actually "prevent" > anything, then it wouldn't be breaking compatibility. But I don't > like the idea much either. It would be extremely expensive, if not > impossible, to determine whether all parents having the similarly-named > column got it from the same common ancestor. (In particular, if the > user had previously ignored the hint, you could have situations where > there isn't a unique ancestor that the column can be traced to; then > what do you do?) > > I think we'd be putting huge amounts of effort into a case that no more > than one or two people would ever hit. I don't agree that it would be a huge amount of effort, but I do agree that only a very small number of people will ever hit it, and that it just doesn't seem worth it. We have bigger fish to fry. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company