Thread: Normalize or not?
Based upon my reading of wikipedia, the solution I think I want to implement is not in 2nd normal form. I'm wondering ifanyone on this list has suggestions, etc. I have a table called containers where object A will contain object B. There is a middle column that will describe the typeof association. For example, the account team will have Joe as a member and Fred will be the lead. This will startsoff with: Containers: Left How Right Account member Joe Account lead Fred There is another table (which I've talked about before) which tells me that "Account" is a "Team". I call this the namestable and has two columns: name and type. So: Names: Name Type Account Team Joe User Fred User in this case. I want to restrict the type of association e.g. I want teams to contain users but not allow users to containteams. And I want some type of associations not apply to others like a Group can simply "contain" a Team but a Teamcan not contain a Group. Thus, I have a third table with entries like: Types of Associations: LType How RType Team member User Team lead User Group contain Team There is a foreign key constraint so that Containers(Left) and Containers(Right) must be in Names(Name) (I hope that syntaxmakes sense). But now comes the hard part. For each row in Containers, I want to take the Type from Names associated with the Left + Containers(How) + the Type fromNames associated with the Right to be an entry in the Types of Associations table. For eacmple, Account member Joe wouldtranslate to Team member User which is in the Types of Associations table so it is a valid entry. But I don't believeI can do this with foreign key constraints. If I can, then stop me here and tell me how. My solution to solving this that I am considering is to add in the LType and RType into the Containers so now Containersbecomes: Containers: Left LType How Right RType Account Team member Joe User Account Team lead Fred User and now my foreign key constraint is simply that Containers(Ltype),Containers(How),Containers(RType) be in Types of Association. I would also change my constraint so that Left,LType of Containers must be in Names as well as Right,RType bein Names. According to my interpretation, the RType and LType columns of Containers are non-prime but are functionally dependent uponeither Left or Right so this table is not even second normal form. But, if I add in the constraint that both the Nameand Type must be in Names, does it really matter? I know that I will not be able to change just LType or just RTypeand create an update anomaly. I guess I should mention that change the Type of a name is not even being considered. The "Names" and "Types of Associations" tables I think of as a type of constants which will be defined and rarelychange. I hope this is reasonably easy to follow. I'm looking forward to your thoughts and comments. Thank you, Perry Smith
On 04/14/11 4:28 PM, Perry Smith wrote: > I hope this is reasonably easy to follow. I'm looking forward to your thoughts and comments. at least on first glance, that looks like object oriented methodology, not relational.
It is not easy to follow...but I'll toss out some thoughts anyway. I would generally not de-normalize the data model in order to make referential integrity easier. If your "requirements" are that complex then writing a wrapper around the insert/update layer for the tables in question is a better idea. You can perform a higher level of validation without sacrificing data integrity. I would suggest looking at both the kinds of selection queries you plan to run as well as what kinds of data (i.e., relationships between containers and objects) to see how readily the final model will work. Two things that SQL does have challenges with is indefinite hierarchies and many-to-many relationships with constraints. You seem to be using both. It isn't that SQL cannot handle them but as a programmer you need to be extra careful to introduce appropriate constraints on data entry and to spend extra time figuring out what valid combinations are likely to result in incorrect results given the kinds of queries you are using. There isn't anything wrong with the normal form violation but you probably want to understand where your model is vulnerable to update anomalies. If you can add some additional tables and turn the de-normalized data into foreign-keys you can use referential integrity to at least limit your exposure but it may not remove it completely. In the end sometimes the only real question of import is whether the solution works. If you are concerned then anything built upon the schema should be designed such that you can refactor the underlying schema if you desire. But it isn't necessarily worth it to exhaust every possible avenue to find the better solution if you have one that works. The question becomes whether you have the resources (time and/or money) to search for the better solution now or whether you should just use the immediately viable solution until it breaks. David J. -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Perry Smith Sent: Thursday, April 14, 2011 7:29 PM To: pgsql-general@postgresql.org Subject: [GENERAL] Normalize or not? I hope this is reasonably easy to follow. I'm looking forward to your thoughts and comments. Thank you, Perry Smith -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Mostly, this entire project is for me to learn. What started out as a toy has gown into something that about 40 people useeach day but it is definitely not normal production stress. Or course, I'd like to assume and do things as if it wasgoing to be stressed. For whatever reason, I've done simple functions but I've never done complex wrappers like you are talking about. Just tobe sure, when you say "writ[e] a wrapper around the insert/update layer", you are referring to the techniques documentedin chapter 36 "The Rule System" ? That has always intimidated me but perhaps its time to grow. Thank you again, Perry On Apr 14, 2011, at 7:00 PM, David Johnston wrote: > It is not easy to follow...but I'll toss out some thoughts anyway. > > I would generally not de-normalize the data model in order to make > referential integrity easier. If your "requirements" are that complex then > writing a wrapper around the insert/update layer for the tables in question > is a better idea. You can perform a higher level of validation without > sacrificing data integrity. > > I would suggest looking at both the kinds of selection queries you plan to > run as well as what kinds of data (i.e., relationships between containers > and objects) to see how readily the final model will work. Two things that > SQL does have challenges with is indefinite hierarchies and many-to-many > relationships with constraints. You seem to be using both. It isn't that > SQL cannot handle them but as a programmer you need to be extra careful to > introduce appropriate constraints on data entry and to spend extra time > figuring out what valid combinations are likely to result in incorrect > results given the kinds of queries you are using. > > There isn't anything wrong with the normal form violation but you probably > want to understand where your model is vulnerable to update anomalies. If > you can add some additional tables and turn the de-normalized data into > foreign-keys you can use referential integrity to at least limit your > exposure but it may not remove it completely. > > In the end sometimes the only real question of import is whether the > solution works. If you are concerned then anything built upon the schema > should be designed such that you can refactor the underlying schema if you > desire. But it isn't necessarily worth it to exhaust every possible avenue > to find the better solution if you have one that works. The question > becomes whether you have the resources (time and/or money) to search for the > better solution now or whether you should just use the immediately viable > solution until it breaks. > > David J. > > > -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Perry Smith > Sent: Thursday, April 14, 2011 7:29 PM > To: pgsql-general@postgresql.org > Subject: [GENERAL] Normalize or not? > > I hope this is reasonably easy to follow. I'm looking forward to your > thoughts and comments. > > Thank you, > Perry Smith > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make > changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
More along the lines of pl/pgsql and/or whatever application language you are using; not rules. On Apr 15, 2011, at 8:45, Perry Smith <pedzsan@gmail.com> wrote: > Mostly, this entire project is for me to learn. What started out as a toy has gown into something that about 40 peopleuse each day but it is definitely not normal production stress. Or course, I'd like to assume and do things as ifit was going to be stressed. > > For whatever reason, I've done simple functions but I've never done complex wrappers like you are talking about. Justto be sure, when you say "writ[e] a wrapper around the insert/update layer", you are referring to the techniques documentedin chapter 36 "The Rule System" ? > > That has always intimidated me but perhaps its time to grow. > > Thank you again, > Perry > > On Apr 14, 2011, at 7:00 PM, David Johnston wrote: > >> It is not easy to follow...but I'll toss out some thoughts anyway. >> >> I would generally not de-normalize the data model in order to make >> referential integrity easier. If your "requirements" are that complex then >> writing a wrapper around the insert/update layer for the tables in question >> is a better idea. You can perform a higher level of validation without >> sacrificing data integrity. >> >> I would suggest looking at both the kinds of selection queries you plan to >> run as well as what kinds of data (i.e., relationships between containers >> and objects) to see how readily the final model will work. Two things that >> SQL does have challenges with is indefinite hierarchies and many-to-many >> relationships with constraints. You seem to be using both. It isn't that >> SQL cannot handle them but as a programmer you need to be extra careful to >> introduce appropriate constraints on data entry and to spend extra time >> figuring out what valid combinations are likely to result in incorrect >> results given the kinds of queries you are using. >> >> There isn't anything wrong with the normal form violation but you probably >> want to understand where your model is vulnerable to update anomalies. If >> you can add some additional tables and turn the de-normalized data into >> foreign-keys you can use referential integrity to at least limit your >> exposure but it may not remove it completely. >> >> In the end sometimes the only real question of import is whether the >> solution works. If you are concerned then anything built upon the schema >> should be designed such that you can refactor the underlying schema if you >> desire. But it isn't necessarily worth it to exhaust every possible avenue >> to find the better solution if you have one that works. The question >> becomes whether you have the resources (time and/or money) to search for the >> better solution now or whether you should just use the immediately viable >> solution until it breaks. >> >> David J. >> >> >> -----Original Message----- >> From: pgsql-general-owner@postgresql.org >> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Perry Smith >> Sent: Thursday, April 14, 2011 7:29 PM >> To: pgsql-general@postgresql.org >> Subject: [GENERAL] Normalize or not? >> >> I hope this is reasonably easy to follow. I'm looking forward to your >> thoughts and comments. >> >> Thank you, >> Perry Smith >> >> >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make >> changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general >> >
Ah. I'm interfacing to the database via Ruby. So, write a function in pl/pgsql to do an insert, update, and delete thatmaintain the integrity that I want and then call those via Ruby. On Apr 15, 2011, at 7:59 AM, David Johnston wrote: > More along the lines of pl/pgsql and/or whatever application language you are using; not rules. > > On Apr 15, 2011, at 8:45, Perry Smith <pedzsan@gmail.com> wrote: > >> Mostly, this entire project is for me to learn. What started out as a toy has gown into something that about 40 peopleuse each day but it is definitely not normal production stress. Or course, I'd like to assume and do things as ifit was going to be stressed. >> >> For whatever reason, I've done simple functions but I've never done complex wrappers like you are talking about. Justto be sure, when you say "writ[e] a wrapper around the insert/update layer", you are referring to the techniques documentedin chapter 36 "The Rule System" ? >> >> That has always intimidated me but perhaps its time to grow. >> >> Thank you again, >> Perry >> >> On Apr 14, 2011, at 7:00 PM, David Johnston wrote: >> >>> It is not easy to follow...but I'll toss out some thoughts anyway. >>> >>> I would generally not de-normalize the data model in order to make >>> referential integrity easier. If your "requirements" are that complex then >>> writing a wrapper around the insert/update layer for the tables in question >>> is a better idea. You can perform a higher level of validation without >>> sacrificing data integrity. >>> >>> I would suggest looking at both the kinds of selection queries you plan to >>> run as well as what kinds of data (i.e., relationships between containers >>> and objects) to see how readily the final model will work. Two things that >>> SQL does have challenges with is indefinite hierarchies and many-to-many >>> relationships with constraints. You seem to be using both. It isn't that >>> SQL cannot handle them but as a programmer you need to be extra careful to >>> introduce appropriate constraints on data entry and to spend extra time >>> figuring out what valid combinations are likely to result in incorrect >>> results given the kinds of queries you are using. >>> >>> There isn't anything wrong with the normal form violation but you probably >>> want to understand where your model is vulnerable to update anomalies. If >>> you can add some additional tables and turn the de-normalized data into >>> foreign-keys you can use referential integrity to at least limit your >>> exposure but it may not remove it completely. >>> >>> In the end sometimes the only real question of import is whether the >>> solution works. If you are concerned then anything built upon the schema >>> should be designed such that you can refactor the underlying schema if you >>> desire. But it isn't necessarily worth it to exhaust every possible avenue >>> to find the better solution if you have one that works. The question >>> becomes whether you have the resources (time and/or money) to search for the >>> better solution now or whether you should just use the immediately viable >>> solution until it breaks. >>> >>> David J. >>> >>> >>> -----Original Message----- >>> From: pgsql-general-owner@postgresql.org >>> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Perry Smith >>> Sent: Thursday, April 14, 2011 7:29 PM >>> To: pgsql-general@postgresql.org >>> Subject: [GENERAL] Normalize or not? >>> >>> I hope this is reasonably easy to follow. I'm looking forward to your >>> thoughts and comments. >>> >>> Thank you, >>> Perry Smith >>> >>> >>> >>> -- >>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make >>> changes to your subscription: >>> http://www.postgresql.org/mailpref/pgsql-general >>> >>