Thread: Normalize or not?

Normalize or not?

From
Perry Smith
Date:
Based upon my reading of wikipedia, the solution I think I want to implement is not in 2nd normal form.  I'm wondering
ifanyone on this list has suggestions, etc. 

I have a table called containers where object A will contain object B.  There is a middle column that will describe the
typeof association.  For example, the account team will have Joe as a member and Fred will be the lead.  This will
startsoff with: 

Containers:
Left How Right
Account member Joe
Account lead Fred

There is another table (which I've talked about before) which tells me that "Account" is a "Team".  I call this the
namestable and has two columns: name and type.  So: 

Names:
Name Type
Account Team
Joe User
Fred User

in this case.  I want to restrict the type of association e.g. I want teams to contain users but not allow users to
containteams.  And I want some type of associations not apply to others like a Group can simply "contain" a Team but a
Teamcan not contain a Group.  Thus, I have a third table with entries like: 

Types of Associations:
LType How RType
Team member User
Team lead User
Group contain Team

There is a foreign key constraint so that Containers(Left) and Containers(Right) must be in Names(Name) (I hope that
syntaxmakes sense).  But now comes the hard part. 

For each row in Containers, I want to take the Type from Names associated with the Left + Containers(How) + the Type
fromNames associated with the Right to be an entry in the Types of Associations table.  For eacmple, Account member Joe
wouldtranslate to Team member User which is in the Types of Associations table so it is a valid entry.  But I don't
believeI can do this with foreign key constraints.  If I can, then stop me here and tell me how. 

My solution to solving this that I am considering is to add in the LType and RType into the Containers so now
Containersbecomes: 

Containers:
Left LType How Right RType
Account Team member Joe User
Account Team lead Fred User

and now my foreign key constraint is simply that Containers(Ltype),Containers(How),Containers(RType) be in Types of
Association. I would also change my constraint so that Left,LType of Containers must be in Names as well as Right,RType
bein Names. 

According to my interpretation, the RType and LType columns of Containers are non-prime but are functionally dependent
uponeither Left or Right so this table is not even second normal form.  But, if I add in the constraint that both the
Nameand Type must be in Names, does it really matter?  I know that I will not be able to change just LType or just
RTypeand create an update anomaly.  I guess I should mention that change the Type of a name is not even being
considered. The "Names" and "Types of Associations" tables I think of as a type of constants which will be defined and
rarelychange. 


I hope this is reasonably easy to follow.  I'm looking forward to your thoughts and comments.

Thank you,
Perry Smith



Re: Normalize or not?

From
John R Pierce
Date:
On 04/14/11 4:28 PM, Perry Smith wrote:
> I hope this is reasonably easy to follow.  I'm looking forward to your thoughts and comments.

at least on first glance, that looks like object oriented methodology,
not relational.





Re: Normalize or not?

From
"David Johnston"
Date:
It is not easy to follow...but I'll toss out some thoughts anyway.

I would generally not de-normalize the data model in order to make
referential integrity easier.  If your "requirements" are that complex then
writing a wrapper around the insert/update layer for the tables in question
is a better idea.  You can perform a higher level of validation without
sacrificing data integrity.

I would suggest looking at both the kinds of selection queries you plan to
run as well as what kinds of data (i.e., relationships between containers
and objects) to see how readily the final model will work.  Two things that
SQL does have challenges with is indefinite hierarchies and many-to-many
relationships with constraints.  You seem to be using both.  It isn't that
SQL cannot handle them but as a programmer you need to be extra careful to
introduce appropriate constraints on data entry and to spend extra time
figuring out what valid combinations are likely to result in incorrect
results given the kinds of queries you are using.

There isn't anything wrong with the normal form violation but you probably
want to understand where your model is vulnerable to update anomalies.  If
you can add some additional tables and turn the de-normalized data into
foreign-keys you can use referential integrity to at least limit your
exposure but it may not remove it completely.

In the end sometimes the only real question of import is whether the
solution works.  If you are concerned then anything built upon the schema
should be designed such that you can refactor the underlying schema if you
desire.  But it isn't necessarily worth it to exhaust every possible avenue
to find the better solution if you have one that works.  The question
becomes whether you have the resources (time and/or money) to search for the
better solution now or whether you should just use the immediately viable
solution until it breaks.

David J.


-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Perry Smith
Sent: Thursday, April 14, 2011 7:29 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] Normalize or not?

I hope this is reasonably easy to follow.  I'm looking forward to your
thoughts and comments.

Thank you,
Perry Smith



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make
changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: Normalize or not?

From
Perry Smith
Date:
Mostly, this entire project is for me to learn.  What started out as a toy has gown into something that about 40 people
useeach day but it is definitely not normal production stress.  Or course, I'd like to assume and do things as if it
wasgoing to be stressed. 

For whatever reason, I've done simple functions but I've never done complex wrappers like you are talking about.  Just
tobe sure, when you say "writ[e] a wrapper around the insert/update layer", you are referring to the techniques
documentedin chapter 36 "The Rule System" ? 

That has always intimidated me but perhaps its time to grow.

Thank you again,
Perry

On Apr 14, 2011, at 7:00 PM, David Johnston wrote:

> It is not easy to follow...but I'll toss out some thoughts anyway.
>
> I would generally not de-normalize the data model in order to make
> referential integrity easier.  If your "requirements" are that complex then
> writing a wrapper around the insert/update layer for the tables in question
> is a better idea.  You can perform a higher level of validation without
> sacrificing data integrity.
>
> I would suggest looking at both the kinds of selection queries you plan to
> run as well as what kinds of data (i.e., relationships between containers
> and objects) to see how readily the final model will work.  Two things that
> SQL does have challenges with is indefinite hierarchies and many-to-many
> relationships with constraints.  You seem to be using both.  It isn't that
> SQL cannot handle them but as a programmer you need to be extra careful to
> introduce appropriate constraints on data entry and to spend extra time
> figuring out what valid combinations are likely to result in incorrect
> results given the kinds of queries you are using.
>
> There isn't anything wrong with the normal form violation but you probably
> want to understand where your model is vulnerable to update anomalies.  If
> you can add some additional tables and turn the de-normalized data into
> foreign-keys you can use referential integrity to at least limit your
> exposure but it may not remove it completely.
>
> In the end sometimes the only real question of import is whether the
> solution works.  If you are concerned then anything built upon the schema
> should be designed such that you can refactor the underlying schema if you
> desire.  But it isn't necessarily worth it to exhaust every possible avenue
> to find the better solution if you have one that works.  The question
> becomes whether you have the resources (time and/or money) to search for the
> better solution now or whether you should just use the immediately viable
> solution until it breaks.
>
> David J.
>
>
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Perry Smith
> Sent: Thursday, April 14, 2011 7:29 PM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] Normalize or not?
>
> I hope this is reasonably easy to follow.  I'm looking forward to your
> thoughts and comments.
>
> Thank you,
> Perry Smith
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make
> changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: Normalize or not?

From
David Johnston
Date:
More along the lines of pl/pgsql and/or whatever application language you are using; not rules.

On Apr 15, 2011, at 8:45, Perry Smith <pedzsan@gmail.com> wrote:

> Mostly, this entire project is for me to learn.  What started out as a toy has gown into something that about 40
peopleuse each day but it is definitely not normal production stress.  Or course, I'd like to assume and do things as
ifit was going to be stressed. 
>
> For whatever reason, I've done simple functions but I've never done complex wrappers like you are talking about.
Justto be sure, when you say "writ[e] a wrapper around the insert/update layer", you are referring to the techniques
documentedin chapter 36 "The Rule System" ? 
>
> That has always intimidated me but perhaps its time to grow.
>
> Thank you again,
> Perry
>
> On Apr 14, 2011, at 7:00 PM, David Johnston wrote:
>
>> It is not easy to follow...but I'll toss out some thoughts anyway.
>>
>> I would generally not de-normalize the data model in order to make
>> referential integrity easier.  If your "requirements" are that complex then
>> writing a wrapper around the insert/update layer for the tables in question
>> is a better idea.  You can perform a higher level of validation without
>> sacrificing data integrity.
>>
>> I would suggest looking at both the kinds of selection queries you plan to
>> run as well as what kinds of data (i.e., relationships between containers
>> and objects) to see how readily the final model will work.  Two things that
>> SQL does have challenges with is indefinite hierarchies and many-to-many
>> relationships with constraints.  You seem to be using both.  It isn't that
>> SQL cannot handle them but as a programmer you need to be extra careful to
>> introduce appropriate constraints on data entry and to spend extra time
>> figuring out what valid combinations are likely to result in incorrect
>> results given the kinds of queries you are using.
>>
>> There isn't anything wrong with the normal form violation but you probably
>> want to understand where your model is vulnerable to update anomalies.  If
>> you can add some additional tables and turn the de-normalized data into
>> foreign-keys you can use referential integrity to at least limit your
>> exposure but it may not remove it completely.
>>
>> In the end sometimes the only real question of import is whether the
>> solution works.  If you are concerned then anything built upon the schema
>> should be designed such that you can refactor the underlying schema if you
>> desire.  But it isn't necessarily worth it to exhaust every possible avenue
>> to find the better solution if you have one that works.  The question
>> becomes whether you have the resources (time and/or money) to search for the
>> better solution now or whether you should just use the immediately viable
>> solution until it breaks.
>>
>> David J.
>>
>>
>> -----Original Message-----
>> From: pgsql-general-owner@postgresql.org
>> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Perry Smith
>> Sent: Thursday, April 14, 2011 7:29 PM
>> To: pgsql-general@postgresql.org
>> Subject: [GENERAL] Normalize or not?
>>
>> I hope this is reasonably easy to follow.  I'm looking forward to your
>> thoughts and comments.
>>
>> Thank you,
>> Perry Smith
>>
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make
>> changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>

Re: Normalize or not?

From
Perry Smith
Date:
Ah.  I'm interfacing to the database via Ruby.  So, write a function in pl/pgsql to do an insert, update, and delete
thatmaintain the integrity that I want and then call those via Ruby. 

On Apr 15, 2011, at 7:59 AM, David Johnston wrote:

> More along the lines of pl/pgsql and/or whatever application language you are using; not rules.
>
> On Apr 15, 2011, at 8:45, Perry Smith <pedzsan@gmail.com> wrote:
>
>> Mostly, this entire project is for me to learn.  What started out as a toy has gown into something that about 40
peopleuse each day but it is definitely not normal production stress.  Or course, I'd like to assume and do things as
ifit was going to be stressed. 
>>
>> For whatever reason, I've done simple functions but I've never done complex wrappers like you are talking about.
Justto be sure, when you say "writ[e] a wrapper around the insert/update layer", you are referring to the techniques
documentedin chapter 36 "The Rule System" ? 
>>
>> That has always intimidated me but perhaps its time to grow.
>>
>> Thank you again,
>> Perry
>>
>> On Apr 14, 2011, at 7:00 PM, David Johnston wrote:
>>
>>> It is not easy to follow...but I'll toss out some thoughts anyway.
>>>
>>> I would generally not de-normalize the data model in order to make
>>> referential integrity easier.  If your "requirements" are that complex then
>>> writing a wrapper around the insert/update layer for the tables in question
>>> is a better idea.  You can perform a higher level of validation without
>>> sacrificing data integrity.
>>>
>>> I would suggest looking at both the kinds of selection queries you plan to
>>> run as well as what kinds of data (i.e., relationships between containers
>>> and objects) to see how readily the final model will work.  Two things that
>>> SQL does have challenges with is indefinite hierarchies and many-to-many
>>> relationships with constraints.  You seem to be using both.  It isn't that
>>> SQL cannot handle them but as a programmer you need to be extra careful to
>>> introduce appropriate constraints on data entry and to spend extra time
>>> figuring out what valid combinations are likely to result in incorrect
>>> results given the kinds of queries you are using.
>>>
>>> There isn't anything wrong with the normal form violation but you probably
>>> want to understand where your model is vulnerable to update anomalies.  If
>>> you can add some additional tables and turn the de-normalized data into
>>> foreign-keys you can use referential integrity to at least limit your
>>> exposure but it may not remove it completely.
>>>
>>> In the end sometimes the only real question of import is whether the
>>> solution works.  If you are concerned then anything built upon the schema
>>> should be designed such that you can refactor the underlying schema if you
>>> desire.  But it isn't necessarily worth it to exhaust every possible avenue
>>> to find the better solution if you have one that works.  The question
>>> becomes whether you have the resources (time and/or money) to search for the
>>> better solution now or whether you should just use the immediately viable
>>> solution until it breaks.
>>>
>>> David J.
>>>
>>>
>>> -----Original Message-----
>>> From: pgsql-general-owner@postgresql.org
>>> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Perry Smith
>>> Sent: Thursday, April 14, 2011 7:29 PM
>>> To: pgsql-general@postgresql.org
>>> Subject: [GENERAL] Normalize or not?
>>>
>>> I hope this is reasonably easy to follow.  I'm looking forward to your
>>> thoughts and comments.
>>>
>>> Thank you,
>>> Perry Smith
>>>
>>>
>>>
>>> --
>>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make
>>> changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-general
>>>
>>