Re: query speed joining tables - Mailing list pgsql-sql

From Josh Berkus
Subject Re: query speed joining tables
Date
Msg-id web-2314937@davinci.ethosmedia.com
Whole thread Raw
In response to query speed joining tables  (Christopher Smith <christopherl_smith@yahoo.com>)
Responses Re: query speed joining tables  (Vernon Wu <vernonw@gatewaytech.com>)
List pgsql-sql
Vernon, > In regarding of recomposing multivalued field as a separated table,
I> have observed some advantages and 
> disadvantages of the approach. Good on search as you have pointed out> and bad on updating data, two operations 
> needed: deletion and insertion. A query may need to join a lot of> table together. In Christ's personal application,
for
 
> example, there are many mulitvalued fields such as relationship> status other then ethnicity. There will be some very
long
 
> and complex queries. 
Hey, it's your database. In my 8-year experience as a professionalDBA, few considerations ever outweigh normalization
ina relationaldatabase. You are merely trading the immediate inconvenience of havingto construct complex queries and
data-savingfunctions for the
 
eventualhuge inconvenience (or possibly disaster) of having your data
corruptedor at least having to modify it by hand, row-by-row.
(Pardon me if I'm a little strident, but I've spend a good portion ofmy career cleaning up other's, and sometimes my
own,database designmistakes and I had to see a disaster-in-the-making repeated)
 
To put it another way: Your current strategy is saving a penny now inorder to pay a dollar tommorrow. 
For example, you currently store multiple ethnicities in a free-formtext field. What happens when:1) Your organization
decidesthey need to split "Asian" into "Chinese"and "Other Asian"?2) Someone types "aisan" by mistake?3) You stop
trackinganother ethnicity, and want to purge it from thedatabase?4) Your administrator decides that Ethnicity needs to
beordered as"primary ethnicity" and "other ethnicities"?5) You need to do complex queries like (Asian and/or Caucasian
butnotHispanic or African)? Your current strategy would require 4 seperatefunctional indexes to support that query, or
doa table scan with 4row-by-row fuzzy text matches ... slow and memory-intensive either
 
way.
As I said, it's your database, and if it's a low-budget projectdestined to be thrown away in 3 months, then go for it.
If,however,you expect this database to be around for a while, you owe it toyourself and your co-workers to design it
right.
If you want an education on database normalization, pick up FabianPascal's "Practical Issues in Database Design".
-Josh Berkus


pgsql-sql by date:

Previous
From: Andreas Joseph Krogh
Date:
Subject: Re: index on to_char(created, 'YYYY') doesn't work
Next
From: "Josh Berkus"
Date:
Subject: Re: index on to_char(created, 'YYYY') doesn't work