Re: query speed joining tables - Mailing list pgsql-sql
From | Vernon Wu |
---|---|
Subject | Re: query speed joining tables |
Date | |
Msg-id | LK4WXSGCFVTCBB9ZUOI71542NHVUJE.3e25a967@kimiko Whole thread Raw |
In response to | Re: query speed joining tables ("Josh Berkus" <josh@agliodbs.com>) |
Responses |
Re: query speed joining tables
Re: query speed joining tables |
List | pgsql-sql |
Hi, Josh, I appreciate you share your experience here. I definitely don't have that many years' DBA experience behind, and are learning to get DB design right at the first place. What I stated is my observation on my project with over twenty multivalued detail tables. I have a selection query contained 200 characters, involving 10 tables, and using subquery. The performance is not bad after properly indexing, least than 3 second (what the planner says). I will have longer queries later and hope they won't have any performance problem. Thank you for recommending another DB book after the "Database Design For Mere Mortals". I will read the book. Vernon 15/01/2003 9:50:22 AM, "Josh Berkus" <josh@agliodbs.com> wrote: >Vernon, > > > In regarding of recomposing multivalued field as a separated table, >I > > have observed some advantages and >> disadvantages of the approach. Good on search as you have pointed out > > and bad on updating data, two operations >> needed: deletion and insertion. A query may need to join a lot of > > table together. In Christ's personal application, for >> example, there are many mulitvalued fields such as relationship > > status other then ethnicity. There will be some very long >> and complex queries. > > Hey, it's your database. In my 8-year experience as a professional > DBA, few considerations ever outweigh normalization in a relational > database. You are merely trading the immediate inconvenience of having > to construct complex queries and data-saving functions for the >eventual > huge inconvenience (or possibly disaster) of having your data >corrupted > or at least having to modify it by hand, row-by-row. > >(Pardon me if I'm a little strident, but I've spend a good portion of > my career cleaning up other's, and sometimes my own, database design > mistakes and I had to see a disaster-in-the-making repeated) > >To put it another way: Your current strategy is saving a penny now in > order to pay a dollar tommorrow. > > For example, you currently store multiple ethnicities in a free-form > text field. What happens when: > 1) Your organization decides they need to split "Asian" into "Chinese" > and "Other Asian"? > 2) Someone types "aisan" by mistake? > 3) You stop tracking another ethnicity, and want to purge it from the > database? > 4) Your administrator decides that Ethnicity needs to be ordered as > "primary ethnicity" and "other ethnicities"? > 5) You need to do complex queries like (Asian and/or Caucasian but not > Hispanic or African)? Your current strategy would require 4 seperate > functional indexes to support that query, or do a table scan with 4 > row-by-row fuzzy text matches ... slow and memory-intensive either >way. > >As I said, it's your database, and if it's a low-budget project > destined to be thrown away in 3 months, then go for it. If, however, > you expect this database to be around for a while, you owe it to > yourself and your co-workers to design it right. > >If you want an education on database normalization, pick up Fabian > Pascal's "Practical Issues in Database Design". > >-Josh Berkus >