Re: Almost relational PostgreSQL (was: one-to-one) - Mailing list pgsql-novice
From | Josh Berkus |
---|---|
Subject | Re: Almost relational PostgreSQL (was: one-to-one) |
Date | |
Msg-id | 200310211310.20299.josh@agliodbs.com Whole thread Raw |
In response to | Almost relational PostgreSQL (was: one-to-one) (Michael Glaesmann <grzm@myrealbox.com>) |
Responses |
Re: Almost relational PostgreSQL (was: one-to-one)
|
List | pgsql-novice |
Michael, > From what I've read of PostgreSQL, it's a bit closer to true relational > than SQL, or at least you can limit yourself from using some of the > non-relational bits of SQL, such as always using NOT NULL in column > definitions, and using DISTINCT to eliminate duplicates in results > relations. I was wondering if you would share some of techniques you > would recommend using in PostgreSQL to make a database 'more > relational', or know of any sources on the web that I might find more > about this specifically. Well, if you've already read Date, Darwin & Pascal, I think you've pretty much covered things. If you don't have it already, Pascal's "Practical Issues in Database Management" is a good place to start; for one thing, it's the clearest description of the Normal Forms I've seen. Unfortunately, a lot of the relational functionality that SQL (and PostgreSQL) is missing can only be awkwardly implemented using PostgreSQL Triggers, Views and Rules. More is coming, though; when we have fully-functional statement-level triggers in 7.5, and when/if Neil finishes implementing an updatable view default, implementing distributed keys and set-based keys will become a lot more possible. As much as it is consistent with Codd, though, I don't recommend using DISTINCT with everything; it's a performance-killer. Neil and I were just hashing this out on IRC. Another tip: beware of over-reliance on surrogate autoincrement keys. They are convenient and necessary some of the time, but use real keys where you can. The current industry reliance on "ID" primary keys encourages sloppy thinking .... and sloppy schema ... by DBAs. I have caught myself building tables without any real keys this way. For that matter, the term "Primary Key" is inherently meaningless anyway. Don't get attached to it. Maybe more later if I feel like it .... more likely, you've inspired me to a magazine article. > Also, I often read about denormalization a database for performance > reasons. My understanding of this is not that normalization is a > problem in and of itself, but that the DBMS hasn't been properly > designed to handle highly normalized databases, which I gather has to > do with the fact the number of joins tends to increase with > normalization. However, I haven't seen much about this with respect to > PostgreSQL, or any particular DBMS, for that matter. Discussion seems > to always be on the theoretical level, though there should be > differences in performance as the engines underlying the different > DBMSs are different. I'd be interested in hearing your views on > practical normalization of PostgreSQL databases, or alternatively know > of places where I might find such discussion. My attitude? Construct your database according to strict normal principles, and then see if you have a performance problem. IMNSHO, 90% of the people advocating denormalization haven't really tested; they are either making assumptions based on outdated knowledge, or using "performance" as an excuse for bad database design. My personal limit of denormalization stops at a few NULL columns and using cache tables to hold copies of views which are too slow. Want to discuss this further? Take it to the PGSQL-SQL list and/or the #postgresql channel on irc.freenode.net. -- -Josh Berkus Aglio Database Solutions San Francisco
pgsql-novice by date: