Thread: Relational database design book
This is slightly off topic perhaps, but maybe not too much ... I'm wondering if anyone can recommend a good general book on relational database design. I am using PostgreSQL, so a PG-centricbook would be best, but I'm mainly interested in the art/science of good SQL database design regardless of theactual database product used. I'm not a total beginner, but have not used databases to do much more than store web page content and authenticate usersto date. I don't mind heavy-going, I prefer that to fluff. I suppose I'm looking for a highly regarded "classic" text, a bit like Martin Fowler's or the Gang of Four's texts on object-orienteddesign patterns, which are applicable to many programming languages. Not so classic that it's out of dateof course! TIA Mick
On Sun, 14 Dec 2008, Michael Hall wrote: > I'm wondering if anyone can recommend a good general book on relational database design. > ... Mick, I highly recommend all the books by Joe Celko. He's been writing on database issues for at least 20 years and communicates very well. I know that he has a book that covers data normalization[1] and that's probably what you want to re-learn. His other topics are extremely helpful, too. For an outstanding book about SQL (recommended to me by Joe when I wrote to him about a source for knowledge of time-based databases[2]), read Rick van der Laans's "Introduction to SQL, 4th Edition." It's a very large book chock full of detailed information written very well. Rich -- Richard B. Shepard, Ph.D. | Integrity Credibility Applied Ecosystem Services, Inc. | Innovation <http://www.appl-ecosys.com> Voice: 503-667-4517 Fax: 503-667-8863 [1] There are several steps one undertakes to determine which attributes belong together in a table. When there is no duplication and referential integrity is ensured you'll have a stable and useful database schema that's easy to maintain. [2] Strangely enough -- to me, at least -- the lack of full support for date- and time-based SQL in database tools such as PostgreSQL is puzzling. Virtually all business-related databases (think accounting systems as a prime example) depend on dates. So do many scientific databases.
rshepard@appl-ecosys.com (Rich Shepard) writes: > > [2] Strangely enough -- to me, at least -- the lack of full support for > date- and time-based SQL in database tools such as PostgreSQL is puzzling. > Virtually all business-related databases (think accounting systems as a > prime example) depend on dates. So do many scientific databases. The support for temporality in PostgreSQL seems above average as far as I can see... PostgreSQL has pretty nice time types between the timestamptz type and interval. What strikes me as being missing is the ability to create temporally-aware foreign keys. That is, suppose the schema is: create table1 ( nearly_pk integer not null, from_date timestamptz not null default now(), to_date timestamptz not null default 'Infinity', constraint dating_t1 check (from_date < to_date) -- probably some other data... ); I'd like to be able to do two more things: a) Treat the date range as part of the primary key (which isn't forcibly hard), b) Have references to table1 that point to the time range for the "nearly_pk" value but which are a little more liberal with the dates. create table2 ( t2pk integer primary key, nearly_pk integer not null, from_date timestamptz not null default now(), to_date timestamptz not null default 'Infinity', -- And have a "foreign key" that requires that -- for tuple in table2 the combination (nearly_pk, from_date, to_date) -- is *contained* by relevant ranges of (nearly_pk, from_date, to_date) -- on table1 foreign key (nearly_pk) references table1(nearly_pk) with temporal (table2.from_date, table2.to_date) contained_by (table1.from_date, table1.to_date) ); I don't think the syntax there is necessarily quite right; I'm just hoping to express the idea successfully. I could presumably do this with a trigger; have been trying to avoid that thus far. There are, of course, other ways of treating temporality; that is part of why it's early to treat this approach as worth putting into syntax. -- output = ("cbbrowne" "@" "acm.org") http://cbbrowne.com/info/finances.html "When the grammar checker identifies an error, it suggests a correction and can even makes some changes for you." -- Microsoft Word for Windows 2.0 User's Guide, p.35:
On 15/12/2008 15:09, Chris Browne wrote: > PostgreSQL has pretty nice time types between the timestamptz type and > interval. Yeah... I've often gotten really frustrated trying to do date- and time-related stuff in PHP - it's much easier to get PG to do it instead. :-) Ray. ------------------------------------------------------------------ Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland rod@iol.ie Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals ------------------------------------------------------------------
It might be useful to look at the capabilities of the Informix Timeseries Datablade (http://www-01.ibm.com/software/data/informix/blades/) if you want to look at ways of enhancing the temporal data capabilities of Postgres. Cheers, Brent Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand >>> Chris Browne <cbbrowne@acm.org> 12/16/08 10:05 AM >>> rshepard@appl-ecosys.com (Rich Shepard) writes: > > [2] Strangely enough -- to me, at least -- the lack of full support for > date- and time-based SQL in database tools such as PostgreSQL is puzzling. > Virtually all business-related databases (think accounting systems as a > prime example) depend on dates. So do many scientific databases. The support for temporality in PostgreSQL seems above average as far as I can see... PostgreSQL has pretty nice time types between the timestamptz type and interval. What strikes me as being missing is the ability to create temporally-aware foreign keys. That is, suppose the schema is: create table1 ( nearly_pk integer not null, from_date timestamptz not null default now(), to_date timestamptz not null default 'Infinity', constraint dating_t1 check (from_date < to_date) -- probably some other data... ); I'd like to be able to do two more things: a) Treat the date range as part of the primary key (which isn't forcibly hard), b) Have references to table1 that point to the time range for the "nearly_pk" value but which are a little more liberal with the dates. create table2 ( t2pk integer primary key, nearly_pk integer not null, from_date timestamptz not null default now(), to_date timestamptz not null default 'Infinity', -- And have a "foreign key" that requires that -- for tuple in table2 the combination (nearly_pk, from_date, to_date) -- is *contained* by relevant ranges of (nearly_pk, from_date, to_date) -- on table1 foreign key (nearly_pk) references table1(nearly_pk) with temporal (table2.from_date, table2.to_date) contained_by (table1.from_date, table1.to_date) ); I don't think the syntax there is necessarily quite right; I'm just hoping to express the idea successfully. I could presumably do this with a trigger; have been trying to avoid that thus far. There are, of course, other ways of treating temporality; that is part of why it's early to treat this approach as worth putting into syntax. -- output = ("cbbrowne" "@" "acm.org") http://cbbrowne.com/info/finances.html "When the grammar checker identifies an error, it suggests a correction and can even makes some changes for you." -- Microsoft Word for Windows 2.0 User's Guide, p.35: -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water & Atmospheric Research Ltd.