Re: Critique needed for contact-DB draft - Mailing list pgsql-novice

From Steve Crawford
Subject Re: Critique needed for contact-DB draft
Date
Msg-id 200407142230.26143.scrawford@pinpointresearch.com
Whole thread Raw
In response to Critique needed for contact-DB draft  ("Felix E. Klee" <felix.klee@inka.de>)
Responses Re: Critique needed for contact-DB draft  ("Felix E. Klee" <felix.klee@inka.de>)
List pgsql-novice
On Wednesday 14 July 2004 5:27 pm, Felix E. Klee wrote:
> ...I bluntly ask you to criticize the design attached below...
>
> I'm especially interested on your take of my use of arrays. They
> avoid the need for additional tables, but maybe they are not good,
> nevertheless?

Correct, they are not. Failure to create a well thought out database
structure will cause you to lose the ability to realize the power of
a SQL database and will be much harder to correct later than earlier.
I know you are under "temporal pressure" but this is truely a case of
"pay me now or pay me later".

By way of example, say you want to write a query showing all contacts
for a particular location. With an array of 0..n location elements
that is very hard to do.

Additionally, basic integrity checks that are easy to establish at the
database level if the data structure is sound will be difficult to
impossible with this array structure. An example here is ensuring
that a location appears only once for a given contact and that all
locations for a contact exist in the location table (and further,
that the location record cannot be deleted if it is referenced by any
other table).

> BTW, what I'm really missing is as a newbie is a way to define new
> data types as "structures" as known from e.g. the C programming
> language (or is this supported by PostgreSQL?). That would IMHO be
> a more natural concept for storing data (for example, then I could
> create an array of addresses for each contact).

No, what you are actually missing is experience "thinking in SQL." You
are trying to develop in SQL while thinking in C. This will yield
less than satisfactory results.

Generally you should ask of each piece of data is it multivariate (an
address field instead of separate fields for streetaddress, city,
state and zip for example) and is it multivalued (ie. a field with
0..n phone numbers). If either of these exist then there is almost
certainly something wrong with the design as this is basically the
first step in data normalization.

You may want to check out "Database Design for Mere Mortals".

SQL has its roots in set theory. I can't try to write your entire
schema but will offer an example to get you started. Say you have
some people (name, sex, birthdate) and those people have zero or more
phone numbers. You also have some organizations (name, president) to
which those people belong. You might consider the following:

people:
person_id integer (primary key==> not null, unique - probably use a
sequence number)
name text,
sex char(1) (validated as M or F)
birthdate date (validated as appropriate for your data)

phones:
phone_id integer (primary key ==> not null, unique - use a sequence)
phone number (validate as appropriate or even split into ac, prefix,
number if necessary for your app_
number type (home, work, fax, cell, etc. - this can be hard coded but
could also be validated against a "phone types" table)

organizations:
org_id integer (primary key ==> not null, unique, use a sequence)
org_name
president integer (links to people database)

people_organizations (a "linking table")
person_id integer
org_id integer

people_phones (another "linking table")
person_id integer
phone_id integer

The linking table is the key to your array problem. It allows you to
set up all sorts of many-to-many relationships and still enforce data
integrity and perform efficient queries.

A person can belong to several organizations and organizations will
presumably have many members but organizing the data this way allows
you to easily answer questions like how many organizations does Bob
belong to, how many members does Fooorg have, who belongs to more
than two organizations, etc.

Organize your data right and you can run all sorts of queries easily.
Do it wrong and you will suffer.

Hope this helps.

Cheers,
Steve


pgsql-novice by date:

Previous
From: Bruno Wolff III
Date:
Subject: Re: Critique needed for contact-DB draft
Next
From: "Felix E. Klee"
Date:
Subject: Re: Critique needed for contact-DB draft