Seeking comments on schema design and data integrity - Mailing list pgsql-general

From Tim Hart
Subject Seeking comments on schema design and data integrity
Date
Msg-id 99efd912c72636f1625d0d30cd4a271c@mac.com
Whole thread Raw
Responses Re: Seeking comments on schema design and data integrity  (Joe Conway <mail@joeconway.com>)
List pgsql-general
List,

I am by no means a DBA, but I'm developing a database app for our local
little league. I would appreciate some comments regarding the following
design.

My major goal here is data integrity. I want reasonable performance,
but I don't expect this database to be used as heavily as a business
app. Data integrity is important to me mainly because I've spent too
much of my professional life trying to deal with data integrity issues
in databases. Usually due to schemas that allow rows or relationships
to exist that simply shouldn't in the business world.

Like most communities, we have several leagues in little league. It's
convenient to keep track of players, coaches, umpires, and committee
members. Players for a particular league must have a date of birth that
falls within a specific range. Coaches cannot officiate in the same
league the coach. Committee members may serve as umpires or coaches.

So this is my first stab at a set of tables in pseudo-code. There are
other tables, but these represent the stuff I'd like discussion on:

league(id, name, min_age, max_age, cuttoff_birth_month,
cuttoff_birth_day)
person(id, last_name, first_name, dob, home_phone, constraint
name_dob_ukey unique(last_name, first_name, dob))
committee_member(id, position);--such as secretary, treasurer, etc.
player(id, team_id, position, games_played);--also includes offensive
and defensive stats
coach(id, team_id, is_head_coach);
umpire(id);

person_league(id, person_id, league_id, role);--The role column
represents coach, committee, etc..
--The 'id' field here is not a primary key, but references player.id,
coach.id, etc, depending on the role column.

So through the role column on the person_league table, I can make sure
that people aren't inappropriately serving multiple roles where those
roles conflict. I can also help make sure that I'm tracking the same
person in their role across multiple leagues (umpire on t-ball, coach
on the 8-9 year old league, etc). A combination of triggers and unique
constraints should handle most of my concerns.

A relatively major drawback I see in this design is all the joins I'll
have to do. Just to get a roster for a team, I'd have to join the team,
player, person_league, and person tables. I can't imagine that the data
size or load on this database will cause the joins to deliver poor
performance, but as I've already said, I'm not a DBA.

I'm not terribly happy with the person_league.id column, since I can't
define it as a foreign key. I'm sure I can handle that in a trigger,
but I wonder if I'm going overboard...

For all practical purposes, users will only use a nice GUI app to
access and update the database. I'd probably be the only person writing
ad-hoc SQL queries.

Now for my questions:

1. Is this design fairly typical?
2. Suggestions for improvement? I don't mind suggestions that might
weaken data integrity, as long as you give me a justification. Also, if
you see an opportunity to strengthen data integrity, please speak up.
3. Regarding enforcing data integrity. I've never really figured out
the best compromise between how much data integrity belongs 'in the
database', whether it be in schema design or triggers, and how much
should be in the software that accesses the database. All opinions on
this are welcome.


Tim Hart
Senior Java/J2EE developer
tjhart@mac.com :email
(312)560-4267   :phone


pgsql-general by date:

Previous
From: "James Harper"
Date:
Subject: Re: implicit cast of empty string to timestamp
Next
From: Joe Conway
Date:
Subject: Re: Seeking comments on schema design and data integrity