Re: Advice on Database Schema - Mailing list pgsql-general

From James Gregory
Subject Re: Advice on Database Schema
Date
Msg-id 1048673546.28857.9.camel@pirate.bridge.anchor.net.au
Whole thread Raw
In response to Advice on Database Schema  ("Jennifer Lee" <jlee@scri.sari.ac.uk>)
List pgsql-general
On Tue, 2003-03-25 at 12:07, Jennifer Lee wrote:
> Hello,
>
> We have been proposed two database schema and are faced with
> making a decision between them. I am fairly new to Postgresql and am
> looking for advice on which would work best. Both would be done in
> postgresql.
>
> Our database will hold lots of different data types and will be queried
> frequently and have new data inserted rarely (in the long run). So we
> would like to have it be most efficient with queries. Only a few users
> would have permission to add data, most will only be allowed to query
> the database.
>
> One schema has on the order of 100 tables and the different types of
> data will be stored in different tables. Most tables will reference at
> least one other table. The second schema has on the order of 10 tables
> with more data stored in each and multiple types of data in a single
> table. In this case tables will reference others, but also then
> themselves in the case where different types of data is stored in a
> single table.

Hi Jennifer,

I'm not sure I have the answer as to which one is fastest. I suppose if
you perform less joins it is theoretically faster, but I don't know if
speed should be your primary concern. I would go for a schema that
logically splits up different entities into different tables. I'm
currently cleaning up a database that sounds a lot like your "simple"
system. What I've found ends up happening is that you end up with a lot
of data that is wasted - ie data which is replicated or tables where
thousands of rows have just empty space. Empty fields take up space as
well.

The other concern is that if you keep your data logically arranged into
as many tables as it takes, then you'll find it's easier to "grow" your
schema; that is you won't find yourself needing to make ad hoc design
decisions to implement a new feature. I find that in general,
referential integrity is easier to maintain like this, since you have
finer grained control over what links to what.

Probably the best of both worlds is to index the data you're going to
search, split it up into as many bits as you need and use views to
present the data to your software in a way which is less onerous than
dealing with 100 tables all at once. Views have the advantage over
standard SQL queries that they are compiled the first time they are used
so that future calls to them are faster.

HTH,

James.


pgsql-general by date:

Previous
From: Dennis Gearon
Date:
Subject: Re: TEXT datatype and VB6...
Next
From: Tariq Muhammad
Date:
Subject: pg_dump (v-7.3.2) of individual schemas