Thread: proper db standard

proper db standard

From

Jodi Kanter

Date:

06 December 2002, 10:46:33

I am creating a simple database that will hold information about various publications. There are keywords that are associated with these publications and there can be anywhere from 1 to about 6 of these different keywords.

As I see it I have two choices:

1) create keyword fields 1-6 in the publications database and accept that some of these fields will be empty.

2) create two tables: "publication" and "keyword". In this scenario I have no limit on the amount of keywords that are used and I don't have empty fields. However, many of the keywords repeat for different publications. In this situation I would have some repeating words in the columns.

I lean toward #2 but wanted to see if there was a preferred standard or another possibility that I am overlooking??

Thanks

Jodi

_______________________________
Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
jkanter@virginia.edu

Re: proper db standard

From

"Dan Langille"

Date:

06 December 2002, 11:04:09

On 6 Dec 2002 at 10:45, Jodi Kanter wrote:

> I am creating a simple database that will hold information about
> various pu= blications. There are keywords that are associated with
> these publications = and there can be anywhere from 1 to about 6 of
> these different keywords.
>
> As I see it I have two choices:
>
> 1) create keyword fields 1-6 in the publications database and accept
> that s= ome of these fields will be empty.

That is unnormalized data and will make queries more awkward.


> 2) create two tables:
> "publication" and "keyword". In this scenario I have = no limit on the
> amount of keywords that are used

You can control the number of keywords in the application and via
triggers in the database.

> and I don't have empty fie= lds.

What is the signifiance of empty fields?  You can always determine
the number of keywords for a given publication with this:

select count(*)
from keywords, pubication
where keywords.publication_id = publication.id;

> However, many of the keywords repeat for different publications. In
> th= is situation I would have some repeating words in the columns.

There's nothing wrong with that

> I lean toward #2 but wanted to see if there was a preferred standard
> or ano= ther possibility that I am overlooking??

I would recommend #2.
--
Dan Langille : http://www.langille.org/

Re: proper db standard

From

"Nick Fankhauser"

Date:

06 December 2002, 12:12:57

Jodi-

Given you two choices, I would go for #2, but consider this third option:

Publication:
pub_id
other_stuff

Keyword:
keyword_id
keyword_text

Keyword_assignment:
pub_id
keyword_id

Keyword only contains 6 records, but you can add new keywords as needed in
the future. (Option #1 didn't give you that ability.)
Duplication of data is kept to a minimum.

-Nick






-----Original Message-----
From: pgsql-admin-owner@postgresql.org
[mailto:pgsql-admin-owner@postgresql.org]On Behalf Of Jodi Kanter
Sent: Friday, December 06, 2002 10:46 AM
To: Postgres Admin List
Subject: [ADMIN] proper db standard


I am creating a simple database that will hold information about various
publications. There are keywords that are associated with these publications
and there can be anywhere from 1 to about 6 of these different keywords.

As I see it I have two choices:

1) create keyword fields 1-6 in the publications database and accept that
some of these fields will be empty.
2) create two tables: "publication" and "keyword". In this scenario I have
no limit on the amount of keywords that are used and I don't have empty
fields. However, many of the keywords repeat for different publications. In
this situation I would have some repeating words in the columns.

I lean toward #2 but wanted to see if there was a preferred standard or
another possibility that I am overlooking??

Thanks
Jodi

_______________________________
Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
jkanter@virginia.edu