Thread: proper db standard
I am creating a simple database that will hold information about various publications. There are keywords that are associated with these publications and there can be anywhere from 1 to about 6 of these different keywords.
As I see it I have two choices:
1) create keyword fields 1-6 in the publications database and accept that some of these fields will be empty.
2) create two tables: "publication" and "keyword". In this scenario I have no limit on the amount of keywords that are used and I don't have empty fields. However, many of the keywords repeat for different publications. In this situation I would have some repeating words in the columns.
I lean toward #2 but wanted to see if there was a preferred standard or another possibility that I am overlooking??
Thanks
Jodi
_______________________________
Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
jkanter@virginia.edu
On 6 Dec 2002 at 10:45, Jodi Kanter wrote: > I am creating a simple database that will hold information about > various pu= blications. There are keywords that are associated with > these publications = and there can be anywhere from 1 to about 6 of > these different keywords. > > As I see it I have two choices: > > 1) create keyword fields 1-6 in the publications database and accept > that s= ome of these fields will be empty. That is unnormalized data and will make queries more awkward. > 2) create two tables: > "publication" and "keyword". In this scenario I have = no limit on the > amount of keywords that are used You can control the number of keywords in the application and via triggers in the database. > and I don't have empty fie= lds. What is the signifiance of empty fields? You can always determine the number of keywords for a given publication with this: select count(*) from keywords, pubication where keywords.publication_id = publication.id; > However, many of the keywords repeat for different publications. In > th= is situation I would have some repeating words in the columns. There's nothing wrong with that > I lean toward #2 but wanted to see if there was a preferred standard > or ano= ther possibility that I am overlooking?? I would recommend #2. -- Dan Langille : http://www.langille.org/
Jodi- Given you two choices, I would go for #2, but consider this third option: Publication: pub_id other_stuff Keyword: keyword_id keyword_text Keyword_assignment: pub_id keyword_id Keyword only contains 6 records, but you can add new keywords as needed in the future. (Option #1 didn't give you that ability.) Duplication of data is kept to a minimum. -Nick -----Original Message----- From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org]On Behalf Of Jodi Kanter Sent: Friday, December 06, 2002 10:46 AM To: Postgres Admin List Subject: [ADMIN] proper db standard I am creating a simple database that will hold information about various publications. There are keywords that are associated with these publications and there can be anywhere from 1 to about 6 of these different keywords. As I see it I have two choices: 1) create keyword fields 1-6 in the publications database and accept that some of these fields will be empty. 2) create two tables: "publication" and "keyword". In this scenario I have no limit on the amount of keywords that are used and I don't have empty fields. However, many of the keywords repeat for different publications. In this situation I would have some repeating words in the columns. I lean toward #2 but wanted to see if there was a preferred standard or another possibility that I am overlooking?? Thanks Jodi _______________________________ Jodi L Kanter BioInformatics Database Administrator University of Virginia (434) 924-2846 jkanter@virginia.edu