Re: Database structure - Mailing list pgsql-sql
From | Andrei Bintintan |
---|---|
Subject | Re: Database structure |
Date | |
Msg-id | 00a401c431e5$51dc7250$0b00a8c0@forge Whole thread Raw |
In response to | Database structure ("Alain Reymond" <alain.reymond@ceia.com>) |
Responses |
Re: Database structure
|
List | pgsql-sql |
I would go for the second one. I think the size of the table is not a problem. You will have just to write the right indexes for easy joins. OBS: " For one assessment, I'll store 60 rows with only two useful integers in it" ... why? Better make a "lab_test" table where you have the tab tests and you write in the results(#assessment_nr, labtest_nr, p, d) only those datas that you have. For example if you have the assesment no. 3000 and you have only the results for lab_test 10->40 then why to write in the DB also the lab_test from 40->70(if you don't have it)??? (if I didn't understand this clear, sorry for the observation). The second option is better if you change one time the lab_test list(have to think also this option --- if making the database for at least 10 years). Because in the first solution you will have to add always a new column... and that is not the "best" option. In the second way you just add a new ID in the lab_test list and finish. No problems. If you go for the first option and you have to change something in the result table... it won't be easy. The alter table is not so tragical as it seems... use constrains...don't ever erase from DB. So... my final answer: the second option. Best regards, Andy. ----- Original Message ----- From: "Alain Reymond" <alain.reymond@ceia.com> To: <pgsql-sql@postgresql.org> Sent: Friday, April 30, 2004 6:06 PM Subject: [SQL] Database structure > Hello, > > I would like an advise on the following problem : > > I have a table of patients. > Each patient can make different biological assessments. > Each assessment is always decomposed into different laboratory tests. > A laboratory test is made of a test number and two values coming from analysers. > > The schema is : > Patients(#patient_nr,name,etc...) > Assessment(#assessment_nr, #patient_nr, assessment_type, other usefull values). > Assessment_types(assessment_type, labtest_nr) > An assessment is composed of different tests, let's say assessment type 1 is > composed of lab test 1,2,3,5 and assessment type 2 of lab test number 10 to 70. > > I have an assessment with 60 different lab tests (always the same). I have two ways > for storing the values : > > 1 - a table with 120 columns for the two values. > results(#assessment_nr, p10,d10, p11,d11, .....,p70,d70). > where 10 to 70 represents the lab test number. > > 2 - a table with 60 rows for one assessment : > results(#assessment_nr, labtest_nr, p, d) where p and d are my two results. > > Here comes my question. Which of the two would you choose? > > The firsrt solution has the advantage of returning one single row for one complete > assessment. If I have to make statistics, it is easy. But, if I have to modify the > composition of an assessment (which occurs very rarely), I shall have to use an alter > table instruction. As I have 4 different assessment types, I have to create five > different tables, one per assessment type. > > The second solution is normalized and more elegant. But I am preoccupied by the > size of the table. For one assessment, I'll store 60 rows with only two useful integers > in it. And you must add the size of the index. With 25.000 assessments a year, it > makes 1.500.000 rows with only 4 columns amoung them 2 only for the results and 2 > for identification. I would like to store 10 years online, so 15.000.000 rows. What > about the size of index ? > > Any advise ? I thank you in advance. > > > Alain Reymond > > (I hope that it is clear enough with my bad English). > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings >