Re: Database/Table Design for Global Country Statistics - Mailing list pgsql-general

From Tino Wildenhain
Subject Re: Database/Table Design for Global Country Statistics
Date
Msg-id 46E7AE5C.3030106@wildenhain.de
Whole thread Raw
In response to Database/Table Design for Global Country Statistics  (Stefan Schwarzer <stefan.schwarzer@grid.unep.ch>)
Responses Re: Database/Table Design for Global Country Statistics
List pgsql-general
Stefan Schwarzer schrieb:
> Hi there,
>
> I learned in another posting that my table design - in a polite way -
> "could be improved".
>
> So, before doing any additional design errors, I would like to get
> feedback, if possible.
>
> I am dealing with some 500 tables for worldwide national statistics
> (GDP, population, environment etc.), covering approx. 30 years each. For
> each of these variables, I usually have as well (pre-prepared)
> subregional and regional aggregations too. These could - and should - at
> the end be calculated on-the-fly, and not pre-calculated and imported
> from Excel as it is for the moment.
>
> My (national) table for a given variable is in the moment as follows (id
> being the identifier for a specific country):
>
> id   |    1970    |    1971    |    ...    |    2004    |    2005
> -------------------------------------------------------------------
>  1   |   NULL    |      36       |   ...     |      42      |      45
>  2 ......
>
> The new design would be like this:
>
> id   |    year    |    value
> -------------------------------
>  1   |   1970    |     NULL
>  1   |   1971    |      36
>  1   ....
>  1   |   2005    |      45
>  2   |   1970    |      ....
>  2   .....
>
>
> Would that be considered as "good table design" then?

I'd not save null values but otherwise you are fine.
If you want to select over stable ranges of years
w/o bothering with non existent data (and therefore year)
it might be usefull to have a separate mapping table like
year -> year_id (sequential) and use the Ids for year.

This can even be simplified to a table just providing you the
coverage of years you have as a whole and use the number
directly as foreign key in your table.

table:years

year
1970
1971
1972
...


table:measures
measure_id,description
1          foovar
2          barvar
...


table:values
year (fk from years), measure_id (fk from measures), value
1971                  1                              36


every column of "values" table would be not null and
primary key (year,measure_id)

This way you can easily create any report you want
by selectiong a year range, then left join the values table
as often as you want for a given measure and so on.

Regards
Tino




pgsql-general by date:

Previous
From: Richard Huxton
Date:
Subject: Re: Database/Table Design for Global Country Statistics
Next
From: "Albe Laurenz"
Date:
Subject: Re: Question to schema public