Re: Normalized Tables & SELECT [was: Find "smallest common year"] - Mailing list pgsql-general

From Alban Hertroys
Subject Re: Normalized Tables & SELECT [was: Find "smallest common year"]
Date
Msg-id 4701061E.4080000@magproductions.nl
Whole thread Raw
In response to Re: Normalized Tables & SELECT [was: Find "smallest common year"]  (Stefan Schwarzer <stefan.schwarzer@grid.unep.ch>)
Responses Re: Normalized Tables & SELECT [was: Find "smallest common year"]  (Stefan Schwarzer <stefan.schwarzer@grid.unep.ch>)
List pgsql-general
Stefan Schwarzer wrote:
>
>
>> BTW, You didn't actually use type text for your year column, did you? No
>> quotes needed then. Otherwise you'd have to make sure your year values
>> are all the same length or sorting gets... interesting.
>
> Yep, my comment just before concerns especially this paragraph, I guess.
> With not only years but also time spans for my data - so, not only 1970,
> 1971.... but also 1970-75, 1975-80 etc. I would need indeed to use text
> for my year column. Why get sorting than "interesting"?
>
> Stef

Text sorts alphabetically, not numerically:

integer        text
-------        -------
1        '1'
2        '10'
3        '2'
4        '3'
5        '4'
6        '5'
7        '6'
8        '7'
9        '8'
10        '9'


I guess you won't have any problems with sorting, as you use full
century prefixes and the differences in syntax ('1970' vs '1970-75') are
at the rear of your values.


An entirely different question is whether it is a good idea to write a
range as a value that the database cannot interpret correctly (referring
to the '1970-75' notation). You cannot group records by value this way
if you need to (for example) combine data from '1970' with data from
'1970-75'.

But you seem to use these values just for labels, which I assume are
unique across years (eg. if you have a value '1970-75' you don't have
values '1970', 1971'..'1974'), in which case this is safe to use. As
pointed out by several people earlier, they make an excellent foreign
key too (provided they're unique).

Other options to handle these years involve having a start/end year or
date (to_date('01/01/1970', 'MM/DD/YYYY'), to_date('12/31/1975',
'MM/DD/YYYY')) or a start date and an interval (to_date('01/01/1970',
'MM/DD/YYYY'), INTERVAL '1 year').

That makes operators like OVERLAPS and BETWEEN available to your
queries, which may be convenient ;)

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

pgsql-general by date:

Previous
From: Richard Huxton
Date:
Subject: Re: sha1 function
Next
From: Mike Charnoky
Date:
Subject: Re: more problems with count(*) on large table