Thread: UPDATE comparing dates and non-dates data

UPDATE comparing dates and non-dates data

From

javier garcia

Date:

13 December 2002, 12:44:25

Hi;
I've got date data, extracted from rain gauge stations. The date of a row in 
my data are structured in three integer fields, and as a result of a query I 
can get the following (what is quite good for my):cod_station | year | month | day | rain
-------------+------+-------+-----+------7250        | 1933 |     8 |   1 |    457250        | 1933 |     8 |   2 |
37250       | 1933 |     8 |   3 |    0
 
... 
and this for several rain station for about forty years. My problem is that I 
need to find missing data (a missing data here is a non-existent row).

I think I could prepare a table with a complete series of date field, and 
ALTER it to ADD aditional fields for every station I need to add:
date         |  stat_7250   | stat_7237 ...
----------------+--------------------+---------------+------...
1/11/1999
2/12/1999
...
, and make an UPDATE that in someway, will compare the "date" of this table 
with an extracted date from "year", "day" and "rain" from the aforementioned 
SELECT, leaving the non-coincident dates in blank and filling the rain data 
of the coincident dates. Is this possible? How?

Please help.
Thank you for any possible advice


-- 
A. Javier García
Water and Soil Conservation Department
CEBAS-CSIC
Apartado 4195
30080 Murcia
Spain

Tel.: +34 968 39 63 90
Fax: +34 968 39 62 13

Re: UPDATE comparing dates and non-dates data

From

Josh Berkus

Date:

13 December 2002, 15:53:38

On Friday 13 December 2002 09:43, javier garcia wrote:
> Hi;
> I've got date data, extracted from rain gauge stations. The date of a row in
> my data are structured in three integer fields, and as a result of a query I
> can get the following (what is quite good for my):
>  cod_station | year | month | day | rain
> -------------+------+-------+-----+------
>  7250        | 1933 |     8 |   1 |    45
>  7250        | 1933 |     8 |   2 |    3
>  7250        | 1933 |     8 |   3 |    0
> ...
> and this for several rain station for about forty years. My problem is that
I
> need to find missing data (a missing data here is a non-existent row).
>
> I think I could prepare a table with a complete series of date field, and
> ALTER it to ADD aditional fields for every station I need to add:
>
>  date         |  stat_7250   | stat_7237 ...
> ----------------+--------------------+---------------+------...
> 1/11/1999
> 2/12/1999
> ...
> , and make an UPDATE that in someway, will compare the "date" of this table
> with an extracted date from "year", "day" and "rain" from the aforementioned
> SELECT, leaving the non-coincident dates in blank and filling the rain data
> of the coincident dates. Is this possible? How?

I don't think this is the best method to find missing dates.  However, I'm  a
little confused as to what you consider a "missing" date.  Could you be more
explicit?


--
-Josh BerkusAglio Database SolutionsSan Francisco

Re: UPDATE comparing dates and non-dates data

From

Bret Hughes

Date:

13 December 2002, 16:34:34

On Fri, 2002-12-13 at 11:43, javier garcia wrote:
> Hi;
> I've got date data, extracted from rain gauge stations. The date of a row in 
> my data are structured in three integer fields, and as a result of a query I 
> can get the following (what is quite good for my):
>  cod_station | year | month | day | rain
> -------------+------+-------+-----+------
>  7250        | 1933 |     8 |   1 |    45
>  7250        | 1933 |     8 |   2 |    3
>  7250        | 1933 |     8 |   3 |    0
> ... 
> and this for several rain station for about forty years. My problem is that I 
> need to find missing data (a missing data here is a non-existent row).
> 
> I think I could prepare a table with a complete series of date field, and 
> ALTER it to ADD aditional fields for every station I need to add:
> 
>  date         |  stat_7250   | stat_7237 ...
> ----------------+--------------------+---------------+------...
> 1/11/1999
> 2/12/1999
> ...
> , and make an UPDATE that in someway, will compare the "date" of this table 
> with an extracted date from "year", "day" and "rain" from the aforementioned 
> SELECT, leaving the non-coincident dates in blank and filling the rain data 
> of the coincident dates. Is this possible? How?
> 
> Please help.
> Thank you for any possible advice
> 

seems like a big job.  not knowing how many rain guages there are (I am
assuming they are mainly in the plain) this is probably a fair amount of
data, roughly 25,000 data points per gauge?

I think I would approach this differently based on what I knew about the
data but it may change after I think about it for a minute or two.  I am
sort of interested in this issue since I am right now scheming on the
most effcient way to sum integer counts of events in seperate rows by 3
columns and where dates are contiguous.

Anyway, are there tens, hundreds or  thousands of guages? do most guages
have some missing data?  Is diskspace an issue for duplicating the data
in an easier to use fashion?

you could approach it as you were talking about but I think I would
first try to identify the missing data.  that seems pretty straight
forward to code in the language of you choice I would probably do it in
perl but that is just because that is what I am most familier with.

buzz the table with a 

select * order  by cod_station,year, month, day

initialize the comparison values to the first row and 

and then for each row compare the station and if it is the same 
compare the date and see if it is equal to the old date + 1 day if not
write it out to a file or inster it into a badentry table as a bad
cod_station/date combo 

you will need to get the dates into a format you can  add a day to
without having to worry about the month year wrapping stuff

if the station is different reinitialize all the variables
and keep going

if the station is the same and the date check worked then set the vars
to the current values and keep on cruising.

This can probably be done in PL/psql or what ever the postgres language
is but I nave not gotten that far yet.

then you can do allsorts of stuff to the table based on the dates that
are found bad.

if all stations are supposed to current then an additional check would
have to be made to see if the last date for a station is the current
date and create entries in the bad file/table for each of the missing
days. 

Bret

What I have been dreaming of is a select max(date) min(date) where dates
are contiguous and blah blah.  I can't seem to find it in the docs
though :)