Re: UPDATE comparing dates and non-dates data - Mailing list pgsql-sql
From | Bret Hughes |
---|---|
Subject | Re: UPDATE comparing dates and non-dates data |
Date | |
Msg-id | 1039815180.2743.12.camel@bretsony Whole thread Raw |
In response to | UPDATE comparing dates and non-dates data (javier garcia <andresjavier.garcia@wanadoo.es>) |
List | pgsql-sql |
On Fri, 2002-12-13 at 11:43, javier garcia wrote: > Hi; > I've got date data, extracted from rain gauge stations. The date of a row in > my data are structured in three integer fields, and as a result of a query I > can get the following (what is quite good for my): > cod_station | year | month | day | rain > -------------+------+-------+-----+------ > 7250 | 1933 | 8 | 1 | 45 > 7250 | 1933 | 8 | 2 | 3 > 7250 | 1933 | 8 | 3 | 0 > ... > and this for several rain station for about forty years. My problem is that I > need to find missing data (a missing data here is a non-existent row). > > I think I could prepare a table with a complete series of date field, and > ALTER it to ADD aditional fields for every station I need to add: > > date | stat_7250 | stat_7237 ... > ----------------+--------------------+---------------+------... > 1/11/1999 > 2/12/1999 > ... > , and make an UPDATE that in someway, will compare the "date" of this table > with an extracted date from "year", "day" and "rain" from the aforementioned > SELECT, leaving the non-coincident dates in blank and filling the rain data > of the coincident dates. Is this possible? How? > > Please help. > Thank you for any possible advice > seems like a big job. not knowing how many rain guages there are (I am assuming they are mainly in the plain) this is probably a fair amount of data, roughly 25,000 data points per gauge? I think I would approach this differently based on what I knew about the data but it may change after I think about it for a minute or two. I am sort of interested in this issue since I am right now scheming on the most effcient way to sum integer counts of events in seperate rows by 3 columns and where dates are contiguous. Anyway, are there tens, hundreds or thousands of guages? do most guages have some missing data? Is diskspace an issue for duplicating the data in an easier to use fashion? you could approach it as you were talking about but I think I would first try to identify the missing data. that seems pretty straight forward to code in the language of you choice I would probably do it in perl but that is just because that is what I am most familier with. buzz the table with a select * order by cod_station,year, month, day initialize the comparison values to the first row and and then for each row compare the station and if it is the same compare the date and see if it is equal to the old date + 1 day if not write it out to a file or inster it into a badentry table as a bad cod_station/date combo you will need to get the dates into a format you can add a day to without having to worry about the month year wrapping stuff if the station is different reinitialize all the variables and keep going if the station is the same and the date check worked then set the vars to the current values and keep on cruising. This can probably be done in PL/psql or what ever the postgres language is but I nave not gotten that far yet. then you can do allsorts of stuff to the table based on the dates that are found bad. if all stations are supposed to current then an additional check would have to be made to see if the last date for a station is the current date and create entries in the bad file/table for each of the missing days. Bret What I have been dreaming of is a select max(date) min(date) where dates are contiguous and blah blah. I can't seem to find it in the docs though :)