Re: timestamps, formatting, and internals - Mailing list pgsql-general
From | David Salisbury |
---|---|
Subject | Re: timestamps, formatting, and internals |
Date | |
Msg-id | 4FB6DB8F.4060307@globe.gov Whole thread Raw |
In response to | timestamps, formatting, and internals (David Salisbury <salisbury@globe.gov>) |
Responses |
Re: timestamps, formatting, and internals
Re: timestamps, formatting, and internals |
List | pgsql-general |
Actually, figured I'd post the whole function, painful as it might be for anyone to read. If anyone sees something that's a bit of a risk ( like perhaps the whole thing ;) On 5/18/12 5:19 PM, David Salisbury wrote: > > I'm trying to debug an intermittent problem I'm seeing in one of our > rollup scripts. > > I'll try to summarize. A table has a measured_at field, of which I > calculate another > time value based on that field and a longitude value, called solar_noon, > and I summarize > min/max values grouped around this solarnoon. While I'm doing this I > also calculate a > minimum time difference between the calcualted solar noon value and all > the measured_at times > within the group. I then join this summary table back with the original > table it's > summarizing, trying to pick out the one record in it that has the > measured_at value that's closest > to the solarnoon value of the grouping. > > Clear as mud? Anyways, there seems to be a problem on that last part. > I'm thinking > the join on these date values is a bit funky. Perhaps things aren't > matching up on micro > second values, but it's hard to know with queries if I'm seeing what the > DB is seeing, as > date values are stored in seconds and what queries give you is a format > of that. > > So one question I have is if there a way to set PG in the way Oracle > does it.. > set nls_date_format = 'YYYY...' so I can query and see exactly what PG > is seeing, > even to the microseconds? Is there a config parameter I can set in PG so > that calculations > are done only to the second? It seems this join doesn't always find a > record that's closest > to solar noon, and therefore drops the summary and join record all > together. > > PG 9.0, Linux > > execute " CREATE OR REPLACE FUNCTION rollup_and_insert_subdays() returns void as $$ declare v_created_at timestamp without time zone; BEGIN v_created_at := now(); INSERT INTO air_temp_dailies ( uuid, site_id, organizationid, protocol_id, measured_at, current_temp_c, maximum_temp_c, minimum_temp_c, created_at, comments ) SELECT subd_summary.uuid, subd_summary.site_id, subd_summary.organizationid, subd_summary.protocol_id, subd_summary.measured_at, subd_summary.current_temp_c, subd_summary.max_temp_c, subd_summary.min_temp_c, v_created_at as updated_at, 'automated station rollup' as comments FROM ( SELECT DISTINCT on ( site_id, solarnoon ) site_id, organizationid, protocol_id, uuid, solarnoon, measured_at, current_temp_c, max_temp_c, min_temp_c FROM ( SELECT sd.site_id, sd.organizationid, sd.protocol_id, sd.uuid, sds.solarnoon, sd.measured_at, sd.current_temp_c, sds.max_temp_c as max_temp_c, sds.min_temp_c as min_temp_c FROM ( SELECT site_id, longitude, calculate_local_solar_noon(measured_at,longitude) as solarnoon, max(current_temp_c) as max_temp_c, min(current_temp_c) as min_temp_c, min( abs( cast( extract( epoch FROM ( measured_at - calculate_local_solar_noon(measured_at,longitude) ) ) as integer ) ) ) as minimum_time_between_measured_and_solarnoon, trunc(count(*)/93) as enough_measurements FROM ( SELECT site_id, current_temp_c, measured_at, ST_X(point) as longitude FROM air_temp_sub_days INNER JOIN sites on ( air_temp_sub_days.site_id = sites.id ) ) as appending_longitude_to_sub_day_values WHERE measured_at is not null GROUP BY site_id, calculate_local_solar_noon(measured_at,longitude), -- we assume that for any site, for any given solarnoon, the -- longitude is constant, so adding this value in the -- group by has no effect other than allowing these values -- to percolate to the outer select longitude ) sds, air_temp_sub_days sd WHERE sds.site_id = sd.site_id and calculate_local_solar_noon( sd.measured_at, sds.longitude ) = sds.solarnoon -- match with the record that's closest to solarnoon. -- At this point we know the time difference, -- but not whether it's more or less. The higher level -- DISTINCT clause removes any duplicates caused should -- solarnoon fall exactly between two measured_at times. and enough_measurements > 0 and ( ( sd.measured_at = ( calculate_local_solar_noon(sd.measured_at,sds.longitude) + ( sds.minimum_time_between_measured_and_solarnoon::text || ' secs' )::interval ) ) or ( sd.measured_at = ( calculate_local_solar_noon(sd.measured_at,sds.longitude) - ( sds.minimum_time_between_measured_and_solarnoon::text || ' secs' )::interval ) ) ) ) end_distinct ) subd_summary LEFT OUTER JOIN air_temp_dailies on subd_summary.site_id = air_temp_dailies.site_id and subd_summary.measured_at = air_temp_dailies.measured_at WHERE air_temp_dailies.site_id is null and air_temp_dailies.measured_at is null order by subd_summary.site_id, subd_summary.measured_at ; END $$ LANGUAGE plpgsql;
pgsql-general by date: