Re: timestamps, formatting, and internals - Mailing list pgsql-general

From David Salisbury
Subject Re: timestamps, formatting, and internals
Date
Msg-id 4FB6DB8F.4060307@globe.gov
Whole thread Raw
In response to timestamps, formatting, and internals  (David Salisbury <salisbury@globe.gov>)
Responses Re: timestamps, formatting, and internals
Re: timestamps, formatting, and internals
List pgsql-general
Actually, figured I'd post the whole function, painful as it
might be for anyone to read.  If anyone sees something that's a bit
of a risk ( like perhaps the whole thing ;)

On 5/18/12 5:19 PM, David Salisbury wrote:
>
> I'm trying to debug an intermittent problem I'm seeing in one of our
> rollup scripts.
>
> I'll try to summarize. A table has a measured_at field, of which I
> calculate another
> time value based on that field and a longitude value, called solar_noon,
> and I summarize
> min/max values grouped around this solarnoon. While I'm doing this I
> also calculate a
> minimum time difference between the calcualted solar noon value and all
> the measured_at times
> within the group. I then join this summary table back with the original
> table it's
> summarizing, trying to pick out the one record in it that has the
> measured_at value that's closest
> to the solarnoon value of the grouping.
>
> Clear as mud? Anyways, there seems to be a problem on that last part.
> I'm thinking
> the join on these date values is a bit funky. Perhaps things aren't
> matching up on micro
> second values, but it's hard to know with queries if I'm seeing what the
> DB is seeing, as
> date values are stored in seconds and what queries give you is a format
> of that.
>
> So one question I have is if there a way to set PG in the way Oracle
> does it..
> set nls_date_format = 'YYYY...' so I can query and see exactly what PG
> is seeing,
> even to the microseconds? Is there a config parameter I can set in PG so
> that calculations
> are done only to the second? It seems this join doesn't always find a
> record that's closest
> to solar noon, and therefore drops the summary and join record all
> together.
>
> PG 9.0, Linux
>
>

     execute "
       CREATE OR REPLACE FUNCTION rollup_and_insert_subdays()
         returns void as $$
         declare v_created_at timestamp without time zone;
       BEGIN
       v_created_at := now();
       INSERT INTO air_temp_dailies (
         uuid,
         site_id,
         organizationid,
         protocol_id,
         measured_at,
         current_temp_c,
         maximum_temp_c,
         minimum_temp_c,
         created_at,
         comments
       )
       SELECT
         subd_summary.uuid,
         subd_summary.site_id,
         subd_summary.organizationid,
         subd_summary.protocol_id,
         subd_summary.measured_at,
         subd_summary.current_temp_c,
         subd_summary.max_temp_c,
         subd_summary.min_temp_c,
         v_created_at as updated_at,
         'automated station rollup' as comments

       FROM (
         SELECT DISTINCT on ( site_id, solarnoon )
           site_id,
           organizationid,
           protocol_id,
           uuid,
           solarnoon,
           measured_at,
           current_temp_c,
           max_temp_c,
           min_temp_c
         FROM (
           SELECT
             sd.site_id,
             sd.organizationid,
             sd.protocol_id,
             sd.uuid,
             sds.solarnoon,
             sd.measured_at,
             sd.current_temp_c,

             sds.max_temp_c as max_temp_c,
             sds.min_temp_c as min_temp_c
           FROM (
             SELECT
               site_id,
               longitude,
               calculate_local_solar_noon(measured_at,longitude) as solarnoon,
               max(current_temp_c) as max_temp_c,
               min(current_temp_c) as min_temp_c,
                 min(
                   abs(
                     cast(
                       extract(
                         epoch FROM (
                           measured_at -
                             calculate_local_solar_noon(measured_at,longitude)
                         )
                       ) as integer
                     )
                   )
                 ) as
                 minimum_time_between_measured_and_solarnoon,
                 trunc(count(*)/93) as enough_measurements
             FROM (
               SELECT
                 site_id,
                 current_temp_c,
                 measured_at,
                 ST_X(point) as longitude
               FROM
                 air_temp_sub_days INNER JOIN sites on (
                   air_temp_sub_days.site_id = sites.id
                 )
             ) as appending_longitude_to_sub_day_values
             WHERE
               measured_at  is not null
             GROUP BY
               site_id,
               calculate_local_solar_noon(measured_at,longitude),
               -- we assume that for any site, for any given solarnoon, the
               -- longitude is constant, so adding this value in the
               -- group by has no effect other than allowing these values
               -- to percolate to the outer select
               longitude
           ) sds,
           air_temp_sub_days sd
         WHERE
           sds.site_id = sd.site_id
           and
           calculate_local_solar_noon(
             sd.measured_at,
             sds.longitude
           ) = sds.solarnoon
           -- match with the record that's closest to solarnoon.
           -- At this point we know the time difference,
           -- but not whether it's more or less. The higher level
           -- DISTINCT clause removes any duplicates caused should
           -- solarnoon fall exactly between two measured_at times.
           and
           enough_measurements > 0
           and
           (
             (
               sd.measured_at = (
                 calculate_local_solar_noon(sd.measured_at,sds.longitude) + (
                   sds.minimum_time_between_measured_and_solarnoon::text ||
                     ' secs'
                 )::interval
               )
             )
             or
             (
               sd.measured_at = (
                 calculate_local_solar_noon(sd.measured_at,sds.longitude) - (
                   sds.minimum_time_between_measured_and_solarnoon::text ||
                     ' secs'
                 )::interval
               )
             )
           )
         ) end_distinct
       ) subd_summary
       LEFT OUTER JOIN air_temp_dailies on
         subd_summary.site_id = air_temp_dailies.site_id
         and
         subd_summary.measured_at = air_temp_dailies.measured_at
       WHERE
         air_temp_dailies.site_id is null
         and
         air_temp_dailies.measured_at is null
       order by subd_summary.site_id, subd_summary.measured_at
       ;

       END $$ LANGUAGE plpgsql;

pgsql-general by date:

Previous
From: David Salisbury
Date:
Subject: timestamps, formatting, and internals
Next
From: David Salisbury
Date:
Subject: Re: timestamps, formatting, and internals