big un stacking query - help save me from myself - Mailing list pgsql-general

From Kirk Wythers
Subject big un stacking query - help save me from myself
Date
Msg-id 5CCBF822-1E66-4B8B-A259-958DD384A0BA@gmail.com
Whole thread Raw
Responses Re: big un stacking query - help save me from myself
List pgsql-general
I hace a fairly large table with two columns that I need to "de-normalize" (235 million rows) There has got to be a
better(i.e. faster) approach than what I am doing. I am using a MAX CASE on each of the 24 variables (column names
variableand value) that I want to unstack.  Any suggestions would be most appreciated.  Here is the ANALYZE  

 GroupAggregate  (cost=107156950.16..174988710.12 rows=23511876 width=286)
   ->  Sort  (cost=107156950.16..107744747.04 rows=235118752 width=286)
         Sort Key: site, canopy, block, plot, measurement_interval, warming_treatment, treatment_code, treatment_abbr,
water_treatment,variable_name, variable_channel, variable_id, rowid, time2, "timestamp", block_name, table_name,
batt_volt,program, flag1, flag2, flag3, amb_a_avg, amb_closed_avg, airtc_avg, airtemp_avg, airtemp_max, airtemp_min,
all_avgt,am25tref1, ptemp_avg, rh, s_all_avgt_avg, vp_avg, tabove_sdupper, tabove_sdlower, tabove_meantrim,
tabove_mean_dc,tsoil_sdupper, tsoil_sdlower, tsoil_meantrim, tsoil_mean_dc 
         ->  Seq Scan on derived_15min_joined  (cost=0.00..11782048.52 rows=235118752 width=286)
(4 rows)


And here is the table:


b4warmed3=# \d derived_15min_joined
              Table "public.derived_15min_joined"
        Column        |            Type             | Modifiers
----------------------+-----------------------------+-----------
 site                 | character varying(6)        |
 canopy               | character varying(24)       |
 block                | character(2)                |
 plot                 | character(6)                |
 measurement_interval | interval                    |
 warming_treatment    | character varying(24)       |
 treatment_code       | character varying(24)       |
 treatment_abbr       | character varying(24)       |
 water_treatment      | character varying(24)       |
 variable_name        | character varying(24)       |
 variable_channel     | character varying(24)       |
 variable_id          | character varying(24)       |
 rowid                | character varying(48)       |
 time2                | timestamp without time zone |
 timestamp            | timestamp without time zone |
 block_name           | character varying(8)        |
 table_name           | character varying(10)       |
 batt_volt            | real                        |
 program              | character varying(48)       |
 flag1                | integer                     |
 flag2                | integer                     |
 flag3                | integer                     |
 amb_a_avg            | real                        |
 amb_closed_avg       | real                        |
 airtc_avg            | real                        |
 airtemp_avg          | real                        |
 airtemp_max          | real                        |
 airtemp_min          | real                        |
 all_avgt             | real                        |
 am25tref1            | real                        |
 ptemp_avg            | real                        |
 rh                   | real                        |
 s_all_avgt_avg       | real                        |
 vp_avg               | real                        |
 tabove_sdupper       | real                        |
 tabove_sdlower       | real                        |
 tabove_meantrim      | real                        |
 tabove_mean_dc       | real                        |
 tsoil_sdupper        | real                        |
 tsoil_sdlower        | real                        |
 tsoil_meantrim       | real                        |
 tsoil_mean_dc        | real                        |
 variable             | text                        |
 value                | real                        |
Indexes:
    "derived_15min_joined_lower_idx" btree (lower(variable_name::text))
    "derived_15min_joined_time2_idx" btree (time2)

b4warmed3=#

And here is my query.

SELECT
    site,
    canopy,
    block,
    plot,
    measurement_interval,
    warming_treatment,
    treatment_code,
    treatment_abbr,
    water_treatment,
    variable_name,
    variable_channel,
    variable_id,
    rowid,
    time2,
    timestamp,
    block_name,
    table_name,
    batt_volt,
    program,
    flag1,
    flag2,
    flag3,
    amb_a_avg,
    amb_closed_avg,
    airtc_avg,
    airtemp_avg,
    airtemp_max,
    airtemp_min,
    all_avgt,
    am25tref1,
    ptemp_avg,
    rh,
    s_all_avgt_avg,
    vp_avg,
    tabove_sdupper,
    tabove_sdlower,
    tabove_meantrim,
    tabove_mean_dc,
    tsoil_sdupper,
    tsoil_sdlower,
    tsoil_meantrim,
    tsoil_mean_dc,
    MAX (
        CASE
        WHEN lower(variable_name) = 'tabove' THEN

        value

        END
    ) AS tabove,
    MAX (
        CASE
        WHEN lower(variable_name) = 'tabove_sc' THEN

        value

        END
    ) AS tabove_sc,
    MAX (
        CASE
        WHEN lower(variable_name) = 'tabove_delta' THEN

        value

        END
    ) AS tabove_delta,
    MAX (
        CASE
        WHEN lower(variable_name) = 'tsoil' THEN

        value

        END
    ) AS tsoil,
    MAX (
        CASE
        WHEN lower(variable_name) = 'tsoil_sc' THEN

        value

        END
    ) AS tsoil_sc,
    MAX (
        CASE
        WHEN lower(variable_name) = 'tsoil_delta' THEN

        value

        END
    ) AS tsoil_delta,
    MAX (
        CASE
        WHEN lower(variable_name) = 's_sdm_out' THEN

        value

        END
    ) AS s_sdm_out,
    MAX (
        CASE
        WHEN lower(variable_name) = 'sbtemp' THEN

        value

        END
    ) AS sbtemp,
    MAX (
        CASE
        WHEN lower(variable_name) = 'heat_a_avg' THEN

        value

        END
    ) AS heat_a_avg,
    MAX (
        CASE
        WHEN lower(variable_name) = 'b_dc_avg' THEN

        value

        END
    ) AS b_dc_avg,
    MAX (
        CASE
        WHEN lower(variable_name) = 'targettemp' THEN

        value

        END
    ) AS targettemp,
    MAX (
        CASE
        WHEN lower(variable_name) = 's_scldout' THEN

        value

        END
    ) AS s_scldout,
    MAX (
        CASE
        WHEN lower(variable_name) = 'tmv' THEN

        value

        END
    ) AS tmv,
    MAX (
        CASE
        WHEN lower(variable_name) = 'a_dc' THEN

        value

        END
    ) AS a_dc,
    MAX (
        CASE
        WHEN lower(variable_name) = 'a_targettemp' THEN

        value

        END
    ) AS a_targettemp,
    MAX (
        CASE
        WHEN lower(variable_name) = 'scldout' THEN

        value

        END
    ) AS scldout,
    MAX (
        CASE
        WHEN lower(variable_name) = 'pid_lmt' THEN

        value

        END
    ) AS pid_lmt,
    MAX (
        CASE
        WHEN lower(variable_name) = 'targettemp_adj' THEN

        value

        END
    ) AS targettemp_adj,
    MAX (
        CASE
        WHEN lower(variable_name) = 'sdm_out' THEN

        value

        END
    ) AS sdm_out,
    MAX (
        CASE
        WHEN lower(variable_name) = 's_pid_lmt' THEN

        value

        END
    ) AS s_pid_lmt,
    MAX (
        CASE
        WHEN lower(variable_name) = 'tsoilr' THEN

        value

        END
    ) AS tsoilr,
    MAX (
        CASE
        WHEN lower(variable_name) = 's_pid_out' THEN

        value

        END
    ) AS s_pid_out,
    MAX (
        CASE
        WHEN lower(variable_name) = 'sctemp' THEN

        value

        END
    ) AS sctemp,
    MAX (
        CASE
        WHEN lower(variable_name) = 'amb_avg' THEN

        value

        END
    ) AS amb_avg
FROM
    derived_15min_joined
GROUP BY
    site,
    canopy,
    block,
    plot,
    measurement_interval,
    warming_treatment,
    treatment_code,
    treatment_abbr,
    water_treatment,
    variable_name,
    variable_channel,
    variable_id,
    rowid,
    time2,
    timestamp,
    block_name,
    table_name,
    batt_volt,
    program,
    flag1,
    flag2,
    flag3,
    amb_a_avg,
    amb_closed_avg,
    airtc_avg,
    airtemp_avg,
    airtemp_max,
    airtemp_min,
    all_avgt,
    am25tref1,
    ptemp_avg,
    rh,
    s_all_avgt_avg,
    vp_avg,
    tabove_sdupper,
    tabove_sdlower,
    tabove_meantrim,
    tabove_mean_dc,
    tsoil_sdupper,
    tsoil_sdlower,
    tsoil_meantrim,
    tsoil_mean_dc
;




pgsql-general by date:

Previous
From: Joshua Berkus
Date:
Subject: Re: [PERFORM] PostgreSQL 9.2.3 performance problem caused Exclusive locks
Next
From: Kirk Wythers
Date:
Subject: big un stacking query - save me from myself