Re: Long running INSERT+SELECT query - Mailing list pgsql-general

From Steven Lembark
Subject Re: Long running INSERT+SELECT query
Date
Msg-id 20180427092930.1ee8aa43@wrkhors.com
Whole thread Raw
In response to Re: Long running INSERT+SELECT query  (Vitaliy Garnashevich <vgarnashevich@gmail.com>)
List pgsql-general
On Thu, 26 Apr 2018 21:08:01 +0300
Vitaliy Garnashevich <vgarnashevich@gmail.com> wrote:

>      INSERT INTO cmdb_sp_usage_history
>        (created_by, updated_by, created_on, updated_on, mod_count,
>        summary_on, quarter, product, used_from, "user",
>        keystrokes, minutes_in_use, times_started, avg_keystrokes,
> max_keystrokes, spkg_operational)
>      SELECT
>         2, 2, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP, 1,
>         CURRENT_TIMESTAMP, quarter.id, "spv"."product",
> "usage"."used_from", "usage"."user",
>         coalesce(sum("usage"."keystrokes"), 0),
>         coalesce(sum("usage"."minutes_in_use"), 0),
>         coalesce(sum("usage"."times_started"), 0),
>         coalesce(avg("usage"."keystrokes"), 0),
>         coalesce(max("usage"."keystrokes"), 0),
>         bool_or("cmdb_ci"."operational")
>      FROM
>        "cmdb_program_daily_usage" "usage"
>      LEFT OUTER JOIN
>        "cmdb_program_instance" "p" ON "p"."id" =
> "usage"."program_instance" LEFT OUTER JOIN
>        "cmdb_ci_spkg" "s" ON "s"."id" = "p"."spkg"
>      LEFT OUTER JOIN
>        "cmdb_ci" "cmdb_ci" ON "s"."id" = "cmdb_ci"."id"
>      LEFT OUTER JOIN
>        "cmdb_software_product_version" "spv" ON "spv"."id" =
> "s"."software" WHERE ("usage"."minutes_in_use" > 0)
>        AND ((NOT ("s"."software" IS NULL))
>             AND ((NOT ("s"."os" = TRUE))
>                  OR ("s"."os" IS NULL)))
>        AND ("usage"."usage_date" >= quarter.start_date)
>        AND ("usage"."usage_date" < quarter.end_date)
>      GROUP BY "spv"."product", "usage"."used_from", "usage"."user"
>      HAVING (coalesce(sum("usage"."keystrokes"), 0) > 0) OR
> (coalesce(sum("usage"."minutes_in_use"), 0) > 0) OR
> (coalesce(sum("usage"."times_started"), 0) > 0)
>      ORDER BY "spv"."product", "usage"."used_from", "usage"."user";

create temporary table
    foobar

    select
        <unaggregatedl, un-coalesced data>
    from
        <join from hell, above, sans group by>
;

This isolates the lock time to performing the bare select, after
which you can coalesce and sum to your heart's content without
locking any of it.

The point is performing the absolute minimum of processing to
generate the temp table so as to release any locks quickly and
avoid "group by" in the main join.

Yes, this might end up creating a large-ish temp table :-)

One other approach would be selecting only incremental data
(e.g., daily) which locks a much smaller subset of the rows
and aggregating the daily totals into quarterly, whatever.
Call it daily usage, select where usage_date = today's or
timestamp && a tstzrange of ( 0000, 2400, [) ). That might
also simplify your query logic: all the coalesce op's end up
in your daily/weekly/monthly/whatever summary, the quarterly
values in the reporting are just sum X group by.

--
Steven Lembark                                       1505 National Ave
Workhorse Computing                                 Rockford, IL 61103
lembark@wrkhors.com                                    +1 888 359 3508


pgsql-general by date:

Previous
From: "a"
Date:
Subject: Re: Parameter passing in trigger function write in C
Next
From: Steven Lembark
Date:
Subject: Re: Long running INSERT+SELECT query