Thread: caching subtotals: update vs sum -- aaugh!

caching subtotals: update vs sum -- aaugh!

From
will trillich
Date:
to save a few cycles, i'm hoping to cache subtotals of
subsidiary records into the corresponding parent records -- but
i can't figure out how to update the parent table with the sums
of the child table fields:

    create table invoice (
        id serial,
        bal numeric(8,2)
        cost numeric(8,2),
        charge numeric(8,2),
    );
    create table line_item (
        id serial,
        item_id integer references invoice ( id ),
        bal numeric(8,2)
        cost numeric(8,2),
        charge numeric(8,2),
    );

selecting the sums is easy:

    select
        item_id,
        sum(bal) as bal,
        sum(cost) as cost,
        sum(charge) as charge
    from
        line_item
    group by
        item_id;

but i can't seem to get by cerebellum around how to flow all the
subtotals upstream -- these don't work:

    -- all totals wind up in ONE invoice record
    update invoice set
      bal = sum(line_item.bal),
      cost = sum(line_item.cost),
      charge = sum(line_item.charge)
    where line_item.item_id = set.id;

    -- syntax error at 'group'
    update invoice set
      bal = sum(line_item.bal),
      cost = sum(line_item.cost),
      charge = sum(line_item.charge)
    from line_item
    where line_item.item_id = set.id GROUP BY line_item.item_id;

the next one works, but not even within earshot of 'elegant'.
there's GOTTA be a slicker way, right?

    -- ridiculosity of redundancy but it limps into the right result
    update invoice set
      bal = (select sum(line_item.bal) where item_id = invoice.id),
      cost = (select sum(line_item.cost) where item_id = invoice.id),
      charge = (select sum(line_item.charge) where item_id = invoice.id)
    from line_item
    where line_item.item_id = set.id GROUP BY line_item.item_id;

or should i settle for something like

    select
        item_id,
        sum(bal) as bal,
        sum(cost) as cost,
        sum(charge) as charge
    into
        fooey
    from
        line_item
    group by
        item_id
    ;
    update invoice set
        bal = fooey.bal,
        cost = fooey.cost,
        charge = fooey.charge
    where fooey.item_id = id
    ;
    drop table fooey
    ;

...? seems a bit of the old "long-way-around"... surely there's
a way--

--
DEBIAN NEWBIE TIP #110 from Dimitri Maziuk <dmaziuk@yola.bmrb.wisc.edu>
:
Here's how to TUNNEL SECURE X11 CONNECTIONS THROUGH SSH: on the
client, do this:
    client# export DISPLAY=client:0.0
    client# ssh -X server
then once you're logged in at the server, do:
    server# netscape &
The environment created at the server will include the DISPLAY
variable, so netscape (or whatever) will dialogue with the
client machine. (See "man ssh" for more.)

Also see http://newbieDoc.sourceForge.net/ ...

Re: caching subtotals: update vs sum -- aaugh!

From
"Alaric B. Snell"
Date:
On Thu, 10 Jan 2002, will trillich wrote:

> or should i settle for something like
>
>     select
>         item_id,
>         sum(bal) as bal,
>         sum(cost) as cost,
>         sum(charge) as charge
>     into
>         fooey
>     from
>         line_item
>     group by
>         item_id
>     ;
>     update invoice set
>         bal = fooey.bal,
>         cost = fooey.cost,
>         charge = fooey.charge
>     where fooey.item_id = id
>     ;
>     drop table fooey
>     ;
>
> ...? seems a bit of the old "long-way-around"... surely there's
> a way--

A stored procedure sounds the way ahead to me. DO it with a trigger if you
want updates to happen automagically whenever the line_items are changed:

http://www.postgresql.org/idocs/index.php?plpgsql-trigger.html
http://www.postgresql.org/idocs/index.php?sql-createtrigger.html

--
Alaric B. Snell, Developer
abs@frontwire.com


Re: caching subtotals: update vs sum -- aaugh!

From
Andrew Gould
Date:
Will,

Where does 'set.id' come from?  If this was a typo,
was it in the sql query you executed?  I would think
that your first attempt should have worked if 'set.id'
was replaced with 'invoice.id'.

Best of luck,

Andrew Gould

--- will trillich <will@serensoft.com> wrote:
> to save a few cycles, i'm hoping to cache subtotals
> of
> subsidiary records into the corresponding parent
> records -- but
> i can't figure out how to update the parent table
> with the sums
> of the child table fields:
>
>     create table invoice (
>         id serial,
>         bal numeric(8,2)
>         cost numeric(8,2),
>         charge numeric(8,2),
>     );
>     create table line_item (
>         id serial,
>         item_id integer references invoice ( id ),
>         bal numeric(8,2)
>         cost numeric(8,2),
>         charge numeric(8,2),
>     );
>
> selecting the sums is easy:
>
>     select
>         item_id,
>         sum(bal) as bal,
>         sum(cost) as cost,
>         sum(charge) as charge
>     from
>         line_item
>     group by
>         item_id;
>
> but i can't seem to get by cerebellum around how to
> flow all the
> subtotals upstream -- these don't work:
>
>     -- all totals wind up in ONE invoice record
>     update invoice set
>       bal = sum(line_item.bal),
>       cost = sum(line_item.cost),
>       charge = sum(line_item.charge)
>     where line_item.item_id = set.id;
>
>     -- syntax error at 'group'
>     update invoice set
>       bal = sum(line_item.bal),
>       cost = sum(line_item.cost),
>       charge = sum(line_item.charge)
>     from line_item
>     where line_item.item_id = set.id GROUP BY
> line_item.item_id;
>
> the next one works, but not even within earshot of
> 'elegant'.
> there's GOTTA be a slicker way, right?
>
>     -- ridiculosity of redundancy but it limps into the
> right result
>     update invoice set
>       bal = (select sum(line_item.bal) where item_id =
> invoice.id),
>       cost = (select sum(line_item.cost) where item_id
> = invoice.id),
>       charge = (select sum(line_item.charge) where
> item_id = invoice.id)
>     from line_item
>     where line_item.item_id = set.id GROUP BY
> line_item.item_id;
>
> or should i settle for something like
>
>     select
>         item_id,
>         sum(bal) as bal,
>         sum(cost) as cost,
>         sum(charge) as charge
>     into
>         fooey
>     from
>         line_item
>     group by
>         item_id
>     ;
>     update invoice set
>         bal = fooey.bal,
>         cost = fooey.cost,
>         charge = fooey.charge
>     where fooey.item_id = id
>     ;
>     drop table fooey
>     ;
>
> ...? seems a bit of the old "long-way-around"...
> surely there's
> a way--
>
> --
> DEBIAN NEWBIE TIP #110 from Dimitri Maziuk
> <dmaziuk@yola.bmrb.wisc.edu>
> :
> Here's how to TUNNEL SECURE X11 CONNECTIONS THROUGH
> SSH: on the
> client, do this:
>     client# export DISPLAY=client:0.0
>     client# ssh -X server
> then once you're logged in at the server, do:
>     server# netscape &
> The environment created at the server will include
> the DISPLAY
> variable, so netscape (or whatever) will dialogue
> with the
> client machine. (See "man ssh" for more.)
>
> Also see http://newbieDoc.sourceForge.net/ ...
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster


__________________________________________________
Do You Yahoo!?
Send FREE video emails in Yahoo! Mail!
http://promo.yahoo.com/videomail/

Re: caching subtotals: update vs sum -- aaugh!

From
Tom Lane
Date:
will trillich <will@serensoft.com> writes:
> but i can't seem to get by cerebellum around how to flow all the
> subtotals upstream -- these don't work:

>     -- all totals wind up in ONE invoice record
>     update invoice set
>       bal = sum(line_item.bal),
>       cost = sum(line_item.cost),
>       charge = sum(line_item.charge)
>     where line_item.item_id = set.id;

UPDATEs containing top-level aggregate functions don't really work
correctly.  SQL92 forbids such things entirely, suggesting that they
think it's not well-defined.  Postgres doesn't currently reject the
query, but the behavior is rather broken IMHO.  See past discussions
in the archives about whether we should reject this, and what it should
mean if we don't.

As for your example with UPDATE ... GROUP BY, I don't believe that that
will get past the parser.

The temp table is probably the right way to go, ugly as it is.
The only standards-conformant way to do it in one query would be three
independent subselects:

update invoice set
  bal = (select sum(line_item.bal) from line_item where item_id = invoice.id),
  cost = (select sum(line_item.cost) from line_item where item_id = invoice.id),
  charge = (select sum(line_item.charge) from line_item where item_id = invoice.id);

and the amount of extra computation needed to do it that way is large.

Or ... wait a second.  How about

update invoice set
  bal = ss.bal,
  cost = ss.cost,
  charge = ss.charge
from
    (select
        item_id,
        sum(bal) as bal,
        sum(cost) as cost,
        sum(charge) as charge
    from
        line_item
    group by
        item_id) ss
where ss.item_id = invoice.id;

I haven't tried this but it seems like it should work.  Better check the
quality of the generated plan though.  The temp table might be faster.

            regards, tom lane

Re: caching subtotals: update vs sum -- aaugh!

From
will trillich
Date:
subtotalling child records into a parent field -- HOWTO?

> --- will trillich <will@serensoft.com> wrote:
> >     create table invoice (
> >         id serial,
> >         bal numeric(8,2)
> >         cost numeric(8,2),
> >         charge numeric(8,2),
> >     );
> >     create table line_item (
> >         id serial,
> >         item_id integer references invoice ( id ),
> >         bal numeric(8,2)
> >         cost numeric(8,2),
> >         charge numeric(8,2),
> >     );

> > i can't seem to get by cerebellum around how to
> > flow all the
> > subtotals upstream -- these don't work:
> >
> >     -- all totals wind up in ONE invoice record
> >     update invoice set
> >       bal = sum(line_item.bal),
> >       cost = sum(line_item.cost),
> >       charge = sum(line_item.charge)
> >     where line_item.item_id = INVOICE.id;
> >
> >     -- syntax error at 'group'
> >     update invoice set
> >       bal = sum(line_item.bal),
> >       cost = sum(line_item.cost),
> >       charge = sum(line_item.charge)
> >     from line_item
> >     where line_item.item_id = INVOICE.id GROUP BY
> > line_item.item_id;

On Thu, Jan 10, 2002 at 05:06:01AM -0800, Andrew Gould wrote:
> Where does 'set.id' come from?  If this was a typo,
> was it in the sql query you executed?  I would think
> that your first attempt should have worked if 'set.id'
> was replaced with 'invoice.id'.

yes, that was a typo (which i fixed in my quoted sql above). and
no, it doesn't do what i want.  (it DOES if i have just ONE
single invoice record in the table -- all totals wind up in just
one invoice record.) if you have an example that DOES work i'd
love to see it!

i'm guessing that this isn't really an obscure task
(subtotalling child records into a parent field) but getting it
to work just ain't happenin' fer me yet. i'm SURE there's gotta
be an sql-friendly way to do this... ?

--
DEBIAN NEWBIE TIP #119 from Jonathan D. Proulx <jon@ai.mit.edu>
:
Having trouble RUNNING REMOTE X APPLICATIONS?  You've tried "xhost
+<host>", set the DISPLAY variable on the remote session, and
checked that the "-nolisten tcp" flag is *not* being sent at X
startup, right?
  Verify that X is really listening: "netstat -tl" will show
all listening tcp ports; you should see port 6000 open if
display :0 is listening (6001 for :1 etc.)
  If it is listening, I'd start wondering about packet filtering
rules. Check ipchains or iptables...

Also see http://newbieDoc.sourceForge.net/ ...

Re: caching subtotals: update vs sum -- aaugh!

From
will trillich
Date:
On Thu, Jan 10, 2002 at 10:41:50AM -0500, Tom Lane wrote:
> Or ... wait a second.  How about
>
> update invoice set
>     bal = ss.bal,
>     cost = ss.cost,
>     charge = ss.charge
> from
>     (select
>         item_id,
>         sum(bal) as bal,
>         sum(cost) as cost,
>         sum(charge) as charge
>     from
>         line_item
>     group by
>         item_id) ss
> where ss.item_id = invoice.id;
>
> I haven't tried this but it seems like it should work.  Better check the
> quality of the generated plan though.  The temp table might be faster.

this is why we like having you around, Tom. you're sneaky in all
the right places. :)

i knew there was a solution in there somewhere...

    UPDATE _acct SET
        cost       = ppp.cost,
        charge     = ppp.charge
    FROM
        (SELECT
            acct_id,
            SUM(cost) AS cost,
            SUM(charge) AS charge
        FROM
            _prop
        GROUP BY
            acct_id) ppp
    WHERE
        acct_id = ppp.acct_id;

this seems like it'd properly update all _acct records with
appropriate subtotals from child _prop records. BUT-- after
doing just that (above), given this data in then _prop table,

    db=# select id,acct_id,charge,cost from _prop;
     id | acct_id | charge | cost
    ----+---------+--------+-------
      3 |       4 |   0.00 |  0.00
      4 |       3 |   0.00 |  0.00
      5 |       2 | 210.98 |  7.25
      2 |       2 | 384.95 | 40.00
    (4 rows)

(then i do the update, above) i wind up with the following in
the _acct table:

    db=# select id,charge,cost from _acct;
     id | charge | cost
    ----+--------+-------
      3 | 595.93 | 47.25
      4 | 595.93 | 47.25
      2 | 595.93 | 47.25
    (3 rows)

the amounts are right for _acct.id = 2, but should be zero for
both of the others. arggh!

for completeness, i tried aliasing the main table, to no avail:

    UPDATE _acct a SET -- alias for subquery?
        cost       = ppp.cost,
        charge     = ppp.charge
    FROM
        (SELECT
            acct_id,
            SUM(cost) AS cost,
            SUM(charge) AS charge
        FROM
            _prop
        WHERE                 -- hmm?
            acct_id = a.id
        GROUP BY
            acct_id) ppp
    WHERE
        acct_id = ppp.acct_id;

    ERROR:  parser: parse error at or near "a"

i'm hoping to add this to a rule:

ON UPDATE TO acct DO INSTEAD (
    UPDATE _acct SET
        f1 = NEW.f1,
        f2 = NEW.f2,
        -- yada yada
        fN = NEW.fN
    WHERE
        id = OLD.id
    ;
    UPDATE _acct SET
        cost       = ppp.cost,
        charge     = ppp.charge
    FROM
        (SELECT
            acct_id,
            SUM(cost) AS cost,
            SUM(charge) AS charge
        FROM
            _prop
    --  WHERE
    --      acct_id = OLD.acct_id
        GROUP BY
            acct_id
        ) ppp
    WHERE
        id = OLD.acct_id;
);

i'm not having any luck with this, though.  i guess it's okay to
do the CREATE TEMP TABLE... DROP TABLE... within a rule, right?
:(

--
DEBIAN NEWBIE TIP #36 from Sean Quinlan <smq@gmx.co.uk>
:
Looking to CHANGE THE DEFAULT LS COLORS? It's simple: first,
    dircolors -p >~/.dircolors
and then edit the results to suit your tastes; finally, insert
    eval `dircolors -b ~/.dircolors`
in your ~/.bashrc. Next time you log in (or source ~/.bashrc)
your new colors will take effect.

Also see http://newbieDoc.sourceForge.net/ ...

Re: caching subtotals: update vs sum -- aaugh!

From
will trillich
Date:
On Fri, Jan 11, 2002 at 03:11:16PM -0600, i whined about:
> > the amounts are right for _acct.id = 2, but should be zero for
> > both of the others. arggh!

then Tom Lane <tgl@sss.pgh.pa.us> replied
> [ scratches head ]  This may represent a bug.  I'm too lazy today to
> whip up a test case based on your emails --- could I trouble you for
> a script that creates and loads the test tables?

here i had my hopes up. alas, it was only me and my bungles.

i think i found it-- this finally works (sure would like to have
the subquery use the WHERE, though...)

create rule acct_edit as
on update to acct
do instead (
    update _acct set
        code   = NEW.code,
        charge = p.charge,
        cost   = p.cost
    from (
        select
            sum(charge) as charge,
            sum(cost  ) as cost,
            acct_id
        from
            _prop
--        where
--            acct_id = OLD.id -- can't see *OLD* record here
        group by
            acct_id
    ) p
    where
        id        = OLD.id
        and
        p.acct_id = OLD.id;
);

i also tried creating a temp table within the rule, but didn't
get very far. (and BOY is it crucial to have an index on
_prop(acct_id)!)

--

as for where the problem sat:

    UPDATE _acct SET
        cost    = ppp.cost,
        charge  = ppp.charge
    FROM
        (SELECT
            acct_id,
            SUM(cost)   AS cost,
            SUM(charge) AS charge
        FROM
            _prop
        GROUP BY
            acct_id) ppp
    WHERE
        acct_id = ppp.acct_id;

the WHERE was wrong -- shoulda been

    WHERE
        id          -- _acct.id, in top query
        =
        ppp.acct_id -- subquery from _prop.acct_id
    ;

--
DEBIAN NEWBIE TIP #60 from Vineet Kumar <debian-user@virtual.doorstop.net>
:
Been hoping to find A FEATURE-PACKED MUTT CONFIG FILE? Check
out the ones at Sven Guckes' site:
    http://www.fefe.de/muttfaq/muttrc
There's also some great vimrc ideas there, too.

Also see http://newbieDoc.sourceForge.net/ ...