Thread: Why adding BEFORE TRIGGER affect client CPU?

Why adding BEFORE TRIGGER affect client CPU?

From
George Woodring
Date:
We are running 9.6.8-10 on our five postgres clusters.
We have ~25 database per cluster.
We have two servers that update a table in each database.

During the update process we were running a pgpsql function that calculated the difference in the new value with the previous.  This function would do a SELECT to get the previous value from the table and do the calculation before the UPDATE.  We also have an AFTER TRIGGER to save the row into a logging table.

We recently changed the process to do a BEFORE TRIGGER to do the calculation between OLD and NEW instead of the separate function with the SELECT.

After doing this, CPU on our two client servers went crazy.  CPU on the database servers look unchanged.  Rolling back this change fixed our client CPU issue.

Anyone know of a reason why this change would affect the client?  I would think that this would only affect the server side.  I could not see any messages printed out anywhere in logs showing any type of error.

Thanks,
George Woodring
iGLASS Networks
www.iglass.net

Re: Why adding BEFORE TRIGGER affect client CPU?

From
Adrian Klaver
Date:
On 1/18/19 10:23 AM, George Woodring wrote:
> We are running 9.6.8-10 on our five postgres clusters.
> We have ~25 database per cluster.
> We have two servers that update a table in each database.
> 
> During the update process we were running a pgpsql function that 
> calculated the difference in the new value with the previous.  This 
> function would do a SELECT to get the previous value from the table and 
> do the calculation before the UPDATE.  We also have an AFTER TRIGGER to 
> save the row into a logging table.
> 
> We recently changed the process to do a BEFORE TRIGGER to do the 
> calculation between OLD and NEW instead of the separate function with 
> the SELECT.

So what was the exact change?

> 
> After doing this, CPU on our two client servers went crazy.  CPU on the 
> database servers look unchanged.  Rolling back this change fixed our 
> client CPU issue.

What is the client server and what is it doing?

> 
> Anyone know of a reason why this change would affect the client?  I 
> would think that this would only affect the server side.  I could not 
> see any messages printed out anywhere in logs showing any type of error.
> 
> Thanks,
> George Woodring
> iGLASS Networks
> www.iglass.net <http://www.iglass.net>


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Why adding BEFORE TRIGGER affect client CPU?

From
Tom Lane
Date:
Adrian Klaver <adrian.klaver@aklaver.com> writes:
> On 1/18/19 10:23 AM, George Woodring wrote:
>> We recently changed the process to do a BEFORE TRIGGER to do the 
>> calculation between OLD and NEW instead of the separate function with 
>> the SELECT.

> So what was the exact change?

>> After doing this, CPU on our two client servers went crazy.  CPU on the 
>> database servers look unchanged.  Rolling back this change fixed our 
>> client CPU issue.

> What is the client server and what is it doing?

Indeed.  There's no direct way that messing with a trigger would have
caused extra CPU on the client side.  I speculate that the trigger
caused the data to look different in a way that your client app wasn't
expecting, causing it to do something funny --- maybe loop trying to
find a matching record, or something like that.  Or maybe removing the
step that did the calculation client-side had side effects you weren't
expecting --- what uses that value client-side, exactly?

            regards, tom lane


Re: Why adding BEFORE TRIGGER affect client CPU?

From
George Woodring
Date:
Adrian Klaver <adrian.klaver@aklaver.com> writes:
> What is the client server and what is it doing?

Our client creates a temp table 
CREATE TEMP TABLE myraw(LIKE raw INCLUDING DEFAULTS)
We load data with a copy
COPY myraw (transferid, pollgrpid, date, data, rrdtypeid, error ) FROM STDIN WITH DELIMITER '|'
We do some checking on the data in the temp table
SELECT rawdate, raw_maint(rawdate) AS checked FROM (SELECT date_trunc('minute', date) AS rawdate FROM myraw GROUP BY rawdate) AS foo;
Then we save the data into the permanent location
SELECT transferid, pollgrpid, process_lastpoll(ROW(myraw.*)::raw) AS processed FROM myraw ORDER BY transferid

Our change was in the process_lastpoll() function.  We went from 
      IF cntr THEN
         oval := calc_last_cntr(ptype, pgid, ds, pdate, data);
      ELSE
         oval := data;
      END IF;

      -- Insert data into table
      -- RAISE DEBUG 'Process: dsnum=%, polltype=%, cntr?=%, pdate=%, data=%, oval=%', dsnum, ptype, cntr, pdate, data, oval;
      UPDATE lastpoll SET polltype=ptype, timeslot=ts, lasttime=pdate, lastval=data, outval=oval, error=err WHERE pollgrpid=pgid AND dsnum=ds;
      IF found THEN
         RETURN true;
      END IF;
      BEGIN
         INSERT INTO lastpoll (timeslot, pollgrpid, dsnum, polltype, lasttime, lastval, outval, error) VALUES (ts, pgid, ds, ptype, pdate, data, oval, err);
         RETURN true;
      EXCEPTION WHEN unique_violation THEN
         -- Don't do anything
      END;
to this
      IF NOT cntr THEN
         oval := data;
      END IF;

      -- Insert data into table
      -- RAISE DEBUG 'Process: dsnum=%, polltype=%, cntr?=%, pdate=%, data=%, oval=%', dsnum, ptype, cntr, pdate, data, oval;
      UPDATE lastpoll SET polltype=ptype, timeslot=ts, lasttime=pdate, lastval=data, outval=oval, error=err WHERE pollgrpid=pgid AND dsnum=ds;
      IF found THEN
         RETURN true;
      END IF;
      BEGIN
         INSERT INTO lastpoll (timeslot, pollgrpid, dsnum, polltype, lasttime, lastval, outval, error) VALUES (ts, pgid, ds, ptype, pdate, data, oval, err);
         RETURN true;
      EXCEPTION WHEN unique_violation THEN
         -- Don't do anything
      END;

The calc_last_cntr() was replaced with the trigger.
from:
CREATE OR REPLACE FUNCTION calc_last_cntr(ptype text, pgid int4, ds int4, pdate timestamp with time zone, data double precision) RETURNS double precision AS $$
   DECLARE 
      mcurr RECORD;
      res double precision;
      dwindow int4;
   BEGIN
      
      SELECT lasttime, lastval INTO mcurr FROM lastpoll WHERE pollgrpid=pgid AND dsnum=ds;
      IF mcurr IS NULL THEN
         -- RAISE DEBUG 'No previous data found for pgid=%', pgid;
         return NULL;
      END IF;

      -- Calculate the counter rate
      -- Check for div by 0
      dwindow := EXTRACT(EPOCH FROM (pdate - mcurr.lasttime));
      IF dwindow = 0 THEN
         RAISE WARNING '(%) Time difference is zero (% - %)', current_database(), pdate, mcurr.lasttime;
         return NULL;
      END IF;

      res := (data - mcurr.lastval)/dwindow;
      IF res < 0 THEN
         -- RAISE DEBUG 'Counter loop for pgid=%', pgid;
         return data/dwindow;
      END IF;

      return res;
   END;
$$ LANGUAGE plpgsql;

to:
CREATE OR REPLACE FUNCTION cntr_lastpoll() RETURNS trigger AS $$
   DECLARE
      dwindow int4;
   BEGIN
      -- Log new value into history if has a date
      IF NEW.outval IS NULL THEN
         -- Calculate counter value for this DS.
         dwindow := EXTRACT(EPOCH FROM (NEW.lasttime - OLD.lasttime));
         IF dwindow = 0 THEN
            RAISE WARNING '(%) Time difference is zero (% - %)', current_database(), NEW.lasttime, OLD.lasttime;
            return NEW;
         END IF;

         NEW.outval := (NEW.lastval - OLD.lastval)/dwindow;
         IF NEW.outval < 0 THEN
            NEW.outval := NEW.lastval/dwindow;
         END IF;

      END IF;
      RETURN NEW;
   END;
$$ LANGUAGE plpgsql;

-- Create Trigger to calculate counter values if needed.
CREATE TRIGGER lastpoll_counter BEFORE UPDATE ON lastpoll
   FOR EACH ROW EXECUTE PROCEDURE cntr_lastpoll();

On Fri, Jan 18, 2019 at 2:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
 what uses that value client-side, exactly?

The outval number is later read looking for values outside of a thresholds.  I did not think about it affecting selects later in the code.  All in all everything worked fine, but since the client box is an AWS EC2 instance, it started eating all of our CPU credits.

 
image.png


iGLASS Networks
www.iglass.net




Attachment

Re: Why adding BEFORE TRIGGER affect client CPU?

From
Adrian Klaver
Date:
On 1/18/19 12:21 PM, George Woodring wrote:
> Adrian Klaver <adrian.klaver@aklaver.com 
> <mailto:adrian.klaver@aklaver.com>> writes:
>  > What is the client server and what is it doing?
> 
> Our client creates a temp table
> CREATE TEMP TABLE myraw(LIKE raw INCLUDING DEFAULTS)
> We load data with a copy
> COPY myraw (transferid, pollgrpid, date, data, rrdtypeid, error ) FROM 
> STDIN WITH DELIMITER '|'
> We do some checking on the data in the temp table
> SELECT rawdate, raw_maint(rawdate) AS checked FROM (SELECT 
> date_trunc('minute', date) AS rawdate FROM myraw GROUP BY rawdate) AS foo;
> Then we save the data into the permanent location
> SELECT transferid, pollgrpid, process_lastpoll(ROW(myraw.*)::raw) AS 
> processed FROM myraw ORDER BY transferid
> 
> Our change was in the process_lastpoll() function.  We went from
>        IF cntr THEN
>           oval := calc_last_cntr(ptype, pgid, ds, pdate, data);
>        ELSE
>           oval := data;
>        END IF;
> 
>        -- Insert data into table
>        -- RAISE DEBUG 'Process: dsnum=%, polltype=%, cntr?=%, pdate=%, 
> data=%, oval=%', dsnum, ptype, cntr, pdate, data, oval;
>        UPDATE lastpoll SET polltype=ptype, timeslot=ts, lasttime=pdate, 
> lastval=data, outval=oval, error=err WHERE pollgrpid=pgid AND dsnum=ds;
>        IF found THEN
>           RETURN true;
>        END IF;
>        BEGIN
>           INSERT INTO lastpoll (timeslot, pollgrpid, dsnum, polltype, 
> lasttime, lastval, outval, error) VALUES (ts, pgid, ds, ptype, pdate, 
> data, oval, err);
>           RETURN true;
>        EXCEPTION WHEN unique_violation THEN
>           -- Don't do anything
>        END;
> to this
>        IF NOT cntr THEN
>           oval := data;
>        END IF;
> 
>        -- Insert data into table
>        -- RAISE DEBUG 'Process: dsnum=%, polltype=%, cntr?=%, pdate=%, 
> data=%, oval=%', dsnum, ptype, cntr, pdate, data, oval;
>        UPDATE lastpoll SET polltype=ptype, timeslot=ts, lasttime=pdate, 
> lastval=data, outval=oval, error=err WHERE pollgrpid=pgid AND dsnum=ds;
>        IF found THEN
>           RETURN true;
>        END IF;
>        BEGIN
>           INSERT INTO lastpoll (timeslot, pollgrpid, dsnum, polltype, 
> lasttime, lastval, outval, error) VALUES (ts, pgid, ds, ptype, pdate, 
> data, oval, err);
>           RETURN true;
>        EXCEPTION WHEN unique_violation THEN
>           -- Don't do anything
>        END;
> 
> The calc_last_cntr() was replaced with the trigger.
> from:
> CREATE OR REPLACE FUNCTION calc_last_cntr(ptype text, pgid int4, ds 
> int4, pdate timestamp with time zone, data double precision) RETURNS 
> double precision AS $$
>     DECLARE
>        mcurr RECORD;
>        res double precision;
>        dwindow int4;
>     BEGIN
>        SELECT lasttime, lastval INTO mcurr FROM lastpoll WHERE 
> pollgrpid=pgid AND dsnum=ds;
>        IF mcurr IS NULL THEN
>           -- RAISE DEBUG 'No previous data found for pgid=%', pgid;
>           return NULL;
>        END IF;
> 
>        -- Calculate the counter rate
>        -- Check for div by 0
>        dwindow := EXTRACT(EPOCH FROM (pdate - mcurr.lasttime));
>        IF dwindow = 0 THEN
>           RAISE WARNING '(%) Time difference is zero (% - %)', 
> current_database(), pdate, mcurr.lasttime;
>           return NULL;
>        END IF;
> 
>        res := (data - mcurr.lastval)/dwindow;
>        IF res < 0 THEN
>           -- RAISE DEBUG 'Counter loop for pgid=%', pgid;
>           return data/dwindow;
>        END IF;
> 
>        return res;
>     END;
> $$ LANGUAGE plpgsql;
> 
> to:
> CREATE OR REPLACE FUNCTION cntr_lastpoll() RETURNS trigger AS $$
>     DECLARE
>        dwindow int4;
>     BEGIN
>        -- Log new value into history if has a date
>        IF NEW.outval IS NULL THEN
>           -- Calculate counter value for this DS.
>           dwindow := EXTRACT(EPOCH FROM (NEW.lasttime - OLD.lasttime));
>           IF dwindow = 0 THEN
>              RAISE WARNING '(%) Time difference is zero (% - %)', 
> current_database(), NEW.lasttime, OLD.lasttime;
>              return NEW;
>           END IF;
> 
>           NEW.outval := (NEW.lastval - OLD.lastval)/dwindow;
>           IF NEW.outval < 0 THEN
>              NEW.outval := NEW.lastval/dwindow;
>           END IF;
> 
>        END IF;
>        RETURN NEW;
>     END;
> $$ LANGUAGE plpgsql;
> 
> -- Create Trigger to calculate counter values if needed.
> CREATE TRIGGER lastpoll_counter BEFORE UPDATE ON lastpoll
>     FOR EACH ROW EXECUTE PROCEDURE cntr_lastpoll();
> 
> On Fri, Jan 18, 2019 at 2:01 PM Tom Lane <tgl@sss.pgh.pa.us 
> <mailto:tgl@sss.pgh.pa.us>> wrote:
> 
>       what uses that value client-side, exactly?
> 

Not claiming to understand all that is going on above.

What I think I see:

1) New  process_lastpoll() has:
UPDATE lastpoll SET polltype=ptype, timeslot=ts, lasttime=pdate,
lastval=data, outval=oval, error=err WHERE pollgrpid=pgid AND dsnum=ds;

2) calc_last_cntr() is now function for trigger:
CREATE TRIGGER lastpoll_counter BEFORE UPDATE ON lastpoll
    FOR EACH ROW EXECUTE PROCEDURE cntr_lastpoll();

Seems to me you are double clutching on the outval value. You UPDATE it 
in 1) then recalculate it again in 2).

As said earlier I do not fully understand what is going on, still seems 
to me process_lastpoll() could go away and its code moved into 
calc_last_cntr().

> 
> The outval number is later read looking for values outside of a 
> thresholds.  I did not think about it affecting selects later in the 
> code.  All in all everything worked fine, but since the client box is an 
> AWS EC2 instance, it started eating all of our CPU credits.
> 
> image.png
> 
> 
> iGLASS Networks
> www.iglass.net <http://www.iglass.net>
> 
> 
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com