Thread: psql connection being reset during function?

psql connection being reset during function?

From
Randall Skelton
Date:
Hi all,

I am running 7.1.2 and I have the unenviable task of cleaning up a
database which has columns:

sgmty  integer,   -- YEAR
sgmtmo integer,   -- MONTH
sgmtd  integer,   -- DAY
sgmth  integer,   -- HOUR
sgmtm  integer,   -- MINUTE
sgmts  float,     -- SECOND
+ 190 other floats per row

The data is basically an atmospheric model dump at 1 frame per second. Why
exactly time is stored like this remains a mystery... The values for
seconds resemble a 16 bit float output directly inserted (via libpq) into
the database:
 0.999992423414 1.999986999391 3.000021808504 ... 4.00001638448158.00000869548259.00000327145959.999997847437...

My first thought was simply to add a new column called gmt_timestamp
and write a simple function in plpgsql to update each record.

My first-cut function:

CREATE FUNCTION mydatetime() RETURNS text AS'  DECLARE      -- defines a record and text variable      rec    RECORD;
  date_time TEXT;
 
  BEGIN      -- loop over all entries in path      FOR rec IN SELECT * FROM path LOOP        date_time:= ''UPDATE path
SETsgmt_timestamp = '' ||                   quote_literal(                   rec.sgmty   || ''-'' ||
rec.sgmtmo || ''-'' ||                   rec.sgmtd   || '' '' ||                   rec.sgmth   || '':'' ||
    rec.sgmtm   || '':'' ||                   cast(rec.sgmts as integer)                   )                   || ''
WHERE''||                   '' sgmty      = '' || quote_literal(rec.sgmty) ||                   '' AND sgmtmo = '' ||
quote_literal(rec.sgmtmo)||                   '' AND sgmtd  = '' || quote_literal(rec.sgmtd) ||                   ''
ANDsgmth  = '' || quote_literal(rec.sgmth) ||                   '' AND sgmtm  = '' || quote_literal(rec.sgmtm) ||
           '' AND sgmts  = '' || quote_literal(rec.sgmts)                   || '';'';        EXECUTE date_time;     END
LOOP;
      --return date_time;      return ''done'';
  END;  ' LANGUAGE 'plpgsql';

Surely this can be improved upon, but it leads to problem #1.  In the case
of sgmts = 59.999997847437 my explicit cast of, 'cast(rec.sgmts as
integer)' creates a problem in that I make a timestamp with '60' in the
seconds column.  A time stamp of this sort is not handled by the postgres
timestamp type and the function falls over.

My revised function explicitly propogates 60 seconds to be a minute, 60
minutes to be an hour, 24 hours to be a day and so on.

CREATE FUNCTION mydatetime() RETURNS text AS'  DECLARE      -- defines a record and text variable      rec    RECORD;
  year   INTEGER;      month  INTEGER;      day    INTEGER;      hour   INTEGER;      minute INTEGER;      second
INTEGER;     addone INTEGER;      date_time     TEXT;      date_time_two TEXT;
 
  BEGIN      -- loop over all entries in atlas3_path      FOR rec IN SELECT * FROM atlas3_path LOOP        /* THIS IS A
DIRTYHACK!!! Should never have excluded a time stamp;         * casting allows the seconds to be 60 which is causes
   * problems for the timestamp postgres type.         * Emperically checked to ensure month does not flip for this
   * dataset (i.e. 1996-01-31 23:59:60 does not occur)         * Original code has now been modified to         *
includea proper timestamp calculation.         */
 
        day := 0;        hour := 0;        minute := 0;        second := 0;
        IF cast(rec.sgmts as integer) > 59 THEN           second := cast(rec.sgmts as integer) - 60;           minute
:=1;        ELSE           second := cast(rec.sgmts as integer);        END IF;
 
        IF minute + rec.sgmtm > 59 THEN           minute := minute + rec.sgmtm - 60;           hour := 1;        ELSE
       minute := minute + rec.sgmtm;        END IF;
 
        IF hour + rec.sgmth > 23 THEN           hour := hour + rec.sgmth - 24;           day := 1;        ELSE
hour := hour + rec.sgmth;        END IF;
 
        day :=  day + rec.sgmtd;
        date_time:= ''UPDATE atlas3_path SET sgmt_timestamp = '' ||                   quote_literal(
rec.sgmty  || ''-'' ||                   rec.sgmtmo  || ''-'' ||                   day         || '' '' ||
    hour        || '':'' ||                   minute      || '':'' ||                   second                   )
            || '' WHERE'' ||                   '' sgmty   = '' || quote_literal(rec.sgmty) ||                   '' AND
sgmtmo = '' || quote_literal(rec.sgmtmo) ||                   '' AND sgmtd    = '' || quote_literal(rec.sgmtd) ||
           '' AND sgmth   = '' || quote_literal(rec.sgmth) ||                   '' AND sgmtm = '' ||
quote_literal(rec.sgmtm)||                   '' AND sgmts = '' || quote_literal(rec.sgmts)                   || '';'';
     EXECUTE date_time;     END LOOP;
 
      --return date_time;      return ''done'';
  END;  ' LANGUAGE 'plpgsql';

Again this is rather verbose and unelegant.  Nevertheless it leads to
problem #2: While this works perfectly for a small table of 10 entries, it
crashes the database connection when I try to update 311537 rows using
psql and 'select mydatetime()'.

pqReadData() -- backend closed the channel unexpectedly.       This probably means the backend terminated abnormally
  before or while processing the request.
 
The connection to the server was lost. Attempting reset: Failed.

My logs read:

...
DEBUG:  XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  MoveOfflineLogs: remove 000000000000006C
DEBUG:  XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG:  XLogWrite: new log file created - consider increasing WAL_FILES
Server process (pid 10219) exited with status 9 at Tue Aug 14 19:34:41 2001
Terminating any active server processes...
Server processes were terminated at Tue Aug 14 19:34:42 2001
Reinitializing shared memory and semaphores
The Data Base System is starting up
DEBUG:  database system was interrupted at 2001-08-14 19:33:51 BST
DEBUG:  CheckPoint record at (0, 2081443972)
DEBUG:  Redo record at (0, 2080561900); Undo record at (0, 1828733240);
Shutdown
FALSE
DEBUG:  NextTransactionId: 5627; NextOid: 15659730
DEBUG:  database system was not properly shut down; automatic recovery in progress...
DEBUG:  redo starts at (0, 2080561900)
DEBUG:  open(logfile 0 seg 126) failed: No such file or directory
DEBUG:  redo done at (0, 2113927744)
DEBUG:  database system is in production state

Any and all thoughts are greatly appreciated!

Cheers,
Randall




Re: psql connection being reset during function?

From
Tom Lane
Date:
Randall Skelton <rhskelto@atm.ox.ac.uk> writes:
> problem #2: While this works perfectly for a small table of 10 entries, it
> crashes the database connection when I try to update 311537 rows

In present releases, plpgsql tends to leak a lot of memory intra-call;
you're probably just running out of memory.  (I think we have cleaned up
the leak problems in the CVS-tip code, but that won't help you unless
you're brave enough to run a snapshot version.)  Consider restructuring
your approach so that the plpgsql function just does a single conversion
and is invoked separately at each row:

UPDATE atlas3_path SET sgmt_timestamp = mydatetime(sgmty,sgmtmo,sgmtd,...);

where mydatetime takes five integers and a float and returns a timestamp.
This way, any memory leaked during plpgsql function execution is
reclaimed when the function exits, so it doesn't build up across rows.

This will probably be much faster than your other approach anyway, since
it doesn't require re-finding each row with a fresh UPDATE.  A function
call is a whole lot cheaper than parsing, planning, and executing a new
query.
        regards, tom lane


Re: psql connection being reset during function?

From
Randall Skelton
Date:
Excellent!  The function you suggest is certainly a better approach... 5
minutes of processing and it was done. Thanks Tom.

Cheers,
Randall

On Tue, 14 Aug 2001, Tom Lane wrote:

> Randall Skelton <rhskelto@atm.ox.ac.uk> writes:
> > problem #2: While this works perfectly for a small table of 10 entries, it
> > crashes the database connection when I try to update 311537 rows
>
> In present releases, plpgsql tends to leak a lot of memory intra-call;
> you're probably just running out of memory.  (I think we have cleaned up
> the leak problems in the CVS-tip code, but that won't help you unless
> you're brave enough to run a snapshot version.)  Consider restructuring
> your approach so that the plpgsql function just does a single conversion
> and is invoked separately at each row:
>
> UPDATE atlas3_path SET sgmt_timestamp = mydatetime(sgmty,sgmtmo,sgmtd,...);
>
> where mydatetime takes five integers and a float and returns a timestamp.
> This way, any memory leaked during plpgsql function execution is
> reclaimed when the function exits, so it doesn't build up across rows.
>
> This will probably be much faster than your other approach anyway, since
> it doesn't require re-finding each row with a fresh UPDATE.  A function
> call is a whole lot cheaper than parsing, planning, and executing a new
> query.
>
>             regards, tom lane
>