Pl/pgsql functions causing crashes in 8.2.2 - Mailing list pgsql-hackers

From Jonathan Gray
Subject Pl/pgsql functions causing crashes in 8.2.2
Date
Msg-id 003701c749c2$e0dafec0$a290fc40$@com
Whole thread Raw
Responses Re: Pl/pgsql functions causing crashes in 8.2.2
Re: Pl/pgsql functions causing crashes in 8.2.2
List pgsql-hackers
Following an upgrade to 8.2.2, many of my plpgsql functions started to cause
server process crashes.

I make use of a custom data-type "uniqueidentifier", available here:

http://gborg.postgresql.org/project/uniqueidentifier
ftp://gborg.postgresql.org/pub/uniqueidentifier/stable/uniqueidentifier-0.2.
tar.gz

This type has given me the same kind of process crash once before, but that
was related to NULL values in a foreign-key referenced field (unresolved to
this day, but behavior is allowed for all builtin types).

I see the crash in plpgsql functions that return an aggregate type which
contain a uniqueidentifier (including triggers which have uniqueidentifiers
in the NEW).  Here is a test case I was able to create:

<< TEST SETUP >>

CREATE SCHEMA test;

CREATE TYPE test.guid_plus AS (
  id      public.uniqueidentifier,
  num     integer
);

CREATE TABLE test.guid_table (
  id      public.uniqueidentifier,
  num     integer
);

INSERT INTO test.guid_table (id,num) VALUES (newid(),1);
INSERT INTO test.guid_table (id,num) VALUES (newid(),2);
INSERT INTO test.guid_table (id,num) VALUES (newid(),3);
INSERT INTO test.guid_table (id,num) VALUES (newid(),4);

CREATE OR REPLACE FUNCTION test.break_guid (idlower integer, idupper
integer) RETURNS SETOF test.guid_plus AS
$$
DECLARE
    x RECORD;
    gplus_ret test.guid_plus;
BEGIN
    FOR x IN SELECT id,num FROM test.guid_table WHERE id > idlower AND id <
idupper LOOP
        gplus_ret :=
(x.id::uniqueidentifier,x.num::integer)::test.guid_plus;
        -- I usually do the following: (but tried above with same result)
        --   gplus_ret := (x.id,x.num);
        RETURN NEXT gplus_ret;
    END LOOP;
    RETURN;
END;
$$ LANGUAGE plpgsql;

<< CAUSE THE CRASH >>

SELECT * FROM test.break_guid(0,5);


server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.


<< FROM THE LOGS, SET AT DEBUG5 >>

2007-02-05 22:31:07 PST [31363]: [45-1] DEBUG: StartTransactionCommand
2007-02-05 22:31:07 PST [31363]: [46-1] DEBUG: StartTransaction
2007-02-05 22:31:07 PST [31363]: [47-1] DEBUG: name: unnamed; blockState:
DEFAULT; state: INPROGR, xid/subid/cid: 7198/1/0, nestlvl: 1, children: <>
2007-02-05 22:31:07 PST [31278]: [775-1] DEBUG: reaping dead processes
2007-02-05 22:31:07 PST [31278]: [776-1] DEBUG: server process (PID 31363)
was terminated BY signal 11
2007-02-05 22:31:07 PST [31278]: [777-1] LOG: server process (PID 31363) was
terminated BY signal 11
2007-02-05 22:31:07 PST [31278]: [778-1] LOG: terminating any other active
server processes
2007-02-05 22:31:07 PST [31278]: [779-1] DEBUG: sending SIGQUIT TO process
31361
2007-02-05 22:31:07 PST [31278]: [780-1] DEBUG: reaping dead processes
2007-02-05 22:31:07 PST [31278]: [781-1] LOG: ALL server processes
terminated; reinitializing
2007-02-05 22:31:07 PST [31278]: [782-1] DEBUG: shmem_exit(0)
2007-02-05 22:31:07 PST [31278]: [783-1] DEBUG: invoking
IpcMemoryCreate(size=537141248)
2007-02-05 22:31:07 PST [31364]: [1-1] LOG: DATABASE system was interrupted
at 2007-02-05 22:30:35 PST
2007-02-05 22:31:07 PST [31364]: [2-1] LOG: checkpoint record IS at
0/286D97FC
2007-02-05 22:31:07 PST [31364]: [3-1] LOG: redo record IS at 0/286D97FC;
undo record IS at 0/0; shutdown TRUE
2007-02-05 22:31:07 PST [31364]: [4-1] LOG: next transaction ID: 0/7192;
next OID: 155654
2007-02-05 22:31:07 PST [31364]: [5-1] LOG: next MultiXactId: 1; next
MultiXactOffset: 0
2007-02-05 22:31:07 PST [31364]: [6-1] LOG: DATABASE system was NOT properly
shut down; automatic recovery IN progress
2007-02-05 22:31:07 PST [31365]: [1-1] FATAL: the DATABASE system IS
starting up
2007-02-05 22:31:07 PST [31365]: [2-1] DEBUG: proc_exit(1)
2007-02-05 22:31:07 PST [31365]: [3-1] DEBUG: shmem_exit(1)
2007-02-05 22:31:07 PST [31365]: [4-1] DEBUG: exit(1)
2007-02-05 22:31:07 PST [31278]: [784-1] DEBUG: forked new backend,
pid=31365 socket=8
2007-02-05 22:31:07 PST [31278]: [785-1] DEBUG: reaping dead processes
2007-02-05 22:31:07 PST [31278]: [786-1] DEBUG: server process (PID 31365)
exited WITH exit code 1
2007-02-05 22:31:07 PST [31364]: [7-1] LOG: record WITH zero length at
0/286D9844
2007-02-05 22:31:07 PST [31364]: [8-1] LOG: redo IS NOT required
2007-02-05 22:31:07 PST [31364]: [9-1] LOG: DATABASE system IS ready
2007-02-05 22:31:07 PST [31364]: [10-1] DEBUG: transaction ID wrap LIMIT IS
2147484171, limited BY DATABASE "postgres"
2007-02-05 22:31:07 PST [31364]: [11-1] DEBUG: proc_exit(0)
2007-02-05 22:31:07 PST [31364]: [12-1] DEBUG: shmem_exit(0)
2007-02-05 22:31:07 PST [31364]: [13-1] DEBUG: exit(0)
2007-02-05 22:31:07 PST [31278]: [787-1] DEBUG: reaping dead processes
2007-02-05 22:31:07 PST [31368]: [1-1] DEBUG: proc_exit(0)
2007-02-05 22:31:07 PST [31368]: [2-1] DEBUG: shmem_exit(0)
2007-02-05 22:31:07 PST [31368]: [3-1] DEBUG: exit(0)
2007-02-05 22:31:07 PST [31278]: [788-1] DEBUG: reaping dead processes

The data in test.guid_table looks fine and I am able to SELECT it without
any problems.  All my problems seem related to plpgsql functions and the
uniqueidentifier type, but I'm not able to diagnose any further (and could
be wrong).

Any help is greatly appreciated.  This is my first post to the mailing lists
but I have been a daily lurker for quite some time now :)


Jonathan Gray
jon5pg@streamy.com



pgsql-hackers by date:

Previous
From: Markus Schiltknecht
Date:
Subject: Re: Proposal: Commit timestamp
Next
From: Magnus Hagander
Date:
Subject: Re: libpq docs about PQfreemem