Thread: backend crash on CREATE OR REPLACE of a C-function on Linux

backend crash on CREATE OR REPLACE of a C-function on Linux

From
Stefan Kaltenbrunner
Date:
Hi all!

While hacking on some C-level functions I noticed that everytime I
replaced the .so file and used CREATE OR REPLACE FUNCTION the backend
immediatly crashed.

To test that it was not caused by something my function does (or one of
the libaries it links in) I created the following testcase based on the
example in the docs:

------
#include "postgres.h"
#include <string.h>
#include "fmgr.h"


#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif


PG_FUNCTION_INFO_V1(add_one);

         Datum
add_one(PG_FUNCTION_ARGS)
{
         int32   arg = PG_GETARG_INT32(0);

         PG_RETURN_INT32(arg + 1);
}
------

compiled using:

gcc -I/usr/local/pgsql83/include/server/ -fpic -c test.c
gcc -shared -o test.so test.o

afterwards simply copy the .so into place using:

cp test.so /usr/local/pgsql83/lib

entered psql and executed:


postgres=# CREATE OR REPLACE FUNCTION add_one(integer) RETURNS integer
AS '/usr/local/pgsql83/lib/test.so','add_one' LANGUAGE C STRICT;
CREATE FUNCTION

in a second session simply execute:

cp test.so /usr/local/pgsql83/lib again (the very same binary)

and in the original psql session the next call of:

postgres=# CREATE OR REPLACE FUNCTION add_one(integer) RETURNS integer
AS '/usr/local/pgsql83/lib/test.so','add_one' LANGUAGE C STRICT;
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

will crash the backend.

I can reproduce this on Debian Etch/i386 (against 8.3), Debian
Lenny/AMD64(against 8.1) and Debian Lenny/ARMv5tel. Various people on
IRC have failed to reproduce on other platforms though.

A backtracke of a crashed backend looks like:


Program received signal SIGSEGV, Segmentation fault.
0xb7fb079a in _dl_rtld_di_serinfo () from /lib/ld-linux.so.2
(gdb) bt
#0  0xb7fb079a in _dl_rtld_di_serinfo () from /lib/ld-linux.so.2
#1  0xb7fb0b07 in _dl_rtld_di_serinfo () from /lib/ld-linux.so.2
#2  0xb7f1a98d in __libc_dlclose () from /lib/tls/i686/cmov/libc.so.6
#3  0xb7f1aaca in _dl_sym () from /lib/tls/i686/cmov/libc.so.6
#4  0xb7f6eee8 in dlsym () from /lib/tls/i686/cmov/libdl.so.2
#5  0xb7fb444f in _dl_rtld_di_serinfo () from /lib/ld-linux.so.2
#6  0xb7f6f42d in dlerror () from /lib/tls/i686/cmov/libdl.so.2
#7  0xb7f6ee7b in dlsym () from /lib/tls/i686/cmov/libdl.so.2
#8  0x082e1cbd in load_external_function (filename=0x84bf8e8
"/usr/local/pgsql83/lib/test.so", funcname=0x84bfb44 "add_one",
     signalNotFound=1 '\001', filehandle=0xb7f70ff4) at dfmgr.c:117
#9  0x08101310 in fmgr_c_validator (fcinfo=0xbffc89f8) at pg_proc.c:509
#10 0x082e4b8e in OidFunctionCall1 (functionId=2247, arg1=2326529) at
fmgr.c:1532
#11 0x08101bcd in ProcedureCreate (procedureName=0x8485d00 "add_one",
procNamespace=2200, replace=1 '\001', returnsSet=0 '\0',
     returnType=23, languageObjectId=13, languageValidator=2247,
prosrc=0x8485f00 "add_one",
     probin=0x8485ed4 "/usr/local/pgsql83/lib/test.so", isAgg=0 '\0',
security_definer=0 '\0', isStrict=1 '\001', volatility=118 'v',
     parameterTypes=0x84bf6ac, allParameterTypes=0, parameterModes=0,
parameterNames=0, proconfig=0, procost=1, prorows=0)
     at pg_proc.c:413
#12 0x0814c014 in CreateFunction (stmt=0x8486068) at functioncmds.c:785
#13 0x08234e3a in PortalRunUtility (portal=0x84b6a0c,
utilityStmt=0x8486068, isTopLevel=1 '\001', dest=0x84860c4,


Which could hint towards this not being our bug but nevertheless I
though I would at least report it.



Stefan

Re: backend crash on CREATE OR REPLACE of a C-function on Linux

From
Andrew Chernow
Date:
Stefan Kaltenbrunner wrote:
> Hi all!
>
> While hacking on some C-level functions I noticed that everytime I
> replaced the .so file and used CREATE OR REPLACE FUNCTION the backend
> immediatly crashed.
>
> To test that it was not caused by something my function does (or one of
> the libaries it links in) I created the following testcase based on the
> example in the docs:
>

I think I've seen this before and reported it.  Try removing your so file and
then copying it, rather than overwriting it.  I think dlopen has something
funnky going on with inodes; may need to generate a new one.  If it is what I
think it is, the problem is with libc not postgres.

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/