Thread: selfmade datatype in C and server-crash

selfmade datatype in C and server-crash

From
Markus Schulz
Date:
Hello,
i'm trying to develop a selfmade pg-datatype derived from type text (at
first) with postgresql 7.4.7.
At first i have taken the original code from textin and textout
($SRC/backend/utils/adt/varlena.c) and compiled them renamed to etextin
and etextout into new .so file.
This works fine and then i've created the new Type like:

CREATE OR REPLACE FUNCTION etextin(cstring)
  RETURNS etext AS
'$libdir/new_types.so', 'etextin'
  LANGUAGE 'c' VOLATILE;

CREATE OR REPLACE FUNCTION etextout(etext)
  RETURNS cstring AS
'$libdir/new_types.so', 'etextout'
  LANGUAGE 'c' VOLATILE;

CREATE TYPE etext (
    INPUT = etextin,
    OUTPUT = etextout,
    INTERNALLENGTH = -1,
    ALIGNMENT=int4,
    STORAGE=EXTENDED
    );

this works also.

But if i'm trying to use the type in a table (for instance table with
only one etext column) the server crashed after inserting the second
(first insert works) tuple or on every select.

What i've missed or doing wrong? Are there any HowTo's on this topic?



--
Markus Schulz

Re: selfmade datatype in C and server-crash

From
Tom Lane
Date:
Markus Schulz <msc@antzsystem.de> writes:
> This works fine and then i've created the new Type like:

> CREATE OR REPLACE FUNCTION etextin(cstring)
>   RETURNS etext AS
> '$libdir/new_types.so', 'etextin'
>   LANGUAGE 'c' VOLATILE;

> CREATE OR REPLACE FUNCTION etextout(etext)
>   RETURNS cstring AS
> '$libdir/new_types.so', 'etextout'
>   LANGUAGE 'c' VOLATILE;

You'd likely be well advised to declare these STRICT (hint: is the C
code checking for null input?) ... and unless the datatype has weird
semantics, its I/O functions should be IMMUTABLE.  This doesn't matter
too much for the system's ordinary use of I/O functions, but for
security you want to make sure the functions are properly marked in
case they get called directly.

> But if i'm trying to use the type in a table (for instance table with
> only one etext column) the server crashed after inserting the second
> (first insert works) tuple or on every select.

Getting a stack trace from the crash would be helpful.  But the fact
that it only fails on the second try makes me suspicious that it's
a memory-management issue.  Count thy pallocs.

            regards, tom lane

Re: selfmade datatype in C and server-crash

From
Markus Schulz
Date:
On Wednesday 05 October 2005 05:01, Tom Lane wrote:
> Markus Schulz <msc@antzsystem.de> writes:
> > This works fine and then i've created the new Type like:
> >
> > CREATE OR REPLACE FUNCTION etextin(cstring)
> >   RETURNS etext AS
> > '$libdir/new_types.so', 'etextin'
> >   LANGUAGE 'c' VOLATILE;
> >
> > CREATE OR REPLACE FUNCTION etextout(etext)
> >   RETURNS cstring AS
> > '$libdir/new_types.so', 'etextout'
> >   LANGUAGE 'c' VOLATILE;
>
> You'd likely be well advised to declare these STRICT (hint: is the C
> code checking for null input?) ... and unless the datatype has weird

 no, was'nt checked. copy from original text code.

> semantics, its I/O functions should be IMMUTABLE.  This doesn't
> matter too much for the system's ordinary use of I/O functions, but
> for security you want to make sure the functions are properly marked
> in case they get called directly.

ok, changed to strict (makes really sense), but don't change anything.
now there is no need for NULL checks if i understand right?

> > But if i'm trying to use the type in a table (for instance table
> > with only one etext column) the server crashed after inserting the
> > second (first insert works) tuple or on every select.
>
> Getting a stack trace from the crash would be helpful.  But the fact
> that it only fails on the second try makes me suspicious that it's
> a memory-management issue.  Count thy pallocs.

here's a backtrace.

Program received signal SIGSEGV, Segmentation fault.
0x402eacff in strlen () from /lib/libc.so.6
(gdb) sharedlibrary new_types
Reading symbols from /usr/lib/postgresql/lib/new_types.so...done.
Loaded symbols for /usr/lib/postgresql/lib/new_types.so
(gdb) bt
#0  0x402eacff in strlen () from /lib/libc.so.6
#1  0x40ffeb00 in etextin (fcinfo=0x0) at new_types.c:27
#2  0x081ec020 in fmgr_internal_function ()
#3  0x081ed720 in OidFunctionCall3 ()
#4  0x080d0ec2 in stringTypeDatum ()
#5  0x080d1745 in coerce_type ()
#6  0x080d130b in coerce_to_target_type ()
#7  0x080d351e in updateTargetListEntry ()
#8  0x080b1549 in parse_sub_analyze ()
#9  0x080b0fe6 in parse_sub_analyze ()
#10 0x080b0de9 in parse_sub_analyze ()
#11 0x080b0cd9 in parse_analyze ()
#12 0x0817c7c1 in pg_analyze_and_rewrite ()
#13 0x0817cb88 in pg_plan_queries ()
#14 0x0817f1e0 in PostgresMain ()
#15 0x08158d7b in ClosePostmasterPorts ()
#16 0x08158763 in ClosePostmasterPorts ()
#17 0x08156c68 in PostmasterMain ()
#18 0x081562f9 in PostmasterMain ()
#19 0x08126266 in main ()


my test query was:(run twice and crashed on second run)
insert into test ( var1) values ('andf');

the table has only one column of type etext.

in my opinion there must be already something wrong in first "insert"
statement and then perhaps the heap became corrupt?
if i understand strict correctly, there is no chance to got a NULL
Pointer into strlen?!?


here is the etextin code:(cut'n'paste from adt/varlena.c Code)

Datum
etextin(PG_FUNCTION_ARGS)
{
    char       *inputText = PG_GETARG_CSTRING(0);
    text       *result;
    int         len=0;

    /* verify encoding */
    len = strlen(inputText);
    pg_verifymbstr(inputText, len, false);

    result = (text *) palloc(len + VARHDRSZ);
    VARATT_SIZEP(result) = len + VARHDRSZ;

    memcpy(VARDATA(result), inputText, len);

    PG_RETURN_TEXT_P(result);
}


compiled and linked like:
gcc -Wall -g -O2 -I. -I/usr/include/postgresql/server   -c -o
new_types.o new_types.c
gcc -shared new_types.o -lm  -o new_types.so



Markus Schulz

Re: selfmade datatype in C and server-crash

From
Markus Schulz
Date:
Now i have debugged the _first_ "Insert" Statement in my etextin
function cause i got a NULL value for inputText on second insert.
now etextin function looks like:
Datum
etextin(PG_FUNCTION_ARGS)
{
    char       *inputText = PG_GETARG_CSTRING(0);
    text       *result;
    int         len=0;

    if(!inputText)
            return CStringGetDatum("");//only for testing purpose
    /* verify encoding */
    len = strlen(inputText);
    pg_verifymbstr(inputText, len, false);

    result = (text *) palloc(len + VARHDRSZ);
    VARATT_SIZEP(result) = len + VARHDRSZ;

    memcpy(VARDATA(result), inputText, len);

    PG_RETURN_TEXT_P(result);
}


debug:(SQL=insert into test ( var1) values ('aiksnd');)

(gdb) break new_types.c:26
Breakpoint 1 at 0x40ffe888: file new_types.c, line 26.
(gdb) cont
Continuing.

Breakpoint 1, etextin (fcinfo=0x8332c90) at new_types.c:26
26              if(!inputText)
(gdb) display inputText
1: inputText = 0x8332c90 "@-3\b\020"

this looks weird to me. The content should be "aiksnd" or i'm wrong?

i think i have missed some precondition but don't know which.


With this additional if(!inputText) condition inserts don't crash
anymore. But each select on the table crashs now.


Markus Schulz

Re: selfmade datatype in C and server-crash

From
Tom Lane
Date:
Markus Schulz <msc@antzsystem.de> writes:
> here is the etextin code:(cut'n'paste from adt/varlena.c Code)

> Datum
> etextin(PG_FUNCTION_ARGS)
> {
>     char       *inputText = PG_GETARG_CSTRING(0);
>     text       *result;
>     int         len=0;

>     /* verify encoding */
>     len = strlen(inputText);
>     pg_verifymbstr(inputText, len, false);

>     result = (text *) palloc(len + VARHDRSZ);
>     VARATT_SIZEP(result) = len + VARHDRSZ;

>     memcpy(VARDATA(result), inputText, len);

>     PG_RETURN_TEXT_P(result);
> }

That code looks fine as far as it goes.  I think you forgot to add a
PG_FUNCTION_INFO_V1() macro; which means the system is calling this with
the wrong argument layout.

            regards, tom lane