Re: [HACKERS] help to identify the reason that extension's C functionreturns array get segmentation fault - Mailing list pgsql-hackers

From 钱新林
Subject Re: [HACKERS] help to identify the reason that extension's C functionreturns array get segmentation fault
Date
Msg-id CABzpzpfzme=n70E1TdyA=e74wk67MH=pjnbD+0JoHFX+q=TDnw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] help to identify the reason that extension's C function returns array get segmentation fault  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Thanks for your clues. 

The system I have used to debug the code is x86 64bit based, Ubuntu 1404 and postgres 9.3.13, I have revised the code and it looks like as following:

Datum vquery(PG_FUNCTION_ARGS) {

int array_len = PG_GETARG_INT32(0);
int64 * node_ids;
ArrayType * retarr;
Datum * vals ;

SPI_connect();

//some code to retrieve data from various tables 
// node_ids are allocated and filled up

vals = SPI_palloc(array_len * sizeof(Datum));
memset (vals, 0, array_len * sizeof(Datum));

// fill the vals up
for (i = 0 ; i < array_len ; i++) 
      vals[i] = Int64GetDatum((node_ids[i]));

retarr = construct_array(vals, retcnt, INT8OID, sizeof(int64), true, 'd');

SPI_finish();

PG_RETURN_ARRAYTYPE_P(retarr);
}

It seems to solve the problem,  I have tested the code for a while and no more segmentation faults are reported. 

I have built Postgresql with --enable-debug and --enable-cassert, but use the binary with gdb and get no symbol file loaded. I will take further researches and use it to facilitate debug. Thanks.  

On Tue, Feb 28, 2017 at 12:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
钱新林 <qianxinlin@gmail.com> writes:
> I have written an extension to manage openstreetmap data. There is a C
> function to perform spatial top k query on several  tables and return an
> array of int8 type as result. The code skeleton of this function is as
> follows:

There are a remarkable lot of bugs in this code fragment.  Many of them
would not bite you as long as you are running on 64-bit Intel hardware,
but that doesn't make them not bugs.

> Datum vquery(PG_FUNCTION_ARGS) {

> int array_len = PG_GETARG_INT32(0);
> long * node_ids;

> SPI_connect();

> //some code to retrieve data from various tables
> // node_ids are allocated and filled up

> ArrayType * retarr;
> Datum * vals ;

> vals = palloc0(array_len * sizeof(long));

Datum is not necessarily the same as "long".

> // fill the vals up
> for (i = 0 ; i < array_len ; i++)
>       vals[i] = Int64GetDatum((node_ids[i]));

int64 is not necessarily the same as "long", either.

> retarr = construct_array(vals, retcnt, INT8OID, sizeof(long), true, 'i');

Again, INT8 is not the same size as "long", and it's not necessarily
pass-by-val, and it's *certainly* not integer alignment.

> SPI_finish();

> PG_RETURN_ARRAYTYPE_P(retarr);

But I think what's really biting you, probably, is that construct_array()
made the array in CurrentMemoryContext which at that point was the SPI
execution context; which would be deleted by SPI_finish.  So you're
returning a dangling pointer.  You need to do something to either copy
the array value out to the caller's context, or build it there in the
first place.

BTW, this failure would be a lot less intermittent if you were testing
in a CLOBBER_FREED_MEMORY build.  I would go so far as to say you should
*never* develop or test C code for the Postgres backend without using
the --enable-cassert configure option for your build.  You're simply
tossing away a whole lot of debug support if you don't.

                        regards, tom lane

pgsql-hackers by date:

Previous
From: Haribabu Kommi
Date:
Subject: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
Next
From: Michael Paquier
Date:
Subject: Re: [HACKERS] Creation of "Future" commit fest, named 2017-07