Re: Weird problems with C extension and bytea as input type - Mailing list pgsql-general

From Adrian Schreyer
Subject Re: Weird problems with C extension and bytea as input type
Date
Msg-id AANLkTingmfRK=0rcz0ewNfNAq+qkv59LhV52tS2UcC3E@mail.gmail.com
Whole thread Raw
In response to Re: Weird problems with C extension and bytea as input type  (David W Noon <dwnoon@ntlworld.com>)
Responses Re: Weird problems with C extension and bytea as input type  (dennis jenkins <dennis.jenkins.75@gmail.com>)
Re: Weird problems with C extension and bytea as input type  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Tue, Mar 22, 2011 at 22:21, David W Noon <dwnoon@ntlworld.com> wrote:
> On Tue, 22 Mar 2011 16:14:47 -0500, Merlin Moncure wrote about Re:
> [GENERAL] Weird problems with C extension and bytea as input type:
>
> [snip]
>>>> On Tue, Mar 22, 2011 at 8:22 AM, Adrian Schreyer <ams214@cam.ac.uk>
>>>> wrote:
> [snip]
>>>>> bytea *b = PG_GETARG_BYTEA_P(0);
>>>>> char *ism;
>>>>>
>>>>> ism = function(b);
>>>>>
>>>>> PG_RETURN_CSTRING(ism);
>
> What is the prototype for function()?  If it returns a char * then you
> will likely have either scope problems, reentrancy problems or memory
> leaks. If you are going to buy the C++ religion then you usually need to
> buy it wholesale: do all if your string processing as std::string
> objects and only return to char * when you revert to C.

you are right, it returns a char *.

The prototype:

char *function(bytea *b);

The actual C++ function looks roughly like this

extern "C"
char *function(bytea *b)
{
   string ism;
   [...]
   return ism.c_str();
}

The postgres wrapper in C like this:

PG_FUNCTION_INFO_V1(bin_to_string);
Datum bin_to_string(PG_FUNCTION_ARGS)
{
   bytea      *b= PG_GETARG_BYTEA_P(0);

   char *ism = function(b);

   PG_RETURN_CSTRING(ism);
}

I have another function in C++ that parses the binary string (file)
into an object that is then further processed. This works for all
functions returning boolean or numeric values, only the string methods
produce these odd results. So as you said, the way in which strings
are passed between C++ and C in my code must be horribly wrong. What
would be the correct way?

> As a rough example:
>
>  bytea *b = PG_GETARG_BYTEA_P(0);
>  std::string ism;
>
>  ism = function(std::string(VARDATA(b), VARSIZE(b)-VARHDRSZ));
>
>  PG_RETURN_CSTRING(ism.c_str());
>
> Note that this returns an ASCIIZ string, which is not necessarily the
> same as the C++ string.  You would be better off creating a
> PostgreSQL text object and then return that.
>
>>Well, C++ string constructor is proper in the sense it makes copy of
>>the source data.  however, it's a little weird that you are passing
>>bytea like this...bytea can contain null and c++ string initialization
>>stops at any 0 byte.
>
> Not so.  If the constructor also specifies a length then the data
> pointer's area is not assumed to be NUL-terminated.
>
>>Maybe you should be encoding the data to text (say, to hex) first?
>
> Better to use the supplied length in the varlena descriptor.
> --
> Regards,
>
> Dave  [RLU #314465]
> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
> dwnoon@ntlworld.com (David W Noon)
> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
>

pgsql-general by date:

Previous
From: tv@fuzzy.cz
Date:
Subject: Re: RAID 1 - drive failed - very slow queries even after drive replaced
Next
From: Vibhor Kumar
Date:
Subject: Re: Utilities for managing streaming replication servers?