Thread: Initializing Datums for use with SPI_execute_plan

Initializing Datums for use with SPI_execute_plan

From
"Jack Orenstein"
Date:
I am trying to write a postgresql extension in C, which uses
SPI_prepare and SPI_execute_plan. What I want to do is to create a
prepared INSERT statement, bind some int, bigint, varchar and
bytea values, and then execute the INSERT.

I'm stuck on how to bind the values. The problem is that the values
being bound are not present as Datums. I don't get them as function
parameters, or from a table. I have data encoded, in a proprietary
format in one bytea column of a table. My function decodes the bytea
yielding the values that I need to bind to the INSERT statement.

int, bigint: From looking at postgres.h, I realize that Datum is an
unsigned long. I'm guessing that I should just be able to assign Datums
carrying ints or bigints, e.g. using Int32GetDatum to cast the int to
a Datum. Is that correct?

varchar: I have a zero-terminated string that I need to turn into a
Datum. Is CStringGetDatum the right thing to use?

bytea: I have an unsigned char* (not zero-terminated). Can I use
PointerGetDatum?

For the varchar and bytea cases, I assume that I simply pfree the
palloc'ed data as usual, after the SPI_execute_plan call.

Jack Orenstein

Re: Initializing Datums for use with SPI_execute_plan

From
Martijn van Oosterhout
Date:
On Tue, Sep 12, 2006 at 09:57:33PM -0400, Jack Orenstein wrote:
> I'm stuck on how to bind the values. The problem is that the values
> being bound are not present as Datums. I don't get them as function
> parameters, or from a table. I have data encoded, in a proprietary
> format in one bytea column of a table. My function decodes the bytea
> yielding the values that I need to bind to the INSERT statement.
>
> int, bigint: From looking at postgres.h, I realize that Datum is an
> unsigned long. I'm guessing that I should just be able to assign Datums
> carrying ints or bigints, e.g. using Int32GetDatum to cast the int to
> a Datum. Is that correct?

Yes, need to use the *GetDatum functions.

> varchar: I have a zero-terminated string that I need to turn into a
> Datum. Is CStringGetDatum the right thing to use?

CStringGetDatum will get you something of the right format for cstring,
if you want varchar, you need to have an object of "VarChar*" first.

> bytea: I have an unsigned char* (not zero-terminated). Can I use
> PointerGetDatum?

Similarly you should use "bytea*" here. Then you can use
PointerGetDatum.

> For the varchar and bytea cases, I assume that I simply pfree the
> palloc'ed data as usual, after the SPI_execute_plan call.

Pretty much.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

Re: Initializing Datums for use with SPI_execute_plan

From
"Jack Orenstein"
Date:
On 9/13/06, Martijn van Oosterhout <kleptog@svana.org> wrote:
> On Tue, Sep 12, 2006 at 09:57:33PM -0400, Jack Orenstein wrote:
> > ...
> > int, bigint: From looking at postgres.h, I realize that Datum is an
> > unsigned long. I'm guessing that I should just be able to assign Datums
> > carrying ints or bigints, e.g. using Int32GetDatum to cast the int to
> > a Datum. Is that correct?
>
> Yes, need to use the *GetDatum functions.
>
> > varchar: I have a zero-terminated string that I need to turn into a
> > Datum. Is CStringGetDatum the right thing to use?
>
> CStringGetDatum will get you something of the right format for cstring,
> if you want varchar, you need to have an object of "VarChar*" first.
>
> > bytea: I have an unsigned char* (not zero-terminated). Can I use
> > PointerGetDatum?
>
> Similarly you should use "bytea*" here. Then you can use
> PointerGetDatum.

I don't think I explained myself clearly. I have a C string (char*,
terminating zero) and a byte array (char*, possibly containing zeros,
and I know the length). I want to obtain Datums wrapping these values
that will be used to bind varchar and bytea columns, and I'm trying to
find out how to generate these Datums.

This doesn't seem right:

    void* plan = SPI_prepare(...);
    char* string = ...;
    char* byte_array = ...;
    int byte_array_length = ...;
    Datum* values = (Datum*) palloc(sizeof(Datum) * 2);
    values[0] = CStringGetDatum(string);
    values[1] = PointerGetDatum(byte_array);
    SPI_execute_plan(plan, values, ...);

because the code does not communicate byte_array_length to
SPI_execute_plan.

I suspect I'm missing something important, because I don't know how to
get a bytea* from byte_array and byte_array_length.

Jack Orenstein

Re: Initializing Datums for use with SPI_execute_plan

From
Andrew - Supernews
Date:
On 2006-09-14, "Jack Orenstein" <jack.orenstein@gmail.com> wrote:
> I don't think I explained myself clearly. I have a C string (char*,
> terminating zero) and a byte array (char*, possibly containing zeros,
> and I know the length). I want to obtain Datums wrapping these values
> that will be used to bind varchar and bytea columns, and I'm trying to
> find out how to generate these Datums.

One way:

    bytea *my_bytea = (bytea *) palloc(byte_array_len + VARHDRSZ);
    memcpy(VARDATA(my_bytea), byte_array, byte_array_len);
    VARATT_SIZEP(my_bytea) = byte_array_len + VARHDRSZ;

    values[0] = DirectFunctionCall1(textin,CStringGetDatum(yourstring));
    values[1] = PointerGetDatum(my_bytea);

    SPI_execute_plan...

--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

Re: Initializing Datums for use with SPI_execute_plan

From
"Jack Orenstein"
Date:
On 9/14/06, Andrew - Supernews <andrew+nonews@supernews.com> wrote:
> On 2006-09-14, "Jack Orenstein" <jack.orenstein@gmail.com> wrote:
> > I don't think I explained myself clearly. I have a C string (char*,
> > terminating zero) and a byte array (char*, possibly containing zeros,
> > and I know the length). I want to obtain Datums wrapping these values
> > that will be used to bind varchar and bytea columns, and I'm trying to
> > find out how to generate these Datums.
>
> One way:

Thank you, I hope you can help with some clarifications.

>
>     bytea *my_bytea = (bytea *) palloc(byte_array_len + VARHDRSZ);
>     memcpy(VARDATA(my_bytea), byte_array, byte_array_len);
>     VARATT_SIZEP(my_bytea) = byte_array_len + VARHDRSZ;
>
>     values[0] = DirectFunctionCall1(textin,CStringGetDatum(yourstring));

What is textin -- the name of a function?

>     values[1] = PointerGetDatum(my_bytea);

One other thing is still unclear: What about signed/unsigned long? If
I have a C long, then how do I get a Datum that I can assign to
values[i]?

Jack Orenstein

Re: Initializing Datums for use with SPI_execute_plan

From
Andrew - Supernews
Date:
On 2006-09-18, "Jack Orenstein" <jack.orenstein@gmail.com> wrote:
> On 9/14/06, Andrew - Supernews <andrew+nonews@supernews.com> wrote:
>> One way:
>
> Thank you, I hope you can help with some clarifications.
>
>>
>>     bytea *my_bytea = (bytea *) palloc(byte_array_len + VARHDRSZ);
>>     memcpy(VARDATA(my_bytea), byte_array, byte_array_len);
>>     VARATT_SIZEP(my_bytea) = byte_array_len + VARHDRSZ;
>>
>>     values[0] = DirectFunctionCall1(textin,CStringGetDatum(yourstring));
>
> What is textin -- the name of a function?

textin is the name of the input function for the "text" type; it takes a
C string as input and returns a Datum. Most (but not all) type input
functions can be called this way. (The format of the C string that you pass
to a type input function is exactly the same as you'd send from a client,
without quoting/escaping.)

(The ones that can't be called by DirectFunctionCall1 are those that need
access to type info, such as array or composite value input, or user-defined
types. Those are callable using other fmgr interfaces.)

>>     values[1] = PointerGetDatum(my_bytea);
>
> One other thing is still unclear: What about signed/unsigned long? If
> I have a C long, then how do I get a Datum that I can assign to
> values[i]?

What matters isn't so much the C type that you have, as the SQL type that
you are supposedly providing. Pg doesn't have an "unsigned" type unless
you created your own, so the question is, are you passing an int4 or an
int8 to SQL?

--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

Re: Initializing Datums for use with SPI_execute_plan

From
"Jack Orenstein"
Date:
On 9/18/06, Jack Orenstein <jack.orenstein@gmail.com> wrote:
> ---------- Forwarded message ----------
> From: Andrew - Supernews <andrew@supernews.net>
...
>  Jack> I have an int8 that I need as a Datum for use with
>  Jack> SPI_execute_plan.
>
> Int64GetDatum(your_variable)

which then has to be pfree'd, correct?

Jack

Re: Initializing Datums for use with SPI_execute_plan

From
Martijn van Oosterhout
Date:
On Tue, Sep 19, 2006 at 01:27:56PM -0400, Jack Orenstein wrote:
> On 9/18/06, Jack Orenstein <jack.orenstein@gmail.com> wrote:
> >---------- Forwarded message ----------
> >From: Andrew - Supernews <andrew@supernews.net>
> ...
> > Jack> I have an int8 that I need as a Datum for use with
> > Jack> SPI_execute_plan.
> >
> >Int64GetDatum(your_variable)
>
> which then has to be pfree'd, correct?

Possibly, it probably depends on the architechture. The memory is being
allocated in a per-call context IIRC so it'll be freed at the end of
the function anyway. I wouldn't worry about it.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

Re: Initializing Datums for use with SPI_execute_plan

From
"Jack Orenstein"
Date:
On 9/19/06, Martijn van Oosterhout <kleptog@svana.org> wrote:
> On Tue, Sep 19, 2006 at 01:27:56PM -0400, Jack Orenstein wrote:
> > On 9/18/06, Jack Orenstein <jack.orenstein@gmail.com> wrote:
> > >---------- Forwarded message ----------
> > >From: Andrew - Supernews <andrew@supernews.net>
> > ...
> > > Jack> I have an int8 that I need as a Datum for use with
> > > Jack> SPI_execute_plan.
> > >
> > >Int64GetDatum(your_variable)
> >
> > which then has to be pfree'd, correct?
>
> Possibly, it probably depends on the architechture. The memory is being
> allocated in a per-call context IIRC so it'll be freed at the end of
> the function anyway. I wouldn't worry about it.

Can you provide some guidance (or point to some documentation) on how
to manage memory? Is the idea that I should (must?) not pfree
palloc'ed memory from Int64GetDatum, but I should free anything I
allocate myself using palloc? Or not even that?

The C extension I'm writing, which uses the SPI, will be called
thousands or millions of times as part of a data conversion -- I do
have to worry about memory leaks. Once the conversion completes, I
won't use the function any longer. But I'd rather not leak memory and
have to do something drastic to reclaim it, such as bouncing
postgresql.

Jack

Re: Initializing Datums for use with SPI_execute_plan

From
Tom Lane
Date:
"Jack Orenstein" <jack.orenstein@gmail.com> writes:
> The C extension I'm writing, which uses the SPI, will be called
> thousands or millions of times as part of a data conversion -- I do
> have to worry about memory leaks.

Not if it's called in a short-lived context, which should ordinarily be
the case.

            regards, tom lane

Re: Initializing Datums for use with SPI_execute_plan

From
Martijn van Oosterhout
Date:
On Tue, Sep 19, 2006 at 04:00:43PM -0400, Jack Orenstein wrote:
> Can you provide some guidance (or point to some documentation) on how
> to manage memory? Is the idea that I should (must?) not pfree
> palloc'ed memory from Int64GetDatum, but I should free anything I
> allocate myself using palloc? Or not even that?

If you want the gory details, check out the mmgr README file.

http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/mmgr/README?rev=1.9

However, except in fairly special circumstances (eg. index comparison
functions) you don't ever need to bother with pfree(). The memory will
all be freed at the right time.

The documentation on C functions and set-returning functions also has
info about when the context is reset.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

Re: Initializing Datums for use with SPI_execute_plan

From
"Jack Orenstein"
Date:
On 9/20/06, Martijn van Oosterhout <kleptog@svana.org> wrote:
> On Tue, Sep 19, 2006 at 04:00:43PM -0400, Jack Orenstein wrote:
> > Can you provide some guidance (or point to some documentation) on how
> > to manage memory? Is the idea that I should (must?) not pfree
> > palloc'ed memory from Int64GetDatum, but I should free anything I
> > allocate myself using palloc? Or not even that?
>
> If you want the gory details, check out the mmgr README file.
>
> http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/mmgr/README?rev=1.9

Thank you, that is a useful document. So let me make sure I understand:

- My C function is invoked in a MessageContext (which you referred to
in an earlier message as a short-lived context).

- In such a context, I do not need to pfree anything, including memory
that I palloc myself, and plans returned by SPI_prepare invocations,
(assuming that I don't hold onto these things past the function
invocation, e.g. in a static).

Is that correct?

Jack

Re: Initializing Datums for use with SPI_execute_plan

From
Martijn van Oosterhout
Date:
On Wed, Sep 20, 2006 at 11:02:46AM -0400, Jack Orenstein wrote:
> >http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/mmgr/README?rev=1.9
>
> Thank you, that is a useful document. So let me make sure I understand:
>
> - My C function is invoked in a MessageContext (which you referred to
> in an earlier message as a short-lived context).
>
> - In such a context, I do not need to pfree anything, including memory
> that I palloc myself, and plans returned by SPI_prepare invocations,
> (assuming that I don't hold onto these things past the function
> invocation, e.g. in a static).

In general, any memory you allocate in a function will have been free'd
by the next time you're called. If you want memory to survive (like in
a static) you need to place it in a longer lived context.

The SPI reference has some details about how long various bits of
memory survive.

It's not a garbage collector, everything in a context goes when the
context is reset or deleted. If you still have a pointer there, it'll
be invalid. If you enable debugging stuff it'll clear freed memory so
the error becomes more obvious.

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment