Thread: Q: text palloc() size vs. SET_VARSIZE()

Q: text palloc() size vs. SET_VARSIZE()

From
Albrecht Dreß
Date:
Hi all,

I have a (hopefully not too dumb) question regarding the size allocation of a text return value in a C user-defined
function.

Basically, the function is somewhat similar to the copytext() example on
<https://www.postgresql.org/docs/10/static/xfunc-c.html>. However, the function shall perform some “decoding” of the
inputtext, so the result is either as long as the input, or shorter. 

In order to avoid time-consuming double-scanning of the input or re-allocation of memory, the idea is to allocate the
resultto the maximum possible size, which may or may not be filled completely.  Copied from the example in the manual: 

---8<--------------------------------------------------------------------------
Datum
decode_text(PG_FUNCTION_ARGS)
{
     text     *t = PG_GETARG_TEXT_PP(0);
     size_t    out_len = 0U;

     // allocate to the max. possible output size
     text     *new_t = (text *) palloc(VARSIZE_ANY_EXHDR(t) + VARHDRSZ);

     // copy data to VARDATA(new_t), and count bytes in out_len

     // set output size which is out_len <= VARSIZE_ANY_EXHDR(t)
     SET_VARSIZE(new_t, out_len + VARHDRSZ);
     PG_RETURN_TEXT_P(new_t);
}
---8<--------------------------------------------------------------------------

 From the docs, for me it is not clear whether the value assigned using SET_VARSIZE() must be the *exact* size of the
newlyallocated return value, or just the length of the text plus the header size.  IOW would the code above create a
memoryleak if out_len < VARSIZE_ANY_EXHDR(t)? 

If this approach is wrong, would it be possible in the example above to just re-size new_t to the correct size by
callingrepalloc()? 

Thanks in advance,
Albrecht.
Attachment

Re: Q: text palloc() size vs. SET_VARSIZE()

From
Tom Lane
Date:
Albrecht =?iso-8859-1?b?RHJl3w==?= <albrecht.dress@arcor.de> writes:
>      text     *t = PG_GETARG_TEXT_PP(0);
>      size_t    out_len = 0U;

>      // allocate to the max. possible output size
>      text     *new_t = (text *) palloc(VARSIZE_ANY_EXHDR(t) + VARHDRSZ);

>      // copy data to VARDATA(new_t), and count bytes in out_len

>      // set output size which is out_len <= VARSIZE_ANY_EXHDR(t)
>      SET_VARSIZE(new_t, out_len + VARHDRSZ);
>      PG_RETURN_TEXT_P(new_t);

That code looks fine to me.

>  From the docs, for me it is not clear whether the value assigned using SET_VARSIZE() must be the *exact* size of the
newlyallocated return value, or just the length of the text plus the header size.  IOW would the code above create a
memoryleak if out_len < VARSIZE_ANY_EXHDR(t)? 

No memory leak.  Your returned value would have some wasted memory at
the end of its palloc chunk, but function result values don't normally
live long enough that that's worth worrying about.

You could repalloc the result down to minimum size if you felt like it,
but I think it'd largely be a waste of cycles.  There are lots of similar
examples in the core backend, and few if any bother with a repalloc.

            regards, tom lane


Re: Q: text palloc() size vs. SET_VARSIZE()

From
Albrecht Dreß
Date:
Am 04.03.18 20:52 schrieb(en) Tom Lane:
> >  From the docs, for me it is not clear whether the value assigned using SET_VARSIZE() must be the *exact* size of
thenewly allocated return value, or just the length of the text plus the header size.  IOW would the code above create
amemory leak if out_len < VARSIZE_ANY_EXHDR(t)? 
>
> No memory leak.  Your returned value would have some wasted memory at the end of its palloc chunk, but function
resultvalues don't normally live long enough that that's worth worrying about. 

Thanks a lot for the clarification!  I.e. palloc()/pfree() basically behave like malloc()/free() in this regard…

In my application, the wasted space will actually be just a few bytes, if any.  So this is definitely the best
solution.

> You could repalloc the result down to minimum size if you felt like it, but I think it'd largely be a waste of
cycles.

Avoiding exactly this overhead is my intention!

Thanks again,
Albrecht.
Attachment