Thread: Q: text palloc() size vs. SET_VARSIZE()
Hi all, I have a (hopefully not too dumb) question regarding the size allocation of a text return value in a C user-defined function. Basically, the function is somewhat similar to the copytext() example on <https://www.postgresql.org/docs/10/static/xfunc-c.html>. However, the function shall perform some “decoding” of the inputtext, so the result is either as long as the input, or shorter. In order to avoid time-consuming double-scanning of the input or re-allocation of memory, the idea is to allocate the resultto the maximum possible size, which may or may not be filled completely. Copied from the example in the manual: ---8<-------------------------------------------------------------------------- Datum decode_text(PG_FUNCTION_ARGS) { text *t = PG_GETARG_TEXT_PP(0); size_t out_len = 0U; // allocate to the max. possible output size text *new_t = (text *) palloc(VARSIZE_ANY_EXHDR(t) + VARHDRSZ); // copy data to VARDATA(new_t), and count bytes in out_len // set output size which is out_len <= VARSIZE_ANY_EXHDR(t) SET_VARSIZE(new_t, out_len + VARHDRSZ); PG_RETURN_TEXT_P(new_t); } ---8<-------------------------------------------------------------------------- From the docs, for me it is not clear whether the value assigned using SET_VARSIZE() must be the *exact* size of the newlyallocated return value, or just the length of the text plus the header size. IOW would the code above create a memoryleak if out_len < VARSIZE_ANY_EXHDR(t)? If this approach is wrong, would it be possible in the example above to just re-size new_t to the correct size by callingrepalloc()? Thanks in advance, Albrecht.
Attachment
Albrecht =?iso-8859-1?b?RHJl3w==?= <albrecht.dress@arcor.de> writes: > text *t = PG_GETARG_TEXT_PP(0); > size_t out_len = 0U; > // allocate to the max. possible output size > text *new_t = (text *) palloc(VARSIZE_ANY_EXHDR(t) + VARHDRSZ); > // copy data to VARDATA(new_t), and count bytes in out_len > // set output size which is out_len <= VARSIZE_ANY_EXHDR(t) > SET_VARSIZE(new_t, out_len + VARHDRSZ); > PG_RETURN_TEXT_P(new_t); That code looks fine to me. > From the docs, for me it is not clear whether the value assigned using SET_VARSIZE() must be the *exact* size of the newlyallocated return value, or just the length of the text plus the header size. IOW would the code above create a memoryleak if out_len < VARSIZE_ANY_EXHDR(t)? No memory leak. Your returned value would have some wasted memory at the end of its palloc chunk, but function result values don't normally live long enough that that's worth worrying about. You could repalloc the result down to minimum size if you felt like it, but I think it'd largely be a waste of cycles. There are lots of similar examples in the core backend, and few if any bother with a repalloc. regards, tom lane
Am 04.03.18 20:52 schrieb(en) Tom Lane: > > From the docs, for me it is not clear whether the value assigned using SET_VARSIZE() must be the *exact* size of thenewly allocated return value, or just the length of the text plus the header size. IOW would the code above create amemory leak if out_len < VARSIZE_ANY_EXHDR(t)? > > No memory leak. Your returned value would have some wasted memory at the end of its palloc chunk, but function resultvalues don't normally live long enough that that's worth worrying about. Thanks a lot for the clarification! I.e. palloc()/pfree() basically behave like malloc()/free() in this regard… In my application, the wasted space will actually be just a few bytes, if any. So this is definitely the best solution. > You could repalloc the result down to minimum size if you felt like it, but I think it'd largely be a waste of cycles. Avoiding exactly this overhead is my intention! Thanks again, Albrecht.