Some notes on whole-tuple function parameters - Mailing list pgsql-hackers

From Tom Lane
Subject Some notes on whole-tuple function parameters
Date
Msg-id 21586.987699802@sss.pgh.pa.us
Whole thread Raw
List pgsql-hackers
I figured out the bug that Alex Pilosov was reporting about constructs
likecreate table customers (...)create function cust_name(customers) ...select cust_name(a) from customers a, addresses
b...
 

The problem is that whole-tuple parameters to functions are represented
as pointers to TupleTableSlot objects.  This works as long as the
function call is performed right away, ie, a simple table scan with no
join.  But in a join, the TupleTableSlot pointer gets put into an output
tuple of a scan node, and by the time it is extracted and used again,
the scan node may have decided to recycle its per-tuple temporary
storage.  Which is where the TupleTableSlot was living.

I have fixed this for the moment by allocating the TupleTableSlot
objects and their subsidiary tuples in TransactionCommandContext, rather
than the per-tuple workspace context.  In other words, a query involving
whole-tuple parameters will leak memory until end of query.  This is not
any worse than the behavior of prior versions (which also leaked memory
for such queries) but it's pretty annoying now that we've fixed most of
the other query-duration leaks.

I believe a good fix for this issue would involve allowing whole-tuple
values to become ordinary varlena datums, so that they can be stuffed
into larger tuples (and even stored on disk as columns of tables).

However the TupleTableSlot representation will not do for this, since
it involves several pointers.  If we design a new representation then we
will break existing C-coded user functions that accept tuples according
to the documented way of doing that (cf src/tutorial/funcs.c).  Possibly
this is no big problem, since the only thing that such functions are
very likely to do with tuples is pass them to the GetAttributeByName or
GetAttributeByNum functions, and we can fix those to interpret the
pointer correctly.  But it's clearly not a change to make in a patch
release.

Another problem is that we'd need to de-TOAST any out-of-line toasted
value in such a tuple before we'd dare store it on disk.  However, it'd
be annoying to do that if the whole-tuple value were not destined to
end up on disk, but only to be passed to a function that might or might
not ever touch the toasted column.  I'm not sure how this could be
handled efficiently.  Thoughts anyone?

Anyway, a better fix is clearly a TODO item for some future release,
not something we can do for 7.1.1.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Thomas Lockhart
Date:
Subject: Re: Re: No printable 7.1 docs?
Next
From: Joel Burton
Date:
Subject: Re: Re: [BUG?] tgconstrrelid doesn't survive a dump/restore