Re: Lifecycle management - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Re: Lifecycle management
Date
Msg-id 20051022205028.GH16589@svana.org
Whole thread Raw
In response to Re: Lifecycle management  (Thomas Hallgren <thomas@tada.se>)
Responses Re: Lifecycle management
List pgsql-hackers
On Sat, Oct 22, 2005 at 10:09:12PM +0200, Thomas Hallgren wrote:
> I guess some of my questions originate in lack of knowledge about the
> rules you mention. I haven't been able to find documentation that
> explains them thoroughly and I haven't been able to fully deduct it from
> looking at the backend code (partly due to my own laziness perhaps).
> Another reason is that I'm trying to marry two ways of handling object
> life cycle, the Java style using a garbage collector and the backend
> style, stacking MemoryContext's. I want the marriage to be somewhat
> generic and resilient to change.

Well, the stuff is mostly in the comments of the executor, but
src/backend/executor/README  has some info.

> Let's assume that one Java function executes a query through SPI. The
> query in itself calls another Java function that returns SET OF <complex
> type>. Each tuple returned from this query could potentially be used 'as
> is' in the caller, i.e. the inner Java function could use the same
> wrapper instance as the caller Java function if I had full control over
> the life cycle of the HeapTuple's that are passed on. At present, I copy
> those tuples and use different wrappers.

You might have to check the SPI documentation to be sure, but the
resultset you receive will be valid until leave that SPI invocation
(the call to SPI_finish). The documentation for SPI_exec says as much.

The SPI interface will collect all the rows from the called function
and return them as a block in a single memory context.

Conversely, if a Java function returns a tuple, the memory it returns
that in only needs to be valid until the next call to your function at
which point you can overwrite it. If the caller still wanted the old
tuple, it would have copied it already.

There is a bit of documentation somewhere that states that nodes should
code like:

Node
{ ResetContext(); allocate tuples in context return
}

Ah yes, in src/backend/utils/mmgr/README, under "Transient contexts
during execution".

> If I knew that all objects that I look at indeed are allocated in a
> MemoryContexts and not on the stack or as a part of the allocation of
> another object, then I could make assumptions that would enable a
> generic and safe way of doing this. From my experience though, I can't
> make such assumptions.

You're right, they're not...

> Nothing probably since I always copy such nodes and keep them until the
> finalizer is called that destroys the wrapper. It would be nice though,
> if the original producer of the tuple could be told to allocate it in a
> designated context from the very start and then *never* free it up. That
> way, PL/Java would assume full responsibility for the object destruction
> and no copying would be necessary. Today, a HeapTuple that is returned
> seems to be freed-up by either calls to heap_freetuple or by destroying
> the context in which it was allocated.

Well, consider that the output of a seqscan node is a pointer to the
actual tuple in the disk buffer in shared memory, you have to realise
that such a scheme is *tricky* at best. See that memory management for
the reasoning behind the current system.

Now, the SPI interface takes care of copying the tuples into a context
long-lived enough to survive your function. Additionally, it provides
functions to move them into a context for returning to your caller. You
obviously need to be doing something tricky to run into problems...

> The primary reason for my desire to wrap the HeapTupleHeader in a fully
> fledged HeapTuple is a) then I can call the heap_copytuple to get a safe
> durable copy and b) I don't need two different wrapper objects (AFAIK,
> there is no heap_copytupleheader function).

Actually, I was wondering where you were getting the HeapTupleHeader's
from. All the functions dealing with tuples (like heap_form_tuple,
heap_deform_tuple, etc) take HeapTuples. HeapTupleHeaders are not
really passed around. Are you making your own code to create tuples?

heap_copytupleheader == memcpy. The data is the raw data. You pretty
much need to have the HeapTuple just to know how big the tuple is as
the on disk version doesn't have a length attribute, that's just an
artifact of the in memory tuples.

Look in include/access/htup.h, you'll see that the length attribute
is in a union overlaying the xmin/xmax values. See the comment above
HeapTupleData about how tuples can be allocated

> Again, I need advice. I'm not fully aware of all the semantics involved,
> how memory contexts are allocated and destroyed, what objects that can
> be trusted to originate from memory contexts etc. Pointers to doc's or
> code that makes this clearer will help a great deal.

Read the READMEs in utils/mmgr and executor, they explain a lot.

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

pgsql-hackers by date:

Previous
From: Anuj Tripathi
Date:
Subject: Query Progress Estimator
Next
From: Martijn van Oosterhout
Date:
Subject: Re: Question about Ctrl-C and less