Thread: Question about tuplestore clients

Question about tuplestore clients

From
Gregory Stark
Date:
I tried to make tuplestore free up tuples that would no longer be needed
because they're older than the mark and neither random access nor rewind
capability was needed. This is important for three different purposes:
optimizing merge join to not need to materialize the entire data set,
recursive queries, and window functions.

However I've run into something I didn't expect. It seems merge joins keep a
reference to a tuple *after* they set the mark beyond it. I'm trying to figure
out why this is necessary but I haven't absorbed all of nodeMergejoin yet.

Is it possible I've misdiagnosed this? I think my logic is correct because if
I ifdef out the pfree it passes all regression tests. That doesn't really
prove anything of course but it seems hard to believe I would have an
off-by-one bug in setting the mark that wouldn't show up in the results.

But in my reading of nodeMergejoin so far it seems it keeps a reference to the
first tuple in a set, ie, the tuple it's going to mark. Not any tuple before
that.

Anyways, I just wanted to know if I was missing some other reason references
have to be valid for older tuples. Maybe I'm looking in the wrong place?

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


Re: Question about tuplestore clients

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
> However I've run into something I didn't expect. It seems merge joins keep a
> reference to a tuple *after* they set the mark beyond it. I'm trying to figure
> out why this is necessary but I haven't absorbed all of nodeMergejoin yet.

I think at the instant that ExecMarkPos is called, there are likely to
still be tuple slots holding references to the previously marked tuple.
It might work if you swap the two lines
                   ExecMarkPos(innerPlan);
                   MarkInnerTuple(node->mj_InnerTupleSlot, node);

However, the whole thing sounds a bit fragile.  If tuplestore_gettuple
returns a tuple with shouldfree = false, I think you had better assume
that that tuple can be referenced until after the next
tuplestore_gettuple call, independently of mark/restore calls.
        regards, tom lane