Planned change of ExecRestrPos API - Mailing list pgsql-hackers

From Tom Lane
Subject Planned change of ExecRestrPos API
Date
Msg-id 18571.1116184185@sss.pgh.pa.us
Whole thread Raw
Responses Re: Planned change of ExecRestrPos API  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Planned change of ExecRestrPos API  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
I'm planning to change ExecRestrPos and the routines it calls so that
an updated TupleTableSlot holding the restored-to tuple is explicitly
returned.

Currently, since nothing is explicitly done to the result Slot of a
plan node when we restore its position, you might think that the Slot
still points at the tuple that was current just before the Restore.
You'd be wrong though, at least for seqscan and indexscan plans
(I haven't looked yet at the other node types that support
mark/restore).  The reason is that the restore operation changes
the contents of a HeapTupleData struct in the scan state (rs_ctup
or xs_ctup) and all that the Slot really contains is a pointer to
that struct.

Now this is really bad.  In the first place, the Slot thinks it
has a pin on the buffer containing its current tuple.  After a
Restore, it may have pin on the wrong buffer.  It seems to be
sheer chance that we've not had bugs due to this.  (The underlying
scan does have pin on the right buffer, but one can easily imagine
sequences in which the scan could be cleared while the Slot is
still assumed valid.)  As of CVS tip the consequences could be
even worse, because the Slot may contain some pointers to extracted
fields of the tuple, and these pointers are now out of sync with
the tuple that the Slot really contains.

So I think that it's essential that we explicitly update the scan
result Slot during ExecRestrPos.

It seems to be a good idea also to make the function return the Slot.
As far as I can tell, nodeMergeJoin has been depending on the assumption
that the physical address of the result slot doesn't change during
Restore.  Which is true for all the current plan types, but since
the ExecProcNode API isn't designed to assume that a node always
returns the same Slot, it doesn't seem like ExecRestrPos should either.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Christopher Kings-Lynne
Date:
Subject: [Fwd: Re: SQL99 Hierarchical queries]
Next
From: Greg Stark
Date:
Subject: Re: Best way to scan on-disk bitmaps