Re: [sqlsmith] Crash in GetOldestSnapshot() - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [sqlsmith] Crash in GetOldestSnapshot()
Date
Msg-id CA+Tgmoadc0eTHjGYo4fMit=eq8KEngBVyHh8N82BBWsgz8aRxg@mail.gmail.com
Whole thread Raw
In response to Re: [sqlsmith] Crash in GetOldestSnapshot()  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Responses Re: [sqlsmith] Crash in GetOldestSnapshot()  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, Aug 6, 2016 at 9:00 AM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
> Hmm.
>
> So this happens because RETURNING queries run to completion immediately
> and populate a tuplestore with the results, and the portal then fetches
> from the tuplestore to send to the destination. The assumption is that
> the tuplestore output can be processed without needing a snapshot, which
> obviously is not true now if it contains toasted data.
>
> In a similar case in the past involving holdable cursors, the solution
> was to detoast _before_ storing in the tuplestore (see
> PersistHoldablePortal). I guess the question now is, under what
> circumstances is it now allowable to detoast a datum with no active
> snapshot? (Wouldn't it be necessary in such a case to know what the
> oldest snapshot ever used in the transaction was?)

Yes, I think you're right.  Suppose we leave "snapshot too old" to one
side; assume that feature is disabled.  If the backend fetches the
heap tuples without de-TOAST-ing, releases its snapshot (presumably
resetting xmin), and then goes into the tank for a while, those tuples
could be pruned.  When the backend wakes up again and tries to TOAST,
we would get the the exact sort of heap:toast disagreement that we set
out to prevent here.  That's not likely to occur because in most cases
the number of tuples returned will be small, and VACUUM is quite
unlikely to remove them before we de-TOAST.  But this report makes me
suspect it can happen (I have not tested).

With "snapshot too old" enabled, it becomes much more likely.  The
offending prune operation can happen at any time after our snapshot
times out and before we de-TOAST, rather than needing to happen after
the query ends and before we de-TOAST.

So I think in the short term what we should do about this is just fix
it so it doesn't crash.  In the longer term, we might want to think a
bit more carefully about the way we handle de-TOASTing overall; we've
had a number of different bugs that are all rooted in failure to think
carefully about the fact that the query's snapshot needs to live at
least as long as any TOAST pointers that we might want to de-TOAST
later (3f8c8e3c61cef5729980ee4372ec159862a979f1,
ec543db77b6b72f24d0a637c4a4a419cf8311d0b).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [sqlsmith] Crash in GetOldestSnapshot()
Next
From: Tom Lane
Date:
Subject: Re: [sqlsmith] Crash in GetOldestSnapshot()