Re: [sqlsmith] Crash in GetOldestSnapshot() - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [sqlsmith] Crash in GetOldestSnapshot()
Date
Msg-id CA+TgmoYiBFyDCNxaQCMorNRB7QtrUn-Lz0GnEb9Vcq-R2y=EXA@mail.gmail.com
Whole thread Raw
In response to Re: [sqlsmith] Crash in GetOldestSnapshot()  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sun, Aug 7, 2016 at 1:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
>> In a similar case in the past involving holdable cursors, the solution
>> was to detoast _before_ storing in the tuplestore (see
>> PersistHoldablePortal). I guess the question now is, under what
>> circumstances is it now allowable to detoast a datum with no active
>> snapshot? (Wouldn't it be necessary in such a case to know what the
>> oldest snapshot ever used in the transaction was?)
>
> After looking at this a bit, I think probably the appropriate solution
> is to register the snapshot that was used by the query and store it as
> a property of the Portal, releasing it when the Portal is destroyed.
> Essentially, this views possession of a relevant snapshot as a resource
> that is required to make toast dereferences safe.

Hmm, good idea.

> I think there has been a bug here for awhile.  Consider a committed-dead
> row with some associated toast data, and suppose the query's snapshot
> was the last one that could see that row.  Once we destroy the query's
> snapshot, there is nothing preventing a concurrent VACUUM from removing
> the dead row and the toast data.

Yeah: as you see, I came to the same conclusion.

> When the RETURNING code was originally
> written, I think this was safe enough, because the bookkeeping that
> determined when VACUUM could remove data was based on transactions'
> advertised xmins, and those did not move once set for the life of the
> transaction.  So dereferencing a toast pointer you'd fetched was safe
> for the rest of the transaction.  But when we changed over to
> oldest-snapshot-based xmin advertisement, and made it so that a
> transaction holding no snapshots advertised no xmin, we created a hazard
> for data held in portals.

But I missed this aspect of it.

> In this view of things, flattening toast pointers in "held" tuplestores
> is still necessary, but it's because their protective snapshot is going
> away not because the transaction as a whole is going away.  But as long
> as we hold onto the relevant snapshot, we don't have to do that for
> portals used intra-transaction.
>
> It's interesting to think about whether we could let snapshots outlive
> transactions and thereby not need to flatten "held" tuplestores either.
> I kinda doubt it's a good idea because of the potential bad effects
> for vacuum not being able to remove dead rows for a long time; but
> it seems at least possible to do it, in this world-view.

EnterpriseDB has had complaints from customers about the cost of
flattening TOAST pointers when tuplestores are held, so I think
providing an option to skip the flattening (at the risk of increased
bloat) would be a good idea even if we chose not to change the default
behavior.  It's worth noting that the ability to set
old_snapshot_threshold serves as a way of limiting the damage that can
be caused by the open snapshot, so that optional behavior might be
more appealing now than heretofore.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [sqlsmith] Crash in GetOldestSnapshot()
Next
From: Robert Haas
Date:
Subject: Re: [sqlsmith] Crash in GetOldestSnapshot()