Re: SUBTRANS: Minimizing calls to SubTransSetParent() - Mailing list pgsql-hackers

From Tom Lane
Subject Re: SUBTRANS: Minimizing calls to SubTransSetParent()
Date
Msg-id 1197168.1668546218@sss.pgh.pa.us
Whole thread Raw
In response to Re: SUBTRANS: Minimizing calls to SubTransSetParent()  (Simon Riggs <simon.riggs@enterprisedb.com>)
Responses Re: SUBTRANS: Minimizing calls to SubTransSetParent()
List pgsql-hackers
Simon Riggs <simon.riggs@enterprisedb.com> writes:
> New version attached.

I looked at this patch and I do not see how it can possibly be safe.

The most direct counterexample arises from the fact that
HeapCheckForSerializableConflictOut checks SubTransGetTopmostTransaction
in some cases.  You haven't tried to analyze when, but just disabled
the optimization in serializable mode:

+         * 2. When IsolationIsSerializable() we sometimes need to access topxid.
+         *    This occurs only when SERIALIZABLE is requested by app user.
+...
+        if (MyProc->subxidStatus.overflowed ||
+            IsolationIsSerializable() ||

However, what this is checking is whether *our current transaction*
is serializable.  If we're not serializable, but other transactions
in the system are, then this fails to store information that they'll
need for correctness.  You'd have to have some way of knowing that
no transaction in the system is using serializable mode, and that
none will start to do so while this transaction is still in-doubt.

I fear that's already enough to kill the idea; but there's more.
The subxidStatus.overflowed check quoted above has a similar sort
of myopia: it's checking whether our current transaction has
already suboverflowed.  But (a) that doesn't prove it won't suboverflow
later, and (b) the relevant logic in XidInMVCCSnapshot needs to run
SubTransGetTopmostTransaction if *any* proc in the snapshot has
suboverflowed.

Lastly, I don't see what the "transaction on same page" business
has got to do with anything.  The comment is certainly failing
to make the case that it's safe to skip filling subtrans when that
is true.

I think we could salvage this small idea:

+             * Insert entries into subtrans for this xid, noting that the entry
+             * points directly to the topxid, not the immediate parent. This is
+             * done for two reasons:
+             * (1) so it is faster in a long chain of subxids, because the
+             * algorithm is then O(1), no matter how many subxids are assigned.

but some work would be needed to update the comments around
SubTransGetParent and SubTransGetTopmostTransaction to explain that
they're no longer reliably different.  I think that that is okay for
the existing use-cases, but they'd better be documented.  In fact,
couldn't we simplify them down to one function?  Given the restriction
that we don't look back in pg_subtrans further than TransactionXmin,
I don't think that updated code would ever need to resolve cases
written by older code.  So we could remove the recursive checks
entirely, or at least be confident that they don't recurse more
than once.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: archive modules
Next
From: David Rowley
Date:
Subject: Re: Speed up transaction completion faster after many relations are accessed in a transaction