Re: SUBTRANS: Minimizing calls to SubTransSetParent() - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: SUBTRANS: Minimizing calls to SubTransSetParent()
Date
Msg-id CANbhV-EASVG8aVV=m0N-7EUwXEwHRPwHi=EcV-qCuNytaSG_Ug@mail.gmail.com
Whole thread Raw
In response to Re: SUBTRANS: Minimizing calls to SubTransSetParent()  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: SUBTRANS: Minimizing calls to SubTransSetParent()
List pgsql-hackers
On Tue, 15 Nov 2022 at 21:03, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Simon Riggs <simon.riggs@enterprisedb.com> writes:
> > New version attached.
>
> I looked at this patch and I do not see how it can possibly be safe.

I grant you it is complex, so please bear with me.


> The most direct counterexample arises from the fact that
> HeapCheckForSerializableConflictOut checks SubTransGetTopmostTransaction
> in some cases.  You haven't tried to analyze when, but just disabled
> the optimization in serializable mode:
>
> +         * 2. When IsolationIsSerializable() we sometimes need to access topxid.
> +         *    This occurs only when SERIALIZABLE is requested by app user.
> +...
> +        if (MyProc->subxidStatus.overflowed ||
> +            IsolationIsSerializable() ||
>
> However, what this is checking is whether *our current transaction*
> is serializable.  If we're not serializable, but other transactions
> in the system are, then this fails to store information that they'll
> need for correctness.  You'd have to have some way of knowing that
> no transaction in the system is using serializable mode, and that
> none will start to do so while this transaction is still in-doubt.

Not true.

src/backend/storage/lmgr/README-SSI   says this...

* Any transaction which is run at a transaction isolation level
other than SERIALIZABLE will not be affected by SSI.  If you want to
enforce business rules through SSI, all transactions should be run at
the SERIALIZABLE transaction isolation level, and that should
probably be set as the default.

If HeapCheckForSerializableConflictOut() cannot find a subxid's parent
then it will not be involved in serialization errors.

So skipping the storage of subxids in subtrans for non-serializable
xacts is valid for both SSI and non-SSI xacts.

Thus the owning transaction can decide to skip the insert into
subtrans if it is not serializable.

> I fear that's already enough to kill the idea; but there's more.
> The subxidStatus.overflowed check quoted above has a similar sort
> of myopia: it's checking whether our current transaction has
> already suboverflowed.  But (a) that doesn't prove it won't suboverflow
> later, and (b) the relevant logic in XidInMVCCSnapshot needs to run
> SubTransGetTopmostTransaction if *any* proc in the snapshot has
> suboverflowed.

Not the way it is coded now.

First, we search the subxid cache in snapshot->subxip.
Then, and only if the snapshot overflowed (i.e. ANY xact overflowed),
do we check subtrans.

Thus, the owning xact knows that anyone else will find the first 64
xids in the subxid cache, so it need not insert them into subtrans,
even if someone else overflowed. When the owning xact overflows, it
knows it must now insert the subxid into subtrans before the xid is
used anywhere in storage, which the patch does. This allows each
owning xact to decide what to do, independent of the actions of
others.

> Lastly, I don't see what the "transaction on same page" business
> has got to do with anything.  The comment is certainly failing
> to make the case that it's safe to skip filling subtrans when that
> is true.

That seems strange, I grant you. It's the same logic that is used in
TransactionIdSetTreeStatus(), in reverse. I understand it 'cos I wrote
it.

TRANSACTION_STATUS_SUB_COMMITTED is only ever used if the topxid and
subxid are on different pages. Therefore TransactionIdDidCommit()
won't ever see a value of TRANSACTION_STATUS_SUB_COMMITTED unless they
are on separate pages. So the owning transaction can predict in
advance whether anyone will ever call SubTransGetParent() for one of
its xids. If they might, then we record the values just in case. If
they NEVER will, then we can skip recording them.


And just to be clear, all 3 of the above preconditions must be true
before the owning xact decides to skip writing a subxid to subtrans.

> I think we could salvage this small idea:
>
> +             * Insert entries into subtrans for this xid, noting that the entry
> +             * points directly to the topxid, not the immediate parent. This is
> +             * done for two reasons:
> +             * (1) so it is faster in a long chain of subxids, because the
> +             * algorithm is then O(1), no matter how many subxids are assigned.
>
> but some work would be needed to update the comments around
> SubTransGetParent and SubTransGetTopmostTransaction to explain that
> they're no longer reliably different.  I think that that is okay for
> the existing use-cases, but they'd better be documented.  In fact,
> couldn't we simplify them down to one function?  Given the restriction
> that we don't look back in pg_subtrans further than TransactionXmin,
> I don't think that updated code would ever need to resolve cases
> written by older code.  So we could remove the recursive checks
> entirely, or at least be confident that they don't recurse more
> than once.

Happy to do so, I'd left it that way to soften the blow of the
absorbing the earlier thoughts.

(Since we know all subxids point directly to parent we know we only
ever need to do one lookup).


I know that if there is a flaw in the above logic then you will find it.

Happy to make any comments changes needed to record the above thoughts
more permanently. I tried, but clearly didn't get everything down
clearly.

Thanks for your detailed thoughts.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Slow standby snapshot
Next
From: Tom Lane
Date:
Subject: Re: Slow standby snapshot