Thread: SUBTRANS: Minimizing calls to SubTransSetParent()

SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Thu, 4 Aug 2022 at 13:11, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> On Wed, 3 Aug 2022 at 20:18, Andres Freund <andres@anarazel.de> wrote:
>
> > I think we should consider redesigning subtrans more substantially - even with
> > the changes you propose here, there's still plenty ways to hit really bad
> > performance. And there's only so much we can do about that without more
> > fundamental design changes.
>
> I completely agree - you will be glad to hear that I've been working
> on a redesign of the subtrans module.
...
> I will post my patch, when complete, in a different thread.

The attached patch reduces the overhead of SUBTRANS by minimizing the
number of times SubTransSetParent() is called, to below 1% of the
current rate in common cases.

Instead of blindly calling SubTransSetParent() for every subxid, this
proposal only calls SubTransSetParent() when that information will be
required for later use. It does this by analyzing all of the callers
of SubTransGetParent() and uses these pre-conditions to filter out
calls/subxids that will never be required, for various reasons. It
redesigns the way XactLockTableWait() calls
SubTransGetTopmostTransactionId() to allow this.

This short patchset compiles and passes make check-world, with lengthy comments.

This might then make viable a simple rewrite of SUBTRANS using a hash
table, as proposed by Andres. But in any case, it will allow us to
design a densely packed SUBTRANS replacement that does not generate as
much contention and I/O.

NOTE that this patchset does not touch SUBTRANS at all, it just
minimizes the calls in preparation for a later redesign in a later
patch. If this patch/later versions of it is committed in Sept CF,
then we should be in good shape to post a subtrans redesign patch by
major patch deadline at end of year.

Patches 001 and 002 are common elements of a different patch,
"Smoothing the subtrans performance catastrophe", but other than that,
the two patches are otherwise independent of each other.

Where does this come from? I learnt a lot about subxids when coding
Hot Standby, specifically commit 06da3c570f21394003. This patch just
builds upon that earlier understanding.

Comments please.

--
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Dilip Kumar
Date:
On Mon, Aug 8, 2022 at 6:41 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> On Thu, 4 Aug 2022 at 13:11, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
> >
> > On Wed, 3 Aug 2022 at 20:18, Andres Freund <andres@anarazel.de> wrote:
> >
> > > I think we should consider redesigning subtrans more substantially - even with
> > > the changes you propose here, there's still plenty ways to hit really bad
> > > performance. And there's only so much we can do about that without more
> > > fundamental design changes.
> >
> > I completely agree - you will be glad to hear that I've been working
> > on a redesign of the subtrans module.
> ...
> > I will post my patch, when complete, in a different thread.
>
> The attached patch reduces the overhead of SUBTRANS by minimizing the
> number of times SubTransSetParent() is called, to below 1% of the
> current rate in common cases.
>
> Instead of blindly calling SubTransSetParent() for every subxid, this
> proposal only calls SubTransSetParent() when that information will be
> required for later use. It does this by analyzing all of the callers
> of SubTransGetParent() and uses these pre-conditions to filter out
> calls/subxids that will never be required, for various reasons. It
> redesigns the way XactLockTableWait() calls
> SubTransGetTopmostTransactionId() to allow this.
>
> This short patchset compiles and passes make check-world, with lengthy comments.

Does this patch set work independently or it has dependency on the
patches on the other thread "Smoothing the subtrans performance
catastrophe"?  Because in this patch I see no code where we are
changing anything to control the access of SubTransGetParent() from
SubTransGetTopmostTransactionId()?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Tue, 9 Aug 2022 at 12:39, Dilip Kumar <dilipbalaut@gmail.com> wrote:

> > This short patchset compiles and passes make check-world, with lengthy comments.
>
> Does this patch set work independently or it has dependency on the
> patches on the other thread "Smoothing the subtrans performance
> catastrophe"?

Patches 001 and 002 are common elements of a different patch,
"Smoothing the subtrans performance catastrophe", but other than that,
the two patches are otherwise independent of each other.

i.e. there are common elements in both patches
001 puts all subxid data in a snapshot (up to a limit of 64 xids per
topxid), even if one or more xids overflows.

> Because in this patch I see no code where we are
> changing anything to control the access of SubTransGetParent() from
> SubTransGetTopmostTransactionId()?

Those calls are unaffected, i.e. they both still work.

Right now, we register all subxids in subtrans. But not all xids are
subxids, so in fact, subtrans has many "holes" in it, where if you
look up the parent for an xid it will just return
InvalidTransactionId. There is a protection against that causing a
problem because if you call TransactionIdDidCommit/Abort you can get a
WARNING, or if you call SubTransGetTopmostTransaction() you can get an
ERROR, but it is possible if you do a lookup for an inappropriate xid.
i.e. if you call TransactionIdDidCommit() without first calling
TransactionIdIsInProgress() as you are supposed to do.

What this patch does is increase the number of "holes" in subtrans,
reducing the overhead and making the subtrans data structure more
amenable to using a dense structure rather than a sparse structure as
we do now, which then leads to I/O overheads. But in this patch, we
only have holes when we can prove that the subxid's parent will never
be requested.

Specifically, with this patch, running PL/pgSQL with a few
subtransactions in will cause those subxids to be logged in subtrans
about 1% as often as they are now, so greatly reducing the number of
subtrans calls.

Happy to provide more detailed review thoughts, so please keep asking questions.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Dilip Kumar
Date:
On Tue, Aug 9, 2022 at 9:46 PM Simon Riggs <simon.riggs@enterprisedb.com> wrote:

> Those calls are unaffected, i.e. they both still work.
>
> Right now, we register all subxids in subtrans. But not all xids are
> subxids, so in fact, subtrans has many "holes" in it, where if you
> look up the parent for an xid it will just return
> c. There is a protection against that causing a
> problem because if you call TransactionIdDidCommit/Abort you can get a
> WARNING, or if you call SubTransGetTopmostTransaction() you can get an
> ERROR, but it is possible if you do a lookup for an inappropriate xid.
> i.e. if you call TransactionIdDidCommit() without first calling
> TransactionIdIsInProgress() as you are supposed to do.

IIUC, if SubTransGetParent SubTransGetParent then
SubTransGetTopmostTransaction() loop will break and return the
previousxid.  So if we pass any topxid to
SubTransGetTopmostTransaction() it will return back the same xid and
that's fine as next we are going to search in the snapshot->xip array.

But if we are calling this function with the subxid which might be
there in the snapshot->subxip array but if we are first calling
SubTransGetTopmostTransaction() then it will just return the same xid
if the parent is not set for it.  And now if we search this in the
snapshot->xip array then we will get the wrong answer?

So I still think some adjustment is required in XidInMVCCSnapdhot()
such that we first search the snapshot->subxip array.

Am I still missing something?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Wed, 10 Aug 2022 at 08:34, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> Am I still missing something?

No, you have found a dependency between the patches that I was unaware
of. So there is no bug if you apply both patches.

Thanks for looking.


> So I still think some adjustment is required in XidInMVCCSnapdhot()

That is one way to resolve the issue, but not the only one. I can also
change AssignTransactionId() to recursively register parent xids for
all of a subxid's parents.

I will add in a test case and resolve the dependency in my next patch.

Thanks again.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Dilip Kumar
Date:
On Wed, Aug 10, 2022 at 6:31 PM Simon Riggs
<simon.riggs@enterprisedb.com> wrote:
>
> On Wed, 10 Aug 2022 at 08:34, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > Am I still missing something?
>
> No, you have found a dependency between the patches that I was unaware
> of. So there is no bug if you apply both patches.

Right

>
> > So I still think some adjustment is required in XidInMVCCSnapdhot()
>
> That is one way to resolve the issue, but not the only one. I can also
> change AssignTransactionId() to recursively register parent xids for
> all of a subxid's parents.
>
> I will add in a test case and resolve the dependency in my next patch.

Okay, thanks, I will look into the updated patch after you submit that.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Thu, 11 Aug 2022 at 06:32, Dilip Kumar <dilipbalaut@gmail.com> wrote:

> > > So I still think some adjustment is required in XidInMVCCSnapdhot()
> >
> > That is one way to resolve the issue, but not the only one. I can also
> > change AssignTransactionId() to recursively register parent xids for
> > all of a subxid's parents.
> >
> > I will add in a test case and resolve the dependency in my next patch.
>
> Okay, thanks, I will look into the updated patch after you submit that.

PFA two patches, replacing earlier work
001_new_isolation_tests_for_subxids.v3.patch
002_minimize_calls_to_SubTransSetParent.v8.patch

001_new_isolation_tests_for_subxids.v3.patch
Adds new test cases to master without adding any new code, specifically
addressing the two areas of code that are not tested by existing tests.
This gives us a baseline from which we can do test driven development.
I'm hoping this can be reviewed and committed fairly smoothly.

002_minimize_calls_to_SubTransSetParent.v8.patch
Reduces the number of calls to subtrans below 1% for the first 64 subxids,
so overall will substantially reduce subtrans contention on master for the
typical case, as well as smoothing the overflow case.
Some discussion needed on this; there are various options.
This combines the work originally posted here with another patch posted on the
thread "Smoothing the subtrans performance catastrophe".

I will do some performance testing also, but more welcome.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Dilip Kumar
Date:
On Tue, Aug 30, 2022 at 10:16 PM Simon Riggs
<simon.riggs@enterprisedb.com> wrote:

> PFA two patches, replacing earlier work
> 001_new_isolation_tests_for_subxids.v3.patch
> 002_minimize_calls_to_SubTransSetParent.v8.patch
>
> 001_new_isolation_tests_for_subxids.v3.patch
> Adds new test cases to master without adding any new code, specifically
> addressing the two areas of code that are not tested by existing tests.
> This gives us a baseline from which we can do test driven development.
> I'm hoping this can be reviewed and committed fairly smoothly.
>
> 002_minimize_calls_to_SubTransSetParent.v8.patch
> Reduces the number of calls to subtrans below 1% for the first 64 subxids,
> so overall will substantially reduce subtrans contention on master for the
> typical case, as well as smoothing the overflow case.
> Some discussion needed on this; there are various options.
> This combines the work originally posted here with another patch posted on the
> thread "Smoothing the subtrans performance catastrophe".
>
> I will do some performance testing also, but more welcome.

Thanks for the updated patch, I have some questions/comments.

1.
+             * This has the downside that anyone waiting for a lock on aborted
+             * subtransactions would not be released immediately; that may or
+             * may not be an acceptable compromise. If not acceptable, this
+             * simple call needs to be replaced with a loop to register the
+             * parent for the current subxid stack, so we can walk
back up it to
+             * the topxid.
+             */
+            SubTransSetParent(subxid, GetTopTransactionId());

I do not understand in which situation we will see this downside.  I
mean if we see the logic of XactLockTableWait() then in the current
situation also if the subtransaction is committed we directly wait on
the top transaction by calling SubTransGetTopmostTransaction(xid);

So if the lock-taking subtransaction is committed then we will wait
directly for the top-level transaction and after that, it doesn't
matter if we abort any of the parent subtransactions, because it will
wait for the topmost transaction to complete.  And if the lock-taking
subtransaction is aborted then it will anyway stop waiting because
TransactionIdIsInProgress() should return false.

2.
    /*
     * Notice that we update pg_subtrans with the top-level xid, rather than
     * the parent xid. This is a difference between normal processing and
     * recovery, yet is still correct in all cases. The reason is that
     * subtransaction commit is not marked in clog until commit processing, so
     * all aborted subtransactions have already been clearly marked in clog.
     * As a result we are able to refer directly to the top-level
     * transaction's state rather than skipping through all the intermediate
     * states in the subtransaction tree. This should be the first time we
     * have attempted to SubTransSetParent().
     */
    for (i = 0; i < nsubxids; i++)
        SubTransSetParent(subxids[i], topxid);

I think this comment needs some modification because in this patch now
in normal processing also we are setting the topxid as a parent right?

3.
+    while (TransactionIdIsValid(parentXid))
+    {
+        previousXid = parentXid;
+
+        /*
+         * Stop as soon as we are earlier than the cutoff. This saves multiple
+         * lookups against subtrans when we have a deeply nested subxid with
+         * a later snapshot with an xmin much higher than TransactionXmin.
+         */
+        if (TransactionIdPrecedes(parentXid, cutoff_xid))
+        {
+            *xid = previousXid;
+            return true;
+        }
+        parentXid = SubTransGetParent(parentXid);

Do we need this while loop if we are directly setting topxid as a
parent, so with that, we do not need multiple iterations to go to the
top xid?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Tue, 6 Sept 2022 at 12:37, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Aug 30, 2022 at 10:16 PM Simon Riggs
> <simon.riggs@enterprisedb.com> wrote:
>
> > PFA two patches, replacing earlier work
> > 001_new_isolation_tests_for_subxids.v3.patch
> > 002_minimize_calls_to_SubTransSetParent.v8.patch
> >
> > 001_new_isolation_tests_for_subxids.v3.patch
> > Adds new test cases to master without adding any new code, specifically
> > addressing the two areas of code that are not tested by existing tests.
> > This gives us a baseline from which we can do test driven development.
> > I'm hoping this can be reviewed and committed fairly smoothly.
> >
> > 002_minimize_calls_to_SubTransSetParent.v8.patch
> > Reduces the number of calls to subtrans below 1% for the first 64 subxids,
> > so overall will substantially reduce subtrans contention on master for the
> > typical case, as well as smoothing the overflow case.
> > Some discussion needed on this; there are various options.
> > This combines the work originally posted here with another patch posted on the
> > thread "Smoothing the subtrans performance catastrophe".
> >
> > I will do some performance testing also, but more welcome.
>
> Thanks for the updated patch, I have some questions/comments.

Thanks for the review.

> 1.
> +             * This has the downside that anyone waiting for a lock on aborted
> +             * subtransactions would not be released immediately; that may or
> +             * may not be an acceptable compromise. If not acceptable, this
> +             * simple call needs to be replaced with a loop to register the
> +             * parent for the current subxid stack, so we can walk
> back up it to
> +             * the topxid.
> +             */
> +            SubTransSetParent(subxid, GetTopTransactionId());
>
> I do not understand in which situation we will see this downside.  I
> mean if we see the logic of XactLockTableWait() then in the current
> situation also if the subtransaction is committed we directly wait on
> the top transaction by calling SubTransGetTopmostTransaction(xid);
>
> So if the lock-taking subtransaction is committed then we will wait
> directly for the top-level transaction and after that, it doesn't
> matter if we abort any of the parent subtransactions, because it will
> wait for the topmost transaction to complete.  And if the lock-taking
> subtransaction is aborted then it will anyway stop waiting because
> TransactionIdIsInProgress() should return false.

Yes, correct.

> 2.
>     /*
>      * Notice that we update pg_subtrans with the top-level xid, rather than
>      * the parent xid. This is a difference between normal processing and
>      * recovery, yet is still correct in all cases. The reason is that
>      * subtransaction commit is not marked in clog until commit processing, so
>      * all aborted subtransactions have already been clearly marked in clog.
>      * As a result we are able to refer directly to the top-level
>      * transaction's state rather than skipping through all the intermediate
>      * states in the subtransaction tree. This should be the first time we
>      * have attempted to SubTransSetParent().
>      */
>     for (i = 0; i < nsubxids; i++)
>         SubTransSetParent(subxids[i], topxid);
>
> I think this comment needs some modification because in this patch now
> in normal processing also we are setting the topxid as a parent right?

Correct

> 3.
> +    while (TransactionIdIsValid(parentXid))
> +    {
> +        previousXid = parentXid;
> +
> +        /*
> +         * Stop as soon as we are earlier than the cutoff. This saves multiple
> +         * lookups against subtrans when we have a deeply nested subxid with
> +         * a later snapshot with an xmin much higher than TransactionXmin.
> +         */
> +        if (TransactionIdPrecedes(parentXid, cutoff_xid))
> +        {
> +            *xid = previousXid;
> +            return true;
> +        }
> +        parentXid = SubTransGetParent(parentXid);
>
> Do we need this while loop if we are directly setting topxid as a
> parent, so with that, we do not need multiple iterations to go to the
> top xid?

Correct. I think we can dispense with
SubTransGetTopmostTransactionPrecedes() entirely.

I was initially trying to leave options open but that is confusing and
as a result, some parts are misleading after I merged the two patches.

I will update the patch, thanks for your scrutiny.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Tue, 6 Sept 2022 at 13:14, Simon Riggs <simon.riggs@enterprisedb.com> wrote:

> I will update the patch, thanks for your scrutiny.

I attach a diff showing what has changed between v8 and v9, and will
reattach a full set of new patches in the next post, so patchtester
doesn't squeal.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Tue, 13 Sept 2022 at 11:56, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> On Tue, 6 Sept 2022 at 13:14, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> > I will update the patch, thanks for your scrutiny.
>
> I attach a diff showing what has changed between v8 and v9, and will
> reattach a full set of new patches in the next post, so patchtester
> doesn't squeal.

Full set of v9 patches

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Alvaro Herrera
Date:
On 2022-Aug-30, Simon Riggs wrote:

> 001_new_isolation_tests_for_subxids.v3.patch
> Adds new test cases to master without adding any new code, specifically
> addressing the two areas of code that are not tested by existing tests.
> This gives us a baseline from which we can do test driven development.
> I'm hoping this can be reviewed and committed fairly smoothly.

I gave this a quick run to confirm the claimed increase of coverage.  It
checks out, so pushed.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Wed, 14 Sept 2022 at 15:21, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2022-Aug-30, Simon Riggs wrote:
>
> > 001_new_isolation_tests_for_subxids.v3.patch
> > Adds new test cases to master without adding any new code, specifically
> > addressing the two areas of code that are not tested by existing tests.
> > This gives us a baseline from which we can do test driven development.
> > I'm hoping this can be reviewed and committed fairly smoothly.
>
> I gave this a quick run to confirm the claimed increase of coverage.  It
> checks out, so pushed.

Thank you.

So now we just have the main part of the patch, reattached here for
the auto patch tester's benefit.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Japin Li
Date:
On Thu, 15 Sep 2022 at 18:04, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
> On Wed, 14 Sept 2022 at 15:21, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>>
>> On 2022-Aug-30, Simon Riggs wrote:
>>
>> > 001_new_isolation_tests_for_subxids.v3.patch
>> > Adds new test cases to master without adding any new code, specifically
>> > addressing the two areas of code that are not tested by existing tests.
>> > This gives us a baseline from which we can do test driven development.
>> > I'm hoping this can be reviewed and committed fairly smoothly.
>>
>> I gave this a quick run to confirm the claimed increase of coverage.  It
>> checks out, so pushed.
>
> Thank you.
>
> So now we just have the main part of the patch, reattached here for
> the auto patch tester's benefit.

Hi Simon,

Thanks for the updated patch, here are some comments.

There is a typo, s/SubTransGetTopMostTransaction/SubTransGetTopmostTransaction/g.

+         *    call SubTransGetTopMostTransaction() if that xact overflowed;


Is there a punctuation mark missing on the following first line?

+         * 2. When IsolationIsSerializable() we sometimes need to access topxid
+         *    This occurs only when SERIALIZABLE is requested by app user.


When we use function name in comments, some places we use parentheses,
but others do not use it. Why? I think, we should keep them consistent,
at least in the same commit.

+         * 3. When TransactionIdSetTreeStatus will use a status of SUB_COMMITTED,
+         *    which then requires us to consult subtrans to find parent, which
+         *    is needed to avoid race condition. In this case we ask Clog/Xact
+         *    module if TransactionIdsAreOnSameXactPage(). Since we start a new
+         *    clog page every 32000 xids, this is usually <<1% of subxids.

Maybe we declaration a topxid to avoid calling GetTopTransactionId()
twice when we should set subtrans parent?

+        TransactionId subxid = XidFromFullTransactionId(s->fullTransactionId);
+        TransactionId topxid = GetTopTransactionId();
  ...
+        if (MyProc->subxidStatus.overflowed ||
+            IsolationIsSerializable() ||
+            !TransactionIdsAreOnSameXactPage(topxid, subxid))
+        {
  ...
+            SubTransSetParent(subxid, topxid);
+        }

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Thu, 15 Sept 2022 at 12:36, Japin Li <japinli@hotmail.com> wrote:
>
>
> On Thu, 15 Sep 2022 at 18:04, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
> > On Wed, 14 Sept 2022 at 15:21, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >>
> >> On 2022-Aug-30, Simon Riggs wrote:
> >>
> >> > 001_new_isolation_tests_for_subxids.v3.patch
> >> > Adds new test cases to master without adding any new code, specifically
> >> > addressing the two areas of code that are not tested by existing tests.
> >> > This gives us a baseline from which we can do test driven development.
> >> > I'm hoping this can be reviewed and committed fairly smoothly.
> >>
> >> I gave this a quick run to confirm the claimed increase of coverage.  It
> >> checks out, so pushed.
> >
> > Thank you.
> >
> > So now we just have the main part of the patch, reattached here for
> > the auto patch tester's benefit.
>
> Hi Simon,
>
> Thanks for the updated patch, here are some comments.

Thanks for your comments.

> There is a typo, s/SubTransGetTopMostTransaction/SubTransGetTopmostTransaction/g.
>
> +                *    call SubTransGetTopMostTransaction() if that xact overflowed;
>
>
> Is there a punctuation mark missing on the following first line?
>
> +                * 2. When IsolationIsSerializable() we sometimes need to access topxid
> +                *    This occurs only when SERIALIZABLE is requested by app user.
>
>
> When we use function name in comments, some places we use parentheses,
> but others do not use it. Why? I think, we should keep them consistent,
> at least in the same commit.
>
> +                * 3. When TransactionIdSetTreeStatus will use a status of SUB_COMMITTED,
> +                *    which then requires us to consult subtrans to find parent, which
> +                *    is needed to avoid race condition. In this case we ask Clog/Xact
> +                *    module if TransactionIdsAreOnSameXactPage(). Since we start a new
> +                *    clog page every 32000 xids, this is usually <<1% of subxids.

I've reworded those comments, hoping to address all of your above points.

> Maybe we declaration a topxid to avoid calling GetTopTransactionId()
> twice when we should set subtrans parent?
>
> +               TransactionId subxid = XidFromFullTransactionId(s->fullTransactionId);
> +               TransactionId topxid = GetTopTransactionId();
>   ...
> +               if (MyProc->subxidStatus.overflowed ||
> +                       IsolationIsSerializable() ||
> +                       !TransactionIdsAreOnSameXactPage(topxid, subxid))
> +               {
>   ...
> +                       SubTransSetParent(subxid, topxid);
> +               }

Seems a minor point, but I've done this anyway.

Thanks for the review.

v10 attached

--
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Fri, 16 Sept 2022 at 13:20, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> Thanks for the review.
>
> v10 attached

v11 attached, corrected for recent commit
14ff44f80c09718d43d853363941457f5468cc03.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Julien Tachoires
Date:
Hi,

Le lun. 26 sept. 2022 à 15:57, Simon Riggs
<simon.riggs@enterprisedb.com> a écrit :
>
> On Fri, 16 Sept 2022 at 13:20, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
> >
> > Thanks for the review.
> >
> > v10 attached
>
> v11 attached, corrected for recent commit
> 14ff44f80c09718d43d853363941457f5468cc03.

Please find below the performance tests results I have produced for this patch.
Attaching some charts and the scripts used to reproduce these tests.

1. Assumption

The number of sub-transaction issued by only one long running
transaction may affect global TPS throughput if the number of
sub-transaction exceeds 64 (sub-overflow)

2. Testing scenario

Based on pgbench, 2 different types of DB activity are applied concurrently:
- 1 long running transaction, including N sub-transactions
- X pgbench clients running read-only workload

Tests are executed with a varying number of sub-transactions: from 0 to 128
Key metric is the TPS rate reported by pgbench runs in read-only mode

Tests are executed against
- HEAD (14a737)
- HEAD (14a737) + 002_minimize_calls_to_SubTransSetParent.v11.patch

3. Long transaction anatomy

Two different long transactions are tested because they don't have the
exact same impact on performance.

Transaction number 1 includes one UPDATE affecting each row of
pgbench_accounts, plus an additional UPDATE affecting only one row but
executed in its own rollbacked sub-transaction:
BEGIN;
SAVEPOINT s1;
SAVEPOINT s2;
-- ...
SAVEPOINT sN - 1;
UPDATE pgbench_accounts SET abalance = abalance + 1  WHERE aid > 0;
SAVEPOINT sN;
UPDATE pgbench_accounts SET abalance = abalance + 1 WHERE aid = 12345;
ROLLBACK TO SAVEPOINT sN;
-- sleeping until the end of the test
ROLLBACK;

Transaction 2 includes one UPDATE affecting each row of pgbench_accounts:
BEGIN;
SAVEPOINT s1;
SAVEPOINT s2;
-- ...
SAVEPOINT sN;
UPDATE pgbench_accounts SET abalance = abalance + 1 WHERE aid > 0;
-- sleeping until the end of the test
ROLLBACK;

4. Test results with transaction 1

TPS vs number of sub-transaction

nsubx  HEAD  patched
--------------------
  0   441109  439474
  8   439045  438103
 16   439123  436993
 24   436269  434194
 32   439707  437429
 40   439997  437220
 48   439388  437422
 56   439409  437210
 64   439748  437366
 72    92869  434448
 80    66577  434100
 88    61243  434255
 96    57016  434419
104    52132  434917
112    49181  433755
120    46581  434044
128    44067  434268

Perf profiling on HEAD with 80 sub-transactions:
Overhead  Symbol
  51.26%  [.] LWLockAttemptLock
  24.59%  [.] LWLockRelease
   0.36%  [.] base_yyparse
   0.35%  [.] PinBuffer
   0.34%  [.] AllocSetAlloc
   0.33%  [.] hash_search_with_hash_value
   0.22%  [.] LWLockAcquire
   0.20%  [.] UnpinBuffer
   0.15%  [.] SimpleLruReadPage_ReadOnly
   0.15%  [.] _bt_compare

Perf profiling on patched with 80 sub-transactions:
Overhead  Symbol
  2.64%  [.] AllocSetAlloc
  2.09%  [.] base_yyparse
  1.76%  [.] hash_search_with_hash_value
  1.62%  [.] LWLockAttemptLock
  1.26%  [.] MemoryContextAllocZeroAligned
  0.93%  [.] _bt_compare
  0.92%  [.] expression_tree_walker_impl.part.4
  0.84%  [.] SearchCatCache1
  0.79%  [.] palloc
  0.64%  [.] core_yylex

5. Test results with transaction 2

nsubx  HEAD  patched
--------------------
  0  440145  443816
  8  438867  443081
 16  438634  441786
 24  436406  440187
 32  439203  442447
 40  439819  443574
 48  439314  442941
 56  439801  443736
 64  439074  441970
 72  439833  444132
 80  148737  439941
 88  413714  443343
 96  251098  442021
104   70190  443488
112  405507  438866
120  177827  443202
128  399431  441842

From the performance point of view, this patch clearly fixes the
dramatic TPS collapse shown in these tests.

Regards,

--
Julien Tachoires
EDB

Attachment

Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Dilip Kumar
Date:
On Fri, Oct 28, 2022 at 10:55 PM Julien Tachoires <julmon@gmail.com> wrote:
>
> Hi,
>
> Le lun. 26 sept. 2022 à 15:57, Simon Riggs
> <simon.riggs@enterprisedb.com> a écrit :
> >
> > On Fri, 16 Sept 2022 at 13:20, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
> > >
> > > Thanks for the review.
> > >
> > > v10 attached
> >
> > v11 attached, corrected for recent commit
> > 14ff44f80c09718d43d853363941457f5468cc03.
>
> Please find below the performance tests results I have produced for this patch.
> Attaching some charts and the scripts used to reproduce these tests.
>
> 1. Assumption
>
> The number of sub-transaction issued by only one long running
> transaction may affect global TPS throughput if the number of
> sub-transaction exceeds 64 (sub-overflow)
>
> 2. Testing scenario
>
> Based on pgbench, 2 different types of DB activity are applied concurrently:
> - 1 long running transaction, including N sub-transactions
> - X pgbench clients running read-only workload
>
> Tests are executed with a varying number of sub-transactions: from 0 to 128
> Key metric is the TPS rate reported by pgbench runs in read-only mode
>
> Tests are executed against
> - HEAD (14a737)
> - HEAD (14a737) + 002_minimize_calls_to_SubTransSetParent.v11.patch
>
> 3. Long transaction anatomy
>
> Two different long transactions are tested because they don't have the
> exact same impact on performance.
>
> Transaction number 1 includes one UPDATE affecting each row of
> pgbench_accounts, plus an additional UPDATE affecting only one row but
> executed in its own rollbacked sub-transaction:
> BEGIN;
> SAVEPOINT s1;
> SAVEPOINT s2;
> -- ...
> SAVEPOINT sN - 1;
> UPDATE pgbench_accounts SET abalance = abalance + 1  WHERE aid > 0;
> SAVEPOINT sN;
> UPDATE pgbench_accounts SET abalance = abalance + 1 WHERE aid = 12345;
> ROLLBACK TO SAVEPOINT sN;
> -- sleeping until the end of the test
> ROLLBACK;
>
> Transaction 2 includes one UPDATE affecting each row of pgbench_accounts:
> BEGIN;
> SAVEPOINT s1;
> SAVEPOINT s2;
> -- ...
> SAVEPOINT sN;
> UPDATE pgbench_accounts SET abalance = abalance + 1 WHERE aid > 0;
> -- sleeping until the end of the test
> ROLLBACK;
>
> 4. Test results with transaction 1
>
> TPS vs number of sub-transaction
>
> nsubx  HEAD  patched
> --------------------
>   0   441109  439474
>   8   439045  438103
>  16   439123  436993
>  24   436269  434194
>  32   439707  437429
>  40   439997  437220
>  48   439388  437422
>  56   439409  437210
>  64   439748  437366
>  72    92869  434448
>  80    66577  434100
>  88    61243  434255
>  96    57016  434419
> 104    52132  434917
> 112    49181  433755
> 120    46581  434044
> 128    44067  434268
>
> Perf profiling on HEAD with 80 sub-transactions:
> Overhead  Symbol
>   51.26%  [.] LWLockAttemptLock
>   24.59%  [.] LWLockRelease
>    0.36%  [.] base_yyparse
>    0.35%  [.] PinBuffer
>    0.34%  [.] AllocSetAlloc
>    0.33%  [.] hash_search_with_hash_value
>    0.22%  [.] LWLockAcquire
>    0.20%  [.] UnpinBuffer
>    0.15%  [.] SimpleLruReadPage_ReadOnly
>    0.15%  [.] _bt_compare
>
> Perf profiling on patched with 80 sub-transactions:
> Overhead  Symbol
>   2.64%  [.] AllocSetAlloc
>   2.09%  [.] base_yyparse
>   1.76%  [.] hash_search_with_hash_value
>   1.62%  [.] LWLockAttemptLock
>   1.26%  [.] MemoryContextAllocZeroAligned
>   0.93%  [.] _bt_compare
>   0.92%  [.] expression_tree_walker_impl.part.4
>   0.84%  [.] SearchCatCache1
>   0.79%  [.] palloc
>   0.64%  [.] core_yylex
>
> 5. Test results with transaction 2
>
> nsubx  HEAD  patched
> --------------------
>   0  440145  443816
>   8  438867  443081
>  16  438634  441786
>  24  436406  440187
>  32  439203  442447
>  40  439819  443574
>  48  439314  442941
>  56  439801  443736
>  64  439074  441970
>  72  439833  444132
>  80  148737  439941
>  88  413714  443343
>  96  251098  442021
> 104   70190  443488
> 112  405507  438866
> 120  177827  443202
> 128  399431  441842
>
> From the performance point of view, this patch clearly fixes the
> dramatic TPS collapse shown in these tests.

I think these are really promising results.  Although the perf result
shows that the bottleneck on the SLRU is no more there with the patch,
I think it would be nice to see the wait event as well.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Julien Tachoires
Date:
Hi,

Le mar. 1 nov. 2022 à 09:39, Dilip Kumar <dilipbalaut@gmail.com> a écrit :
>
> On Fri, Oct 28, 2022 at 10:55 PM Julien Tachoires <julmon@gmail.com> wrote:
> >
> > Hi,
> >
> > Le lun. 26 sept. 2022 à 15:57, Simon Riggs
> > <simon.riggs@enterprisedb.com> a écrit :
> > >
> > > On Fri, 16 Sept 2022 at 13:20, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
> > > >
> > > > Thanks for the review.
> > > >
> > > > v10 attached
> > >
> > > v11 attached, corrected for recent commit
> > > 14ff44f80c09718d43d853363941457f5468cc03.
> >
> > Please find below the performance tests results I have produced for this patch.
> > Attaching some charts and the scripts used to reproduce these tests.
> >
> > 1. Assumption
> >
> > The number of sub-transaction issued by only one long running
> > transaction may affect global TPS throughput if the number of
> > sub-transaction exceeds 64 (sub-overflow)
> >
> > 2. Testing scenario
> >
> > Based on pgbench, 2 different types of DB activity are applied concurrently:
> > - 1 long running transaction, including N sub-transactions
> > - X pgbench clients running read-only workload
> >
> > Tests are executed with a varying number of sub-transactions: from 0 to 128
> > Key metric is the TPS rate reported by pgbench runs in read-only mode
> >
> > Tests are executed against
> > - HEAD (14a737)
> > - HEAD (14a737) + 002_minimize_calls_to_SubTransSetParent.v11.patch
> >
> > 3. Long transaction anatomy
> >
> > Two different long transactions are tested because they don't have the
> > exact same impact on performance.
> >
> > Transaction number 1 includes one UPDATE affecting each row of
> > pgbench_accounts, plus an additional UPDATE affecting only one row but
> > executed in its own rollbacked sub-transaction:
> > BEGIN;
> > SAVEPOINT s1;
> > SAVEPOINT s2;
> > -- ...
> > SAVEPOINT sN - 1;
> > UPDATE pgbench_accounts SET abalance = abalance + 1  WHERE aid > 0;
> > SAVEPOINT sN;
> > UPDATE pgbench_accounts SET abalance = abalance + 1 WHERE aid = 12345;
> > ROLLBACK TO SAVEPOINT sN;
> > -- sleeping until the end of the test
> > ROLLBACK;
> >
> > Transaction 2 includes one UPDATE affecting each row of pgbench_accounts:
> > BEGIN;
> > SAVEPOINT s1;
> > SAVEPOINT s2;
> > -- ...
> > SAVEPOINT sN;
> > UPDATE pgbench_accounts SET abalance = abalance + 1 WHERE aid > 0;
> > -- sleeping until the end of the test
> > ROLLBACK;
> >
> > 4. Test results with transaction 1
> >
> > TPS vs number of sub-transaction
> >
> > nsubx  HEAD  patched
> > --------------------
> >   0   441109  439474
> >   8   439045  438103
> >  16   439123  436993
> >  24   436269  434194
> >  32   439707  437429
> >  40   439997  437220
> >  48   439388  437422
> >  56   439409  437210
> >  64   439748  437366
> >  72    92869  434448
> >  80    66577  434100
> >  88    61243  434255
> >  96    57016  434419
> > 104    52132  434917
> > 112    49181  433755
> > 120    46581  434044
> > 128    44067  434268
> >
> > Perf profiling on HEAD with 80 sub-transactions:
> > Overhead  Symbol
> >   51.26%  [.] LWLockAttemptLock
> >   24.59%  [.] LWLockRelease
> >    0.36%  [.] base_yyparse
> >    0.35%  [.] PinBuffer
> >    0.34%  [.] AllocSetAlloc
> >    0.33%  [.] hash_search_with_hash_value
> >    0.22%  [.] LWLockAcquire
> >    0.20%  [.] UnpinBuffer
> >    0.15%  [.] SimpleLruReadPage_ReadOnly
> >    0.15%  [.] _bt_compare
> >
> > Perf profiling on patched with 80 sub-transactions:
> > Overhead  Symbol
> >   2.64%  [.] AllocSetAlloc
> >   2.09%  [.] base_yyparse
> >   1.76%  [.] hash_search_with_hash_value
> >   1.62%  [.] LWLockAttemptLock
> >   1.26%  [.] MemoryContextAllocZeroAligned
> >   0.93%  [.] _bt_compare
> >   0.92%  [.] expression_tree_walker_impl.part.4
> >   0.84%  [.] SearchCatCache1
> >   0.79%  [.] palloc
> >   0.64%  [.] core_yylex
> >
> > 5. Test results with transaction 2
> >
> > nsubx  HEAD  patched
> > --------------------
> >   0  440145  443816
> >   8  438867  443081
> >  16  438634  441786
> >  24  436406  440187
> >  32  439203  442447
> >  40  439819  443574
> >  48  439314  442941
> >  56  439801  443736
> >  64  439074  441970
> >  72  439833  444132
> >  80  148737  439941
> >  88  413714  443343
> >  96  251098  442021
> > 104   70190  443488
> > 112  405507  438866
> > 120  177827  443202
> > 128  399431  441842
> >
> > From the performance point of view, this patch clearly fixes the
> > dramatic TPS collapse shown in these tests.
>
> I think these are really promising results.  Although the perf result
> shows that the bottleneck on the SLRU is no more there with the patch,
> I think it would be nice to see the wait event as well.

Please find attached samples returned by the following query when
testing transaction 1 with 80 subxacts:
SELECT wait_event_type, wait_event, locktype, mode, database,
relation, COUNT(*) from pg_stat_activity AS psa JOIN pg_locks AS pl ON
(psa.pid = pl.pid) GROUP BY 1, 2, 3, 4, 5, 6 ORDER BY 7 DESC;

Regards,


--
Julien Tachoires
EDB

Attachment

Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Tue, 1 Nov 2022 at 08:55, Julien Tachoires <julmon@gmail.com> wrote:
>
> > > 4. Test results with transaction 1
> > >
> > > TPS vs number of sub-transaction
> > >
> > > nsubx  HEAD  patched
> > > --------------------
> > >   0   441109  439474
> > >   8   439045  438103
> > >  16   439123  436993
> > >  24   436269  434194
> > >  32   439707  437429
> > >  40   439997  437220
> > >  48   439388  437422
> > >  56   439409  437210
> > >  64   439748  437366
> > >  72    92869  434448
> > >  80    66577  434100
> > >  88    61243  434255
> > >  96    57016  434419
> > > 104    52132  434917
> > > 112    49181  433755
> > > 120    46581  434044
> > > 128    44067  434268
> > >
> > > Perf profiling on HEAD with 80 sub-transactions:
> > > Overhead  Symbol
> > >   51.26%  [.] LWLockAttemptLock
> > >   24.59%  [.] LWLockRelease
> > >    0.36%  [.] base_yyparse
> > >    0.35%  [.] PinBuffer
> > >    0.34%  [.] AllocSetAlloc
> > >    0.33%  [.] hash_search_with_hash_value
> > >    0.22%  [.] LWLockAcquire
> > >    0.20%  [.] UnpinBuffer
> > >    0.15%  [.] SimpleLruReadPage_ReadOnly
> > >    0.15%  [.] _bt_compare
> > >
> > > Perf profiling on patched with 80 sub-transactions:
> > > Overhead  Symbol
> > >   2.64%  [.] AllocSetAlloc
> > >   2.09%  [.] base_yyparse
> > >   1.76%  [.] hash_search_with_hash_value
> > >   1.62%  [.] LWLockAttemptLock
> > >   1.26%  [.] MemoryContextAllocZeroAligned
> > >   0.93%  [.] _bt_compare
> > >   0.92%  [.] expression_tree_walker_impl.part.4
> > >   0.84%  [.] SearchCatCache1
> > >   0.79%  [.] palloc
> > >   0.64%  [.] core_yylex
> > >
> > > 5. Test results with transaction 2
> > >
> > > nsubx  HEAD  patched
> > > --------------------
> > >   0  440145  443816
> > >   8  438867  443081
> > >  16  438634  441786
> > >  24  436406  440187
> > >  32  439203  442447
> > >  40  439819  443574
> > >  48  439314  442941
> > >  56  439801  443736
> > >  64  439074  441970
> > >  72  439833  444132
> > >  80  148737  439941
> > >  88  413714  443343
> > >  96  251098  442021
> > > 104   70190  443488
> > > 112  405507  438866
> > > 120  177827  443202
> > > 128  399431  441842
> > >
> > > From the performance point of view, this patch clearly fixes the
> > > dramatic TPS collapse shown in these tests.
> >
> > I think these are really promising results.  Although the perf result
> > shows that the bottleneck on the SLRU is no more there with the patch,
> > I think it would be nice to see the wait event as well.
>
> Please find attached samples returned by the following query when
> testing transaction 1 with 80 subxacts:
> SELECT wait_event_type, wait_event, locktype, mode, database,
> relation, COUNT(*) from pg_stat_activity AS psa JOIN pg_locks AS pl ON
> (psa.pid = pl.pid) GROUP BY 1, 2, 3, 4, 5, 6 ORDER BY 7 DESC;

These results are compelling, thank you.

Setting this to Ready for Committer.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Mon, 7 Nov 2022 at 21:14, Simon Riggs <simon.riggs@enterprisedb.com> wrote:

> These results are compelling, thank you.
>
> Setting this to Ready for Committer.

New version attached.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Japin Li
Date:
On Tue, 15 Nov 2022 at 17:34, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
> On Mon, 7 Nov 2022 at 21:14, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
>> These results are compelling, thank you.
>>
>> Setting this to Ready for Committer.
>
> New version attached.

Take a quick look, I think it should be PGPROC instead of PG_PROC, right?

+         * 1. When there's no room in PG_PROC, as mentioned above.
+         *    During XactLockTableWait() we sometimes need to know the topxid.
+         *    If there is room in PG_PROC we can get a subxid's topxid direct
+         *    from the procarray if the topxid is still running, using

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Tom Lane
Date:
Simon Riggs <simon.riggs@enterprisedb.com> writes:
> New version attached.

I looked at this patch and I do not see how it can possibly be safe.

The most direct counterexample arises from the fact that
HeapCheckForSerializableConflictOut checks SubTransGetTopmostTransaction
in some cases.  You haven't tried to analyze when, but just disabled
the optimization in serializable mode:

+         * 2. When IsolationIsSerializable() we sometimes need to access topxid.
+         *    This occurs only when SERIALIZABLE is requested by app user.
+...
+        if (MyProc->subxidStatus.overflowed ||
+            IsolationIsSerializable() ||

However, what this is checking is whether *our current transaction*
is serializable.  If we're not serializable, but other transactions
in the system are, then this fails to store information that they'll
need for correctness.  You'd have to have some way of knowing that
no transaction in the system is using serializable mode, and that
none will start to do so while this transaction is still in-doubt.

I fear that's already enough to kill the idea; but there's more.
The subxidStatus.overflowed check quoted above has a similar sort
of myopia: it's checking whether our current transaction has
already suboverflowed.  But (a) that doesn't prove it won't suboverflow
later, and (b) the relevant logic in XidInMVCCSnapshot needs to run
SubTransGetTopmostTransaction if *any* proc in the snapshot has
suboverflowed.

Lastly, I don't see what the "transaction on same page" business
has got to do with anything.  The comment is certainly failing
to make the case that it's safe to skip filling subtrans when that
is true.

I think we could salvage this small idea:

+             * Insert entries into subtrans for this xid, noting that the entry
+             * points directly to the topxid, not the immediate parent. This is
+             * done for two reasons:
+             * (1) so it is faster in a long chain of subxids, because the
+             * algorithm is then O(1), no matter how many subxids are assigned.

but some work would be needed to update the comments around
SubTransGetParent and SubTransGetTopmostTransaction to explain that
they're no longer reliably different.  I think that that is okay for
the existing use-cases, but they'd better be documented.  In fact,
couldn't we simplify them down to one function?  Given the restriction
that we don't look back in pg_subtrans further than TransactionXmin,
I don't think that updated code would ever need to resolve cases
written by older code.  So we could remove the recursive checks
entirely, or at least be confident that they don't recurse more
than once.

            regards, tom lane



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Tue, 15 Nov 2022 at 21:03, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Simon Riggs <simon.riggs@enterprisedb.com> writes:
> > New version attached.
>
> I looked at this patch and I do not see how it can possibly be safe.

I grant you it is complex, so please bear with me.


> The most direct counterexample arises from the fact that
> HeapCheckForSerializableConflictOut checks SubTransGetTopmostTransaction
> in some cases.  You haven't tried to analyze when, but just disabled
> the optimization in serializable mode:
>
> +         * 2. When IsolationIsSerializable() we sometimes need to access topxid.
> +         *    This occurs only when SERIALIZABLE is requested by app user.
> +...
> +        if (MyProc->subxidStatus.overflowed ||
> +            IsolationIsSerializable() ||
>
> However, what this is checking is whether *our current transaction*
> is serializable.  If we're not serializable, but other transactions
> in the system are, then this fails to store information that they'll
> need for correctness.  You'd have to have some way of knowing that
> no transaction in the system is using serializable mode, and that
> none will start to do so while this transaction is still in-doubt.

Not true.

src/backend/storage/lmgr/README-SSI   says this...

* Any transaction which is run at a transaction isolation level
other than SERIALIZABLE will not be affected by SSI.  If you want to
enforce business rules through SSI, all transactions should be run at
the SERIALIZABLE transaction isolation level, and that should
probably be set as the default.

If HeapCheckForSerializableConflictOut() cannot find a subxid's parent
then it will not be involved in serialization errors.

So skipping the storage of subxids in subtrans for non-serializable
xacts is valid for both SSI and non-SSI xacts.

Thus the owning transaction can decide to skip the insert into
subtrans if it is not serializable.

> I fear that's already enough to kill the idea; but there's more.
> The subxidStatus.overflowed check quoted above has a similar sort
> of myopia: it's checking whether our current transaction has
> already suboverflowed.  But (a) that doesn't prove it won't suboverflow
> later, and (b) the relevant logic in XidInMVCCSnapshot needs to run
> SubTransGetTopmostTransaction if *any* proc in the snapshot has
> suboverflowed.

Not the way it is coded now.

First, we search the subxid cache in snapshot->subxip.
Then, and only if the snapshot overflowed (i.e. ANY xact overflowed),
do we check subtrans.

Thus, the owning xact knows that anyone else will find the first 64
xids in the subxid cache, so it need not insert them into subtrans,
even if someone else overflowed. When the owning xact overflows, it
knows it must now insert the subxid into subtrans before the xid is
used anywhere in storage, which the patch does. This allows each
owning xact to decide what to do, independent of the actions of
others.

> Lastly, I don't see what the "transaction on same page" business
> has got to do with anything.  The comment is certainly failing
> to make the case that it's safe to skip filling subtrans when that
> is true.

That seems strange, I grant you. It's the same logic that is used in
TransactionIdSetTreeStatus(), in reverse. I understand it 'cos I wrote
it.

TRANSACTION_STATUS_SUB_COMMITTED is only ever used if the topxid and
subxid are on different pages. Therefore TransactionIdDidCommit()
won't ever see a value of TRANSACTION_STATUS_SUB_COMMITTED unless they
are on separate pages. So the owning transaction can predict in
advance whether anyone will ever call SubTransGetParent() for one of
its xids. If they might, then we record the values just in case. If
they NEVER will, then we can skip recording them.


And just to be clear, all 3 of the above preconditions must be true
before the owning xact decides to skip writing a subxid to subtrans.

> I think we could salvage this small idea:
>
> +             * Insert entries into subtrans for this xid, noting that the entry
> +             * points directly to the topxid, not the immediate parent. This is
> +             * done for two reasons:
> +             * (1) so it is faster in a long chain of subxids, because the
> +             * algorithm is then O(1), no matter how many subxids are assigned.
>
> but some work would be needed to update the comments around
> SubTransGetParent and SubTransGetTopmostTransaction to explain that
> they're no longer reliably different.  I think that that is okay for
> the existing use-cases, but they'd better be documented.  In fact,
> couldn't we simplify them down to one function?  Given the restriction
> that we don't look back in pg_subtrans further than TransactionXmin,
> I don't think that updated code would ever need to resolve cases
> written by older code.  So we could remove the recursive checks
> entirely, or at least be confident that they don't recurse more
> than once.

Happy to do so, I'd left it that way to soften the blow of the
absorbing the earlier thoughts.

(Since we know all subxids point directly to parent we know we only
ever need to do one lookup).


I know that if there is a flaw in the above logic then you will find it.

Happy to make any comments changes needed to record the above thoughts
more permanently. I tried, but clearly didn't get everything down
clearly.

Thanks for your detailed thoughts.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Tom Lane
Date:
Simon Riggs <simon.riggs@enterprisedb.com> writes:
> On Tue, 15 Nov 2022 at 21:03, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The subxidStatus.overflowed check quoted above has a similar sort
>> of myopia: it's checking whether our current transaction has
>> already suboverflowed.  But (a) that doesn't prove it won't suboverflow
>> later, and (b) the relevant logic in XidInMVCCSnapshot needs to run
>> SubTransGetTopmostTransaction if *any* proc in the snapshot has
>> suboverflowed.

> Not the way it is coded now.

> First, we search the subxid cache in snapshot->subxip.
> Then, and only if the snapshot overflowed (i.e. ANY xact overflowed),
> do we check subtrans.

No, that's not what XidInMVCCSnapshot does.  If snapshot->suboverflowed
is set (ie, somebody somewhere/somewhen overflowed), then it does
SubTransGetTopmostTransaction and searches only the xips with the result.
This behavior requires that all live subxids be correctly mapped by
SubTransGetTopmostTransaction, or we'll draw false conclusions.

We could perhaps make it do what you suggest, but that would require
a complete performance analysis to make sure we're not giving up
more than we would gain.

Also, both GetSnapshotData and CopySnapshot assume that the subxips
array is not used if suboverflowed is set, and don't bother
(continuing to) populate it.  So we would need code changes and
additional cycles in those areas too.

I'm not sure about your other claims, but I'm pretty sure this one
point is enough to kill the patch.

            regards, tom lane



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Wed, 16 Nov 2022 at 00:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Simon Riggs <simon.riggs@enterprisedb.com> writes:
> > On Tue, 15 Nov 2022 at 21:03, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> The subxidStatus.overflowed check quoted above has a similar sort
> >> of myopia: it's checking whether our current transaction has
> >> already suboverflowed.  But (a) that doesn't prove it won't suboverflow
> >> later, and (b) the relevant logic in XidInMVCCSnapshot needs to run
> >> SubTransGetTopmostTransaction if *any* proc in the snapshot has
> >> suboverflowed.
>
> > Not the way it is coded now.
>
> > First, we search the subxid cache in snapshot->subxip.
> > Then, and only if the snapshot overflowed (i.e. ANY xact overflowed),
> > do we check subtrans.
>
> No, that's not what XidInMVCCSnapshot does. If snapshot->suboverflowed
> is set (ie, somebody somewhere/somewhen overflowed), then it does
> SubTransGetTopmostTransaction and searches only the xips with the result.
> This behavior requires that all live subxids be correctly mapped by
> SubTransGetTopmostTransaction, or we'll draw false conclusions.

Your comments are correct wrt to the existing coding, but not to the
patch, which is coded as described and does not suffer those issues.


> We could perhaps make it do what you suggest,

Already done in the patch since v5.


> but that would require
> a complete performance analysis to make sure we're not giving up
> more than we would gain.

I agree that a full performance analysis is sensible and an objective
analysis has been performed by Julien Tachoires. This is described
here, along with other explanations:
https://docs.google.com/presentation/d/1A7Ar8_LM5EdC2OHL_j3U9J-QwjMiGw9mmXeBLJOmFlg/edit?usp=sharing

It is important to understand the context here: there is already a
well documented LOSS of performance with the current coding. The patch
alleviates that, and I have not been able to find a performance case
where there is any negative impact.

Further tests welcome.


> Also, both GetSnapshotData and CopySnapshot assume that the subxips
> array is not used if suboverflowed is set, and don't bother
> (continuing to) populate it.  So we would need code changes and
> additional cycles in those areas too.

Already done in the patch since v5.

Any additional cycles apply only to the case of snapshot overflow,
which currently performs very badly.


> I'm not sure about your other claims, but I'm pretty sure this one
> point is enough to kill the patch.

Then please look again because there are misunderstandings above.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Dilip Kumar
Date:
On Wed, Nov 16, 2022 at 8:41 AM Simon Riggs
<simon.riggs@enterprisedb.com> wrote:
>
> > No, that's not what XidInMVCCSnapshot does. If snapshot->suboverflowed
> > is set (ie, somebody somewhere/somewhen overflowed), then it does
> > SubTransGetTopmostTransaction and searches only the xips with the result.
> > This behavior requires that all live subxids be correctly mapped by
> > SubTransGetTopmostTransaction, or we'll draw false conclusions.
>
> Your comments are correct wrt to the existing coding, but not to the
> patch, which is coded as described and does not suffer those issues.
>

This will work because of these two changes in patch 1) even though
the snapshot is marked "overflow" we will include all the
subtransactions information in snapshot->subxip. 2) As Simon mentioned
in XidInMVCCSnapshot(), first, we search the subxip cache in
snapshot->subxip, and only if it is not found in that we will look
into the SLRU.  So now because of 1) we will always find any
concurrent subtransaction in "snapshot->subxip".

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Tom Lane
Date:
Simon Riggs <simon.riggs@enterprisedb.com> writes:
> On Wed, 16 Nov 2022 at 00:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> No, that's not what XidInMVCCSnapshot does. If snapshot->suboverflowed
>> is set (ie, somebody somewhere/somewhen overflowed), then it does
>> SubTransGetTopmostTransaction and searches only the xips with the result.
>> This behavior requires that all live subxids be correctly mapped by
>> SubTransGetTopmostTransaction, or we'll draw false conclusions.

> Your comments are correct wrt to the existing coding, but not to the
> patch, which is coded as described and does not suffer those issues.

Ah, OK.

Still ... I really do not like this patch.  It introduces a number of
extremely fragile assumptions, and I think those would come back to
bite us someday, even if they hold now which I'm still unsure about.
It doesn't help that you've chosen to document them only at the place
making them and not at the place(s) likely to break them.

Also, to be blunt, this is not Ready For Committer.  It's more WIP,
because even if the code is okay there are comments all over the system
that you've invalidated.  (At the very least, the file header comments
in subtrans.c and the comments in struct SnapshotData need work; I've
not looked hard but I'm sure there are more places with comments
bearing on these data structures.)

Perhaps it would be a good idea to split up the patch.  The business
about making pg_subtrans flat rather than a tree seems like a good
idea in any event, although as I said it doesn't seem like we've got
a fleshed-out version of that here.  We could push forward on getting
that done and then separately consider the rest of it.

            regards, tom lane



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Wed, 16 Nov 2022 at 15:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Simon Riggs <simon.riggs@enterprisedb.com> writes:
> > On Wed, 16 Nov 2022 at 00:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> No, that's not what XidInMVCCSnapshot does. If snapshot->suboverflowed
> >> is set (ie, somebody somewhere/somewhen overflowed), then it does
> >> SubTransGetTopmostTransaction and searches only the xips with the result.
> >> This behavior requires that all live subxids be correctly mapped by
> >> SubTransGetTopmostTransaction, or we'll draw false conclusions.
>
> > Your comments are correct wrt to the existing coding, but not to the
> > patch, which is coded as described and does not suffer those issues.
>
> Ah, OK.
>
> Still ... I really do not like this patch.  It introduces a number of
> extremely fragile assumptions, and I think those would come back to
> bite us someday, even if they hold now which I'm still unsure about.

Completely understand. It took me months to think this through.

> It doesn't help that you've chosen to document them only at the place
> making them and not at the place(s) likely to break them.

Yes, apologies for that, I focused on the holistic explanation in the slides.

> Also, to be blunt, this is not Ready For Committer.  It's more WIP,
> because even if the code is okay there are comments all over the system
> that you've invalidated.  (At the very least, the file header comments
> in subtrans.c and the comments in struct SnapshotData need work; I've
> not looked hard but I'm sure there are more places with comments
> bearing on these data structures.)

New version with greatly improved comments coming very soon.

> Perhaps it would be a good idea to split up the patch.  The business
> about making pg_subtrans flat rather than a tree seems like a good
> idea in any event, although as I said it doesn't seem like we've got
> a fleshed-out version of that here.  We could push forward on getting
> that done and then separately consider the rest of it.

Yes, I thought you might ask that so, after some thought, have found a
clean way to do that and have split this into two parts.

Julien has agreed to do further perf tests and is working on that now.

I will post new versions soon, earliest tomorrow.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Thu, 17 Nov 2022 at 17:04, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>
> New version with greatly improved comments coming very soon.

> > Perhaps it would be a good idea to split up the patch.  The business
> > about making pg_subtrans flat rather than a tree seems like a good
> > idea in any event, although as I said it doesn't seem like we've got
> > a fleshed-out version of that here.  We could push forward on getting
> > that done and then separately consider the rest of it.
>
> Yes, I thought you might ask that so, after some thought, have found a
> clean way to do that and have split this into two parts.

Attached.

002 includes many comment revisions, as well as flattening the loops
in SubTransGetTopmostTransaction and TransactionIdDidCommit/Abort
003 includes the idea to not-always do SubTransSetParent()

> Julien has agreed to do further perf tests and is working on that now.
>
> I will post new versions soon, earliest tomorrow.

Julien's results show that 002 patch on its own is probably all we
need, but I'm posting 003 also in case that situation changes based on
other later results with different test cases.

Detailed numbers shown here, plus graph derived from them - thanks Julien!

nsubxacts  HEAD (3d0c95)  patched 002-v13  patched 002+003-v13
0          434161         436778           437287
8          432619         434718           435381
16         432856         434710           435092
24         429954         431835           431974
32         434643         436134           436793
40         433939         436121           435622
48         434503         434368           435662
56         432965         434229           436182
64         433672         433951           436192
72          93555         431626           433551
80          66642         431421           434305
88          61349         432776           433664
96          55892         432306           434212
104         52270         432571           434133
112         49166         433655           434754
120         46477         432817           434104
128         43226         432258           432611
(yes, the last line shows x10 performance patched, that is not a typo)

--
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Tomas Vondra
Date:

On 11/17/22 18:29, Simon Riggs wrote:
> On Thu, 17 Nov 2022 at 17:04, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
>>
>> New version with greatly improved comments coming very soon.
> 
>>> Perhaps it would be a good idea to split up the patch.  The business
>>> about making pg_subtrans flat rather than a tree seems like a good
>>> idea in any event, although as I said it doesn't seem like we've got
>>> a fleshed-out version of that here.  We could push forward on getting
>>> that done and then separately consider the rest of it.
>>
>> Yes, I thought you might ask that so, after some thought, have found a
>> clean way to do that and have split this into two parts.
> 
> Attached.
> 
> 002 includes many comment revisions, as well as flattening the loops
> in SubTransGetTopmostTransaction and TransactionIdDidCommit/Abort
> 003 includes the idea to not-always do SubTransSetParent()
> 

I'm a bit confused by the TransactionIdsAreOnSameXactPage naming. Isn't
this really checking clog pages?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Thu, 17 Nov 2022 at 20:29, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 11/17/22 18:29, Simon Riggs wrote:
> > On Thu, 17 Nov 2022 at 17:04, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
> >>
> > 003 includes the idea to not-always do SubTransSetParent()
> >
> I'm a bit confused by the TransactionIdsAreOnSameXactPage naming. Isn't
> this really checking clog pages?

Yes, clog page. I named it to match the new name of pg_xact

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Simon Riggs
Date:
On Thu, 17 Nov 2022 at 17:29, Simon Riggs <simon.riggs@enterprisedb.com> wrote:

> (yes, the last line shows x10 performance patched, that is not a typo)

New version of patch, now just a one-line patch!

Results show it's still a good win for XidInMVCCSnapshot().

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Tomas Vondra
Date:
On 11/29/22 13:49, Simon Riggs wrote:
> On Thu, 17 Nov 2022 at 17:29, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
> 
>> (yes, the last line shows x10 performance patched, that is not a typo)
> 
> New version of patch, now just a one-line patch!
> 
> Results show it's still a good win for XidInMVCCSnapshot().
> 

I'm a bit confused - for which workload/benchmark are there results?
It's generally a good idea to share the scripts used to run the test and
not just a chart.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Julien Tachoires
Date:
Hi Tomas,

Le mar. 29 nov. 2022 à 14:06, Tomas Vondra
<tomas.vondra@enterprisedb.com> a écrit :
>
> On 11/29/22 13:49, Simon Riggs wrote:
> > On Thu, 17 Nov 2022 at 17:29, Simon Riggs <simon.riggs@enterprisedb.com> wrote:
> >
> >> (yes, the last line shows x10 performance patched, that is not a typo)
> >
> > New version of patch, now just a one-line patch!
> >
> > Results show it's still a good win for XidInMVCCSnapshot().
> >
>
> I'm a bit confused - for which workload/benchmark are there results?
> It's generally a good idea to share the scripts used to run the test and
> not just a chart.

The scripts have been attached to this thread with the initial
performance results.
Anyway, re-sending those (including a minor fix).

--
Julien Tachoires
EDB

Attachment

Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Tom Lane
Date:
Simon Riggs <simon.riggs@enterprisedb.com> writes:
> New version of patch, now just a one-line patch!

Of course, there's all the documentation and comments that you falsified.
Also, what of the SubTransSetParent call in ProcessTwoPhaseBuffer?

(The one in ProcArrayApplyXidAssignment is actually okay, though the
comment making excuses for it no longer is.)

Also, if we're going to go over to a one-level structure in pg_subtrans,
we really ought to simplify the code in subtrans.c accordingly, and
get rid of the extra lookup currently done for the top parent's parent.

I still wonder whether we'll regret losing information about the
subtransaction tree structure, as discussed in the other thread [1].
That seems like the main barrier to proceeding with this.

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/CANbhV-HYfP0ebZRERkpt84ZCDsNX-UYJGYsjfS88jtbYzY%2BKcQ%40mail.gmail.com



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Andres Freund
Date:
Hi,

On 2022-11-29 13:30:02 -0500, Tom Lane wrote:
> I still wonder whether we'll regret losing information about the
> subtransaction tree structure, as discussed in the other thread [1].
> That seems like the main barrier to proceeding with this.

Yea, this has me worried. I suspect that we have a bunch of places relying on
this. I'm not at all convinced that optimizing XidInMVCCSnapshot() is a good
reason for this structural change, given that there's other possible ways to
optimize (e.g. my proposal to add overflowed subxids to the Snapshot during
lookup when there's space).

Greetings,

Andres Freund



Re: SUBTRANS: Minimizing calls to SubTransSetParent()

From
Robert Haas
Date:
On Tue, Nov 29, 2022 at 1:35 PM Andres Freund <andres@anarazel.de> wrote:
> On 2022-11-29 13:30:02 -0500, Tom Lane wrote:
> > I still wonder whether we'll regret losing information about the
> > subtransaction tree structure, as discussed in the other thread [1].
> > That seems like the main barrier to proceeding with this.
>
> Yea, this has me worried.

Me, too.

-- 
Robert Haas
EDB: http://www.enterprisedb.com