Thread: Maybe some more low-hanging fruit in the latestCompletedXid patch.

Maybe some more low-hanging fruit in the latestCompletedXid patch.

From
"Florian G. Pflug"
Date:
Hi

I've already posted this idea, but I feel that I did explain it
rather badly. So here comes a new try.

Currently, we do not assume that either the childXids array, nor
the xid cache in the proc array are sorted by ascending xid order.
I believe that we could simplify the code, further reduce the locking
requirements, and enabled a transaction to de-overflow it's xid cache
if we assume that those arrays are in ascending xid order.

** Sortedness **
Presumably, the existing code *already* guarantees that both the childXids
and the xid cache *are* in fact sorted by ascending xid order - due to
the following reasons:

A transaction (including subtransactions) form a tree. Each xact (or node)
may have any number of children - assume that they are added "left to right"
in the order of their creation. For ever (sub)xact, all subtrees but the
rightmost one is "closed" (meaning that no new nodes will ever be created inside
them). That last assertion follows from the fact that each non-rightmost subtree
is either aborted, or subcommitted.

Since adding a new sibling to a (sub) xact is only possible if all the other 
subtrees are subcommitted or aborted, it will surely get a higher xid than any 
node in any sibling subtree. Since the xids in the subcommitted siblings where 
already added to the original (sub) xact upon subcommit, the childXids of that
xact will have the following structure:
<xids of 1. subcommited child> <xids of 2. subcommited child> ...
Each of the sublists are guaranteed to include only xids larger than those
inside their predecessors - which leads to a completely sorted array in
the whole.

The proc-array cache is surely sorted by ascending xid order, because we
add entries while we generate them (until we overflow).

** Benefits ***
Since we know that both the childXids and the xid cache array are sorted,
removing entries from the cache becomes much easier. Since we can only abort
a rightmost subtree (All others are either already aborted, or subcommitted),
we know that the to-be-aborted xids are the N largest xids in our whole
transactions. To remove them from the proc array we overwrite their xids
with InvalidTransactionId from right-to-left, and afterwards set the new
array length.

This allows as to replace the ExclusiveLock with a ShareLock. Since we
overwrite the xids right-to-left, a snapshot take concurrently will miss
some of the right-most xids. This is exactly the same situation as if
we had aborted those xids one-by-one, instead of all in one go.

(Xmin calculation are completely unaffected - the toplevel xid *not*
removed this way, and that one is surely smaller than any subxact xid)

The rather easy xid-cache remove algorithms will also detect if a previously
overflowed cache will be de-overflowed by the removal. (This needs a little
bit more checking - de-overflowing the xid cache is a bit trickey because
we must not race with TransactionIdIsInProgress. But that seems doable,
it'll just need the right ordering of things - maybe updating the CLOG
*before* updating the proc array is already sufficient even).

If there is interest in doing this, I can come up with a patch.

greetings, Florian Pflug




Re: Maybe some more low-hanging fruit in the latestCompletedXid patch.

From
Tom Lane
Date:
"Florian G. Pflug" <fgp@phlo.org> writes:
> Currently, we do not assume that either the childXids array, nor
> the xid cache in the proc array are sorted by ascending xid order.
> I believe that we could simplify the code, further reduce the locking
> requirements, and enabled a transaction to de-overflow it's xid cache
> if we assume that those arrays are in ascending xid order.

"de-overflowing" the cache sounds completely unsafe, as other backends
need that state to determine whether they need to look into pg_subtrans.
I still don't believe you can avoid taking exclusive lock, either; your
argument here did not address latestCompletedXid.

But the main point remains this: there is no evidence whatsoever that
these code paths are sufficiently performance-critical to be worth
speeding up by making the code more fragile.
        regards, tom lane


Re: Maybe some more low-hanging fruit in the latestCompletedXid patch.

From
"Florian G. Pflug"
Date:
Tom Lane wrote:
> "Florian G. Pflug" <fgp@phlo.org> writes:
>> Currently, we do not assume that either the childXids array, nor the xid
>> cache in the proc array are sorted by ascending xid order. I believe that
>> we could simplify the code, further reduce the locking requirements, and
>> enabled a transaction to de-overflow it's xid cache if we assume that those
>> arrays are in ascending xid order.
> 
> "de-overflowing" the cache sounds completely unsafe, as other backends need
> that state to determine whether they need to look into pg_subtrans.

We'd only de-overflow if we abort *all* xids that are missing from the
xid cache. And only after marking them as aborted in the clog. If someone
concurrently checks for an overflow, and already sees the new (non-overflowed)
state, than he'll assume the xid is not running if he hasn't found it in
the array. Which is correct - we just aborted it.

Plus, removing the exclusive lock doesn't depend on de-overflowing. It's
just something that seems rather easy to do once the subxid handling is
in a state that allows concurrent removal of entries. If it turns out that
it's not that easy, than I'll just drop the idea again.

> I still don't believe you can avoid taking exclusive lock, either; your 
> argument here did not address latestCompletedXid.

Sorry, not addressing latestCompletedXid was an oversight :-(.
My point is the we only *need* to advance latestCompletedXid on COMMITS. We do
so for aborts only to avoid running with unnecessarily low xmins after
a transaction ABORT. That corner case can only happen after a toplevel
ABORT, though - aborting subxacts cannot change the xmin, because the
toplevel xact will have a lower xid than any of it's subtransactions anyway.

We can therefore just remember the largest assigned xid for a given transaction,
and update latestCompletedXid to that on toplevel commit or abort. That
prevents that corner-case too, without updating latestCompletedXid during
subxact abort.

> But the main point remains this: there is no evidence whatsoever that these
> code paths are sufficiently performance-critical to be worth speeding up by
> making the code more fragile.

The gain will be less than that of the locking improvements done so far.
It will be a win for heavy users of plpgsql BEGIN/END/EXCEPTION blocks,
though I think.

We'll also save some cycles in TransactionIdIsInProgress, because we can
use a binary search, but that's just an added bonus.

I'm currently trying to code up a patch, since it's easier to judge the
correctness of actual code than that of a mere proposals. I'll do some
benchmarking when the patch is done to see if it brings measurable benefits.

greetings, Florian Pflug



Re: Maybe some more low-hanging fruit in the latestCompletedXid patch.

From
Bruce Momjian
Date:
Is this a TODO?

---------------------------------------------------------------------------

Florian G. Pflug wrote:
> Tom Lane wrote:
> > "Florian G. Pflug" <fgp@phlo.org> writes:
> >> Currently, we do not assume that either the childXids array, nor the xid
> >> cache in the proc array are sorted by ascending xid order. I believe that
> >> we could simplify the code, further reduce the locking requirements, and
> >> enabled a transaction to de-overflow it's xid cache if we assume that those
> >> arrays are in ascending xid order.
> > 
> > "de-overflowing" the cache sounds completely unsafe, as other backends need
> > that state to determine whether they need to look into pg_subtrans.
> 
> We'd only de-overflow if we abort *all* xids that are missing from the
> xid cache. And only after marking them as aborted in the clog. If someone
> concurrently checks for an overflow, and already sees the new (non-overflowed)
> state, than he'll assume the xid is not running if he hasn't found it in
> the array. Which is correct - we just aborted it.
> 
> Plus, removing the exclusive lock doesn't depend on de-overflowing. It's
> just something that seems rather easy to do once the subxid handling is
> in a state that allows concurrent removal of entries. If it turns out that
> it's not that easy, than I'll just drop the idea again.
> 
> > I still don't believe you can avoid taking exclusive lock, either; your 
> > argument here did not address latestCompletedXid.
> 
> Sorry, not addressing latestCompletedXid was an oversight :-(.
> My point is the we only *need* to advance latestCompletedXid on COMMITS. We do
> so for aborts only to avoid running with unnecessarily low xmins after
> a transaction ABORT. That corner case can only happen after a toplevel
> ABORT, though - aborting subxacts cannot change the xmin, because the
> toplevel xact will have a lower xid than any of it's subtransactions anyway.
> 
> We can therefore just remember the largest assigned xid for a given transaction,
> and update latestCompletedXid to that on toplevel commit or abort. That
> prevents that corner-case too, without updating latestCompletedXid during
> subxact abort.
> 
> > But the main point remains this: there is no evidence whatsoever that these
> > code paths are sufficiently performance-critical to be worth speeding up by
> > making the code more fragile.
> 
> The gain will be less than that of the locking improvements done so far.
> It will be a win for heavy users of plpgsql BEGIN/END/EXCEPTION blocks,
> though I think.
> 
> We'll also save some cycles in TransactionIdIsInProgress, because we can
> use a binary search, but that's just an added bonus.
> 
> I'm currently trying to code up a patch, since it's easier to judge the
> correctness of actual code than that of a mere proposals. I'll do some
> benchmarking when the patch is done to see if it brings measurable benefits.
> 
> greetings, Florian Pflug
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Maybe some more low-hanging fruit in the latestCompletedXid patch.

From
"Florian G. Pflug"
Date:
Bruce Momjian wrote:
> Is this a TODO?

It's for from clear that avoing an exclusive ProcArray lock on subxact 
abort will bring a measurable performance benefit, so probably not.

I've actually coded a prototype for this a few months ago, to
check if it would bring any benefit at all, though I ran out of time 
before I had time to benchmark this, and I probably also lack the 
hardware for running high-concurrency tests.

> ---------------------------------------------------------------------------
> Florian G. Pflug wrote:
>> Tom Lane wrote:
>>> "Florian G. Pflug" <fgp@phlo.org> writes:
>>>> Currently, we do not assume that either the childXids array, nor the xid
>>>> cache in the proc array are sorted by ascending xid order. I believe that
>>>> we could simplify the code, further reduce the locking requirements, and
>>>> enabled a transaction to de-overflow it's xid cache if we assume that those
>>>> arrays are in ascending xid order.
>>> "de-overflowing" the cache sounds completely unsafe, as other backends need
>>> that state to determine whether they need to look into pg_subtrans.
>> We'd only de-overflow if we abort *all* xids that are missing from the
>> xid cache. And only after marking them as aborted in the clog. If someone
>> concurrently checks for an overflow, and already sees the new (non-overflowed)
>> state, than he'll assume the xid is not running if he hasn't found it in
>> the array. Which is correct - we just aborted it.
>>
>> Plus, removing the exclusive lock doesn't depend on de-overflowing. It's
>> just something that seems rather easy to do once the subxid handling is
>> in a state that allows concurrent removal of entries. If it turns out that
>> it's not that easy, than I'll just drop the idea again.
>>
>>> I still don't believe you can avoid taking exclusive lock, either; your 
>>> argument here did not address latestCompletedXid.
>> Sorry, not addressing latestCompletedXid was an oversight :-(.
>> My point is the we only *need* to advance latestCompletedXid on COMMITS. We do
>> so for aborts only to avoid running with unnecessarily low xmins after
>> a transaction ABORT. That corner case can only happen after a toplevel
>> ABORT, though - aborting subxacts cannot change the xmin, because the
>> toplevel xact will have a lower xid than any of it's subtransactions anyway.
>>
>> We can therefore just remember the largest assigned xid for a given transaction,
>> and update latestCompletedXid to that on toplevel commit or abort. That
>> prevents that corner-case too, without updating latestCompletedXid during
>> subxact abort.
>>
>>> But the main point remains this: there is no evidence whatsoever that these
>>> code paths are sufficiently performance-critical to be worth speeding up by
>>> making the code more fragile.
>> The gain will be less than that of the locking improvements done so far.
>> It will be a win for heavy users of plpgsql BEGIN/END/EXCEPTION blocks,
>> though I think.
>>
>> We'll also save some cycles in TransactionIdIsInProgress, because we can
>> use a binary search, but that's just an added bonus.
>>
>> I'm currently trying to code up a patch, since it's easier to judge the
>> correctness of actual code than that of a mere proposals. I'll do some
>> benchmarking when the patch is done to see if it brings measurable benefits.


Re: Maybe some more low-hanging fruit in the latestCompletedXid patch.

From
Bruce Momjian
Date:
Thanks for the feedback.

---------------------------------------------------------------------------

Florian G. Pflug wrote:
> Bruce Momjian wrote:
> > Is this a TODO?
> 
> It's for from clear that avoing an exclusive ProcArray lock on subxact 
> abort will bring a measurable performance benefit, so probably not.
> 
> I've actually coded a prototype for this a few months ago, to
> check if it would bring any benefit at all, though I ran out of time 
> before I had time to benchmark this, and I probably also lack the 
> hardware for running high-concurrency tests.
> 
> > ---------------------------------------------------------------------------
> > Florian G. Pflug wrote:
> >> Tom Lane wrote:
> >>> "Florian G. Pflug" <fgp@phlo.org> writes:
> >>>> Currently, we do not assume that either the childXids array, nor the xid
> >>>> cache in the proc array are sorted by ascending xid order. I believe that
> >>>> we could simplify the code, further reduce the locking requirements, and
> >>>> enabled a transaction to de-overflow it's xid cache if we assume that those
> >>>> arrays are in ascending xid order.
> >>> "de-overflowing" the cache sounds completely unsafe, as other backends need
> >>> that state to determine whether they need to look into pg_subtrans.
> >> We'd only de-overflow if we abort *all* xids that are missing from the
> >> xid cache. And only after marking them as aborted in the clog. If someone
> >> concurrently checks for an overflow, and already sees the new (non-overflowed)
> >> state, than he'll assume the xid is not running if he hasn't found it in
> >> the array. Which is correct - we just aborted it.
> >>
> >> Plus, removing the exclusive lock doesn't depend on de-overflowing. It's
> >> just something that seems rather easy to do once the subxid handling is
> >> in a state that allows concurrent removal of entries. If it turns out that
> >> it's not that easy, than I'll just drop the idea again.
> >>
> >>> I still don't believe you can avoid taking exclusive lock, either; your 
> >>> argument here did not address latestCompletedXid.
> >> Sorry, not addressing latestCompletedXid was an oversight :-(.
> >> My point is the we only *need* to advance latestCompletedXid on COMMITS. We do
> >> so for aborts only to avoid running with unnecessarily low xmins after
> >> a transaction ABORT. That corner case can only happen after a toplevel
> >> ABORT, though - aborting subxacts cannot change the xmin, because the
> >> toplevel xact will have a lower xid than any of it's subtransactions anyway.
> >>
> >> We can therefore just remember the largest assigned xid for a given transaction,
> >> and update latestCompletedXid to that on toplevel commit or abort. That
> >> prevents that corner-case too, without updating latestCompletedXid during
> >> subxact abort.
> >>
> >>> But the main point remains this: there is no evidence whatsoever that these
> >>> code paths are sufficiently performance-critical to be worth speeding up by
> >>> making the code more fragile.
> >> The gain will be less than that of the locking improvements done so far.
> >> It will be a win for heavy users of plpgsql BEGIN/END/EXCEPTION blocks,
> >> though I think.
> >>
> >> We'll also save some cycles in TransactionIdIsInProgress, because we can
> >> use a binary search, but that's just an added bonus.
> >>
> >> I'm currently trying to code up a patch, since it's easier to judge the
> >> correctness of actual code than that of a mere proposals. I'll do some
> >> benchmarking when the patch is done to see if it brings measurable benefits.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +