Thread: Maybe some more low-hanging fruit in the latestCompletedXid patch.
Hi I've already posted this idea, but I feel that I did explain it rather badly. So here comes a new try. Currently, we do not assume that either the childXids array, nor the xid cache in the proc array are sorted by ascending xid order. I believe that we could simplify the code, further reduce the locking requirements, and enabled a transaction to de-overflow it's xid cache if we assume that those arrays are in ascending xid order. ** Sortedness ** Presumably, the existing code *already* guarantees that both the childXids and the xid cache *are* in fact sorted by ascending xid order - due to the following reasons: A transaction (including subtransactions) form a tree. Each xact (or node) may have any number of children - assume that they are added "left to right" in the order of their creation. For ever (sub)xact, all subtrees but the rightmost one is "closed" (meaning that no new nodes will ever be created inside them). That last assertion follows from the fact that each non-rightmost subtree is either aborted, or subcommitted. Since adding a new sibling to a (sub) xact is only possible if all the other subtrees are subcommitted or aborted, it will surely get a higher xid than any node in any sibling subtree. Since the xids in the subcommitted siblings where already added to the original (sub) xact upon subcommit, the childXids of that xact will have the following structure: <xids of 1. subcommited child> <xids of 2. subcommited child> ... Each of the sublists are guaranteed to include only xids larger than those inside their predecessors - which leads to a completely sorted array in the whole. The proc-array cache is surely sorted by ascending xid order, because we add entries while we generate them (until we overflow). ** Benefits *** Since we know that both the childXids and the xid cache array are sorted, removing entries from the cache becomes much easier. Since we can only abort a rightmost subtree (All others are either already aborted, or subcommitted), we know that the to-be-aborted xids are the N largest xids in our whole transactions. To remove them from the proc array we overwrite their xids with InvalidTransactionId from right-to-left, and afterwards set the new array length. This allows as to replace the ExclusiveLock with a ShareLock. Since we overwrite the xids right-to-left, a snapshot take concurrently will miss some of the right-most xids. This is exactly the same situation as if we had aborted those xids one-by-one, instead of all in one go. (Xmin calculation are completely unaffected - the toplevel xid *not* removed this way, and that one is surely smaller than any subxact xid) The rather easy xid-cache remove algorithms will also detect if a previously overflowed cache will be de-overflowed by the removal. (This needs a little bit more checking - de-overflowing the xid cache is a bit trickey because we must not race with TransactionIdIsInProgress. But that seems doable, it'll just need the right ordering of things - maybe updating the CLOG *before* updating the proc array is already sufficient even). If there is interest in doing this, I can come up with a patch. greetings, Florian Pflug
"Florian G. Pflug" <fgp@phlo.org> writes: > Currently, we do not assume that either the childXids array, nor > the xid cache in the proc array are sorted by ascending xid order. > I believe that we could simplify the code, further reduce the locking > requirements, and enabled a transaction to de-overflow it's xid cache > if we assume that those arrays are in ascending xid order. "de-overflowing" the cache sounds completely unsafe, as other backends need that state to determine whether they need to look into pg_subtrans. I still don't believe you can avoid taking exclusive lock, either; your argument here did not address latestCompletedXid. But the main point remains this: there is no evidence whatsoever that these code paths are sufficiently performance-critical to be worth speeding up by making the code more fragile. regards, tom lane
Re: Maybe some more low-hanging fruit in the latestCompletedXid patch.
From
"Florian G. Pflug"
Date:
Tom Lane wrote: > "Florian G. Pflug" <fgp@phlo.org> writes: >> Currently, we do not assume that either the childXids array, nor the xid >> cache in the proc array are sorted by ascending xid order. I believe that >> we could simplify the code, further reduce the locking requirements, and >> enabled a transaction to de-overflow it's xid cache if we assume that those >> arrays are in ascending xid order. > > "de-overflowing" the cache sounds completely unsafe, as other backends need > that state to determine whether they need to look into pg_subtrans. We'd only de-overflow if we abort *all* xids that are missing from the xid cache. And only after marking them as aborted in the clog. If someone concurrently checks for an overflow, and already sees the new (non-overflowed) state, than he'll assume the xid is not running if he hasn't found it in the array. Which is correct - we just aborted it. Plus, removing the exclusive lock doesn't depend on de-overflowing. It's just something that seems rather easy to do once the subxid handling is in a state that allows concurrent removal of entries. If it turns out that it's not that easy, than I'll just drop the idea again. > I still don't believe you can avoid taking exclusive lock, either; your > argument here did not address latestCompletedXid. Sorry, not addressing latestCompletedXid was an oversight :-(. My point is the we only *need* to advance latestCompletedXid on COMMITS. We do so for aborts only to avoid running with unnecessarily low xmins after a transaction ABORT. That corner case can only happen after a toplevel ABORT, though - aborting subxacts cannot change the xmin, because the toplevel xact will have a lower xid than any of it's subtransactions anyway. We can therefore just remember the largest assigned xid for a given transaction, and update latestCompletedXid to that on toplevel commit or abort. That prevents that corner-case too, without updating latestCompletedXid during subxact abort. > But the main point remains this: there is no evidence whatsoever that these > code paths are sufficiently performance-critical to be worth speeding up by > making the code more fragile. The gain will be less than that of the locking improvements done so far. It will be a win for heavy users of plpgsql BEGIN/END/EXCEPTION blocks, though I think. We'll also save some cycles in TransactionIdIsInProgress, because we can use a binary search, but that's just an added bonus. I'm currently trying to code up a patch, since it's easier to judge the correctness of actual code than that of a mere proposals. I'll do some benchmarking when the patch is done to see if it brings measurable benefits. greetings, Florian Pflug
Is this a TODO? --------------------------------------------------------------------------- Florian G. Pflug wrote: > Tom Lane wrote: > > "Florian G. Pflug" <fgp@phlo.org> writes: > >> Currently, we do not assume that either the childXids array, nor the xid > >> cache in the proc array are sorted by ascending xid order. I believe that > >> we could simplify the code, further reduce the locking requirements, and > >> enabled a transaction to de-overflow it's xid cache if we assume that those > >> arrays are in ascending xid order. > > > > "de-overflowing" the cache sounds completely unsafe, as other backends need > > that state to determine whether they need to look into pg_subtrans. > > We'd only de-overflow if we abort *all* xids that are missing from the > xid cache. And only after marking them as aborted in the clog. If someone > concurrently checks for an overflow, and already sees the new (non-overflowed) > state, than he'll assume the xid is not running if he hasn't found it in > the array. Which is correct - we just aborted it. > > Plus, removing the exclusive lock doesn't depend on de-overflowing. It's > just something that seems rather easy to do once the subxid handling is > in a state that allows concurrent removal of entries. If it turns out that > it's not that easy, than I'll just drop the idea again. > > > I still don't believe you can avoid taking exclusive lock, either; your > > argument here did not address latestCompletedXid. > > Sorry, not addressing latestCompletedXid was an oversight :-(. > My point is the we only *need* to advance latestCompletedXid on COMMITS. We do > so for aborts only to avoid running with unnecessarily low xmins after > a transaction ABORT. That corner case can only happen after a toplevel > ABORT, though - aborting subxacts cannot change the xmin, because the > toplevel xact will have a lower xid than any of it's subtransactions anyway. > > We can therefore just remember the largest assigned xid for a given transaction, > and update latestCompletedXid to that on toplevel commit or abort. That > prevents that corner-case too, without updating latestCompletedXid during > subxact abort. > > > But the main point remains this: there is no evidence whatsoever that these > > code paths are sufficiently performance-critical to be worth speeding up by > > making the code more fragile. > > The gain will be less than that of the locking improvements done so far. > It will be a win for heavy users of plpgsql BEGIN/END/EXCEPTION blocks, > though I think. > > We'll also save some cycles in TransactionIdIsInProgress, because we can > use a binary search, but that's just an added bonus. > > I'm currently trying to code up a patch, since it's easier to judge the > correctness of actual code than that of a mere proposals. I'll do some > benchmarking when the patch is done to see if it brings measurable benefits. > > greetings, Florian Pflug > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Re: Maybe some more low-hanging fruit in the latestCompletedXid patch.
From
"Florian G. Pflug"
Date:
Bruce Momjian wrote: > Is this a TODO? It's for from clear that avoing an exclusive ProcArray lock on subxact abort will bring a measurable performance benefit, so probably not. I've actually coded a prototype for this a few months ago, to check if it would bring any benefit at all, though I ran out of time before I had time to benchmark this, and I probably also lack the hardware for running high-concurrency tests. > --------------------------------------------------------------------------- > Florian G. Pflug wrote: >> Tom Lane wrote: >>> "Florian G. Pflug" <fgp@phlo.org> writes: >>>> Currently, we do not assume that either the childXids array, nor the xid >>>> cache in the proc array are sorted by ascending xid order. I believe that >>>> we could simplify the code, further reduce the locking requirements, and >>>> enabled a transaction to de-overflow it's xid cache if we assume that those >>>> arrays are in ascending xid order. >>> "de-overflowing" the cache sounds completely unsafe, as other backends need >>> that state to determine whether they need to look into pg_subtrans. >> We'd only de-overflow if we abort *all* xids that are missing from the >> xid cache. And only after marking them as aborted in the clog. If someone >> concurrently checks for an overflow, and already sees the new (non-overflowed) >> state, than he'll assume the xid is not running if he hasn't found it in >> the array. Which is correct - we just aborted it. >> >> Plus, removing the exclusive lock doesn't depend on de-overflowing. It's >> just something that seems rather easy to do once the subxid handling is >> in a state that allows concurrent removal of entries. If it turns out that >> it's not that easy, than I'll just drop the idea again. >> >>> I still don't believe you can avoid taking exclusive lock, either; your >>> argument here did not address latestCompletedXid. >> Sorry, not addressing latestCompletedXid was an oversight :-(. >> My point is the we only *need* to advance latestCompletedXid on COMMITS. We do >> so for aborts only to avoid running with unnecessarily low xmins after >> a transaction ABORT. That corner case can only happen after a toplevel >> ABORT, though - aborting subxacts cannot change the xmin, because the >> toplevel xact will have a lower xid than any of it's subtransactions anyway. >> >> We can therefore just remember the largest assigned xid for a given transaction, >> and update latestCompletedXid to that on toplevel commit or abort. That >> prevents that corner-case too, without updating latestCompletedXid during >> subxact abort. >> >>> But the main point remains this: there is no evidence whatsoever that these >>> code paths are sufficiently performance-critical to be worth speeding up by >>> making the code more fragile. >> The gain will be less than that of the locking improvements done so far. >> It will be a win for heavy users of plpgsql BEGIN/END/EXCEPTION blocks, >> though I think. >> >> We'll also save some cycles in TransactionIdIsInProgress, because we can >> use a binary search, but that's just an added bonus. >> >> I'm currently trying to code up a patch, since it's easier to judge the >> correctness of actual code than that of a mere proposals. I'll do some >> benchmarking when the patch is done to see if it brings measurable benefits.
Thanks for the feedback. --------------------------------------------------------------------------- Florian G. Pflug wrote: > Bruce Momjian wrote: > > Is this a TODO? > > It's for from clear that avoing an exclusive ProcArray lock on subxact > abort will bring a measurable performance benefit, so probably not. > > I've actually coded a prototype for this a few months ago, to > check if it would bring any benefit at all, though I ran out of time > before I had time to benchmark this, and I probably also lack the > hardware for running high-concurrency tests. > > > --------------------------------------------------------------------------- > > Florian G. Pflug wrote: > >> Tom Lane wrote: > >>> "Florian G. Pflug" <fgp@phlo.org> writes: > >>>> Currently, we do not assume that either the childXids array, nor the xid > >>>> cache in the proc array are sorted by ascending xid order. I believe that > >>>> we could simplify the code, further reduce the locking requirements, and > >>>> enabled a transaction to de-overflow it's xid cache if we assume that those > >>>> arrays are in ascending xid order. > >>> "de-overflowing" the cache sounds completely unsafe, as other backends need > >>> that state to determine whether they need to look into pg_subtrans. > >> We'd only de-overflow if we abort *all* xids that are missing from the > >> xid cache. And only after marking them as aborted in the clog. If someone > >> concurrently checks for an overflow, and already sees the new (non-overflowed) > >> state, than he'll assume the xid is not running if he hasn't found it in > >> the array. Which is correct - we just aborted it. > >> > >> Plus, removing the exclusive lock doesn't depend on de-overflowing. It's > >> just something that seems rather easy to do once the subxid handling is > >> in a state that allows concurrent removal of entries. If it turns out that > >> it's not that easy, than I'll just drop the idea again. > >> > >>> I still don't believe you can avoid taking exclusive lock, either; your > >>> argument here did not address latestCompletedXid. > >> Sorry, not addressing latestCompletedXid was an oversight :-(. > >> My point is the we only *need* to advance latestCompletedXid on COMMITS. We do > >> so for aborts only to avoid running with unnecessarily low xmins after > >> a transaction ABORT. That corner case can only happen after a toplevel > >> ABORT, though - aborting subxacts cannot change the xmin, because the > >> toplevel xact will have a lower xid than any of it's subtransactions anyway. > >> > >> We can therefore just remember the largest assigned xid for a given transaction, > >> and update latestCompletedXid to that on toplevel commit or abort. That > >> prevents that corner-case too, without updating latestCompletedXid during > >> subxact abort. > >> > >>> But the main point remains this: there is no evidence whatsoever that these > >>> code paths are sufficiently performance-critical to be worth speeding up by > >>> making the code more fragile. > >> The gain will be less than that of the locking improvements done so far. > >> It will be a win for heavy users of plpgsql BEGIN/END/EXCEPTION blocks, > >> though I think. > >> > >> We'll also save some cycles in TransactionIdIsInProgress, because we can > >> use a binary search, but that's just an added bonus. > >> > >> I'm currently trying to code up a patch, since it's easier to judge the > >> correctness of actual code than that of a mere proposals. I'll do some > >> benchmarking when the patch is done to see if it brings measurable benefits. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +