Thread: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock
Hackers, I've attached a small patch which aims to improve the performance of AccessExclusiveLock when there are many AccessExclusiveLock stored. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Wed, Mar 1, 2017 at 5:32 PM, David Rowley <david.rowley@2ndquadrant.com> wrote: > Hackers, > > I've attached a small patch which aims to improve the performance of > AccessExclusiveLock when there are many AccessExclusiveLock stored. > I could see that your idea is quite straightforward to improve performance (basically, convert recovery lock list to recovery lock hash table based on xid), however, it is not clear under what kind of workload this will be helpful? Do you expect that many concurrent sessions are trying to acquire different AEL locks? Few comments on the patch: 1. +/* + * Number of buckets to split RecoveryLockTable into. + * This must be a power of two. + */ +#define RECOVERYLOCKTABLE_SIZE 1024 On what basis did you decide to keep the lock table size as 1024, is it just a guess, if so, then also adding a comment to indicate that you think this is sufficient for current use cases seems better. However, if it is based on some data, then it would be more appropriate. 2. @@ -607,6 +627,8 @@ StandbyAcquireAccessExclusiveLock(TransactionId xid, Oid dbOid, Oid relOid){ xl_standby_lock *newlock; LOCKTAG locktag; + size_t pidx; + Comment on top of this function needs some change, it still seems to refer to RelationLockList. * We keep a single dynamically expandible list of locks in local memory,* RelationLockList, so we can keep track of the variousentries made by* the Startup process's virtual xid in the shared lock table. 3. + size_t pidx; + + if (!TransactionIdIsValid(xid)) + { + StandbyReleaseAllLocks(); + return; + } What is the need of this new code? 4. In some cases, it might slow down the performance where you have to traverse the complete hash table like StandbyReleaseOldLocks, however, I think those cases will be less relative to other lock release calls which are called at every commit, so probably this is okay. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On 2 March 2017 at 16:06, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Wed, Mar 1, 2017 at 5:32 PM, David Rowley > <david.rowley@2ndquadrant.com> wrote: >> Hackers, >> >> I've attached a small patch which aims to improve the performance of >> AccessExclusiveLock when there are many AccessExclusiveLock stored. >> > > I could see that your idea is quite straightforward to improve > performance (basically, convert recovery lock list to recovery lock > hash table based on xid), however, it is not clear under what kind of > workload this will be helpful? Do you expect that many concurrent > sessions are trying to acquire different AEL locks? Many thanks for looking over this. The problem case is when many AccessExclusiveLocks are in the list, each transaction which commits must look through the entire list for locks which belong to it. This can be a problem when a small number of transactions create large amount of temp tables, perhaps some batch processing job. Other transactions, even ones which don't create any temp tables must look through the list to release any locks belonging to them. All we need to do to recreate this is have a bunch of transactions creating temp tables, then try and execute some very small and fast transactions at the same time. It's possible to reproduce the problem by running the attached temptable_bench.sql with: pgbench -f temptable_bench.sql -j 10 -c 10 -T 60 -n postgres while concurrently running pgbench -f updatecounter.sql -T 60 -n postgres after having done: CREATE TABLE t1 (value int not null); insert into t1 values(1); The hot physical standby lag grows significantly during the run. I saw up to 500MB on a short 1 minute run and perf record shows that StandbyReleaseLocks() is consuming 90% of CPU time. With the patch StandbyReleaseLocks() is down at about 2% CPU. > Few comments on the patch: > 1. > +/* > + * Number of buckets to split RecoveryLockTable into. > + * This must be a power of two. > + */ > > +#define RECOVERYLOCKTABLE_SIZE 1024 > > On what basis did you decide to keep the lock table size as 1024, is > it just a guess, if so, then also adding a comment to indicate that > you think this is sufficient for current use cases seems better. > However, if it is based on some data, then it would be more > appropriate. That may need tweaking. Likely it could be smaller if we had some sort of bloom filter to mark if the transaction had obtained any AEL locks, that way it could skip. Initially I really didn't want to make the patch too complex. I had thought that a fairly large hash table would fix the problem well enough, as quite possibly most buckets would be empty and non empty buckets have short lists. > 2. > @@ -607,6 +627,8 @@ StandbyAcquireAccessExclusiveLock(TransactionId > xid, Oid dbOid, Oid relOid) > { > xl_standby_lock *newlock; > LOCKTAG locktag; > + size_t pidx; > + > > Comment on top of this function needs some change, it still seems to > refer to RelationLockList. Will fix that. > > * We keep a single dynamically expandible list of locks in local memory, > * RelationLockList, so we can keep track of the various entries made by > * the Startup process's virtual xid in the shared lock table. > > 3. > + size_t pidx; > + > + if (!TransactionIdIsValid(xid)) > + { > + StandbyReleaseAllLocks(); > + return; > + } > > What is the need of this new code? I was not much of a fan of the old way that the function coped with an invalid xid. The test was being performed inside of the for loop. My new for loop was not really compatible with that test, so I moved handling of invalid xids to outside the loop. I think doing that with today's code would be an improvement as it would mean less work inside the slow loop. > 4. > In some cases, it might slow down the performance where you have to > traverse the complete hash table like StandbyReleaseOldLocks, however, > I think those cases will be less relative to other lock release calls > which are called at every commit, so probably this is okay. Yes, that may be true. Perhaps making the table smaller would help bring that back to what it was, or perhaps the patch needs rethought to use bloom filters to help identify if we need to release any locks. Opinions on that would be welcome. Meanwhile I'll go off and play with bloom filters to see if I can knock some percentage points of the perf report for StandbyReleaseLocks(). -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 7 March 2017 at 10:01, David Rowley <david.rowley@2ndquadrant.com> wrote: > On 2 March 2017 at 16:06, Amit Kapila <amit.kapila16@gmail.com> wrote: >> On Wed, Mar 1, 2017 at 5:32 PM, David Rowley >> <david.rowley@2ndquadrant.com> wrote: >>> Hackers, >>> >>> I've attached a small patch which aims to improve the performance of >>> AccessExclusiveLock when there are many AccessExclusiveLock stored. Looks useful. > That may need tweaking. Likely it could be smaller if we had some sort > of bloom filter to mark if the transaction had obtained any AEL locks, > that way it could skip. Initially I really didn't want to make the > patch too complex. I had thought that a fairly large hash table would > fix the problem well enough, as quite possibly most buckets would be > empty and non empty buckets have short lists. ISTM that we should mark each COMMIT if it has an AEL, so we can avoid any overhead in the common case. So an additional sub-record for the commit/abort wal record, via include/access/xact.h >> 4. >> In some cases, it might slow down the performance where you have to >> traverse the complete hash table like StandbyReleaseOldLocks, however, >> I think those cases will be less relative to other lock release calls >> which are called at every commit, so probably this is okay. > > Yes, that may be true. Perhaps making the table smaller would help > bring that back to what it was, or perhaps the patch needs rethought > to use bloom filters to help identify if we need to release any locks. > > Opinions on that would be welcome. Meanwhile I'll go off and play with > bloom filters to see if I can knock some percentage points of the perf > report for StandbyReleaseLocks(). Didn't look at the code closely, but if the common case COMMITs don't scan the list that would be most of the problem solved. If we did need a hash table, we should just use a normal hash table? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 7 March 2017 at 17:31, David Rowley <david.rowley@2ndquadrant.com> wrote: > On 2 March 2017 at 16:06, Amit Kapila <amit.kapila16@gmail.com> wrote: >> Few comments on the patch: >> 1. >> +/* >> + * Number of buckets to split RecoveryLockTable into. >> + * This must be a power of two. >> + */ >> >> +#define RECOVERYLOCKTABLE_SIZE 1024 >> >> On what basis did you decide to keep the lock table size as 1024, is >> it just a guess, if so, then also adding a comment to indicate that >> you think this is sufficient for current use cases seems better. >> However, if it is based on some data, then it would be more >> appropriate. I've performed some benchmarking using the test case from my previous email and performed a perf record -g on the startup process during each test. I used various different sized hash tables, in the range of 32 to 8192 elements. The results are below: @32 + 28.55% 25.20% postgres postgres [.] StandbyReleaseLocks @64 + 16.07% 14.65% postgres postgres [.] StandbyReleaseLocks @128 + 9.15% 8.07% postgres postgres [.] StandbyReleaseLocks @256 + 5.80% 5.00% postgres postgres [.] StandbyReleaseLocks @512 2.99% 2.63% postgres postgres [.] StandbyReleaseLocks @1024 1.74% 1.58% postgres postgres [.] StandbyReleaseLocks @2048 1.26% 1.14% postgres postgres [.] StandbyReleaseLocks @4096 0.89% 0.83% postgres postgres [.] StandbyReleaseLocks @8192 0.99% 0.93% postgres postgres [.] StandbyReleaseLocks Seems to be diminishing returns after 1024 elements. So perhaps 512 or 1024 are in fact the best size. Of course, I'm only performing this on my i7 laptop, these numbers may vary on larger hardware with more transactions running. >> 2. >> @@ -607,6 +627,8 @@ StandbyAcquireAccessExclusiveLock(TransactionId >> xid, Oid dbOid, Oid relOid) >> { >> xl_standby_lock *newlock; >> LOCKTAG locktag; >> + size_t pidx; >> + >> >> Comment on top of this function needs some change, it still seems to >> refer to RelationLockList. > > Will fix that. Fixed in the attached. >> 4. >> In some cases, it might slow down the performance where you have to >> traverse the complete hash table like StandbyReleaseOldLocks, however, >> I think those cases will be less relative to other lock release calls >> which are called at every commit, so probably this is okay. > > Yes, that may be true. Perhaps making the table smaller would help > bring that back to what it was, or perhaps the patch needs rethought > to use bloom filters to help identify if we need to release any locks. > > Opinions on that would be welcome. Meanwhile I'll go off and play with > bloom filters to see if I can knock some percentage points of the perf > report for StandbyReleaseLocks(). So I quickly realised that bloom filters are impossible here as there's no way to unmark the bit when locks are released as we cannot know if the bit is used to mark another transaction's locks. I ended up giving up on that idea and instead I added a RecoveryLockCount variable that keeps track on the number of locks. This may also help to speed up some cases where a transaction has many sub transactions and there's no AEL locks held, as we can simply fast path out in StandbyReleaseLockTree() without looping over each sub-xid and trying to release locks which don't exist. I've attached an updated patch -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 7 March 2017 at 23:20, Simon Riggs <simon@2ndquadrant.com> wrote: > On 7 March 2017 at 10:01, David Rowley <david.rowley@2ndquadrant.com> wrote: >> On 2 March 2017 at 16:06, Amit Kapila <amit.kapila16@gmail.com> wrote: >>> On Wed, Mar 1, 2017 at 5:32 PM, David Rowley >>> <david.rowley@2ndquadrant.com> wrote: >>>> Hackers, >>>> >>>> I've attached a small patch which aims to improve the performance of >>>> AccessExclusiveLock when there are many AccessExclusiveLock stored. > > Looks useful. Thanks. >> That may need tweaking. Likely it could be smaller if we had some sort >> of bloom filter to mark if the transaction had obtained any AEL locks, >> that way it could skip. Initially I really didn't want to make the >> patch too complex. I had thought that a fairly large hash table would >> fix the problem well enough, as quite possibly most buckets would be >> empty and non empty buckets have short lists. > > ISTM that we should mark each COMMIT if it has an AEL, so we can avoid > any overhead in the common case. > > So an additional sub-record for the commit/abort wal record, via > include/access/xact.h That would be ideal if we could do that, but doing that for so many possible transaction IDs seems impractical. I had previously thought I could do this in a lossy way with bloom filters for the hash table, but I'd failed to realise at that time that I'd be unable to unmark the bloom filter bit when releasing the locks, as that bit may be used by other xid. >>> 4. >>> In some cases, it might slow down the performance where you have to >>> traverse the complete hash table like StandbyReleaseOldLocks, however, >>> I think those cases will be less relative to other lock release calls >>> which are called at every commit, so probably this is okay. >> >> Yes, that may be true. Perhaps making the table smaller would help >> bring that back to what it was, or perhaps the patch needs rethought >> to use bloom filters to help identify if we need to release any locks. >> >> Opinions on that would be welcome. Meanwhile I'll go off and play with >> bloom filters to see if I can knock some percentage points of the perf >> report for StandbyReleaseLocks(). > > Didn't look at the code closely, but if the common case COMMITs don't > scan the list that would be most of the problem solved. If we did need > a hash table, we should just use a normal hash table? I've tried to address that to some degree and added code which skips the lookups when there's no AEL locks held at all. There's a possibility for speeding up replay of COMMIT of a transaction with many sub transactions where no AEL locks are held at all. See StandbyReleaseLockTree(). -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 7 March 2017 at 19:22, David Rowley <david.rowley@2ndquadrant.com> wrote: >>> That may need tweaking. Likely it could be smaller if we had some sort >>> of bloom filter to mark if the transaction had obtained any AEL locks, >>> that way it could skip. Initially I really didn't want to make the >>> patch too complex. I had thought that a fairly large hash table would >>> fix the problem well enough, as quite possibly most buckets would be >>> empty and non empty buckets have short lists. >> >> ISTM that we should mark each COMMIT if it has an AEL, so we can avoid >> any overhead in the common case. >> >> So an additional sub-record for the commit/abort wal record, via >> include/access/xact.h > > That would be ideal if we could do that, but doing that for so many > possible transaction IDs seems impractical. Don't understand this. I'm talking about setting a flag on commit/abort WAL records, like the attached. We just need to track info so we can set the flag at EOXact and we're done. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Hi, On 2017-03-08 00:15:05 +1300, David Rowley wrote: > -static List *RecoveryLockList; > +/* > + * RecoveryLockTable is a poor man's hash table that allows us to partition > + * the stored locks. Which partition a lock is stored in is determined by the > + * xid which the lock belongs to. The hash function is very simplistic and > + * merely performs a binary AND against the final 0-based partition number. > + * Splitting into partitions in this way avoids having to look through all > + * locks to find one specific to a given xid. > + */ > +static List **RecoveryLockTable; Why are we open coding this? That strikes me as a bad plan... Regards, Andres
On 8 March 2017 at 01:13, Simon Riggs <simon@2ndquadrant.com> wrote: > Don't understand this. I'm talking about setting a flag on > commit/abort WAL records, like the attached. There's nothing setting a flag in the attached. I only see the checking of the flag. > We just need to track info so we can set the flag at EOXact and we're done. Do you have an idea where to store that information? I'd assume we'd want to store something during LogAccessExclusiveLocks() and check that in XactLogCommitRecord() and XactLogAbortRecord(). I don't see anything existing which might help with that today. Do you think I should introduce some new global variable for that? Or do you propose calling GetRunningTransactionLocks() again while generating the Commit/Abort record? -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 8 March 2017 at 10:02, David Rowley <david.rowley@2ndquadrant.com> wrote: > On 8 March 2017 at 01:13, Simon Riggs <simon@2ndquadrant.com> wrote: >> Don't understand this. I'm talking about setting a flag on >> commit/abort WAL records, like the attached. > > There's nothing setting a flag in the attached. I only see the > checking of the flag. Yeh, it wasn't a full patch, just a way marker to the summit for you. >> We just need to track info so we can set the flag at EOXact and we're done. > > Do you have an idea where to store that information? I'd assume we'd > want to store something during LogAccessExclusiveLocks() and check > that in XactLogCommitRecord() and XactLogAbortRecord(). I don't see > anything existing which might help with that today. Do you think I > should introduce some new global variable for that? Or do you propose > calling GetRunningTransactionLocks() again while generating the > Commit/Abort record? Seemed easier to write it than explain further. Please see what you think. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 16 March 2017 at 13:31, Simon Riggs <simon@2ndquadrant.com> wrote:
On 8 March 2017 at 10:02, David Rowley <david.rowley@2ndquadrant.com> wrote:
> On 8 March 2017 at 01:13, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Don't understand this. I'm talking about setting a flag on
>> commit/abort WAL records, like the attached.
>
> There's nothing setting a flag in the attached. I only see the
> checking of the flag.
Yeh, it wasn't a full patch, just a way marker to the summit for you.
>> We just need to track info so we can set the flag at EOXact and we're done.
>
> Do you have an idea where to store that information? I'd assume we'd
> want to store something during LogAccessExclusiveLocks() and check
> that in XactLogCommitRecord() and XactLogAbortRecord(). I don't see
> anything existing which might help with that today. Do you think I
> should introduce some new global variable for that? Or do you propose
> calling GetRunningTransactionLocks() again while generating the
> Commit/Abort record?
Seemed easier to write it than explain further. Please see what you think.
Thanks. I had been looking for some struct to store the flag in. I'd not considered just adding a new global variable.
I was in fact just working on this again earlier, and I had come up with the attached.
I was just trying to verify if it was going to do the correct thing in regards to subtransactions and I got a little confused at the comments at the top of StandbyAcquireAccessExclusiveLock():
* We record the lock against the top-level xid, rather than individual
* subtransaction xids. This means AccessExclusiveLocks held by aborted
* subtransactions are not released as early as possible on standbys.
if this is true, and it seems to be per LogAccessExclusiveLock(), does StandbyReleaseLockTree() need to release the locks for sub transactions too?
This code:
for (i = 0; i < nsubxids; i++)
StandbyReleaseLocks(subxids[i]);
FWIW I tested the attached and saw that with my test cases I showed earlier that StandbyReleaseLockTree() was down to 0.74% in the perf report. Likely the gain will be even better if I ran a few more CPUs on small simple transactions which are not taking Access Exclusive locks, so I agree this is a better way forward than what I had in my original patch. Your patch will have the same effect, so much better than the 1.74% I saw in my perf report for the 1024 bucket hash table.
Attachment
On Thu, Mar 16, 2017 at 7:22 AM, David Rowley <david.rowley@2ndquadrant.com> wrote: > On 16 March 2017 at 13:31, Simon Riggs <simon@2ndquadrant.com> wrote: >> >> On 8 March 2017 at 10:02, David Rowley <david.rowley@2ndquadrant.com> >> wrote: >> > On 8 March 2017 at 01:13, Simon Riggs <simon@2ndquadrant.com> wrote: >> >> Don't understand this. I'm talking about setting a flag on >> >> commit/abort WAL records, like the attached. >> > >> > There's nothing setting a flag in the attached. I only see the >> > checking of the flag. >> >> Yeh, it wasn't a full patch, just a way marker to the summit for you. >> >> >> We just need to track info so we can set the flag at EOXact and we're >> >> done. >> > >> > Do you have an idea where to store that information? I'd assume we'd >> > want to store something during LogAccessExclusiveLocks() and check >> > that in XactLogCommitRecord() and XactLogAbortRecord(). I don't see >> > anything existing which might help with that today. Do you think I >> > should introduce some new global variable for that? Or do you propose >> > calling GetRunningTransactionLocks() again while generating the >> > Commit/Abort record? >> >> Seemed easier to write it than explain further. Please see what you think. > > > Thanks. I had been looking for some struct to store the flag in. I'd not > considered just adding a new global variable. > > I was in fact just working on this again earlier, and I had come up with the > attached. > I think this will not work if you take AEL in one of the subtransaction and then commit the transaction. The reason is that you are using CurrentTransactionState to remember AEL locks and during commit (at the time of XactLogCommitRecord) only top level CurrentTransactionState will be available which is not same as subtransactions CurrentTransactionState. You can try with a simple test like below: Create table t1(c1 int); Begin; insert into t1 values(1); savepoint s1; insert into t1 values(2); savepoint s2; insert into t1 values(3); Lock t1 in Access Exclusive mode; commit; Now, we might want to store this info in top level transaction state to make such cases work, but I am not sure if it is better than what Simon has proposed in his approach. Although, I have not evaluated in detail, but it seems Simon's patch looks better way to solve this problem (In his patch, there is an explicit handling of two-phase commits which seems more robust). Few other comments on patch: 1. @@ -5325,6 +5341,9 @@ XactLogAbortRecord(TimestampTz abort_time, if (xl_xinfo.xinfo & XACT_XINFO_HAS_TWOPHASE) XLogRegisterData((char *) (&xl_twophase), sizeof (xl_xact_twophase)); + if (CurrentTransactionState->didLogAELock) + xl_xinfo.xinfo |= XACT_XINFO_HAS_AE_LOCKS; You need to set xl_info before the below line in XactLogAbortRecord() to get it logged properly. if (xl_xinfo.xinfo != 0) info |= XLOG_XACT_HAS_INFO; 2. Explicitly Initialize didLogAELock in StartTransaction as we do for didLogXid. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Thu, Mar 16, 2017 at 6:01 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 8 March 2017 at 10:02, David Rowley <david.rowley@2ndquadrant.com> wrote: >> On 8 March 2017 at 01:13, Simon Riggs <simon@2ndquadrant.com> wrote: >>> Don't understand this. I'm talking about setting a flag on >>> commit/abort WAL records, like the attached. >> >> There's nothing setting a flag in the attached. I only see the >> checking of the flag. > > Yeh, it wasn't a full patch, just a way marker to the summit for you. > >>> We just need to track info so we can set the flag at EOXact and we're done. >> >> Do you have an idea where to store that information? I'd assume we'd >> want to store something during LogAccessExclusiveLocks() and check >> that in XactLogCommitRecord() and XactLogAbortRecord(). I don't see >> anything existing which might help with that today. Do you think I >> should introduce some new global variable for that? Or do you propose >> calling GetRunningTransactionLocks() again while generating the >> Commit/Abort record? > > Seemed easier to write it than explain further. Please see what you think. > @@ -5563,7 +5575,8 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid) /* * Release locks, if any. There are no invalidations to send. */ - StandbyReleaseLockTree(xid, parsed->nsubxacts, parsed->subxacts); + if (parsed->xinfo & XACT_XINFO_HAS_AE_LOCKS) + StandbyReleaseLockTree(xid, parsed->nsubxacts, parsed->subxacts); The patch uses XACT_XINFO_HAS_AE_LOCKS during abort replay, but during normal operation (XactLogAbortRecord()), it doesn't seem to log the information. Is this an oversight or do you have some reason for doing so? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On 16 March 2017 at 09:52, David Rowley <david.rowley@2ndquadrant.com> wrote: > I was just trying to verify if it was going to do the correct thing in > regards to subtransactions and I got a little confused at the comments at > the top of StandbyAcquireAccessExclusiveLock(): > > * We record the lock against the top-level xid, rather than individual > * subtransaction xids. This means AccessExclusiveLocks held by aborted > * subtransactions are not released as early as possible on standbys. > > if this is true, and it seems to be per LogAccessExclusiveLock(), does > StandbyReleaseLockTree() need to release the locks for sub transactions too? > > This code: > > for (i = 0; i < nsubxids; i++) > StandbyReleaseLocks(subxids[i]); Yes, you are correct, good catch. It's not broken, but the code does nothing, slowly. We have two choices... remove this code or assign locks against subxids. 1. In xact_redo_abort() we can just pass NULL to StandbyReleaseLockTree(), keeping the other code as is for later fixing by another approach. 2. In LogAccessExclusiveLock() we can use GetCurrentTransactionId() rather than GetTopTransactionId(), so that we assign the lock to the subxid rather than the top xid. That could increase lock traffic, but less likely. It also solves the problem of early release when AELs held by subxids. (2) looks safe enough, so patch attached. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 16 March 2017 at 19:08, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Thu, Mar 16, 2017 at 6:01 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> On 8 March 2017 at 10:02, David Rowley <david.rowley@2ndquadrant.com> wrote: >>> On 8 March 2017 at 01:13, Simon Riggs <simon@2ndquadrant.com> wrote: >>>> Don't understand this. I'm talking about setting a flag on >>>> commit/abort WAL records, like the attached. >>> >>> There's nothing setting a flag in the attached. I only see the >>> checking of the flag. >> >> Yeh, it wasn't a full patch, just a way marker to the summit for you. >> >>>> We just need to track info so we can set the flag at EOXact and we're done. >>> >>> Do you have an idea where to store that information? I'd assume we'd >>> want to store something during LogAccessExclusiveLocks() and check >>> that in XactLogCommitRecord() and XactLogAbortRecord(). I don't see >>> anything existing which might help with that today. Do you think I >>> should introduce some new global variable for that? Or do you propose >>> calling GetRunningTransactionLocks() again while generating the >>> Commit/Abort record? >> >> Seemed easier to write it than explain further. Please see what you think. >> > > @@ -5563,7 +5575,8 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, > TransactionId xid) > /* > * Release locks, if any. There are no invalidations to send. > */ > - StandbyReleaseLockTree(xid, parsed->nsubxacts, parsed->subxacts); > + if (parsed->xinfo & XACT_XINFO_HAS_AE_LOCKS) > + StandbyReleaseLockTree(xid, parsed->nsubxacts, parsed->subxacts); > > > The patch uses XACT_XINFO_HAS_AE_LOCKS during abort replay, but during > normal operation (XactLogAbortRecord()), it doesn't seem to log the > information. Is this an oversight or do you have some reason for > doing so? Good catch, thanks. Yes, I changed my mind on whether it was needed during implementation. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 16 March 2017 at 09:52, David Rowley <david.rowley@2ndquadrant.com> wrote: >> Seemed easier to write it than explain further. Please see what you think. > > > Thanks. I had been looking for some struct to store the flag in. I'd not > considered just adding a new global variable. As Amit says, I don't see the gain from adding that to each xact state. I'd suggest refactoring my patch so that the existign MyXactAccessedTempRel becomes MyXactFlags and we can just set a flag in the two cases (temp rels and has-aels). That way we still have one Global but it doesn't get any uglier than it already is. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 18 March 2017 at 21:52, Simon Riggs <simon@2ndquadrant.com> wrote: > 2. In LogAccessExclusiveLock() we can use GetCurrentTransactionId() > rather than GetTopTransactionId(), so that we assign the lock to the > subxid rather than the top xid. That could increase lock traffic, but > less likely. It also solves the problem of early release when AELs > held by subxids. > > (2) looks safe enough, so patch attached. I might be missing something here, but what's the point in doing this if we're not releasing the lock any earlier? Going by the other patch you posted we're only recording if the top level xact has an AEL, so we'll never try to release the locks at the end of the subxact, since the new flag bit won't be set when the subxact ends. Seems it'll just be better to remove the dead code from StandbyReleaseLockTree(), as that'll just mean one pass over the RecoveryLockList when it comes to releasing it at the end the top level transaction. Probably we'd want to just move all of StandbyReleaseLocks()'s code into StandbyReleaseLockTree(), since the StandbyReleaseLockTree() would otherwise just end up calling the static function. Probably I misunderstood this again. Let me know if that's the case. -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 17 March 2017 at 00:04, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Thu, Mar 16, 2017 at 7:22 AM, David Rowley > <david.rowley@2ndquadrant.com> wrote: >> On 16 March 2017 at 13:31, Simon Riggs <simon@2ndquadrant.com> wrote: >>> >>> On 8 March 2017 at 10:02, David Rowley <david.rowley@2ndquadrant.com> >>> wrote: >>> > On 8 March 2017 at 01:13, Simon Riggs <simon@2ndquadrant.com> wrote: >>> >> Don't understand this. I'm talking about setting a flag on >>> >> commit/abort WAL records, like the attached. >>> > >>> > There's nothing setting a flag in the attached. I only see the >>> > checking of the flag. >>> >>> Yeh, it wasn't a full patch, just a way marker to the summit for you. >>> >>> >> We just need to track info so we can set the flag at EOXact and we're >>> >> done. >>> > >>> > Do you have an idea where to store that information? I'd assume we'd >>> > want to store something during LogAccessExclusiveLocks() and check >>> > that in XactLogCommitRecord() and XactLogAbortRecord(). I don't see >>> > anything existing which might help with that today. Do you think I >>> > should introduce some new global variable for that? Or do you propose >>> > calling GetRunningTransactionLocks() again while generating the >>> > Commit/Abort record? >>> >>> Seemed easier to write it than explain further. Please see what you think. >> >> >> Thanks. I had been looking for some struct to store the flag in. I'd not >> considered just adding a new global variable. >> >> I was in fact just working on this again earlier, and I had come up with the >> attached. >> > > I think this will not work if you take AEL in one of the > subtransaction and then commit the transaction. The reason is that > you are using CurrentTransactionState to remember AEL locks and during > commit (at the time of XactLogCommitRecord) only top level > CurrentTransactionState will be available which is not same as > subtransactions CurrentTransactionState. You can try with a simple > test like below: > > Create table t1(c1 int); > Begin; > insert into t1 values(1); > savepoint s1; > insert into t1 values(2); > savepoint s2; > insert into t1 values(3); > Lock t1 in Access Exclusive mode; > commit; > > Now, we might want to store this info in top level transaction state > to make such cases work, but I am not sure if it is better than what > Simon has proposed in his approach. Although, I have not evaluated in > detail, but it seems Simon's patch looks better way to solve this > problem (In his patch, there is an explicit handling of two-phase > commits which seems more robust). > > Few other comments on patch: > 1. > @@ -5325,6 +5341,9 @@ XactLogAbortRecord(TimestampTz abort_time, > if (xl_xinfo.xinfo & > XACT_XINFO_HAS_TWOPHASE) > XLogRegisterData((char *) (&xl_twophase), sizeof > (xl_xact_twophase)); > > + if (CurrentTransactionState->didLogAELock) > + xl_xinfo.xinfo |= > XACT_XINFO_HAS_AE_LOCKS; > > You need to set xl_info before the below line in XactLogAbortRecord() > to get it logged properly. > if (xl_xinfo.xinfo != 0) > info |= XLOG_XACT_HAS_INFO; > > 2. > Explicitly Initialize didLogAELock in StartTransaction as we do for didLogXid. Thanks for looking over this Amit. I'll drop my version of the patch and make the changes described by Simon about moving MyXactAccessedTempRel and MyXactAcquiredAccessExclusiveLock into a bitmap fiags variable. Thanks for confirming my concerns about the subxacts. I was confused over how it was meant to work due to StandbyReleaseLockTree() having some dead code. -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 18 March 2017 at 21:59, Simon Riggs <simon@2ndquadrant.com> wrote: > As Amit says, I don't see the gain from adding that to each xact state. > > I'd suggest refactoring my patch so that the existign > MyXactAccessedTempRel becomes MyXactFlags and we can just set a flag > in the two cases (temp rels and has-aels). That way we still have one > Global but it doesn't get any uglier than it already is. OK, now that I understand things a bit better, I've gone over your patch and changing things around that it combines the two variables into a flags variable which will be better suited for future expansion. I also fixed up the missing code from the Abort transaction. This is patch 0001 0002: I was thinking it might be a good idea to back patch the patch which removes the bogus code from StandbyReleaseLockTree(). Any users of sub-transactions are currently paying a price that they've no reason to, especially so when there's AELs held by other concurrent transactions. We already know this code is slow, having this mistake in there is not helping any. 0003: Is intended to be patched atop of 0002 (for master only) and revises this code further to remove the StandbyReleaseLockTree() function. To me it seems better to do this to clear up any confusion about what's going on here. This patch does close the door a little on tracking AELs on sub-xacts. I really think if we're going to do that, then it won't be that hard to put it back again. Seems better to be consistent here and not leave any code around that causes people to be confused about the same thing as I was confused about. This patch does get rid of StandbyReleaseLockTree() which isn't static. Are we likely to break any 3rd party code by doing that? 0004: Again master only, is a further small optimization and of StandbyReleaseLocks() to further tighten the slow loop over each held lock. I also think that it's better to be more explicit about the case of when the function would release all locks. Although I'm not all that sure in which case the function will actually be called with an invalid xid. Patch series is attached. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 20 March 2017 at 08:31, David Rowley <david.rowley@2ndquadrant.com> wrote: > On 18 March 2017 at 21:59, Simon Riggs <simon@2ndquadrant.com> wrote: >> As Amit says, I don't see the gain from adding that to each xact state. >> >> I'd suggest refactoring my patch so that the existign >> MyXactAccessedTempRel becomes MyXactFlags and we can just set a flag >> in the two cases (temp rels and has-aels). That way we still have one >> Global but it doesn't get any uglier than it already is. > > OK, now that I understand things a bit better, I've gone over your > patch and changing things around that it combines the two variables > into a flags variable which will be better suited for future > expansion. > > I also fixed up the missing code from the Abort transaction. This is patch 0001 This looks good to go, will final check and apply. Thanks, > 0002: > > I was thinking it might be a good idea to back patch the patch which > removes the bogus code from StandbyReleaseLockTree(). Any users of > sub-transactions are currently paying a price that they've no reason > to, especially so when there's AELs held by other concurrent > transactions. We already know this code is slow, having this mistake > in there is not helping any. > > 0003: > > Is intended to be patched atop of 0002 (for master only) and revises > this code further to remove the StandbyReleaseLockTree() function. To > me it seems better to do this to clear up any confusion about what's > going on here. This patch does close the door a little on tracking > AELs on sub-xacts. I really think if we're going to do that, then it > won't be that hard to put it back again. Seems better to be consistent > here and not leave any code around that causes people to be confused > about the same thing as I was confused about. This patch does get rid > of StandbyReleaseLockTree() which isn't static. Are we likely to break > any 3rd party code by doing that? We've solved the performance problem due to your insight, so ISTM ok now to enable AELs on subxids, since not having them causes locks to be held for too long, which is a problem also. Rearranging the code doesn't gain us any further performance, but as you say it prevents prevents us solving the AELs on subxids problem. Given that, do you agree to me applying assign_aels_against_subxids.v1.patch as well? We can discuss any backpatches after the CF is done. > 0004: > > Again master only, is a further small optimization and of > StandbyReleaseLocks() to further tighten the slow loop over each held > lock. I also think that it's better to be more explicit about the case > of when the function would release all locks. Although I'm not all > that sure in which case the function will actually be called with an > invalid xid. > > Patch series is attached. This does improve things a little more, but we already hit 95% of the gain, so I'd prefer to keep the code as similar as possible. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 22 March 2017 at 22:27, Simon Riggs <simon@2ndquadrant.com> wrote: > On 20 March 2017 at 08:31, David Rowley <david.rowley@2ndquadrant.com> wrote: >> 0003: >> >> Is intended to be patched atop of 0002 (for master only) and revises >> this code further to remove the StandbyReleaseLockTree() function. To >> me it seems better to do this to clear up any confusion about what's >> going on here. This patch does close the door a little on tracking >> AELs on sub-xacts. I really think if we're going to do that, then it >> won't be that hard to put it back again. Seems better to be consistent >> here and not leave any code around that causes people to be confused >> about the same thing as I was confused about. This patch does get rid >> of StandbyReleaseLockTree() which isn't static. Are we likely to break >> any 3rd party code by doing that? > > We've solved the performance problem due to your insight, so ISTM ok > now to enable AELs on subxids, since not having them causes locks to > be held for too long, which is a problem also. Rearranging the code > doesn't gain us any further performance, but as you say it prevents > prevents us solving the AELs on subxids problem. > > Given that, do you agree to me applying assign_aels_against_subxids.v1.patch > as well? Does applying assign_aels_against_subxids.v1.patch still need to keep the loop to release the subxacts? Won't this be gone already with the subxact commit/abort record replays? This does possibly mean that we perform more loops over the RecoveryLockList even if the subxact does not have an AELs, but its parent xact does. Wonder if this is a good price to pay for releasing the locks earlier? > We can discuss any backpatches after the CF is done. > >> 0004: >> >> Again master only, is a further small optimization and of >> StandbyReleaseLocks() to further tighten the slow loop over each held >> lock. I also think that it's better to be more explicit about the case >> of when the function would release all locks. Although I'm not all >> that sure in which case the function will actually be called with an >> invalid xid. >> >> Patch series is attached. > > This does improve things a little more, but we already hit 95% of the > gain, so I'd prefer to keep the code as similar as possible. Seems fair. Its just a blemish that I saw every time I looked at the code and was inspired not to see it anymore. I agree that there won't be much gain from it. -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 22 March 2017 at 13:19, David Rowley <david.rowley@2ndquadrant.com> wrote: >> Given that, do you agree to me applying assign_aels_against_subxids.v1.patch >> as well? > > Does applying assign_aels_against_subxids.v1.patch still need to keep > the loop to release the subxacts? Won't this be gone already with the > subxact commit/abort record replays? No it is still required because aborts and commits might have subcommitted subxids. > This does possibly mean that we perform more loops over the > RecoveryLockList even if the subxact does not have an AELs, but its > parent xact does. Wonder if this is a good price to pay for releasing > the locks earlier? We'd be performing the same number of loops as we do now. It's just now they would have a purpose. But we aren't doing it at all unless the top level xid has at least one AEL, so the bulk of the problem is gone. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services