Re: Excessive CPU usage in StandbyReleaseLocks() - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Excessive CPU usage in StandbyReleaseLocks() |
Date | |
Msg-id | 20180619063027.x2wjgfvqa5rkjnjy@alap3.anarazel.de Whole thread Raw |
In response to | Excessive CPU usage in StandbyReleaseLocks() (Thomas Munro <thomas.munro@enterprisedb.com>) |
Responses |
Re: Excessive CPU usage in StandbyReleaseLocks()
|
List | pgsql-hackers |
Hi, On 2018-06-19 17:43:42 +1200, Thomas Munro wrote: > The problem is that StandbyReleaseLocks() does a linear search of all > known AccessExclusiveLocks when a transaction ends. Luckily, since > v10 (commit 9b013dc2) that is skipped for transactions that haven't > taken any AELs and aren't using 2PC, but that doesn't help all users. > It's fine if the AEL list is short, but if you do something that takes > a lot of AELs such as restoring a database with many tables or > truncating a lot of partitions while other transactions are in flight > then we start doing O(txrate * nlocks * nsubxacts) work and that can > hurt. > > The reproducer script I've attached creates one long-lived transaction > that acquires 6,000 AELs and takes a nap, while 48 connections run > trivial 2PC transactions (I was also able to reproduce the effect > without 2PC by creating a throw-away temporary table in every > transaction, but it was unreliable due to contention slowing > everything down). For me, the standby's startup process becomes 100% > pegged, replay_lag begins to climb and perf says something like: > > + 97.88% 96.96% postgres postgres [.] StandbyReleaseLocks > > The attached patch splits the AEL list into one list per xid and > sticks them in a hash table. That makes perf say something like: > + 0.60% 0.00% postgres postgres [.] StandbyReleaseLocks Neato. Results aside, that seems like a better suited datastructure. > /* > * InitRecoveryTransactionEnvironment > @@ -63,6 +73,19 @@ void > InitRecoveryTransactionEnvironment(void) > { > VirtualTransactionId vxid; > + HASHCTL hash_ctl; > + > + /* > + * Initialize the hash table for tracking the list of locks held by each > + * transaction. > + */ > + memset(&hash_ctl, 0, sizeof(hash_ctl)); > + hash_ctl.keysize = sizeof(TransactionId); > + hash_ctl.entrysize = sizeof(RecoveryLockListsEntry); > + RecoveryLockLists = hash_create("RecoveryLockLists", > + 4096, > + &hash_ctl, Isn't that initial size needlessly big? > void > StandbyAcquireAccessExclusiveLock(TransactionId xid, Oid dbOid, Oid relOid) > { > + RecoveryLockListsEntry *entry; > xl_standby_lock *newlock; > LOCKTAG locktag; > + bool found; > > /* Already processed? */ > if (!TransactionIdIsValid(xid) || > @@ -616,11 +641,19 @@ StandbyAcquireAccessExclusiveLock(TransactionId xid, Oid dbOid, Oid relOid) > /* dbOid is InvalidOid when we are locking a shared relation. */ > Assert(OidIsValid(relOid)); > > + /* Create a new list for this xid, if we don't have one already. */ > + entry = hash_search(RecoveryLockLists, &xid, HASH_ENTER, &found); > + if (!found) > + { > + entry->xid = xid; > + entry->locks = NIL; > + } > + > newlock = palloc(sizeof(xl_standby_lock)); > newlock->xid = xid; > newlock->dbOid = dbOid; > newlock->relOid = relOid; > - RecoveryLockList = lappend(RecoveryLockList, newlock); > + entry->locks = lappend(entry->locks, newlock); Gotta be careful with lappend et al - it's really easy to leak memory without explicit context usage. Have you checked that we don't here? > SET_LOCKTAG_RELATION(locktag, newlock->dbOid, newlock->relOid); > > @@ -628,46 +661,45 @@ StandbyAcquireAccessExclusiveLock(TransactionId xid, Oid dbOid, Oid relOid) > } > > static void > -StandbyReleaseLocks(TransactionId xid) > +StandbyReleaseLockList(List *locks) > { > - ListCell *cell, > - *prev, > - *next; > - > - /* > - * Release all matching locks and remove them from list > - */ > - prev = NULL; > - for (cell = list_head(RecoveryLockList); cell; cell = next) > + while (locks) > { > - xl_standby_lock *lock = (xl_standby_lock *) lfirst(cell); > + xl_standby_lock *lock = (xl_standby_lock *) linitial(locks); > + LOCKTAG locktag; > + elog(trace_recovery(DEBUG4), > + "releasing recovery lock: xid %u db %u rel %u", > + lock->xid, lock->dbOid, lock->relOid); > + SET_LOCKTAG_RELATION(locktag, lock->dbOid, lock->relOid); > + if (!LockRelease(&locktag, AccessExclusiveLock, true)) > + elog(LOG, > + "RecoveryLockLists contains entry for lock no longer recorded by lock manager: xid %u database %u relation%u", > + lock->xid, lock->dbOid, lock->relOid); This should be a PANIC imo. Greetings, Andres Freund
pgsql-hackers by date: