Thread: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

[HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

David Rowley

Date:

01 March 2017, 15:02:19

Hackers,

I've attached a small patch which aims to improve the performance of
AccessExclusiveLock when there are many AccessExclusiveLock stored.


-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

recovery_lock_speedup_v1.patch

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

Amit Kapila

Date:

02 March 2017, 06:06:34

On Wed, Mar 1, 2017 at 5:32 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> Hackers,
>
> I've attached a small patch which aims to improve the performance of
> AccessExclusiveLock when there are many AccessExclusiveLock stored.
>

I could see that your idea is quite straightforward to improve
performance (basically, convert recovery lock list to recovery lock
hash table based on xid), however, it is not clear under what kind of
workload this will be helpful?  Do you expect that many concurrent
sessions are trying to acquire different AEL locks?

Few comments on the patch:
1.
+/*
+ * Number of buckets to split RecoveryLockTable into.
+ * This must be a power of two.
+ */

+#define RECOVERYLOCKTABLE_SIZE 1024

On what basis did you decide to keep the lock table size as 1024, is
it just a guess, if so, then also adding a comment to indicate that
you think this is sufficient for current use cases seems better.
However, if it is based on some data, then it would be more
appropriate.

2.
@@ -607,6 +627,8 @@ StandbyAcquireAccessExclusiveLock(TransactionId
xid, Oid dbOid, Oid relOid){ xl_standby_lock *newlock; LOCKTAG locktag;
+ size_t pidx;
+

Comment on top of this function needs some change, it still seems to
refer to RelationLockList.

* We keep a single dynamically expandible list of locks in local memory,* RelationLockList, so we can keep track of the
variousentries made by* the Startup process's virtual xid in the shared lock table.

3.
+ size_t pidx;
+
+ if (!TransactionIdIsValid(xid))
+ {
+ StandbyReleaseAllLocks();
+ return;
+ }

What is the need of this new code?

4.
In some cases, it might slow down the performance where you have to
traverse the complete hash table like StandbyReleaseOldLocks, however,
I think those cases will be less relative to other lock release calls
which are called at every commit, so probably this is okay.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

David Rowley

Date:

07 March 2017, 07:31:44

On 2 March 2017 at 16:06, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Mar 1, 2017 at 5:32 PM, David Rowley
> <david.rowley@2ndquadrant.com> wrote:
>> Hackers,
>>
>> I've attached a small patch which aims to improve the performance of
>> AccessExclusiveLock when there are many AccessExclusiveLock stored.
>>
>
> I could see that your idea is quite straightforward to improve
> performance (basically, convert recovery lock list to recovery lock
> hash table based on xid), however, it is not clear under what kind of
> workload this will be helpful?  Do you expect that many concurrent
> sessions are trying to acquire different AEL locks?

Many thanks for looking over this.

The problem case is when many AccessExclusiveLocks are in the list,
each transaction which commits must look through the entire list for
locks which belong to it. This can be a problem when a small number of
transactions create large amount of temp tables, perhaps some batch
processing job. Other transactions, even ones which don't create any
temp tables must look through the list to release any locks belonging
to them. All we need to do to recreate this is have a bunch of
transactions creating temp tables, then try and execute some very
small and fast transactions at the same time.

It's possible to reproduce the problem by running the attached
temptable_bench.sql with:

pgbench -f temptable_bench.sql -j 10 -c 10 -T 60 -n postgres

while concurrently running

pgbench -f updatecounter.sql -T 60 -n postgres

after having done:

CREATE TABLE t1 (value int not null);
insert into t1 values(1);

The hot physical standby lag grows significantly during the run. I saw
up to 500MB on a short 1 minute run and perf record shows that
StandbyReleaseLocks() is consuming 90% of CPU time.

With the patch StandbyReleaseLocks() is down at about 2% CPU.

> Few comments on the patch:
> 1.
> +/*
> + * Number of buckets to split RecoveryLockTable into.
> + * This must be a power of two.
> + */
>
> +#define RECOVERYLOCKTABLE_SIZE 1024
>
> On what basis did you decide to keep the lock table size as 1024, is
> it just a guess, if so, then also adding a comment to indicate that
> you think this is sufficient for current use cases seems better.
> However, if it is based on some data, then it would be more
> appropriate.

That may need tweaking. Likely it could be smaller if we had some sort
of bloom filter to mark if the transaction had obtained any AEL locks,
that way it could skip. Initially I really didn't want to make the
patch too complex. I had thought that a fairly large hash table would
fix the problem well enough, as quite possibly most buckets would be
empty and non empty buckets have short lists.

> 2.
> @@ -607,6 +627,8 @@ StandbyAcquireAccessExclusiveLock(TransactionId
> xid, Oid dbOid, Oid relOid)
>  {
>   xl_standby_lock *newlock;
>   LOCKTAG locktag;
> + size_t pidx;
> +
>
> Comment on top of this function needs some change, it still seems to
> refer to RelationLockList.

Will fix that.

>
> * We keep a single dynamically expandible list of locks in local memory,
>  * RelationLockList, so we can keep track of the various entries made by
>  * the Startup process's virtual xid in the shared lock table.
>
> 3.
> + size_t pidx;
> +
> + if (!TransactionIdIsValid(xid))
> + {
> + StandbyReleaseAllLocks();
> + return;
> + }
>
> What is the need of this new code?

I was not much of a fan of the old way that the function coped with an
invalid xid. The test was being performed inside of the for loop. My
new for loop was not really compatible with that test, so I moved
handling of invalid xids to outside the loop. I think doing that with
today's code would be an improvement as it would mean less work inside
the slow loop.

> 4.
> In some cases, it might slow down the performance where you have to
> traverse the complete hash table like StandbyReleaseOldLocks, however,
> I think those cases will be less relative to other lock release calls
> which are called at every commit, so probably this is okay.

Yes, that may be true. Perhaps making the table smaller would help
bring that back to what it was, or perhaps the patch needs rethought
to use bloom filters to help identify if we need to release any locks.

Opinions on that would be welcome. Meanwhile I'll go off and play with
bloom filters to see if I can knock some percentage points of the perf
report for StandbyReleaseLocks().

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

Simon Riggs

Date:

07 March 2017, 13:20:00

On 7 March 2017 at 10:01, David Rowley <david.rowley@2ndquadrant.com> wrote:
> On 2 March 2017 at 16:06, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Wed, Mar 1, 2017 at 5:32 PM, David Rowley
>> <david.rowley@2ndquadrant.com> wrote:
>>> Hackers,
>>>
>>> I've attached a small patch which aims to improve the performance of
>>> AccessExclusiveLock when there are many AccessExclusiveLock stored.

Looks useful.

> That may need tweaking. Likely it could be smaller if we had some sort
> of bloom filter to mark if the transaction had obtained any AEL locks,
> that way it could skip. Initially I really didn't want to make the
> patch too complex. I had thought that a fairly large hash table would
> fix the problem well enough, as quite possibly most buckets would be
> empty and non empty buckets have short lists.

ISTM that we should mark each COMMIT if it has an AEL, so we can avoid
any overhead in the common case.

So an additional sub-record for the commit/abort wal record, via
include/access/xact.h

>> 4.
>> In some cases, it might slow down the performance where you have to
>> traverse the complete hash table like StandbyReleaseOldLocks, however,
>> I think those cases will be less relative to other lock release calls
>> which are called at every commit, so probably this is okay.
>
> Yes, that may be true. Perhaps making the table smaller would help
> bring that back to what it was, or perhaps the patch needs rethought
> to use bloom filters to help identify if we need to release any locks.
>
> Opinions on that would be welcome. Meanwhile I'll go off and play with
> bloom filters to see if I can knock some percentage points of the perf
> report for StandbyReleaseLocks().

Didn't look at the code closely, but if the common case COMMITs don't
scan the list that would be most of the problem solved. If we did need
a hash table, we should just use a normal hash table?

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

David Rowley

Date:

07 March 2017, 14:15:05

On 7 March 2017 at 17:31, David Rowley <david.rowley@2ndquadrant.com> wrote:
> On 2 March 2017 at 16:06, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Few comments on the patch:
>> 1.
>> +/*
>> + * Number of buckets to split RecoveryLockTable into.
>> + * This must be a power of two.
>> + */
>>
>> +#define RECOVERYLOCKTABLE_SIZE 1024
>>
>> On what basis did you decide to keep the lock table size as 1024, is
>> it just a guess, if so, then also adding a comment to indicate that
>> you think this is sufficient for current use cases seems better.
>> However, if it is based on some data, then it would be more
>> appropriate.

I've performed some benchmarking using the test case from my previous
email and performed a perf record -g on the startup process during
each test.

I used various different sized hash tables, in the range of 32 to 8192
elements. The results are below:

@32 +   28.55%    25.20%  postgres  postgres           [.] StandbyReleaseLocks
@64 +   16.07%    14.65%  postgres  postgres           [.] StandbyReleaseLocks
@128 +    9.15%     8.07%  postgres  postgres           [.] StandbyReleaseLocks
@256 +    5.80%     5.00%  postgres  postgres           [.] StandbyReleaseLocks
@512      2.99%     2.63%  postgres  postgres           [.] StandbyReleaseLocks
@1024     1.74%     1.58%  postgres  postgres           [.] StandbyReleaseLocks
@2048     1.26%     1.14%  postgres  postgres           [.] StandbyReleaseLocks
@4096     0.89%     0.83%  postgres  postgres           [.] StandbyReleaseLocks
@8192     0.99%     0.93%  postgres  postgres           [.] StandbyReleaseLocks

Seems to be diminishing returns after 1024 elements. So perhaps 512 or
1024 are in fact the best size. Of course, I'm only performing this on
my i7 laptop, these numbers may vary on larger hardware with more
transactions running.

>> 2.
>> @@ -607,6 +627,8 @@ StandbyAcquireAccessExclusiveLock(TransactionId
>> xid, Oid dbOid, Oid relOid)
>>  {
>>   xl_standby_lock *newlock;
>>   LOCKTAG locktag;
>> + size_t pidx;
>> +
>>
>> Comment on top of this function needs some change, it still seems to
>> refer to RelationLockList.
>
> Will fix that.

Fixed in the attached.

>> 4.
>> In some cases, it might slow down the performance where you have to
>> traverse the complete hash table like StandbyReleaseOldLocks, however,
>> I think those cases will be less relative to other lock release calls
>> which are called at every commit, so probably this is okay.
>
> Yes, that may be true. Perhaps making the table smaller would help
> bring that back to what it was, or perhaps the patch needs rethought
> to use bloom filters to help identify if we need to release any locks.
>
> Opinions on that would be welcome. Meanwhile I'll go off and play with
> bloom filters to see if I can knock some percentage points of the perf
> report for StandbyReleaseLocks().

So I quickly realised that bloom filters are impossible here as
there's no way to unmark the bit when locks are released as we cannot
know if the bit is used to mark another transaction's locks. I ended
up giving up on that idea and instead I added a RecoveryLockCount
variable that keeps track on the number of locks. This may also help
to speed up some cases where a transaction has many sub transactions
and there's no AEL locks held, as we can simply fast path out in
StandbyReleaseLockTree() without looping over each sub-xid and trying
to release locks which don't exist.

I've attached an updated patch

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

recovery_lock_v2.patch

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

David Rowley

Date:

07 March 2017, 14:22:12

On 7 March 2017 at 23:20, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 7 March 2017 at 10:01, David Rowley <david.rowley@2ndquadrant.com> wrote:
>> On 2 March 2017 at 16:06, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> On Wed, Mar 1, 2017 at 5:32 PM, David Rowley
>>> <david.rowley@2ndquadrant.com> wrote:
>>>> Hackers,
>>>>
>>>> I've attached a small patch which aims to improve the performance of
>>>> AccessExclusiveLock when there are many AccessExclusiveLock stored.
>
> Looks useful.

Thanks.

>> That may need tweaking. Likely it could be smaller if we had some sort
>> of bloom filter to mark if the transaction had obtained any AEL locks,
>> that way it could skip. Initially I really didn't want to make the
>> patch too complex. I had thought that a fairly large hash table would
>> fix the problem well enough, as quite possibly most buckets would be
>> empty and non empty buckets have short lists.
>
> ISTM that we should mark each COMMIT if it has an AEL, so we can avoid
> any overhead in the common case.
>
> So an additional sub-record for the commit/abort wal record, via
> include/access/xact.h

That would be ideal if we could do that, but doing that for so many
possible transaction IDs seems impractical. I had previously thought I
could do this in a lossy way with bloom filters for the hash table,
but I'd failed to realise at that time that I'd be unable to unmark
the bloom filter bit when releasing the locks, as that bit may be used
by other xid.

>>> 4.
>>> In some cases, it might slow down the performance where you have to
>>> traverse the complete hash table like StandbyReleaseOldLocks, however,
>>> I think those cases will be less relative to other lock release calls
>>> which are called at every commit, so probably this is okay.
>>
>> Yes, that may be true. Perhaps making the table smaller would help
>> bring that back to what it was, or perhaps the patch needs rethought
>> to use bloom filters to help identify if we need to release any locks.
>>
>> Opinions on that would be welcome. Meanwhile I'll go off and play with
>> bloom filters to see if I can knock some percentage points of the perf
>> report for StandbyReleaseLocks().
>
> Didn't look at the code closely, but if the common case COMMITs don't
> scan the list that would be most of the problem solved. If we did need
> a hash table, we should just use a normal hash table?

I've tried to address that to some degree and added code which skips
the lookups when there's no AEL locks held at all. There's a
possibility for speeding up replay of COMMIT of a transaction with
many sub transactions where no AEL locks are held at all. See
StandbyReleaseLockTree().

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

Simon Riggs

Date:

07 March 2017, 15:13:48

On 7 March 2017 at 19:22, David Rowley <david.rowley@2ndquadrant.com> wrote:

>>> That may need tweaking. Likely it could be smaller if we had some sort
>>> of bloom filter to mark if the transaction had obtained any AEL locks,
>>> that way it could skip. Initially I really didn't want to make the
>>> patch too complex. I had thought that a fairly large hash table would
>>> fix the problem well enough, as quite possibly most buckets would be
>>> empty and non empty buckets have short lists.
>>
>> ISTM that we should mark each COMMIT if it has an AEL, so we can avoid
>> any overhead in the common case.
>>
>> So an additional sub-record for the commit/abort wal record, via
>> include/access/xact.h
>
> That would be ideal if we could do that, but doing that for so many
> possible transaction IDs seems impractical.

Don't understand this. I'm talking about setting a flag on
commit/abort WAL records, like the attached.

We just need to track info so we can set the flag at EOXact and we're done.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

partial.patch

Re: [HACKERS] Patch to improve performance of replay ofAccessExclusiveLock

From

Andres Freund

Date:

07 March 2017, 20:04:12

Hi,

On 2017-03-08 00:15:05 +1300, David Rowley wrote:
> -static List *RecoveryLockList;
> +/*
> + * RecoveryLockTable is a poor man's hash table that allows us to partition
> + * the stored locks. Which partition a lock is stored in is determined by the
> + * xid which the lock belongs to. The hash function is very simplistic and
> + * merely performs a binary AND against the final 0-based partition number.
> + * Splitting into partitions in this way avoids having to look through all
> + * locks to find one specific to a given xid.
> + */
> +static List **RecoveryLockTable;

Why are we open coding this? That strikes me as a bad plan...

Regards,

Andres

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

David Rowley

Date:

08 March 2017, 05:02:17

On 8 March 2017 at 01:13, Simon Riggs <simon@2ndquadrant.com> wrote:
> Don't understand this. I'm talking about setting a flag on
> commit/abort WAL records, like the attached.

There's nothing setting a flag in the attached. I only see the
checking of the flag.

> We just need to track info so we can set the flag at EOXact and we're done.

Do you have an idea where to store that information? I'd assume we'd
want to store something during LogAccessExclusiveLocks() and check
that in XactLogCommitRecord() and XactLogAbortRecord(). I don't see
anything existing which might help with that today. Do you think I
should introduce some new global variable for that? Or do you propose
calling GetRunningTransactionLocks() again while generating the
Commit/Abort record?

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

Simon Riggs

Date:

16 March 2017, 03:31:55

On 8 March 2017 at 10:02, David Rowley <david.rowley@2ndquadrant.com> wrote:
> On 8 March 2017 at 01:13, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Don't understand this. I'm talking about setting a flag on
>> commit/abort WAL records, like the attached.
>
> There's nothing setting a flag in the attached. I only see the
> checking of the flag.

Yeh, it wasn't a full patch, just a way marker to the summit for you.

>> We just need to track info so we can set the flag at EOXact and we're done.
>
> Do you have an idea where to store that information? I'd assume we'd
> want to store something during LogAccessExclusiveLocks() and check
> that in XactLogCommitRecord() and XactLogAbortRecord(). I don't see
> anything existing which might help with that today. Do you think I
> should introduce some new global variable for that? Or do you propose
> calling GetRunningTransactionLocks() again while generating the
> Commit/Abort record?

Seemed easier to write it than explain further. Please see what you think.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

hs_skip_overhead_AEL_on_standby.v1.patch

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

David Rowley

Date:

16 March 2017, 04:52:03

On 16 March 2017 at 13:31, Simon Riggs <simon@2ndquadrant.com> wrote:

On 8 March 2017 at 10:02, David Rowley <david.rowley@2ndquadrant.com> wrote:
> On 8 March 2017 at 01:13, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Don't understand this. I'm talking about setting a flag on
>> commit/abort WAL records, like the attached.
>
> There's nothing setting a flag in the attached. I only see the
> checking of the flag.

Yeh, it wasn't a full patch, just a way marker to the summit for you.

>> We just need to track info so we can set the flag at EOXact and we're done.
>
> Do you have an idea where to store that information? I'd assume we'd
> want to store something during LogAccessExclusiveLocks() and check
> that in XactLogCommitRecord() and XactLogAbortRecord(). I don't see
> anything existing which might help with that today. Do you think I
> should introduce some new global variable for that? Or do you propose
> calling GetRunningTransactionLocks() again while generating the
> Commit/Abort record?

Seemed easier to write it than explain further. Please see what you think.

Thanks. I had been looking for some struct to store the flag in. I'd not considered just adding a new global variable.

I was in fact just working on this again earlier, and I had come up with the attached.

I was just trying to verify if it was going to do the correct thing in regards to subtransactions and I got a little confused at the comments at the top of StandbyAcquireAccessExclusiveLock():

* We record the lock against the top-level xid, rather than individual

* subtransaction xids. This means AccessExclusiveLocks held by aborted

* subtransactions are not released as early as possible on standbys.

if this is true, and it seems to be per LogAccessExclusiveLock(), does StandbyReleaseLockTree() need to release the locks for sub transactions too?

This code:

for (i = 0; i < nsubxids; i++)

StandbyReleaseLocks(subxids[i]);

FWIW I tested the attached and saw that with my test cases I showed earlier that StandbyReleaseLockTree() was down to 0.74% in the perf report. Likely the gain will be even better if I ran a few more CPUs on small simple transactions which are not taking Access Exclusive locks, so I agree this is a better way forward than what I had in my original patch. Your patch will have the same effect, so much better than the 1.74% I saw in my perf report for the 1024 bucket hash table.

David Rowley http://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Training & Services

Attachment

mark_xact_ael.patch

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

Amit Kapila

Date:

16 March 2017, 14:04:02

On Thu, Mar 16, 2017 at 7:22 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> On 16 March 2017 at 13:31, Simon Riggs <simon@2ndquadrant.com> wrote:
>>
>> On 8 March 2017 at 10:02, David Rowley <david.rowley@2ndquadrant.com>
>> wrote:
>> > On 8 March 2017 at 01:13, Simon Riggs <simon@2ndquadrant.com> wrote:
>> >> Don't understand this. I'm talking about setting a flag on
>> >> commit/abort WAL records, like the attached.
>> >
>> > There's nothing setting a flag in the attached. I only see the
>> > checking of the flag.
>>
>> Yeh, it wasn't a full patch, just a way marker to the summit for you.
>>
>> >> We just need to track info so we can set the flag at EOXact and we're
>> >> done.
>> >
>> > Do you have an idea where to store that information? I'd assume we'd
>> > want to store something during LogAccessExclusiveLocks() and check
>> > that in XactLogCommitRecord() and XactLogAbortRecord(). I don't see
>> > anything existing which might help with that today. Do you think I
>> > should introduce some new global variable for that? Or do you propose
>> > calling GetRunningTransactionLocks() again while generating the
>> > Commit/Abort record?
>>
>> Seemed easier to write it than explain further. Please see what you think.
>
>
> Thanks. I had been looking for some struct to store the flag in. I'd not
> considered just adding a new global variable.
>
> I was in fact just working on this again earlier, and I had come up with the
> attached.
>

I think this will not work if you take AEL in one of the
subtransaction and then commit the transaction.  The reason is that
you are using CurrentTransactionState to remember AEL locks and during
commit (at the time of XactLogCommitRecord) only top level
CurrentTransactionState will be available which is not same as
subtransactions CurrentTransactionState.  You can try with a simple
test like below:

Create table t1(c1 int);
Begin;
insert into t1 values(1);
savepoint s1;
insert into t1 values(2);
savepoint s2;
insert into t1 values(3);
Lock t1 in Access Exclusive mode;
commit;

Now, we might want to store this info in top level transaction state
to make such cases work, but I am not sure if it is better than what
Simon has proposed in his approach.  Although, I have not evaluated in
detail, but it seems Simon's patch looks better way to solve this
problem (In his patch, there is an explicit handling of two-phase
commits which seems more robust).

Few other comments on patch:
1.
@@ -5325,6 +5341,9 @@ XactLogAbortRecord(TimestampTz abort_time, if (xl_xinfo.xinfo &
XACT_XINFO_HAS_TWOPHASE) XLogRegisterData((char *) (&xl_twophase), sizeof
(xl_xact_twophase));

+ if (CurrentTransactionState->didLogAELock)
+ xl_xinfo.xinfo |=
XACT_XINFO_HAS_AE_LOCKS;

You need to set xl_info before the below line in XactLogAbortRecord()
to get it logged properly.
if (xl_xinfo.xinfo != 0)
info |= XLOG_XACT_HAS_INFO;

2.
Explicitly Initialize didLogAELock in StartTransaction as we do for didLogXid.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

Amit Kapila

Date:

16 March 2017, 14:08:08

On Thu, Mar 16, 2017 at 6:01 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 8 March 2017 at 10:02, David Rowley <david.rowley@2ndquadrant.com> wrote:
>> On 8 March 2017 at 01:13, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> Don't understand this. I'm talking about setting a flag on
>>> commit/abort WAL records, like the attached.
>>
>> There's nothing setting a flag in the attached. I only see the
>> checking of the flag.
>
> Yeh, it wasn't a full patch, just a way marker to the summit for you.
>
>>> We just need to track info so we can set the flag at EOXact and we're done.
>>
>> Do you have an idea where to store that information? I'd assume we'd
>> want to store something during LogAccessExclusiveLocks() and check
>> that in XactLogCommitRecord() and XactLogAbortRecord(). I don't see
>> anything existing which might help with that today. Do you think I
>> should introduce some new global variable for that? Or do you propose
>> calling GetRunningTransactionLocks() again while generating the
>> Commit/Abort record?
>
> Seemed easier to write it than explain further. Please see what you think.
>

@@ -5563,7 +5575,8 @@ xact_redo_abort(xl_xact_parsed_abort *parsed,
TransactionId xid) /* * Release locks, if any. There are no invalidations to send. */
- StandbyReleaseLockTree(xid, parsed->nsubxacts, parsed->subxacts);
+ if (parsed->xinfo & XACT_XINFO_HAS_AE_LOCKS)
+ StandbyReleaseLockTree(xid, parsed->nsubxacts, parsed->subxacts);


The patch uses XACT_XINFO_HAS_AE_LOCKS during abort replay, but during
normal operation (XactLogAbortRecord()), it doesn't seem to log the
information.  Is this an oversight or do you have some reason for
doing so?




-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

Simon Riggs

Date:

18 March 2017, 11:52:14

On 16 March 2017 at 09:52, David Rowley <david.rowley@2ndquadrant.com> wrote:

> I was just trying to verify if it was going to do the correct thing in
> regards to subtransactions and I got a little confused at the comments at
> the top of StandbyAcquireAccessExclusiveLock():
>
>  * We record the lock against the top-level xid, rather than individual
>  * subtransaction xids. This means AccessExclusiveLocks held by aborted
>  * subtransactions are not released as early as possible on standbys.
>
> if this is true, and it seems to be per LogAccessExclusiveLock(), does
> StandbyReleaseLockTree() need to release the locks for sub transactions too?
>
> This code:
>
> for (i = 0; i < nsubxids; i++)
> StandbyReleaseLocks(subxids[i]);

Yes, you are correct, good catch. It's not broken, but the code does
nothing, slowly.

We have two choices... remove this code or assign locks against subxids.

1. In xact_redo_abort() we can just pass NULL to
StandbyReleaseLockTree(), keeping the other code as is for later
fixing by another approach.

2. In LogAccessExclusiveLock() we can use GetCurrentTransactionId()
rather than GetTopTransactionId(), so that we assign the lock to the
subxid rather than the top xid. That could increase lock traffic, but
less likely. It also solves the problem of early release when AELs
held by subxids.

(2) looks safe enough, so patch attached.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

assign_aels_against_subxids.v1.patch

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

Simon Riggs

Date:

18 March 2017, 11:53:40

On 16 March 2017 at 19:08, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Thu, Mar 16, 2017 at 6:01 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On 8 March 2017 at 10:02, David Rowley <david.rowley@2ndquadrant.com> wrote:
>>> On 8 March 2017 at 01:13, Simon Riggs <simon@2ndquadrant.com> wrote:
>>>> Don't understand this. I'm talking about setting a flag on
>>>> commit/abort WAL records, like the attached.
>>>
>>> There's nothing setting a flag in the attached. I only see the
>>> checking of the flag.
>>
>> Yeh, it wasn't a full patch, just a way marker to the summit for you.
>>
>>>> We just need to track info so we can set the flag at EOXact and we're done.
>>>
>>> Do you have an idea where to store that information? I'd assume we'd
>>> want to store something during LogAccessExclusiveLocks() and check
>>> that in XactLogCommitRecord() and XactLogAbortRecord(). I don't see
>>> anything existing which might help with that today. Do you think I
>>> should introduce some new global variable for that? Or do you propose
>>> calling GetRunningTransactionLocks() again while generating the
>>> Commit/Abort record?
>>
>> Seemed easier to write it than explain further. Please see what you think.
>>
>
> @@ -5563,7 +5575,8 @@ xact_redo_abort(xl_xact_parsed_abort *parsed,
> TransactionId xid)
>   /*
>   * Release locks, if any. There are no invalidations to send.
>   */
> - StandbyReleaseLockTree(xid, parsed->nsubxacts, parsed->subxacts);
> + if (parsed->xinfo & XACT_XINFO_HAS_AE_LOCKS)
> + StandbyReleaseLockTree(xid, parsed->nsubxacts, parsed->subxacts);
>
>
> The patch uses XACT_XINFO_HAS_AE_LOCKS during abort replay, but during
> normal operation (XactLogAbortRecord()), it doesn't seem to log the
> information.  Is this an oversight or do you have some reason for
> doing so?

Good catch, thanks. Yes, I changed my mind on whether it was needed
during implementation.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

Simon Riggs

Date:

18 March 2017, 11:59:11

On 16 March 2017 at 09:52, David Rowley <david.rowley@2ndquadrant.com> wrote:

>> Seemed easier to write it than explain further. Please see what you think.
>
>
> Thanks. I had been looking for some struct to store the flag in. I'd not
> considered just adding a new global variable.

As Amit says, I don't see the gain from adding that to each xact state.

I'd suggest refactoring my patch so that the existign
MyXactAccessedTempRel becomes MyXactFlags and we can just set a flag
in the two cases (temp rels and has-aels). That way we still have one
Global but it doesn't get any uglier than it already is.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

David Rowley

Date:

20 March 2017, 08:47:58

On 18 March 2017 at 21:52, Simon Riggs <simon@2ndquadrant.com> wrote:
> 2. In LogAccessExclusiveLock() we can use GetCurrentTransactionId()
> rather than GetTopTransactionId(), so that we assign the lock to the
> subxid rather than the top xid. That could increase lock traffic, but
> less likely. It also solves the problem of early release when AELs
> held by subxids.
>
> (2) looks safe enough, so patch attached.

I might be missing something here, but what's the point in doing this
if we're not releasing the lock any earlier?

Going by the other patch you posted we're only recording if the top
level xact has an AEL, so we'll never try to release the locks at the
end of the subxact, since the new flag bit won't be set when the
subxact ends.

Seems it'll just be better to remove the dead code from
StandbyReleaseLockTree(), as that'll just mean one pass over the
RecoveryLockList when it comes to releasing it at the end the top
level transaction. Probably we'd want to just move all of
StandbyReleaseLocks()'s code into StandbyReleaseLockTree(), since the
StandbyReleaseLockTree() would otherwise just end up calling the
static function.

Probably I misunderstood this again. Let me know if that's the case.

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

David Rowley

Date:

20 March 2017, 08:52:47

On 17 March 2017 at 00:04, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Thu, Mar 16, 2017 at 7:22 AM, David Rowley
> <david.rowley@2ndquadrant.com> wrote:
>> On 16 March 2017 at 13:31, Simon Riggs <simon@2ndquadrant.com> wrote:
>>>
>>> On 8 March 2017 at 10:02, David Rowley <david.rowley@2ndquadrant.com>
>>> wrote:
>>> > On 8 March 2017 at 01:13, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> >> Don't understand this. I'm talking about setting a flag on
>>> >> commit/abort WAL records, like the attached.
>>> >
>>> > There's nothing setting a flag in the attached. I only see the
>>> > checking of the flag.
>>>
>>> Yeh, it wasn't a full patch, just a way marker to the summit for you.
>>>
>>> >> We just need to track info so we can set the flag at EOXact and we're
>>> >> done.
>>> >
>>> > Do you have an idea where to store that information? I'd assume we'd
>>> > want to store something during LogAccessExclusiveLocks() and check
>>> > that in XactLogCommitRecord() and XactLogAbortRecord(). I don't see
>>> > anything existing which might help with that today. Do you think I
>>> > should introduce some new global variable for that? Or do you propose
>>> > calling GetRunningTransactionLocks() again while generating the
>>> > Commit/Abort record?
>>>
>>> Seemed easier to write it than explain further. Please see what you think.
>>
>>
>> Thanks. I had been looking for some struct to store the flag in. I'd not
>> considered just adding a new global variable.
>>
>> I was in fact just working on this again earlier, and I had come up with the
>> attached.
>>
>
> I think this will not work if you take AEL in one of the
> subtransaction and then commit the transaction.  The reason is that
> you are using CurrentTransactionState to remember AEL locks and during
> commit (at the time of XactLogCommitRecord) only top level
> CurrentTransactionState will be available which is not same as
> subtransactions CurrentTransactionState.  You can try with a simple
> test like below:
>
> Create table t1(c1 int);
> Begin;
> insert into t1 values(1);
> savepoint s1;
> insert into t1 values(2);
> savepoint s2;
> insert into t1 values(3);
> Lock t1 in Access Exclusive mode;
> commit;
>
> Now, we might want to store this info in top level transaction state
> to make such cases work, but I am not sure if it is better than what
> Simon has proposed in his approach.  Although, I have not evaluated in
> detail, but it seems Simon's patch looks better way to solve this
> problem (In his patch, there is an explicit handling of two-phase
> commits which seems more robust).
>
> Few other comments on patch:
> 1.
> @@ -5325,6 +5341,9 @@ XactLogAbortRecord(TimestampTz abort_time,
>   if (xl_xinfo.xinfo &
> XACT_XINFO_HAS_TWOPHASE)
>   XLogRegisterData((char *) (&xl_twophase), sizeof
> (xl_xact_twophase));
>
> + if (CurrentTransactionState->didLogAELock)
> + xl_xinfo.xinfo |=
> XACT_XINFO_HAS_AE_LOCKS;
>
> You need to set xl_info before the below line in XactLogAbortRecord()
> to get it logged properly.
> if (xl_xinfo.xinfo != 0)
> info |= XLOG_XACT_HAS_INFO;
>
> 2.
> Explicitly Initialize didLogAELock in StartTransaction as we do for didLogXid.

Thanks for looking over this Amit. I'll drop my version of the patch
and make the changes described by Simon about moving
MyXactAccessedTempRel and MyXactAcquiredAccessExclusiveLock into a
bitmap fiags variable.

Thanks for confirming my concerns about the subxacts. I was confused
over how it was meant to work due to StandbyReleaseLockTree() having
some dead code.

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

David Rowley

Date:

20 March 2017, 11:31:31

On 18 March 2017 at 21:59, Simon Riggs <simon@2ndquadrant.com> wrote:
> As Amit says, I don't see the gain from adding that to each xact state.
>
> I'd suggest refactoring my patch so that the existign
> MyXactAccessedTempRel becomes MyXactFlags and we can just set a flag
> in the two cases (temp rels and has-aels). That way we still have one
> Global but it doesn't get any uglier than it already is.

OK, now that I understand things a bit better, I've gone over your
patch and changing things around that it combines the two variables
into a flags variable which will be better suited for future
expansion.

I also fixed up the missing code from the Abort transaction. This is patch 0001

0002:

I was thinking it might be a good idea to back patch the patch which
removes the bogus code from StandbyReleaseLockTree(). Any users of
sub-transactions are currently paying a price that they've no reason
to, especially so when there's AELs held by other concurrent
transactions. We already know this code is slow, having this mistake
in there is not helping any.

0003:

Is intended to be patched atop of 0002 (for master only) and revises
this code further to remove the StandbyReleaseLockTree() function. To
me it seems better to do this to clear up any confusion about what's
going on here. This patch does close the door a little on tracking
AELs on sub-xacts. I really think if we're going to do that, then it
won't be that hard to put it back again. Seems better to be consistent
here and not leave any code around that causes people to be confused
about the same thing as I was confused about.  This patch does get rid
of StandbyReleaseLockTree() which isn't static. Are we likely to break
any 3rd party code by doing that?

0004:

Again master only, is a further small optimization and of
StandbyReleaseLocks() to further tighten the slow loop over each held
lock. I also think that it's better to be more explicit about the case
of when the function would release all locks. Although I'm not all
that sure in which case the function will actually be called with an
invalid xid.

Patch series is attached.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

Simon Riggs

Date:

22 March 2017, 14:27:11

On 20 March 2017 at 08:31, David Rowley <david.rowley@2ndquadrant.com> wrote:
> On 18 March 2017 at 21:59, Simon Riggs <simon@2ndquadrant.com> wrote:
>> As Amit says, I don't see the gain from adding that to each xact state.
>>
>> I'd suggest refactoring my patch so that the existign
>> MyXactAccessedTempRel becomes MyXactFlags and we can just set a flag
>> in the two cases (temp rels and has-aels). That way we still have one
>> Global but it doesn't get any uglier than it already is.
>
> OK, now that I understand things a bit better, I've gone over your
> patch and changing things around that it combines the two variables
> into a flags variable which will be better suited for future
> expansion.
>
> I also fixed up the missing code from the Abort transaction. This is patch 0001

This looks good to go, will final check and apply. Thanks,

> 0002:
>
> I was thinking it might be a good idea to back patch the patch which
> removes the bogus code from StandbyReleaseLockTree(). Any users of
> sub-transactions are currently paying a price that they've no reason
> to, especially so when there's AELs held by other concurrent
> transactions. We already know this code is slow, having this mistake
> in there is not helping any.
>
> 0003:
>
> Is intended to be patched atop of 0002 (for master only) and revises
> this code further to remove the StandbyReleaseLockTree() function. To
> me it seems better to do this to clear up any confusion about what's
> going on here. This patch does close the door a little on tracking
> AELs on sub-xacts. I really think if we're going to do that, then it
> won't be that hard to put it back again. Seems better to be consistent
> here and not leave any code around that causes people to be confused
> about the same thing as I was confused about.  This patch does get rid
> of StandbyReleaseLockTree() which isn't static. Are we likely to break
> any 3rd party code by doing that?

We've solved the performance problem due to your insight, so ISTM ok
now to enable AELs on subxids, since not having them causes locks to
be held for too long, which is a problem also. Rearranging the code
doesn't gain us any further performance, but as you say it prevents
prevents us solving the AELs on subxids problem.

Given that, do you agree to me applying assign_aels_against_subxids.v1.patch
as well?

We can discuss any backpatches after the CF is done.

> 0004:
>
> Again master only, is a further small optimization and of
> StandbyReleaseLocks() to further tighten the slow loop over each held
> lock. I also think that it's better to be more explicit about the case
> of when the function would release all locks. Although I'm not all
> that sure in which case the function will actually be called with an
> invalid xid.
>
> Patch series is attached.

This does improve things a little more, but we already hit 95% of the
gain, so I'd prefer to keep the code as similar as possible.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

David Rowley

Date:

22 March 2017, 16:19:25

On 22 March 2017 at 22:27, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 20 March 2017 at 08:31, David Rowley <david.rowley@2ndquadrant.com> wrote:
>> 0003:
>>
>> Is intended to be patched atop of 0002 (for master only) and revises
>> this code further to remove the StandbyReleaseLockTree() function. To
>> me it seems better to do this to clear up any confusion about what's
>> going on here. This patch does close the door a little on tracking
>> AELs on sub-xacts. I really think if we're going to do that, then it
>> won't be that hard to put it back again. Seems better to be consistent
>> here and not leave any code around that causes people to be confused
>> about the same thing as I was confused about.  This patch does get rid
>> of StandbyReleaseLockTree() which isn't static. Are we likely to break
>> any 3rd party code by doing that?
>
> We've solved the performance problem due to your insight, so ISTM ok
> now to enable AELs on subxids, since not having them causes locks to
> be held for too long, which is a problem also. Rearranging the code
> doesn't gain us any further performance, but as you say it prevents
> prevents us solving the AELs on subxids problem.
>
> Given that, do you agree to me applying assign_aels_against_subxids.v1.patch
> as well?

Does applying assign_aels_against_subxids.v1.patch still need to keep
the loop to release the subxacts? Won't this be gone already with the
subxact commit/abort record replays?
This does possibly mean that we perform more loops over the
RecoveryLockList even if the subxact does not have an AELs, but its
parent xact does. Wonder if this is a good price to pay for releasing
the locks earlier?

> We can discuss any backpatches after the CF is done.
>
>> 0004:
>>
>> Again master only, is a further small optimization and of
>> StandbyReleaseLocks() to further tighten the slow loop over each held
>> lock. I also think that it's better to be more explicit about the case
>> of when the function would release all locks. Although I'm not all
>> that sure in which case the function will actually be called with an
>> invalid xid.
>>
>> Patch series is attached.
>
> This does improve things a little more, but we already hit 95% of the
> gain, so I'd prefer to keep the code as similar as possible.

Seems fair. Its just a blemish that I saw every time I looked at the
code and was inspired not to see it anymore. I agree that there won't
be much gain from it.


-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock

From

Simon Riggs

Date:

22 March 2017, 17:39:57

On 22 March 2017 at 13:19, David Rowley <david.rowley@2ndquadrant.com> wrote:

>> Given that, do you agree to me applying assign_aels_against_subxids.v1.patch
>> as well?
>
> Does applying assign_aels_against_subxids.v1.patch still need to keep
> the loop to release the subxacts? Won't this be gone already with the
> subxact commit/abort record replays?

No it is still required because aborts and commits might have
subcommitted subxids.

> This does possibly mean that we perform more loops over the
> RecoveryLockList even if the subxact does not have an AELs, but its
> parent xact does. Wonder if this is a good price to pay for releasing
> the locks earlier?

We'd be performing the same number of loops as we do now. It's just
now they would have a purpose.

But we aren't doing it at all unless the top level xid has at least
one AEL, so the bulk of the problem is gone.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services