Home > mailing lists

Thread: Reducing ClogControlLock contention

Reducing ClogControlLock contention

From

Simon Riggs

Date:

30 June 2015, 07:02:37

ClogControlLock contention is high at commit time. This appears to be due to the fact that ClogControlLock is acquired in Exclusive mode prior to marking commit, which then gets starved by backends running TransactionIdGetStatus().

Proposal for improving this is to acquire the ClogControlLock in Shared mode, if possible.

This is safe because people checking visibility of an xid must always run TransactionIdIsInProgress() first to avoid race conditions, which will always return true for the transaction we are currently committing. As a result, we never get concurrent access to the same bits in clog, which would require a barrier.

Two concurrent writers might access the same word concurrently, so we protect against that with a new CommitLock. We could partition that by pageno also, if needed.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

clog_optimize.v3.patch

Re: Reducing ClogControlLock contention

From

Michael Paquier

Date:

30 June 2015, 07:13:30

On Tue, Jun 30, 2015 at 4:02 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

ClogControlLock contention is high at commit time. This appears to be due to the fact that ClogControlLock is acquired in Exclusive mode prior to marking commit, which then gets starved by backends running TransactionIdGetStatus().

Proposal for improving this is to acquire the ClogControlLock in Shared mode, if possible.

This is safe because people checking visibility of an xid must always run TransactionIdIsInProgress() first to avoid race conditions, which will always return true for the transaction we are currently committing. As a result, we never get concurrent access to the same bits in clog, which would require a barrier.

Two concurrent writers might access the same word concurrently, so we protect against that with a new CommitLock. We could partition that by pageno also, if needed.

Could it be possible to see some performance numbers? For example with a simple pgbench script doing a bunch of tiny transactions, with many concurrent sessions (perhaps hundreds).
--

Michael

Re: Reducing ClogControlLock contention

From

Simon Riggs

Date:

30 June 2015, 07:25:04

On 30 June 2015 at 08:13, Michael Paquier <michael.paquier@gmail.com> wrote:

On Tue, Jun 30, 2015 at 4:02 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
ClogControlLock contention is high at commit time. This appears to be due to the fact that ClogControlLock is acquired in Exclusive mode prior to marking commit, which then gets starved by backends running TransactionIdGetStatus().

Proposal for improving this is to acquire the ClogControlLock in Shared mode, if possible.

This is safe because people checking visibility of an xid must always run TransactionIdIsInProgress() first to avoid race conditions, which will always return true for the transaction we are currently committing. As a result, we never get concurrent access to the same bits in clog, which would require a barrier.

Two concurrent writers might access the same word concurrently, so we protect against that with a new CommitLock. We could partition that by pageno also, if needed.

Could it be possible to see some performance numbers? For example with a simple pgbench script doing a bunch of tiny transactions, with many concurrent sessions (perhaps hundreds).

I'm more interested to see if people think it is safe.

This contention is masked by contention elsewhere, e.g. ProcArrayLock, so the need for testing here should come once other patches ahead of this are in.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Reducing ClogControlLock contention

From

Simon Riggs

Date:

30 June 2015, 07:31:24

On 30 June 2015 at 08:22, Simon Riggs <simon@2ndquadrant.com> wrote:

This contention is masked by contention elsewhere, e.g. ProcArrayLock, so the need for testing here should come once other patches ahead of this are in.

Let me explain more clearly.

Andres' patch to cache snapshots and reduce ProcArrayLock was interesting, but not initially compelling. We now have a solution that commits in batches, which will reduce the number of times the ProcArray changes - this will heighten the benefit from Andres' snapshot cache patch.

So the order of testing/commit should be

Queued commit patch

ProcArray cache patch

Clog shared commit patch (this one)

I didn't hear recent mention of Robert's chash patch, but IIRC it was effective and so we hope to see it again soon also.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Reducing ClogControlLock contention

From

Amit Kapila

Date:

30 June 2015, 08:31:47

On Tue, Jun 30, 2015 at 12:52 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 30 June 2015 at 08:13, Michael Paquier <michael.paquier@gmail.com> wrote:
>>
>>
>> Could it be possible to see some performance numbers? For example with a simple pgbench script doing a bunch of tiny transactions, with many concurrent sessions (perhaps hundreds).
>
>
> I'm more interested to see if people think it is safe.
>
> This contention is masked by contention elsewhere, e.g. ProcArrayLock, so the need for testing here should come once other patches ahead of this are in.
>

Exactly and other lock that can mask this improvement is WALWriteLock,

but for that we can take the performance data with synchronous_commit

off mode.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Reducing ClogControlLock contention

From

Amit Kapila

Date:

01 July 2015, 08:00:30

On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> ClogControlLock contention is high at commit time. This appears to be due to the fact that ClogControlLock is acquired in Exclusive mode prior to marking commit, which then gets starved by backends running TransactionIdGetStatus().
>
> Proposal for improving this is to acquire the ClogControlLock in Shared mode, if possible.
>

This approach looks good way for avoiding the contention around

ClogControlLock. Few things that occurred to me while looking at

patch are that

a. the semantics of new LWLock (CommitLock) introduced

by patch seems to be different in the sense that it is just taken in

Exclusive mode (and no Shared mode is required) as per your proposal. We

could use existing LWLock APi's, but on the other hand we could even

invent new LWLock API for this kind of locking.

b. SimpleLruReadPage_ReadOnly() - This API's is meant mainly for

read-access of page and the description also says the same, but now

we want to use it for updating page as well. It might be better to invent

similar new API or at the very least modify it's description.

> Two concurrent writers might access the same word concurrently, so we protect against that with a new CommitLock. We could partition that by pageno also, if needed.
>

I think it will be better to partition it or use it in some other way to avoid

two concurrent writers block at it, however if you want to first see the

test results with this, then that is also okay.

Overall the idea seems good to pursue, however I have slight feeling

that using 2 LWLocks (CLOGControlLock in shared mode and new

CommitLock in Exclusive mode) to set the transaction information

is somewhat odd, but I could not see any problem with it.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Reducing ClogControlLock contention

From

Simon Riggs

Date:

01 July 2015, 08:10:25

On 1 July 2015 at 09:00, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> ClogControlLock contention is high at commit time. This appears to be due to the fact that ClogControlLock is acquired in Exclusive mode prior to marking commit, which then gets starved by backends running TransactionIdGetStatus().
>
> Proposal for improving this is to acquire the ClogControlLock in Shared mode, if possible.
>

This approach looks good way for avoiding the contention around
ClogControlLock. Few things that occurred to me while looking at
patch are that

a. the semantics of new LWLock (CommitLock) introduced
by patch seems to be different in the sense that it is just taken in
Exclusive mode (and no Shared mode is required) as per your proposal. We
could use existing LWLock APi's, but on the other hand we could even
invent new LWLock API for this kind of locking.

LWLock API code is already too complex, so -1 for more changes there

b. SimpleLruReadPage_ReadOnly() - This API's is meant mainly for
read-access of page and the description also says the same, but now
we want to use it for updating page as well. It might be better to invent
similar new API or at the very least modify it's description.

Agreed, perhaps SimpleLruReadPage_Optimistic()

> Two concurrent writers might access the same word concurrently, so we protect against that with a new CommitLock. We could partition that by pageno also, if needed.
>

I think it will be better to partition it or use it in some other way to avoid
two concurrent writers block at it, however if you want to first see the
test results with this, then that is also okay.

Many updates would be on same page, so partitioning it would need to be at least 4-way to be worth doing. Maybe we could stripe into 512 bye pages.

Overall the idea seems good to pursue, however I have slight feeling
that using 2 LWLocks (CLOGControlLock in shared mode and new
CommitLock in Exclusive mode) to set the transaction information
is somewhat odd, but I could not see any problem with it.

Perhaps call it the CommitSerializationLock would help. There are already locks that are held only in Exclusive mode.

Locking two separate resources at same time is common in other code.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Reducing ClogControlLock contention

From

Amit Kapila

Date:

01 July 2015, 10:12:04

On Wed, Jul 1, 2015 at 1:38 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 1 July 2015 at 09:00, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>>
>> I think it will be better to partition it or use it in some other way to avoid
>> two concurrent writers block at it, however if you want to first see the
>> test results with this, then that is also okay.
>
>
> Many updates would be on same page, so partitioning it would need to be at least 4-way to be worth doing. Maybe we could stripe into 512 bye pages.

>

Sure, it makes sense to try that way, once you have that ready, I can

try this out along with ProcArrayLock patch to see the impact.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Reducing ClogControlLock contention

From

Andres Freund

Date:

01 July 2015, 10:14:20

On 2015-07-01 09:08:11 +0100, Simon Riggs wrote:
> On 1 July 2015 at 09:00, Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> > On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon@2ndquadrant.com>
> > a.  the semantics of new LWLock (CommitLock) introduced
> > by patch seems to be different in the sense that it is just taken in
> > Exclusive mode (and no Shared mode is required) as per your proposal. We
> > could use existing LWLock APi's, but on the other hand we could even
> > invent new LWLock API for this kind of locking.
> >
> 
> LWLock API code is already too complex, so -1 for more changes there

I don't think that's a valid argument. It's better to have the
complexity in one place (lwlock) than have rather similar complexity in
several other places. The clog control lock is far from the only place
that would benefit from tricks along these lines.

Greetings,

Andres Freund

Re: Reducing ClogControlLock contention

From

Simon Riggs

Date:

01 July 2015, 10:15:54

On 1 July 2015 at 11:11, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jul 1, 2015 at 1:38 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 1 July 2015 at 09:00, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>>
>> I think it will be better to partition it or use it in some other way to avoid
>> two concurrent writers block at it, however if you want to first see the
>> test results with this, then that is also okay.
>
>
> Many updates would be on same page, so partitioning it would need to be at least 4-way to be worth doing. Maybe we could stripe into 512 bye pages.
>

Sure, it makes sense to try that way, once you have that ready, I can
try this out along with ProcArrayLock patch to see the impact.

Seems sensible to measure what the new point of contention is with both before doing anything further.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Reducing ClogControlLock contention

From

Simon Riggs

Date:

01 July 2015, 10:19:45

On 1 July 2015 at 11:14, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-01 09:08:11 +0100, Simon Riggs wrote:
> On 1 July 2015 at 09:00, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon@2ndquadrant.com>
> > a. the semantics of new LWLock (CommitLock) introduced
> > by patch seems to be different in the sense that it is just taken in
> > Exclusive mode (and no Shared mode is required) as per your proposal. We
> > could use existing LWLock APi's, but on the other hand we could even
> > invent new LWLock API for this kind of locking.
> >
>
> LWLock API code is already too complex, so -1 for more changes there

I don't think that's a valid argument. It's better to have the
complexity in one place (lwlock) than have rather similar complexity in
several other places. The clog control lock is far from the only place
that would benefit from tricks along these lines.

What "tricks" are being used??

Please explain why taking 2 locks is bad here, yet works fine elsewhere.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Reducing ClogControlLock contention

From

Andres Freund

Date:

01 July 2015, 10:21:19

On 2015-07-01 11:19:40 +0100, Simon Riggs wrote:
> What "tricks" are being used??
> 
> Please explain why taking 2 locks is bad here, yet works fine elsewhere.

I didn't say anything about 'bad'. It's more complicated than one
lock. Suddenly you have to care about lock ordering and such. The
algorithms for ensuring correctness gets more complicated.

Re: Reducing ClogControlLock contention

From

Robert Haas

Date:

02 July 2015, 18:17:23

On Wed, Jul 1, 2015 at 6:21 AM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-07-01 11:19:40 +0100, Simon Riggs wrote:
>> What "tricks" are being used??
>>
>> Please explain why taking 2 locks is bad here, yet works fine elsewhere.
>
> I didn't say anything about 'bad'. It's more complicated than one
> lock. Suddenly you have to care about lock ordering and such. The
> algorithms for ensuring correctness gets more complicated.

Taking two locks might also be more expensive than just taking one.  I
suppose benchmarking will reveal whether there is an issue there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Reducing ClogControlLock contention

From

Amit Kapila

Date:

11 August 2015, 09:55:29

On Wed, Jul 1, 2015 at 3:49 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 1 July 2015 at 11:14, Andres Freund <andres@anarazel.de> wrote:
>>
>> On 2015-07-01 09:08:11 +0100, Simon Riggs wrote:
>> > On 1 July 2015 at 09:00, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >
>> > > On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon@2ndquadrant.com>
>> > > a. the semantics of new LWLock (CommitLock) introduced
>> > > by patch seems to be different in the sense that it is just taken in
>> > > Exclusive mode (and no Shared mode is required) as per your proposal. We
>> > > could use existing LWLock APi's, but on the other hand we could even
>> > > invent new LWLock API for this kind of locking.
>> > >
>> >
>> > LWLock API code is already too complex, so -1 for more changes there
>>
>> I don't think that's a valid argument. It's better to have the
>> complexity in one place (lwlock) than have rather similar complexity in
>> several other places. The clog control lock is far from the only place
>> that would benefit from tricks along these lines.
>
>
> What "tricks" are being used??
>
> Please explain why taking 2 locks is bad here, yet works fine elsewhere.
>

One thing that could be risky in this new scheme of locking

is that now in functions TransactionIdSetPageStatus and

TransactionIdSetStatusBit, we modify slru's shared state with Control Lock

in Shared mode whereas I think it is mandated in the code that those

should be modified with ControlLock in Exlusive mode. This could have

some repercussions.

Another thing is that in this flow, with patch there will be three locks

(we take per-buffer locks before doing I/O) that will get involved rather than

two, so one effect of this patch will be that currently while doing I/O,

concurrent committers will be allowed to proceed as we release ControlLock

before doing the same whereas with Patch, they will not be allowed as they

are blocked by CommitLock. Now may be this scenario is less common and

doesn't matter much if the patch improves the more common scenario,

however this is an indication of what Andres tries to highlight that having more

locks for this might make patch more complicated.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Reducing ClogControlLock contention

From

Simon Riggs

Date:

11 August 2015, 10:14:39

On 11 August 2015 at 10:55, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jul 1, 2015 at 3:49 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 1 July 2015 at 11:14, Andres Freund <andres@anarazel.de> wrote:
>>
>> On 2015-07-01 09:08:11 +0100, Simon Riggs wrote:
>> > On 1 July 2015 at 09:00, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >
>> > > On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon@2ndquadrant.com>
>> > > a. the semantics of new LWLock (CommitLock) introduced
>> > > by patch seems to be different in the sense that it is just taken in
>> > > Exclusive mode (and no Shared mode is required) as per your proposal. We
>> > > could use existing LWLock APi's, but on the other hand we could even
>> > > invent new LWLock API for this kind of locking.
>> > >
>> >
>> > LWLock API code is already too complex, so -1 for more changes there
>>
>> I don't think that's a valid argument. It's better to have the
>> complexity in one place (lwlock) than have rather similar complexity in
>> several other places. The clog control lock is far from the only place
>> that would benefit from tricks along these lines.
>
>
> What "tricks" are being used??
>
> Please explain why taking 2 locks is bad here, yet works fine elsewhere.
>

One thing that could be risky in this new scheme of locking
is that now in functions TransactionIdSetPageStatus and
TransactionIdSetStatusBit, we modify slru's shared state with Control Lock
in Shared mode whereas I think it is mandated in the code that those
should be modified with ControlLock in Exlusive mode. This could have
some repercussions.

Do you know of any? This is a technical forum, so if we see a problem we say what it is, and if we don't, that's usually classed as a positive point in a code review.

Another thing is that in this flow, with patch there will be three locks
(we take per-buffer locks before doing I/O) that will get involved rather than
two, so one effect of this patch will be that currently while doing I/O,
concurrent committers will be allowed to proceed as we release ControlLock
before doing the same whereas with Patch, they will not be allowed as they
are blocked by CommitLock. Now may be this scenario is less common and
doesn't matter much if the patch improves the more common scenario,
however this is an indication of what Andres tries to highlight that having more
locks for this might make patch more complicated.

It's easy to stripe the CommitLock in that case, if it is a problem.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Reducing ClogControlLock contention

From

Amit Kapila

Date:

11 August 2015, 10:39:56

On Tue, Aug 11, 2015 at 3:44 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 11 August 2015 at 10:55, Amit Kapila <amit.kapila16@gmail.com> wrote:
> What "tricks" are being used??
>
> Please explain why taking 2 locks is bad here, yet works fine elsewhere.
>

One thing that could be risky in this new scheme of locking
is that now in functions TransactionIdSetPageStatus and
TransactionIdSetStatusBit, we modify slru's shared state with Control Lock
in Shared mode whereas I think it is mandated in the code that those
should be modified with ControlLock in Exlusive mode. This could have
some repercussions.

Do you know of any? This is a technical forum, so if we see a problem we say what it is, and if we don't, that's usually classed as a positive point in a code review.

One of the main reason of saying this is that it is written in File

level comments in slru.c that for accessing (examine or modify)

the shared state, Control lock *must* be held in Exclusive mode

except in function SimpleLruReadPage_ReadOnly(). So, whereas

I agree that I should think more about if there is any breakage due

to patch, but I don't find any explanation either in your e-mail or in

patch why it is safe to modify the state after patch when it was not

before. If you think it is safe, then atleast modify comments in

slru.c.

Another thing is that in this flow, with patch there will be three locks
(we take per-buffer locks before doing I/O) that will get involved rather than
two, so one effect of this patch will be that currently while doing I/O,
concurrent committers will be allowed to proceed as we release ControlLock
before doing the same whereas with Patch, they will not be allowed as they
are blocked by CommitLock. Now may be this scenario is less common and
doesn't matter much if the patch improves the more common scenario,
however this is an indication of what Andres tries to highlight that having more
locks for this might make patch more complicated.

It's easy to stripe the CommitLock in that case, if it is a problem.

Sure, I think other places in code that take both the other locks also

needs to be checked for updation.

With Regards,
Amit Kapila.

EnterpriseDB: http://www.enterprisedb.com

Re: Reducing ClogControlLock contention

From

Amit Kapila

Date:

11 August 2015, 13:53:35

On Tue, Aug 11, 2015 at 4:09 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Aug 11, 2015 at 3:44 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>
>>>
>>> Another thing is that in this flow, with patch there will be three locks
>>> (we take per-buffer locks before doing I/O) that will get involved rather than
>>> two, so one effect of this patch will be that currently while doing I/O,
>>> concurrent committers will be allowed to proceed as we release ControlLock
>>> before doing the same whereas with Patch, they will not be allowed as they
>>> are blocked by CommitLock. Now may be this scenario is less common and
>>> doesn't matter much if the patch improves the more common scenario,
>>> however this is an indication of what Andres tries to highlight that having more
>>> locks for this might make patch more complicated.
>>
>>
>> It's easy to stripe the CommitLock in that case, if it is a problem.
>
>
> Sure, I think other places in code that take both the other locks also
> needs to be checked for updation.
>

One more point here why do we need CommitLock before calling

SimpleLruReadPage_ReadOnly() in the patch and if it is not required,

then can we use LWLockAcquire(shared->buffer_locks[slotno], LW_EXCLUSIVE);

instead of CommitLock?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Reducing ClogControlLock contention

From

Simon Riggs

Date:

11 August 2015, 13:57:20

On 11 August 2015 at 14:53, Amit Kapila <amit.kapila16@gmail.com> wrote:

One more point here why do we need CommitLock before calling
SimpleLruReadPage_ReadOnly() in the patch and if it is not required,
then can we use LWLockAcquire(shared->buffer_locks[slotno], LW_EXCLUSIVE);
instead of CommitLock?

That prevents read only access, not just commits, so that isn't a better suggestion.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Reducing ClogControlLock contention

From

Simon Riggs

Date:

11 August 2015, 14:02:20

On 11 August 2015 at 11:39, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Aug 11, 2015 at 3:44 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 11 August 2015 at 10:55, Amit Kapila <amit.kapila16@gmail.com> wrote:
> What "tricks" are being used??
>
> Please explain why taking 2 locks is bad here, yet works fine elsewhere.
>

One thing that could be risky in this new scheme of locking
is that now in functions TransactionIdSetPageStatus and
TransactionIdSetStatusBit, we modify slru's shared state with Control Lock
in Shared mode whereas I think it is mandated in the code that those
should be modified with ControlLock in Exlusive mode. This could have
some repercussions.

Do you know of any? This is a technical forum, so if we see a problem we say what it is, and if we don't, that's usually classed as a positive point in a code review.

One of the main reason of saying this is that it is written in File
level comments in slru.c that for accessing (examine or modify)
the shared state, Control lock *must* be held in Exclusive mode
except in function SimpleLruReadPage_ReadOnly(). So, whereas
I agree that I should think more about if there is any breakage due
to patch, but I don't find any explanation either in your e-mail or in
patch why it is safe to modify the state after patch when it was not
before. If you think it is safe, then atleast modify comments in
slru.c.

"except"...

I explained that a reader will never be reading data that is concurrently changed by a writer, so it was safe to break the general rule for this specific case only.

Yes, I will modify comments in the patch.

Another thing is that in this flow, with patch there will be three locks
(we take per-buffer locks before doing I/O) that will get involved rather than
two, so one effect of this patch will be that currently while doing I/O,
concurrent committers will be allowed to proceed as we release ControlLock
before doing the same whereas with Patch, they will not be allowed as they
are blocked by CommitLock. Now may be this scenario is less common and
doesn't matter much if the patch improves the more common scenario,
however this is an indication of what Andres tries to highlight that having more
locks for this might make patch more complicated.

It's easy to stripe the CommitLock in that case, if it is a problem.

Sure, I think other places in code that take both the other locks also
needs to be checked for updation.

Not sure what that means, but there are no other places calling CommitLock

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Reducing ClogControlLock contention

From

Amit Kapila

Date:

12 August 2015, 03:50:00

On Tue, Aug 11, 2015 at 7:27 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 11 August 2015 at 14:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>>
>> One more point here why do we need CommitLock before calling
>> SimpleLruReadPage_ReadOnly() in the patch and if it is not required,
>> then can we use LWLockAcquire(shared->buffer_locks[slotno], LW_EXCLUSIVE);
>> instead of CommitLock?
>
>
> That prevents read only access, not just commits, so that isn't a better suggestion.

read only access of what (clog page?)?

Here we are mainly doing three operations read clog page, write transaction status

on clog page and update shared control state. So basically two resources are

involved clog page and shared control state, so which one of those you are talking?

Apart from above, in below code, it is assumed that we have exclusive lock on

clog page which we don't in the proposed patch as some one can read the

same page while we are modifying it. In current code, this assumption is valid

because during Write we take CLogControlLock in Exclusive mode and while

Reading we take the same in Shared mode.

TransactionIdSetStatusBit()
{
..
/* note this assumes exclusive access to the clog page */
byteval = *byteptr;
byteval &= ~(((1 << CLOG_BITS_PER_XACT) - 1) << bshift);
byteval |= (status << bshift);
*byteptr = byteval;
..
}

Now even if this is a problem, I think we can solve it with some more lower

level lock or may be with atomic operation, but I have mentioned it to check

your opinion on the same.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Reducing ClogControlLock contention

From

Amit Kapila

Date:

12 August 2015, 04:14:24

On Tue, Aug 11, 2015 at 7:32 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 11 August 2015 at 11:39, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>>>>
>>>> Another thing is that in this flow, with patch there will be three locks
>>>> (we take per-buffer locks before doing I/O) that will get involved rather than
>>>> two, so one effect of this patch will be that currently while doing I/O,
>>>> concurrent committers will be allowed to proceed as we release ControlLock
>>>> before doing the same whereas with Patch, they will not be allowed as they
>>>> are blocked by CommitLock. Now may be this scenario is less common and
>>>> doesn't matter much if the patch improves the more common scenario,
>>>> however this is an indication of what Andres tries to highlight that having more
>>>> locks for this might make patch more complicated.
>>>
>>>
>>> It's easy to stripe the CommitLock in that case, if it is a problem.
>>
>>
>> Sure, I think other places in code that take both the other locks also
>> needs to be checked for updation.
>
>
> Not sure what that means, but there are no other places calling CommitLock
>

What I mean is that if want to ensure that we don't keep CommitLock while

doing I/O, then we might also want to ensure the same while waiting for I/O.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Reducing ClogControlLock contention

From

Andres Freund

Date:

22 August 2015, 14:15:03

Hi,

Amit pinged me about my opinion of this patch. I don't really have
time/energy for an in-depth review right now, but since I'd looked
enough to see some troublesome points, I thought I'd write those.

On 2015-06-30 08:02:25 +0100, Simon Riggs wrote:
> Proposal for improving this is to acquire the ClogControlLock in Shared
> mode, if possible.

I find that rather scary. That requires for all read and write paths in
clog.c and slru.c to only ever read memory locations once. Otherwise
those reads may not get the same results. That'd mean we'd need to add
indirections via volatile to all reads/writes. It also requires that we
never read in 4 byte quantities.

> This is safe because people checking visibility of an xid must always run
> TransactionIdIsInProgress() first to avoid race conditions, which will
> always return true for the transaction we are currently committing. As a
> result, we never get concurrent access to the same bits in clog, which
> would require a barrier.

I don't think that's really sufficient. There's a bunch of callers do
lookups without such checks, e.g. in heapam.c. It might be safe due to
other circumstances, but at the very least this is a significant but
sublte API revision.

> Two concurrent writers might access the same word concurrently, so we
> protect against that with a new CommitLock. We could partition that by
> pageno also, if needed.

To me it seems better to make this more integrated with slru.c. Change
the locking so that the control lock (relevant for page mappings et al)
is different from the locks for reading/writing data.

>       * If we're doing an async commit (ie, lsn is valid), then we must wait
>       * for any active write on the page slot to complete.  Otherwise our
>       * update could reach disk in that write, which will not do since we
> @@ -273,7 +290,9 @@ TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
>       * write-busy, since we don't care if the update reaches disk sooner than
>       * we think.
>       */
> -    slotno = SimpleLruReadPage(ClogCtl, pageno, XLogRecPtrIsInvalid(lsn), xid);
> +    if (!InRecovery)
> +        LWLockAcquire(CommitLock, LW_EXCLUSIVE);
> +    slotno = SimpleLruReadPage_ReadOnly(ClogCtl, pageno, XLogRecPtrIsInvalid(lsn), xid);
>  
>      /*
>       * Set the main transaction id, if any.
> @@ -312,6 +331,9 @@ TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
>      ClogCtl->shared->page_dirty[slotno] = true;
>  
>      LWLockRelease(CLogControlLock);
> +
> +    if (!InRecovery)
> +        LWLockRelease(CommitLock);
>  }

TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
writes an 8 byte variable (the lsn). That's not safe.

Maybe write_ok = XLogRecPtrIsInvalid(lsn) is trying to address that? If
so, I don't see how. A page is returned with only the shared lock held
if it's in VALID state before. Even if that were changed, this'd be a
mightily subtle thing to do without a very fat comment.

> @@ -447,7 +447,8 @@ SimpleLruReadPage(SlruCtl ctl, int pageno, bool write_ok,
>   * It is unspecified whether the lock will be shared or exclusive.
>   */
>  int
> -SimpleLruReadPage_ReadOnly(SlruCtl ctl, int pageno, TransactionId xid)
> +SimpleLruReadPage_ReadOnly(SlruCtl ctl, int pageno, bool write_ok,
> +                           TransactionId xid)
>  {
>      SlruShared    shared = ctl->shared;
>      int            slotno;
> @@ -460,7 +461,9 @@ SimpleLruReadPage_ReadOnly(SlruCtl ctl, int pageno, TransactionId xid)
>      {
>          if (shared->page_number[slotno] == pageno &&
>              shared->page_status[slotno] != SLRU_PAGE_EMPTY &&
> -            shared->page_status[slotno] != SLRU_PAGE_READ_IN_PROGRESS)
> +            shared->page_status[slotno] != SLRU_PAGE_READ_IN_PROGRESS &&
> +            (write_ok ||
> +             shared->page_status[slotno] != SLRU_PAGE_WRITE_IN_PROGRESS))
>          {
>              /* See comments for SlruRecentlyUsed macro */
>              SlruRecentlyUsed(shared, slotno);
> @@ -472,7 +475,7 @@ SimpleLruReadPage_ReadOnly(SlruCtl ctl, int pageno, TransactionId xid)
>      LWLockRelease(shared->ControlLock);
>      LWLockAcquire(shared->ControlLock, LW_EXCLUSIVE);
>  
> -    return SimpleLruReadPage(ctl, pageno, true, xid);
> +    return SimpleLruReadPage(ctl, pageno, write_ok, xid);
>  }

This function's name would definitely need to be changed, and it'll need
to be documented when/how it's safe to use write_ok = true. Same with
slru.c's header.

A patch like this will need far more changes. Every read and write from
a page has to be guaranteed to only be done once, otherwise you can get
'effectively torn' values.

That is, you can't just do
static void
TransactionIdSetStatusBit(TransactionId xid, XidStatus status, XLogRecPtr lsn, int slotno)
...char       *byteptr;char        byteval;
.../* note this assumes exclusive access to the clog page */byteval = *byteptr;byteval &= ~(((1 << CLOG_BITS_PER_XACT)
-1) << bshift);byteval |= (status << bshift);*byteptr = byteval;

...

the compiler is entirely free to "optimize" away the byteval variable
and do all these masking operations on memory! It can intermittenly
write temporary values to byteval, because e.g. the register pressure is
too high.

To actually rely on single-copy-atomicity you have to enforce that these
reads/writes can only happen. Leavout out some possible macro
indirection that'd have to look like       byteval = (volatile char *) byteptr;       ...       *(volatile char *)
byteptr= byteval;

some for TransactionIdGetStatus(), without the write side obviously.

and stuff likeif (!XLogRecPtrIsInvalid(lsn)){    int            lsnindex = GetLSNIndex(slotno, xid);
    if (ClogCtl->shared->group_lsn[lsnindex] < lsn)        ClogCtl->shared->group_lsn[lsnindex] = lsn;}
just aren't possible at all unless we drop a bunch of platforms.

In essence I think this patch allows us to roughly benchmark whether and
how benefitial fixing the contention around clog is, but it seems pretty
far from something that can actually be applied.

Greetings,

Andres Freund

Re: Reducing ClogControlLock contention

From

Simon Riggs

Date:

25 August 2015, 08:38:28

On 12 August 2015 at 04:49, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Aug 11, 2015 at 7:27 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 11 August 2015 at 14:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>>
>> One more point here why do we need CommitLock before calling
>> SimpleLruReadPage_ReadOnly() in the patch and if it is not required,
>> then can we use LWLockAcquire(shared->buffer_locks[slotno], LW_EXCLUSIVE);
>> instead of CommitLock?
>
>
> That prevents read only access, not just commits, so that isn't a better suggestion.

read only access of what (clog page?)?

Here we are mainly doing three operations read clog page, write transaction status
on clog page and update shared control state. So basically two resources are
involved clog page and shared control state, so which one of those you are talking?

Sorry, your suggestion was good. Using LWLockAcquire(shared->buffer_locks[slotno], LW_EXCLUSIVE); now seems sufficient.

Apart from above, in below code, it is assumed that we have exclusive lock on
clog page which we don't in the proposed patch as some one can read the
same page while we are modifying it. In current code, this assumption is valid
because during Write we take CLogControlLock in Exclusive mode and while
Reading we take the same in Shared mode.

Not exactly, no. This is not a general case, it is for one important and very specific case only, exactly suited to our transaction manager. I have checked all call paths and we are good.

New patch attached. I will reply to Andres' post separately since this does not yet address all of his detailed points.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

clog_optimize.v4.patch

Re: Reducing ClogControlLock contention

From

Simon Riggs

Date:

25 August 2015, 08:51:17

On 22 August 2015 at 15:14, Andres Freund <andres@anarazel.de> wrote:

On 2015-06-30 08:02:25 +0100, Simon Riggs wrote:
> Proposal for improving this is to acquire the ClogControlLock in Shared
> mode, if possible.

I find that rather scary. That requires for all read and write paths in
clog.c and slru.c to only ever read memory locations once. Otherwise
those reads may not get the same results. That'd mean we'd need to add
indirections via volatile to all reads/writes. It also requires that we
never read in 4 byte quantities.

There is is a very specific case in which this is possible. We already allow writes to data structures in the transaction manager without locks *in certain cases* and this is all that is proposed here. Nothing scary about doing something we already do, as long as we get it right.

> This is safe because people checking visibility of an xid must always run
> TransactionIdIsInProgress() first to avoid race conditions, which will
> always return true for the transaction we are currently committing. As a
> result, we never get concurrent access to the same bits in clog, which
> would require a barrier.

I don't think that's really sufficient. There's a bunch of callers do
lookups without such checks, e.g. in heapam.c. It might be safe due to
other circumstances, but at the very least this is a significant but
sublte API revision.

I've checked the call sites you mention and they refer to tests made AFTER we know have waited for the xid to complete via the lock manager. So as of now, there are no callers of TransactionIdGetStatus() that have not already confirmed that the xid is no longer active in the lock manager or the procarray. Since we set clog before touching lock manager or procarray we can be certain there is no concurrent reads and writes.

TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
writes an 8 byte variable (the lsn). That's not safe.

Agreed, thanks for spotting that.

I see this as the biggest problem to overcome with this patch.

We write WAL in pages, so we don't need to store the low order bits (varies according to WAL page size). We seldom use the higher order bits, since it takes a while to go thru (8192 * INT_MAX) = 32PB of WAL. So it seems like we can have a base LSN for a whole clog page, plus 4-byte LSN offsets. That way we can make the reads and writes atomic on all platforms. All of that can be hidden in clog.c in macros, so low impact, modular code.

A patch like this will need far more changes. Every read and write from
a page has to be guaranteed to only be done once, otherwise you can get
'effectively torn' values.

That is, you can't just do
static void
TransactionIdSetStatusBit(TransactionId xid, XidStatus status, XLogRecPtr lsn, int slotno)
...
char *byteptr;
char byteval;
...
/* note this assumes exclusive access to the clog page */
byteval = *byteptr;
byteval &= ~(((1 << CLOG_BITS_PER_XACT) - 1) << bshift);
byteval |= (status << bshift);
*byteptr = byteval;
...

the compiler is entirely free to "optimize" away the byteval variable
and do all these masking operations on memory! It can intermittenly
write temporary values to byteval, because e.g. the register pressure is
too high.

To actually rely on single-copy-atomicity you have to enforce that these
reads/writes can only happen. Leavout out some possible macro
indirection that'd have to look like
byteval = (volatile char *) byteptr;
...
*(volatile char *) byteptr = byteval;
some for TransactionIdGetStatus(), without the write side obviously.

Seems doable.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Reducing ClogControlLock contention

From

Amit Kapila

Date:

26 August 2015, 10:40:44

On Tue, Aug 25, 2015 at 2:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 22 August 2015 at 15:14, Andres Freund <andres@anarazel.de> wrote:

TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
writes an 8 byte variable (the lsn). That's not safe.

Agreed, thanks for spotting that.

I see this as the biggest problem to overcome with this patch.

How about using 64bit atomics or spinlock to protect this variable?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Reducing ClogControlLock contention

From

Simon Riggs

Date:

26 August 2015, 10:48:18

On 26 August 2015 at 11:40, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Aug 25, 2015 at 2:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 22 August 2015 at 15:14, Andres Freund <andres@anarazel.de> wrote:

TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
writes an 8 byte variable (the lsn). That's not safe.

Agreed, thanks for spotting that.

I see this as the biggest problem to overcome with this patch.

How about using 64bit atomics or spinlock to protect this variable?

Spinlock is out IMHO because this happens on every clog lookup. So it must be an atomic read.

I'm wondering if its worth making this work on 32-bit systems at all. The contention problems only occur on higher end servers, so we can just disable this optimization if we aren't on a 64bit server.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Reducing ClogControlLock contention

From

Amit Kapila

Date:

31 August 2015, 11:35:09

On Wed, Aug 26, 2015 at 4:18 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 26 August 2015 at 11:40, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Tue, Aug 25, 2015 at 2:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>>
>>> On 22 August 2015 at 15:14, Andres Freund <andres@anarazel.de> wrote:
>>
>>
>>>>
>>>> TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
>>>> writes an 8 byte variable (the lsn). That's not safe.
>>>
>>>
>>> Agreed, thanks for spotting that.
>>>
>>> I see this as the biggest problem to overcome with this patch.
>>
>>
>> How about using 64bit atomics or spinlock to protect this variable?
>
>
> Spinlock is out IMHO because this happens on every clog lookup. So it must be an atomic read.
>

Agreed, however using atomics is still an option, yet another way could

be before updating group_lsn, check if we already have CLogControlLock

in Exclusive mode then update it, else release the lock, re-acquire in

Exclusive mode and update the variable. The first thing that comes to mind

with this idea is that it will be less performant, yes thats true, but it will be

only done for asynchronous commits (mode which is generally not recommended

for production-use) and that too not on every commit, so may be the impact is

not that high. I have updated the patch (attached with mail) to show you what

I have in mind.

Another point about the latest patch:

+ (write_ok ||

+ shared->page_status[slotno] != SLRU_PAGE_WRITE_IN_PROGRESS))

Do you think that with new locking protocol as proposed in this

patch, it is safe to return if page status is SLRU_PAGE_WRITE_IN_PROGRESS

even if write_ok is true?

I think the case where it can cause problem is during SlruInternalWritePage()

where it performs below actions:

1. first marks the status of page as SLRU_PAGE_WRITE_IN_PROGRESS.

2. then take buffer lock in Exclusive mode.

3. release control lock.

4. perform the write

5. re-acquire the control lock in Exclusive mode.

Now consider another client which has to update the transaction status:

1. Control lock in Shared mode.

2. Get the slot

3. Acquire the buffer lock in Exclusive mode

Now consider client which has to update the transaction status performs

its step-1 after step-3 of writer, if that happens, then that could lead to

deadlock because writer will wait for client to release control lock and

client will wait for writer to release buffer lock.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment

clog_optimize.v5.patch

Re: Reducing ClogControlLock contention

From

Jesper Pedersen

Date:

18 September 2015, 18:04:19

On 08/31/2015 07:34 AM, Amit Kapila wrote:
> I have updated the patch (attached with mail) to show
> you what I have in mind.
>

I havn't been able to get a successful run with _v5 using pgbench.

TransactionIdSetStatusBit assumes an exclusive lock on CLogControlLock 
when called, but that part is removed from TransactionIdSetPageStatus now.

I tried an
  if (!LWLockHeldByMe(CLogControlLock))  {      LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);      mode = LW_EXCLUSIVE;
}

approach, but didn't get further. Plus that probably isn't the best way, 
since we will traverse all LWLock's, and we need to account for cases w/ 
and w/o sub-transactions, e.g. multiple calls to 
TransactionIdSetStatusBit within the CLogControlLock exclusive boundary.

Best regards, Jesper

Re: Reducing ClogControlLock contention

From

Amit Kapila

Date:

19 September 2015, 03:32:31

On Fri, Sep 18, 2015 at 11:31 PM, Jesper Pedersen <jesper.pedersen@redhat.com> wrote:

On 08/31/2015 07:34 AM, Amit Kapila wrote:
I have updated the patch (attached with mail) to show
you what I have in mind.

I havn't been able to get a successful run with _v5 using pgbench.

This patch is still not ready for performance test, as you might

have seen in comments that we are still discussing locking

issues.

TransactionIdSetStatusBit assumes an exclusive lock on CLogControlLock when called, but that part is removed from TransactionIdSetPageStatus now.

It doesn't seem to be a necessary condition, thats why in this patch

a smaller granularity lock is introduced.

I tried an

if (!LWLockHeldByMe(CLogControlLock))
{
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
mode = LW_EXCLUSIVE;
}

approach, but didn't get further.

I suspect the problem is something else.

Plus that probably isn't the best way, since we will traverse all LWLock's,

Yes, but it does in such a way that probably the caller will find it in

very few initial entries in the array.

Thank you very much for doing valuable performance tests with the

CLog patches.

With Regards,
Amit Kapila.

EnterpriseDB: http://www.enterprisedb.com

Re: Reducing ClogControlLock contention

From

Simon Riggs

Date:

16 December 2015, 22:28:22

On 22 August 2015 at 15:14, Andres Freund <andres@anarazel.de> wrote:

TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
writes an 8 byte variable (the lsn). That's not safe.

This point was the main sticking point here.

It turns out that we don't need to store the LSN (8 bytes).

WAL is fsynced every time we finish writing a file, so we only actually need to store the byte position within each file, so no more than 16MB. That fits within a 4 byte value, so can be written atomically.

So I have a valid way forward for this approach. Cool.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services