Thread: Reducing ClogControlLock contention
ClogControlLock contention is high at commit time. This appears to be due to the fact that ClogControlLock is acquired in Exclusive mode prior to marking commit, which then gets starved by backends running TransactionIdGetStatus().
Proposal for improving this is to acquire the ClogControlLock in Shared mode, if possible.
This is safe because people checking visibility of an xid must always run TransactionIdIsInProgress() first to avoid race conditions, which will always return true for the transaction we are currently committing. As a result, we never get concurrent access to the same bits in clog, which would require a barrier.
Two concurrent writers might access the same word concurrently, so we protect against that with a new CommitLock. We could partition that by pageno also, if needed.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Tue, Jun 30, 2015 at 4:02 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
ClogControlLock contention is high at commit time. This appears to be due to the fact that ClogControlLock is acquired in Exclusive mode prior to marking commit, which then gets starved by backends running TransactionIdGetStatus().Proposal for improving this is to acquire the ClogControlLock in Shared mode, if possible.This is safe because people checking visibility of an xid must always run TransactionIdIsInProgress() first to avoid race conditions, which will always return true for the transaction we are currently committing. As a result, we never get concurrent access to the same bits in clog, which would require a barrier.Two concurrent writers might access the same word concurrently, so we protect against that with a new CommitLock. We could partition that by pageno also, if needed.
Could it be possible to see some performance numbers? For example with a simple pgbench script doing a bunch of tiny transactions, with many concurrent sessions (perhaps hundreds).
--
--
Michael
On 30 June 2015 at 08:13, Michael Paquier <michael.paquier@gmail.com> wrote:
--
On Tue, Jun 30, 2015 at 4:02 PM, Simon Riggs <simon@2ndquadrant.com> wrote:ClogControlLock contention is high at commit time. This appears to be due to the fact that ClogControlLock is acquired in Exclusive mode prior to marking commit, which then gets starved by backends running TransactionIdGetStatus().Proposal for improving this is to acquire the ClogControlLock in Shared mode, if possible.This is safe because people checking visibility of an xid must always run TransactionIdIsInProgress() first to avoid race conditions, which will always return true for the transaction we are currently committing. As a result, we never get concurrent access to the same bits in clog, which would require a barrier.Two concurrent writers might access the same word concurrently, so we protect against that with a new CommitLock. We could partition that by pageno also, if needed.Could it be possible to see some performance numbers? For example with a simple pgbench script doing a bunch of tiny transactions, with many concurrent sessions (perhaps hundreds).
I'm more interested to see if people think it is safe.
This contention is masked by contention elsewhere, e.g. ProcArrayLock, so the need for testing here should come once other patches ahead of this are in.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 30 June 2015 at 08:22, Simon Riggs <simon@2ndquadrant.com> wrote:
--
This contention is masked by contention elsewhere, e.g. ProcArrayLock, so the need for testing here should come once other patches ahead of this are in.
Let me explain more clearly.
Andres' patch to cache snapshots and reduce ProcArrayLock was interesting, but not initially compelling. We now have a solution that commits in batches, which will reduce the number of times the ProcArray changes - this will heighten the benefit from Andres' snapshot cache patch.
So the order of testing/commit should be
Queued commit patch
ProcArray cache patch
Clog shared commit patch (this one)
I didn't hear recent mention of Robert's chash patch, but IIRC it was effective and so we hope to see it again soon also.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Jun 30, 2015 at 12:52 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 30 June 2015 at 08:13, Michael Paquier <michael.paquier@gmail.com> wrote:
>>
>>
>> Could it be possible to see some performance numbers? For example with a simple pgbench script doing a bunch of tiny transactions, with many concurrent sessions (perhaps hundreds).
>
>
> I'm more interested to see if people think it is safe.
>
> This contention is masked by contention elsewhere, e.g. ProcArrayLock, so the need for testing here should come once other patches ahead of this are in.
>
>
> On 30 June 2015 at 08:13, Michael Paquier <michael.paquier@gmail.com> wrote:
>>
>>
>> Could it be possible to see some performance numbers? For example with a simple pgbench script doing a bunch of tiny transactions, with many concurrent sessions (perhaps hundreds).
>
>
> I'm more interested to see if people think it is safe.
>
> This contention is masked by contention elsewhere, e.g. ProcArrayLock, so the need for testing here should come once other patches ahead of this are in.
>
Exactly and other lock that can mask this improvement is WALWriteLock,
but for that we can take the performance data with synchronous_commit
On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> ClogControlLock contention is high at commit time. This appears to be due to the fact that ClogControlLock is acquired in Exclusive mode prior to marking commit, which then gets starved by backends running TransactionIdGetStatus().
>
> Proposal for improving this is to acquire the ClogControlLock in Shared mode, if possible.
>
> Two concurrent writers might access the same word concurrently, so we protect against that with a new CommitLock. We could partition that by pageno also, if needed.
>
>
> ClogControlLock contention is high at commit time. This appears to be due to the fact that ClogControlLock is acquired in Exclusive mode prior to marking commit, which then gets starved by backends running TransactionIdGetStatus().
>
> Proposal for improving this is to acquire the ClogControlLock in Shared mode, if possible.
>
This approach looks good way for avoiding the contention around
ClogControlLock. Few things that occurred to me while looking at
patch are that
a. the semantics of new LWLock (CommitLock) introduced
by patch seems to be different in the sense that it is just taken in
Exclusive mode (and no Shared mode is required) as per your proposal. We
could use existing LWLock APi's, but on the other hand we could even
invent new LWLock API for this kind of locking.
b. SimpleLruReadPage_ReadOnly() - This API's is meant mainly for
read-access of page and the description also says the same, but now
we want to use it for updating page as well. It might be better to invent
similar new API or at the very least modify it's description.
> Two concurrent writers might access the same word concurrently, so we protect against that with a new CommitLock. We could partition that by pageno also, if needed.
>
I think it will be better to partition it or use it in some other way to avoid
two concurrent writers block at it, however if you want to first see the
test results with this, then that is also okay.
Overall the idea seems good to pursue, however I have slight feeling
that using 2 LWLocks (CLOGControlLock in shared mode and new
CommitLock in Exclusive mode) to set the transaction information
is somewhat odd, but I could not see any problem with it.
On 1 July 2015 at 09:00, Amit Kapila <amit.kapila16@gmail.com> wrote:
--
On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> ClogControlLock contention is high at commit time. This appears to be due to the fact that ClogControlLock is acquired in Exclusive mode prior to marking commit, which then gets starved by backends running TransactionIdGetStatus().
>
> Proposal for improving this is to acquire the ClogControlLock in Shared mode, if possible.
>This approach looks good way for avoiding the contention aroundClogControlLock. Few things that occurred to me while looking atpatch are thata. the semantics of new LWLock (CommitLock) introducedby patch seems to be different in the sense that it is just taken inExclusive mode (and no Shared mode is required) as per your proposal. Wecould use existing LWLock APi's, but on the other hand we could eveninvent new LWLock API for this kind of locking.
LWLock API code is already too complex, so -1 for more changes there
b. SimpleLruReadPage_ReadOnly() - This API's is meant mainly forread-access of page and the description also says the same, but nowwe want to use it for updating page as well. It might be better to inventsimilar new API or at the very least modify it's description.
Agreed, perhaps SimpleLruReadPage_Optimistic()
> Two concurrent writers might access the same word concurrently, so we protect against that with a new CommitLock. We could partition that by pageno also, if needed.
>I think it will be better to partition it or use it in some other way to avoidtwo concurrent writers block at it, however if you want to first see thetest results with this, then that is also okay.
Many updates would be on same page, so partitioning it would need to be at least 4-way to be worth doing. Maybe we could stripe into 512 bye pages.
Overall the idea seems good to pursue, however I have slight feelingthat using 2 LWLocks (CLOGControlLock in shared mode and newCommitLock in Exclusive mode) to set the transaction informationis somewhat odd, but I could not see any problem with it.
Perhaps call it the CommitSerializationLock would help. There are already locks that are held only in Exclusive mode.
Locking two separate resources at same time is common in other code.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Jul 1, 2015 at 1:38 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 1 July 2015 at 09:00, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>>
>> I think it will be better to partition it or use it in some other way to avoid
>> two concurrent writers block at it, however if you want to first see the
>> test results with this, then that is also okay.
>
>
> Many updates would be on same page, so partitioning it would need to be at least 4-way to be worth doing. Maybe we could stripe into 512 bye pages.
>
> On 1 July 2015 at 09:00, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>>
>> I think it will be better to partition it or use it in some other way to avoid
>> two concurrent writers block at it, however if you want to first see the
>> test results with this, then that is also okay.
>
>
> Many updates would be on same page, so partitioning it would need to be at least 4-way to be worth doing. Maybe we could stripe into 512 bye pages.
>
Sure, it makes sense to try that way, once you have that ready, I can
try this out along with ProcArrayLock patch to see the impact.
On 2015-07-01 09:08:11 +0100, Simon Riggs wrote: > On 1 July 2015 at 09:00, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon@2ndquadrant.com> > > a. the semantics of new LWLock (CommitLock) introduced > > by patch seems to be different in the sense that it is just taken in > > Exclusive mode (and no Shared mode is required) as per your proposal. We > > could use existing LWLock APi's, but on the other hand we could even > > invent new LWLock API for this kind of locking. > > > > LWLock API code is already too complex, so -1 for more changes there I don't think that's a valid argument. It's better to have the complexity in one place (lwlock) than have rather similar complexity in several other places. The clog control lock is far from the only place that would benefit from tricks along these lines. Greetings, Andres Freund
On 1 July 2015 at 11:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
--
On Wed, Jul 1, 2015 at 1:38 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 1 July 2015 at 09:00, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>>
>> I think it will be better to partition it or use it in some other way to avoid
>> two concurrent writers block at it, however if you want to first see the
>> test results with this, then that is also okay.
>
>
> Many updates would be on same page, so partitioning it would need to be at least 4-way to be worth doing. Maybe we could stripe into 512 bye pages.>Sure, it makes sense to try that way, once you have that ready, I cantry this out along with ProcArrayLock patch to see the impact.
Seems sensible to measure what the new point of contention is with both before doing anything further.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 1 July 2015 at 11:14, Andres Freund <andres@anarazel.de> wrote:
--
On 2015-07-01 09:08:11 +0100, Simon Riggs wrote:
> On 1 July 2015 at 09:00, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon@2ndquadrant.com>
> > a. the semantics of new LWLock (CommitLock) introduced
> > by patch seems to be different in the sense that it is just taken in
> > Exclusive mode (and no Shared mode is required) as per your proposal. We
> > could use existing LWLock APi's, but on the other hand we could even
> > invent new LWLock API for this kind of locking.
> >
>
> LWLock API code is already too complex, so -1 for more changes there
I don't think that's a valid argument. It's better to have the
complexity in one place (lwlock) than have rather similar complexity in
several other places. The clog control lock is far from the only place
that would benefit from tricks along these lines.
What "tricks" are being used??
Please explain why taking 2 locks is bad here, yet works fine elsewhere.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2015-07-01 11:19:40 +0100, Simon Riggs wrote: > What "tricks" are being used?? > > Please explain why taking 2 locks is bad here, yet works fine elsewhere. I didn't say anything about 'bad'. It's more complicated than one lock. Suddenly you have to care about lock ordering and such. The algorithms for ensuring correctness gets more complicated.
On Wed, Jul 1, 2015 at 6:21 AM, Andres Freund <andres@anarazel.de> wrote: > On 2015-07-01 11:19:40 +0100, Simon Riggs wrote: >> What "tricks" are being used?? >> >> Please explain why taking 2 locks is bad here, yet works fine elsewhere. > > I didn't say anything about 'bad'. It's more complicated than one > lock. Suddenly you have to care about lock ordering and such. The > algorithms for ensuring correctness gets more complicated. Taking two locks might also be more expensive than just taking one. I suppose benchmarking will reveal whether there is an issue there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jul 1, 2015 at 3:49 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 1 July 2015 at 11:14, Andres Freund <andres@anarazel.de> wrote:
>>
>> On 2015-07-01 09:08:11 +0100, Simon Riggs wrote:
>> > On 1 July 2015 at 09:00, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >
>> > > On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon@2ndquadrant.com>
>> > > a. the semantics of new LWLock (CommitLock) introduced
>> > > by patch seems to be different in the sense that it is just taken in
>> > > Exclusive mode (and no Shared mode is required) as per your proposal. We
>> > > could use existing LWLock APi's, but on the other hand we could even
>> > > invent new LWLock API for this kind of locking.
>> > >
>> >
>> > LWLock API code is already too complex, so -1 for more changes there
>>
>> I don't think that's a valid argument. It's better to have the
>> complexity in one place (lwlock) than have rather similar complexity in
>> several other places. The clog control lock is far from the only place
>> that would benefit from tricks along these lines.
>
>
> What "tricks" are being used??
>
> Please explain why taking 2 locks is bad here, yet works fine elsewhere.
>
>
> On 1 July 2015 at 11:14, Andres Freund <andres@anarazel.de> wrote:
>>
>> On 2015-07-01 09:08:11 +0100, Simon Riggs wrote:
>> > On 1 July 2015 at 09:00, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >
>> > > On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon@2ndquadrant.com>
>> > > a. the semantics of new LWLock (CommitLock) introduced
>> > > by patch seems to be different in the sense that it is just taken in
>> > > Exclusive mode (and no Shared mode is required) as per your proposal. We
>> > > could use existing LWLock APi's, but on the other hand we could even
>> > > invent new LWLock API for this kind of locking.
>> > >
>> >
>> > LWLock API code is already too complex, so -1 for more changes there
>>
>> I don't think that's a valid argument. It's better to have the
>> complexity in one place (lwlock) than have rather similar complexity in
>> several other places. The clog control lock is far from the only place
>> that would benefit from tricks along these lines.
>
>
> What "tricks" are being used??
>
> Please explain why taking 2 locks is bad here, yet works fine elsewhere.
>
One thing that could be risky in this new scheme of locking
is that now in functions TransactionIdSetPageStatus and
TransactionIdSetStatusBit, we modify slru's shared state with Control Lock
in Shared mode whereas I think it is mandated in the code that those
should be modified with ControlLock in Exlusive mode. This could have
some repercussions.
Another thing is that in this flow, with patch there will be three locks
(we take per-buffer locks before doing I/O) that will get involved rather than
two, so one effect of this patch will be that currently while doing I/O,
concurrent committers will be allowed to proceed as we release ControlLock
before doing the same whereas with Patch, they will not be allowed as they
are blocked by CommitLock. Now may be this scenario is less common and
doesn't matter much if the patch improves the more common scenario,
however this is an indication of what Andres tries to highlight that having more
locks for this might make patch more complicated.
On 11 August 2015 at 10:55, Amit Kapila <amit.kapila16@gmail.com> wrote:
--
On Wed, Jul 1, 2015 at 3:49 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 1 July 2015 at 11:14, Andres Freund <andres@anarazel.de> wrote:
>>
>> On 2015-07-01 09:08:11 +0100, Simon Riggs wrote:
>> > On 1 July 2015 at 09:00, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >
>> > > On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon@2ndquadrant.com>
>> > > a. the semantics of new LWLock (CommitLock) introduced
>> > > by patch seems to be different in the sense that it is just taken in
>> > > Exclusive mode (and no Shared mode is required) as per your proposal. We
>> > > could use existing LWLock APi's, but on the other hand we could even
>> > > invent new LWLock API for this kind of locking.
>> > >
>> >
>> > LWLock API code is already too complex, so -1 for more changes there
>>
>> I don't think that's a valid argument. It's better to have the
>> complexity in one place (lwlock) than have rather similar complexity in
>> several other places. The clog control lock is far from the only place
>> that would benefit from tricks along these lines.
>
>
> What "tricks" are being used??
>
> Please explain why taking 2 locks is bad here, yet works fine elsewhere.
>One thing that could be risky in this new scheme of lockingis that now in functions TransactionIdSetPageStatus andTransactionIdSetStatusBit, we modify slru's shared state with Control Lockin Shared mode whereas I think it is mandated in the code that thoseshould be modified with ControlLock in Exlusive mode. This could havesome repercussions.
Do you know of any? This is a technical forum, so if we see a problem we say what it is, and if we don't, that's usually classed as a positive point in a code review.
Another thing is that in this flow, with patch there will be three locks(we take per-buffer locks before doing I/O) that will get involved rather thantwo, so one effect of this patch will be that currently while doing I/O,concurrent committers will be allowed to proceed as we release ControlLockbefore doing the same whereas with Patch, they will not be allowed as theyare blocked by CommitLock. Now may be this scenario is less common anddoesn't matter much if the patch improves the more common scenario,however this is an indication of what Andres tries to highlight that having morelocks for this might make patch more complicated.
It's easy to stripe the CommitLock in that case, if it is a problem.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Aug 11, 2015 at 3:44 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
With Regards,
Amit Kapila.
On 11 August 2015 at 10:55, Amit Kapila <amit.kapila16@gmail.com> wrote:> What "tricks" are being used??
>
> Please explain why taking 2 locks is bad here, yet works fine elsewhere.
>One thing that could be risky in this new scheme of lockingis that now in functions TransactionIdSetPageStatus andTransactionIdSetStatusBit, we modify slru's shared state with Control Lockin Shared mode whereas I think it is mandated in the code that thoseshould be modified with ControlLock in Exlusive mode. This could havesome repercussions.Do you know of any? This is a technical forum, so if we see a problem we say what it is, and if we don't, that's usually classed as a positive point in a code review.
One of the main reason of saying this is that it is written in File
level comments in slru.c that for accessing (examine or modify)
the shared state, Control lock *must* be held in Exclusive mode
except in function SimpleLruReadPage_ReadOnly(). So, whereas
I agree that I should think more about if there is any breakage due
to patch, but I don't find any explanation either in your e-mail or in
patch why it is safe to modify the state after patch when it was not
before. If you think it is safe, then atleast modify comments in
slru.c.
Another thing is that in this flow, with patch there will be three locks(we take per-buffer locks before doing I/O) that will get involved rather thantwo, so one effect of this patch will be that currently while doing I/O,concurrent committers will be allowed to proceed as we release ControlLockbefore doing the same whereas with Patch, they will not be allowed as theyare blocked by CommitLock. Now may be this scenario is less common anddoesn't matter much if the patch improves the more common scenario,however this is an indication of what Andres tries to highlight that having morelocks for this might make patch more complicated.It's easy to stripe the CommitLock in that case, if it is a problem.
Sure, I think other places in code that take both the other locks also
needs to be checked for updation.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Tue, Aug 11, 2015 at 4:09 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Aug 11, 2015 at 3:44 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>
>>>
>>> Another thing is that in this flow, with patch there will be three locks
>>> (we take per-buffer locks before doing I/O) that will get involved rather than
>>> two, so one effect of this patch will be that currently while doing I/O,
>>> concurrent committers will be allowed to proceed as we release ControlLock
>>> before doing the same whereas with Patch, they will not be allowed as they
>>> are blocked by CommitLock. Now may be this scenario is less common and
>>> doesn't matter much if the patch improves the more common scenario,
>>> however this is an indication of what Andres tries to highlight that having more
>>> locks for this might make patch more complicated.
>>
>>
>> It's easy to stripe the CommitLock in that case, if it is a problem.
>
>
> Sure, I think other places in code that take both the other locks also
> needs to be checked for updation.
>
One more point here why do we need CommitLock before calling
>
> On Tue, Aug 11, 2015 at 3:44 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>
>>>
>>> Another thing is that in this flow, with patch there will be three locks
>>> (we take per-buffer locks before doing I/O) that will get involved rather than
>>> two, so one effect of this patch will be that currently while doing I/O,
>>> concurrent committers will be allowed to proceed as we release ControlLock
>>> before doing the same whereas with Patch, they will not be allowed as they
>>> are blocked by CommitLock. Now may be this scenario is less common and
>>> doesn't matter much if the patch improves the more common scenario,
>>> however this is an indication of what Andres tries to highlight that having more
>>> locks for this might make patch more complicated.
>>
>>
>> It's easy to stripe the CommitLock in that case, if it is a problem.
>
>
> Sure, I think other places in code that take both the other locks also
> needs to be checked for updation.
>
One more point here why do we need CommitLock before calling
SimpleLruReadPage_ReadOnly() in the patch and if it is not required,
then can we use LWLockAcquire(shared->buffer_locks[slotno], LW_EXCLUSIVE);
instead of CommitLock?
On 11 August 2015 at 14:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
--
One more point here why do we need CommitLock before callingSimpleLruReadPage_ReadOnly() in the patch and if it is not required,then can we use LWLockAcquire(shared->buffer_locks[slotno], LW_EXCLUSIVE);instead of CommitLock?
That prevents read only access, not just commits, so that isn't a better suggestion.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 11 August 2015 at 11:39, Amit Kapila <amit.kapila16@gmail.com> wrote:
--
On Tue, Aug 11, 2015 at 3:44 PM, Simon Riggs <simon@2ndquadrant.com> wrote:On 11 August 2015 at 10:55, Amit Kapila <amit.kapila16@gmail.com> wrote:> What "tricks" are being used??
>
> Please explain why taking 2 locks is bad here, yet works fine elsewhere.
>One thing that could be risky in this new scheme of lockingis that now in functions TransactionIdSetPageStatus andTransactionIdSetStatusBit, we modify slru's shared state with Control Lockin Shared mode whereas I think it is mandated in the code that thoseshould be modified with ControlLock in Exlusive mode. This could havesome repercussions.Do you know of any? This is a technical forum, so if we see a problem we say what it is, and if we don't, that's usually classed as a positive point in a code review.One of the main reason of saying this is that it is written in Filelevel comments in slru.c that for accessing (examine or modify)the shared state, Control lock *must* be held in Exclusive modeexcept in function SimpleLruReadPage_ReadOnly(). So, whereasI agree that I should think more about if there is any breakage dueto patch, but I don't find any explanation either in your e-mail or inpatch why it is safe to modify the state after patch when it was notbefore. If you think it is safe, then atleast modify comments inslru.c.
"except"...
I explained that a reader will never be reading data that is concurrently changed by a writer, so it was safe to break the general rule for this specific case only.
Yes, I will modify comments in the patch.
Another thing is that in this flow, with patch there will be three locks(we take per-buffer locks before doing I/O) that will get involved rather thantwo, so one effect of this patch will be that currently while doing I/O,concurrent committers will be allowed to proceed as we release ControlLockbefore doing the same whereas with Patch, they will not be allowed as theyare blocked by CommitLock. Now may be this scenario is less common anddoesn't matter much if the patch improves the more common scenario,however this is an indication of what Andres tries to highlight that having morelocks for this might make patch more complicated.It's easy to stripe the CommitLock in that case, if it is a problem.Sure, I think other places in code that take both the other locks alsoneeds to be checked for updation.
Not sure what that means, but there are no other places calling CommitLock
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Aug 11, 2015 at 7:27 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 11 August 2015 at 14:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>>
>> One more point here why do we need CommitLock before calling
>> SimpleLruReadPage_ReadOnly() in the patch and if it is not required,
>> then can we use LWLockAcquire(shared->buffer_locks[slotno], LW_EXCLUSIVE);
>> instead of CommitLock?
>
>
> That prevents read only access, not just commits, so that isn't a better suggestion.
TransactionIdSetStatusBit()
{
..
/* note this assumes exclusive access to the clog page */
byteval = *byteptr;
byteval &= ~(((1 << CLOG_BITS_PER_XACT) - 1) << bshift);
byteval |= (status << bshift);
*byteptr = byteval;
..
}
>
> On 11 August 2015 at 14:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>>
>> One more point here why do we need CommitLock before calling
>> SimpleLruReadPage_ReadOnly() in the patch and if it is not required,
>> then can we use LWLockAcquire(shared->buffer_locks[slotno], LW_EXCLUSIVE);
>> instead of CommitLock?
>
>
> That prevents read only access, not just commits, so that isn't a better suggestion.
read only access of what (clog page?)?
Here we are mainly doing three operations read clog page, write transaction status
on clog page and update shared control state. So basically two resources are
involved clog page and shared control state, so which one of those you are talking?
Apart from above, in below code, it is assumed that we have exclusive lock on
clog page which we don't in the proposed patch as some one can read the
same page while we are modifying it. In current code, this assumption is valid
because during Write we take CLogControlLock in Exclusive mode and while
Reading we take the same in Shared mode.
{
..
/* note this assumes exclusive access to the clog page */
byteval = *byteptr;
byteval &= ~(((1 << CLOG_BITS_PER_XACT) - 1) << bshift);
byteval |= (status << bshift);
*byteptr = byteval;
..
}
Now even if this is a problem, I think we can solve it with some more lower
level lock or may be with atomic operation, but I have mentioned it to check
your opinion on the same.
On Tue, Aug 11, 2015 at 7:32 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 11 August 2015 at 11:39, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>>>>
>>>> Another thing is that in this flow, with patch there will be three locks
>>>> (we take per-buffer locks before doing I/O) that will get involved rather than
>>>> two, so one effect of this patch will be that currently while doing I/O,
>>>> concurrent committers will be allowed to proceed as we release ControlLock
>>>> before doing the same whereas with Patch, they will not be allowed as they
>>>> are blocked by CommitLock. Now may be this scenario is less common and
>>>> doesn't matter much if the patch improves the more common scenario,
>>>> however this is an indication of what Andres tries to highlight that having more
>>>> locks for this might make patch more complicated.
>>>
>>>
>>> It's easy to stripe the CommitLock in that case, if it is a problem.
>>
>>
>> Sure, I think other places in code that take both the other locks also
>> needs to be checked for updation.
>
>
> Not sure what that means, but there are no other places calling CommitLock
>
>
> On 11 August 2015 at 11:39, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>>>>
>>>> Another thing is that in this flow, with patch there will be three locks
>>>> (we take per-buffer locks before doing I/O) that will get involved rather than
>>>> two, so one effect of this patch will be that currently while doing I/O,
>>>> concurrent committers will be allowed to proceed as we release ControlLock
>>>> before doing the same whereas with Patch, they will not be allowed as they
>>>> are blocked by CommitLock. Now may be this scenario is less common and
>>>> doesn't matter much if the patch improves the more common scenario,
>>>> however this is an indication of what Andres tries to highlight that having more
>>>> locks for this might make patch more complicated.
>>>
>>>
>>> It's easy to stripe the CommitLock in that case, if it is a problem.
>>
>>
>> Sure, I think other places in code that take both the other locks also
>> needs to be checked for updation.
>
>
> Not sure what that means, but there are no other places calling CommitLock
>
What I mean is that if want to ensure that we don't keep CommitLock while
doing I/O, then we might also want to ensure the same while waiting for I/O.
Hi, Amit pinged me about my opinion of this patch. I don't really have time/energy for an in-depth review right now, but since I'd looked enough to see some troublesome points, I thought I'd write those. On 2015-06-30 08:02:25 +0100, Simon Riggs wrote: > Proposal for improving this is to acquire the ClogControlLock in Shared > mode, if possible. I find that rather scary. That requires for all read and write paths in clog.c and slru.c to only ever read memory locations once. Otherwise those reads may not get the same results. That'd mean we'd need to add indirections via volatile to all reads/writes. It also requires that we never read in 4 byte quantities. > This is safe because people checking visibility of an xid must always run > TransactionIdIsInProgress() first to avoid race conditions, which will > always return true for the transaction we are currently committing. As a > result, we never get concurrent access to the same bits in clog, which > would require a barrier. I don't think that's really sufficient. There's a bunch of callers do lookups without such checks, e.g. in heapam.c. It might be safe due to other circumstances, but at the very least this is a significant but sublte API revision. > Two concurrent writers might access the same word concurrently, so we > protect against that with a new CommitLock. We could partition that by > pageno also, if needed. To me it seems better to make this more integrated with slru.c. Change the locking so that the control lock (relevant for page mappings et al) is different from the locks for reading/writing data. > * If we're doing an async commit (ie, lsn is valid), then we must wait > * for any active write on the page slot to complete. Otherwise our > * update could reach disk in that write, which will not do since we > @@ -273,7 +290,9 @@ TransactionIdSetPageStatus(TransactionId xid, int nsubxids, > * write-busy, since we don't care if the update reaches disk sooner than > * we think. > */ > - slotno = SimpleLruReadPage(ClogCtl, pageno, XLogRecPtrIsInvalid(lsn), xid); > + if (!InRecovery) > + LWLockAcquire(CommitLock, LW_EXCLUSIVE); > + slotno = SimpleLruReadPage_ReadOnly(ClogCtl, pageno, XLogRecPtrIsInvalid(lsn), xid); > > /* > * Set the main transaction id, if any. > @@ -312,6 +331,9 @@ TransactionIdSetPageStatus(TransactionId xid, int nsubxids, > ClogCtl->shared->page_dirty[slotno] = true; > > LWLockRelease(CLogControlLock); > + > + if (!InRecovery) > + LWLockRelease(CommitLock); > } TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which writes an 8 byte variable (the lsn). That's not safe. Maybe write_ok = XLogRecPtrIsInvalid(lsn) is trying to address that? If so, I don't see how. A page is returned with only the shared lock held if it's in VALID state before. Even if that were changed, this'd be a mightily subtle thing to do without a very fat comment. > @@ -447,7 +447,8 @@ SimpleLruReadPage(SlruCtl ctl, int pageno, bool write_ok, > * It is unspecified whether the lock will be shared or exclusive. > */ > int > -SimpleLruReadPage_ReadOnly(SlruCtl ctl, int pageno, TransactionId xid) > +SimpleLruReadPage_ReadOnly(SlruCtl ctl, int pageno, bool write_ok, > + TransactionId xid) > { > SlruShared shared = ctl->shared; > int slotno; > @@ -460,7 +461,9 @@ SimpleLruReadPage_ReadOnly(SlruCtl ctl, int pageno, TransactionId xid) > { > if (shared->page_number[slotno] == pageno && > shared->page_status[slotno] != SLRU_PAGE_EMPTY && > - shared->page_status[slotno] != SLRU_PAGE_READ_IN_PROGRESS) > + shared->page_status[slotno] != SLRU_PAGE_READ_IN_PROGRESS && > + (write_ok || > + shared->page_status[slotno] != SLRU_PAGE_WRITE_IN_PROGRESS)) > { > /* See comments for SlruRecentlyUsed macro */ > SlruRecentlyUsed(shared, slotno); > @@ -472,7 +475,7 @@ SimpleLruReadPage_ReadOnly(SlruCtl ctl, int pageno, TransactionId xid) > LWLockRelease(shared->ControlLock); > LWLockAcquire(shared->ControlLock, LW_EXCLUSIVE); > > - return SimpleLruReadPage(ctl, pageno, true, xid); > + return SimpleLruReadPage(ctl, pageno, write_ok, xid); > } This function's name would definitely need to be changed, and it'll need to be documented when/how it's safe to use write_ok = true. Same with slru.c's header. A patch like this will need far more changes. Every read and write from a page has to be guaranteed to only be done once, otherwise you can get 'effectively torn' values. That is, you can't just do static void TransactionIdSetStatusBit(TransactionId xid, XidStatus status, XLogRecPtr lsn, int slotno) ...char *byteptr;char byteval; .../* note this assumes exclusive access to the clog page */byteval = *byteptr;byteval &= ~(((1 << CLOG_BITS_PER_XACT) -1) << bshift);byteval |= (status << bshift);*byteptr = byteval; ... the compiler is entirely free to "optimize" away the byteval variable and do all these masking operations on memory! It can intermittenly write temporary values to byteval, because e.g. the register pressure is too high. To actually rely on single-copy-atomicity you have to enforce that these reads/writes can only happen. Leavout out some possible macro indirection that'd have to look like byteval = (volatile char *) byteptr; ... *(volatile char *) byteptr= byteval; some for TransactionIdGetStatus(), without the write side obviously. and stuff likeif (!XLogRecPtrIsInvalid(lsn)){ int lsnindex = GetLSNIndex(slotno, xid); if (ClogCtl->shared->group_lsn[lsnindex] < lsn) ClogCtl->shared->group_lsn[lsnindex] = lsn;} just aren't possible at all unless we drop a bunch of platforms. In essence I think this patch allows us to roughly benchmark whether and how benefitial fixing the contention around clog is, but it seems pretty far from something that can actually be applied. Greetings, Andres Freund
On 12 August 2015 at 04:49, Amit Kapila <amit.kapila16@gmail.com> wrote:
--
On Tue, Aug 11, 2015 at 7:27 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 11 August 2015 at 14:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>>
>> One more point here why do we need CommitLock before calling
>> SimpleLruReadPage_ReadOnly() in the patch and if it is not required,
>> then can we use LWLockAcquire(shared->buffer_locks[slotno], LW_EXCLUSIVE);
>> instead of CommitLock?
>
>
> That prevents read only access, not just commits, so that isn't a better suggestion.read only access of what (clog page?)?Here we are mainly doing three operations read clog page, write transaction statuson clog page and update shared control state. So basically two resources areinvolved clog page and shared control state, so which one of those you are talking?
Sorry, your suggestion was good. Using LWLockAcquire(shared->buffer_locks[slotno], LW_EXCLUSIVE); now seems sufficient.
Apart from above, in below code, it is assumed that we have exclusive lock onclog page which we don't in the proposed patch as some one can read thesame page while we are modifying it. In current code, this assumption is validbecause during Write we take CLogControlLock in Exclusive mode and whileReading we take the same in Shared mode.
Not exactly, no. This is not a general case, it is for one important and very specific case only, exactly suited to our transaction manager. I have checked all call paths and we are good.
New patch attached. I will reply to Andres' post separately since this does not yet address all of his detailed points.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On 22 August 2015 at 15:14, Andres Freund <andres@anarazel.de> wrote:
--
On 2015-06-30 08:02:25 +0100, Simon Riggs wrote:
> Proposal for improving this is to acquire the ClogControlLock in Shared
> mode, if possible.
I find that rather scary. That requires for all read and write paths in
clog.c and slru.c to only ever read memory locations once. Otherwise
those reads may not get the same results. That'd mean we'd need to add
indirections via volatile to all reads/writes. It also requires that we
never read in 4 byte quantities.
There is is a very specific case in which this is possible. We already allow writes to data structures in the transaction manager without locks *in certain cases* and this is all that is proposed here. Nothing scary about doing something we already do, as long as we get it right.
> This is safe because people checking visibility of an xid must always run
> TransactionIdIsInProgress() first to avoid race conditions, which will
> always return true for the transaction we are currently committing. As a
> result, we never get concurrent access to the same bits in clog, which
> would require a barrier.
I don't think that's really sufficient. There's a bunch of callers do
lookups without such checks, e.g. in heapam.c. It might be safe due to
other circumstances, but at the very least this is a significant but
sublte API revision.
I've checked the call sites you mention and they refer to tests made AFTER we know have waited for the xid to complete via the lock manager. So as of now, there are no callers of TransactionIdGetStatus() that have not already confirmed that the xid is no longer active in the lock manager or the procarray. Since we set clog before touching lock manager or procarray we can be certain there is no concurrent reads and writes.
TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
writes an 8 byte variable (the lsn). That's not safe.
Agreed, thanks for spotting that.
I see this as the biggest problem to overcome with this patch.
We write WAL in pages, so we don't need to store the low order bits (varies according to WAL page size). We seldom use the higher order bits, since it takes a while to go thru (8192 * INT_MAX) = 32PB of WAL. So it seems like we can have a base LSN for a whole clog page, plus 4-byte LSN offsets. That way we can make the reads and writes atomic on all platforms. All of that can be hidden in clog.c in macros, so low impact, modular code.
A patch like this will need far more changes. Every read and write from
a page has to be guaranteed to only be done once, otherwise you can get
'effectively torn' values.
That is, you can't just do
static void
TransactionIdSetStatusBit(TransactionId xid, XidStatus status, XLogRecPtr lsn, int slotno)
...
char *byteptr;
char byteval;
...
/* note this assumes exclusive access to the clog page */
byteval = *byteptr;
byteval &= ~(((1 << CLOG_BITS_PER_XACT) - 1) << bshift);
byteval |= (status << bshift);
*byteptr = byteval;
...
the compiler is entirely free to "optimize" away the byteval variable
and do all these masking operations on memory! It can intermittenly
write temporary values to byteval, because e.g. the register pressure is
too high.
To actually rely on single-copy-atomicity you have to enforce that these
reads/writes can only happen. Leavout out some possible macro
indirection that'd have to look like
byteval = (volatile char *) byteptr;
...
*(volatile char *) byteptr = byteval;
some for TransactionIdGetStatus(), without the write side obviously.
Seems doable.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Aug 25, 2015 at 2:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 22 August 2015 at 15:14, Andres Freund <andres@anarazel.de> wrote:
TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
writes an 8 byte variable (the lsn). That's not safe.Agreed, thanks for spotting that.I see this as the biggest problem to overcome with this patch.
How about using 64bit atomics or spinlock to protect this variable?
On 26 August 2015 at 11:40, Amit Kapila <amit.kapila16@gmail.com> wrote:
I'm wondering if its worth making this work on 32-bit systems at all. The contention problems only occur on higher end servers, so we can just disable this optimization if we aren't on a 64bit server.
--
On Tue, Aug 25, 2015 at 2:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote:On 22 August 2015 at 15:14, Andres Freund <andres@anarazel.de> wrote:TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
writes an 8 byte variable (the lsn). That's not safe.Agreed, thanks for spotting that.I see this as the biggest problem to overcome with this patch.How about using 64bit atomics or spinlock to protect this variable?
Spinlock is out IMHO because this happens on every clog lookup. So it must be an atomic read.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Aug 26, 2015 at 4:18 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 26 August 2015 at 11:40, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Tue, Aug 25, 2015 at 2:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>>
>>> On 22 August 2015 at 15:14, Andres Freund <andres@anarazel.de> wrote:
>>
>>
>>>>
>>>> TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
>>>> writes an 8 byte variable (the lsn). That's not safe.
>>>
>>>
>>> Agreed, thanks for spotting that.
>>>
>>> I see this as the biggest problem to overcome with this patch.
>>
>>
>> How about using 64bit atomics or spinlock to protect this variable?
>
>
> Spinlock is out IMHO because this happens on every clog lookup. So it must be an atomic read.
>
>
> On 26 August 2015 at 11:40, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Tue, Aug 25, 2015 at 2:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>>
>>> On 22 August 2015 at 15:14, Andres Freund <andres@anarazel.de> wrote:
>>
>>
>>>>
>>>> TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
>>>> writes an 8 byte variable (the lsn). That's not safe.
>>>
>>>
>>> Agreed, thanks for spotting that.
>>>
>>> I see this as the biggest problem to overcome with this patch.
>>
>>
>> How about using 64bit atomics or spinlock to protect this variable?
>
>
> Spinlock is out IMHO because this happens on every clog lookup. So it must be an atomic read.
>
Agreed, however using atomics is still an option, yet another way could
be before updating group_lsn, check if we already have CLogControlLock
in Exclusive mode then update it, else release the lock, re-acquire in
Exclusive mode and update the variable. The first thing that comes to mind
with this idea is that it will be less performant, yes thats true, but it will be
only done for asynchronous commits (mode which is generally not recommended
for production-use) and that too not on every commit, so may be the impact is
not that high. I have updated the patch (attached with mail) to show you what
I have in mind.
Another point about the latest patch:
+ (write_ok ||
+ shared->page_status[slotno] != SLRU_PAGE_WRITE_IN_PROGRESS))
Do you think that with new locking protocol as proposed in this
patch, it is safe to return if page status is SLRU_PAGE_WRITE_IN_PROGRESS
even if write_ok is true?
I think the case where it can cause problem is during SlruInternalWritePage()
where it performs below actions:
1. first marks the status of page as SLRU_PAGE_WRITE_IN_PROGRESS.
2. then take buffer lock in Exclusive mode.
3. release control lock.
4. perform the write
5. re-acquire the control lock in Exclusive mode.
Now consider another client which has to update the transaction status:
1. Control lock in Shared mode.
2. Get the slot
3. Acquire the buffer lock in Exclusive mode
Now consider client which has to update the transaction status performs
its step-1 after step-3 of writer, if that happens, then that could lead to
deadlock because writer will wait for client to release control lock and
client will wait for writer to release buffer lock.
Attachment
On 08/31/2015 07:34 AM, Amit Kapila wrote: > I have updated the patch (attached with mail) to show > you what I have in mind. > I havn't been able to get a successful run with _v5 using pgbench. TransactionIdSetStatusBit assumes an exclusive lock on CLogControlLock when called, but that part is removed from TransactionIdSetPageStatus now. I tried an if (!LWLockHeldByMe(CLogControlLock)) { LWLockAcquire(CLogControlLock, LW_EXCLUSIVE); mode = LW_EXCLUSIVE; } approach, but didn't get further. Plus that probably isn't the best way, since we will traverse all LWLock's, and we need to account for cases w/ and w/o sub-transactions, e.g. multiple calls to TransactionIdSetStatusBit within the CLogControlLock exclusive boundary. Best regards, Jesper
On Fri, Sep 18, 2015 at 11:31 PM, Jesper Pedersen <jesper.pedersen@redhat.com> wrote:
With Regards,
Amit Kapila.
On 08/31/2015 07:34 AM, Amit Kapila wrote:I have updated the patch (attached with mail) to show
you what I have in mind.
I havn't been able to get a successful run with _v5 using pgbench.
This patch is still not ready for performance test, as you might
have seen in comments that we are still discussing locking
issues.
TransactionIdSetStatusBit assumes an exclusive lock on CLogControlLock when called, but that part is removed from TransactionIdSetPageStatus now.
It doesn't seem to be a necessary condition, thats why in this patch
a smaller granularity lock is introduced.
I tried an
if (!LWLockHeldByMe(CLogControlLock))
{
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
mode = LW_EXCLUSIVE;
}
approach, but didn't get further.
I suspect the problem is something else.
Plus that probably isn't the best way, since we will traverse all LWLock's,
Yes, but it does in such a way that probably the caller will find it in
very few initial entries in the array.
Thank you very much for doing valuable performance tests with the
CLog patches.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On 22 August 2015 at 15:14, Andres Freund <andres@anarazel.de> wrote:
So I have a valid way forward for this approach. Cool.
--
TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
writes an 8 byte variable (the lsn). That's not safe.
This point was the main sticking point here.
It turns out that we don't need to store the LSN (8 bytes).
WAL is fsynced every time we finish writing a file, so we only actually need to store the byte position within each file, so no more than 16MB. That fits within a 4 byte value, so can be written atomically.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services