Re: testing HS/SR - 1 vs 2 performance - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: testing HS/SR - 1 vs 2 performance |
Date | |
Msg-id | 1272199800.4161.1662.camel@ebony Whole thread Raw |
In response to | Re: testing HS/SR - 1 vs 2 performance (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: testing HS/SR - 1 vs 2 performance
|
List | pgsql-hackers |
On Sun, 2010-04-25 at 06:53 -0400, Robert Haas wrote: > On Sat, Apr 24, 2010 at 5:17 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > >> Both Heikki and I objected to that patch. > > > > Please explain your objection, based upon the patch and my explanations. > > Well, we objected to the locking. Having reread the patch a few times > though, I think I'm starting to wrap my head around it so, I don't > know, maybe it's OK. Have you tested grabbing the ProcArrayLock in > exclusive mode instead of having a separate spinlock, to see how that > performs? They both perform well. Heikki and I agree that spinlocks perform well. The point of taking the ProcArrayLock in Shared rather than Exclusive mode is the reduction in contention that gives. Adding a KnownAssignedXid is exactly analogous to starting a transaction. We go to some trouble to avoid taking a lock when we start a transaction and I want to do a similar thing on the standby. I think it's even more important we reduce contention on the standby than it is on the primary because if the Startup process is delayed then it slows down recovery. > >> And apparently it doesn't > >> fix the problem, either. So, -1 from me. > > > > There is an issue observed in Erik's later tests, but my interpretation > > of the results so far is that the sorted array patch successfully > > removes the initially reported loss of performance. > > Is it possible the remaining spikes are due to fights over the spinlock? That one specific spinlock? I considered it. I think its doubtful. The spikes all involved a jump from <5ms to much more than that, often as long as a second. That looks like a pure I/O problem or an LWlock held across a slow I/O or platform specific issues of some kind. Erik's tests were with 1 backend, so that would mean that at most 2 processes might want the spinlock you mention. The spinlock is only taken at snapshot time, so once per transaction in Erik's tests. So the if the spinlock were the problem then one or other of the processes would need to grab and hold the spinlock for a very long time and then release it sometime later. Since the spinlock code has been with us for some time, I doubt there is anything wrong there. I don't have any evidence as to whether the problem is solely on Erik's system, nor whether it is an issue already in the codebase or introduced by this patch. It seems unlikely to me that this patch introduces the problem, and even more unlikely that the issue is caused by two processes fighting over that particular spinlock. The patch is very effective at reducing overall cost of taking snapshots and so I think it needs to be applied to allow more performance data to be collected. I don't suppose this is the last performance tuning we will need to do. -- Simon Riggs www.2ndQuadrant.com
pgsql-hackers by date: