Re: testing HS/SR - 1 vs 2 performance - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: testing HS/SR - 1 vs 2 performance |
Date | |
Msg-id | w2n603c8f071004250837hb6b318cez229e9c6a48a90d51@mail.gmail.com Whole thread Raw |
In response to | Re: testing HS/SR - 1 vs 2 performance (Simon Riggs <simon@2ndQuadrant.com>) |
List | pgsql-hackers |
On Sun, Apr 25, 2010 at 8:50 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Sun, 2010-04-25 at 06:53 -0400, Robert Haas wrote: >> On Sat, Apr 24, 2010 at 5:17 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> >> Both Heikki and I objected to that patch. >> > >> > Please explain your objection, based upon the patch and my explanations. >> >> Well, we objected to the locking. Having reread the patch a few times >> though, I think I'm starting to wrap my head around it so, I don't >> know, maybe it's OK. Have you tested grabbing the ProcArrayLock in >> exclusive mode instead of having a separate spinlock, to see how that >> performs? > > They both perform well. Heikki and I agree that spinlocks perform well. > > The point of taking the ProcArrayLock in Shared rather than Exclusive > mode is the reduction in contention that gives. Adding a > KnownAssignedXid is exactly analogous to starting a transaction. We go > to some trouble to avoid taking a lock when we start a transaction and I > want to do a similar thing on the standby. I think it's even more > important we reduce contention on the standby than it is on the primary > because if the Startup process is delayed then it slows down recovery. > >> >> And apparently it doesn't >> >> fix the problem, either. So, -1 from me. >> > >> > There is an issue observed in Erik's later tests, but my interpretation >> > of the results so far is that the sorted array patch successfully >> > removes the initially reported loss of performance. >> >> Is it possible the remaining spikes are due to fights over the spinlock? > > That one specific spinlock? I considered it. I think its doubtful. > > The spikes all involved a jump from <5ms to much more than that, often > as long as a second. That looks like a pure I/O problem or an LWlock > held across a slow I/O or platform specific issues of some kind. > > Erik's tests were with 1 backend, so that would mean that at most 2 > processes might want the spinlock you mention. The spinlock is only > taken at snapshot time, so once per transaction in Erik's tests. So the > if the spinlock were the problem then one or other of the processes > would need to grab and hold the spinlock for a very long time and then > release it sometime later. Since the spinlock code has been with us for > some time, I doubt there is anything wrong there. > > I don't have any evidence as to whether the problem is solely on Erik's > system, nor whether it is an issue already in the codebase or introduced > by this patch. It seems unlikely to me that this patch introduces the > problem, and even more unlikely that the issue is caused by two > processes fighting over that particular spinlock. > > The patch is very effective at reducing overall cost of taking snapshots > and so I think it needs to be applied to allow more performance data to > be collected. I don't suppose this is the last performance tuning we > will need to do. OK, fair enough. I withdraw my objections. ...Robert
pgsql-hackers by date: