Re: Speed up Clog Access by increasing CLOG buffers - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Speed up Clog Access by increasing CLOG buffers
Date
Msg-id ca714e28-81a2-440c-bd92-79fe1fa0fc15@2ndquadrant.com
Whole thread Raw
In response to Re: Speed up Clog Access by increasing CLOG buffers  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Speed up Clog Access by increasing CLOG buffers
Re: Speed up Clog Access by increasing CLOG buffers
List pgsql-hackers
On 11/01/2016 08:13 PM, Robert Haas wrote:
> On Mon, Oct 31, 2016 at 5:48 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> Honestly, I have no idea what to think about this ...
>
> I think a lot of the details here depend on OS scheduler behavior.
> For example, here's one of the first scalability graphs I ever did:
>
> http://rhaas.blogspot.com/2011/09/scalability-in-graphical-form-analyzed.html
>
> It's a nice advertisement for fast-path locking, but look at the funny
> shape of the red and green lines between 1 and 32 cores.  The curve is
> oddly bowl-shaped.  As the post discusses, we actually dip WAY under
> linear scalability in the 8-20 core range and then shoot up like a
> rocket afterwards so that at 32 cores we actually achieve super-linear
> scalability. You can't blame this on anything except Linux.  Someone
> shared BSD graphs (I forget which flavor) with me privately and they
> don't exhibit this poor behavior.  (They had different poor behaviors
> instead - performance collapsed at high client counts.  That was a
> long time ago so it's probably fixed now.)
>
> This is why I think it's fundamentally wrong to look at this patch and
> say "well, contention goes down, and in some cases that makes
> performance go up, but because in other cases it decreases performance
> or increases variability we shouldn't commit it".  If we took that
> approach, we wouldn't have fast-path locking today, because the early
> versions of fast-path locking could exhibit *major* regressions
> precisely because of contention shifting to other locks, specifically
> SInvalReadLock and msgNumLock.  (cf. commit
> b4fbe392f8ff6ff1a66b488eb7197eef9e1770a4).  If we say that because the
> contention on those other locks can get worse as a result of
> contention on this lock being reduced, or even worse, if we try to
> take responsibility for what effect reducing lock contention might
> have on the operating system scheduler discipline (which will
> certainly differ from system to system and version to version), we're
> never going to get anywhere, because there's almost always going to be
> some way that reducing contention in one place can bite you someplace
> else.
>

I don't think I've suggested not committing any of the clog patches (or 
other patches in general) because shifting the contention somewhere else 
might cause regressions. At the end of the last CF I've however stated 
that we need to better understand the impact on various wokloads, and I 
think Amit agreed with that conclusion.

We have that understanding now, I believe - also thanks to your idea of 
sampling wait events data.

You're right we can't fix all the contention points in one patch, and 
that shifting the contention may cause regressions. But we should at 
least understand what workloads might be impacted, how serious the 
regressions may get etc. Which is why all the testing was done.


> I also believe it's pretty normal for patches that remove lock
> contention to increase variability.  If you run an auto race where
> every car has a speed governor installed that limits it to 80 kph,
> there will be much less variability in the finish times than if you
> remove the governor, but that's a stupid way to run a race.  You won't
> get much innovation around increasing the top speed of the cars under
> those circumstances, either.  Nobody ever bothered optimizing the
> contention around msgNumLock before fast-path locking happened,
> because the heavyweight lock manager burdened the system so heavily
> that you couldn't generate enough contention on it to matter.
> Similarly, we're not going to get much traction around optimizing the
> other locks to which contention would shift if we applied this patch
> unless we apply it.  This is not theoretical: EnterpriseDB staff have
> already done work on trying to optimize WALWriteLock, but it's hard to
> get a benefit.  The more contention other contention we eliminate, the
> easier it will be to see whether a proposed change to WALWriteLock
> helps.

Sure, I understand that. My main worry was that people will get worse 
performance with the next major version that what they get now (assuming 
we don't manage to address the other contention points). Which is 
difficult to explain to users & customers, no matter how reasonable it 
seems to us.

The difference is that both the fast-path locks and msgNumLock went into 
9.2, so that end users probably never saw that regression. But we don't 
know if that happens for clog and WAL.

Perhaps you have a working patch addressing the WAL contention, so that 
we could see how that changes the results?

> Of course, we'll also be more at the mercy of operating system
> scheduler discipline, but that's not all a bad thing either.  The
> Linux kernel guys have been known to run PostgreSQL to see whether
> proposed changes help or hurt, but they're not going to try those
> tests after applying patches that we rejected because they expose us
> to existing Linux shortcomings.
>

I might be wrong, but I doubt the kernel guys are running particularly 
wide set of tests, so how likely is it they will notice issues with 
specific workloads? Wouldn't it be great if we could tell them there's a 
bug and provide a workload that reproduces it?

I don't see how "it's a Linux issue" makes it someone else's problem. 
The kernel guys can't really test everything (and are not obliged to). 
It's up to us to do more testing in this area, and report issues to the 
kernel guys (which is not happening as much as it should).
>
> I don't want to be perceived as advocating too forcefully for a
> patch that was, after all, written by a colleague. However, I
> sincerely believe it's a mistake to say that a patch which reduces
> lock contention must show a tangible win or at least no loss on every
> piece of hardware, on every kernel, at every client count with no
> increase in variability in any configuration.>

I don't think anyone suggested that.
>
> Very few (if any) patches are going to be able to meet that bar, and
> if we make that the bar, people aren't going to write patches to
> reduce lock contention in PostgreSQL. For that to be worth doing, you
> have to be able to get the patch committed in finite time. We've
> spent an entire release cycle dithering over this patch. Several
> alternative patches have been written that are not any better (and
> the people who wrote those patches don't seem especially interested
> in doing further work on them anyway). There is increasing evidence
> that the patch is effective at solving the problem it claims to
> solve, and that any downsides are just the result of poor
> lock-scaling behavior elsewhere which we could be working on fixing
> if we weren't still spending time on this. Is that really not good
> enough?
>

Except that a few days ago, after getting results from the last round of 
tests, I've stated that we haven't really found any regressions that 
would matter, and that group_update seems to be performing the best (and 
actually significantly improves results for some of the tests). I 
haven't done any code review, though.

The one remaining thing is the strange zig-zag behavior, but that might 
easily be a due to scheduling in kernel, or something else. I don't 
consider it a blocker for any of the patches, though.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: WAL consistency check facility
Next
From: Michael Paquier
Date:
Subject: Re: WAL consistency check facility