Re: Speed up Clog Access by increasing CLOG buffers - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Speed up Clog Access by increasing CLOG buffers
Date
Msg-id CA+TgmobJBv0qYEMazPEqsit4zkk_ECvafYdu8X=jAnVei0yaYg@mail.gmail.com
Whole thread Raw
In response to Re: Speed up Clog Access by increasing CLOG buffers  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Speed up Clog Access by increasing CLOG buffers
List pgsql-hackers
On Thu, Oct 20, 2016 at 11:45 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Oct 20, 2016 at 3:36 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> On Thu, Oct 13, 2016 at 12:25 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> I agree with these conclusions.  I had a chance to talk with Andres
>>> this morning at Postgres Vision and based on that conversation I'd
>>> like to suggest a couple of additional tests:
>>>
>>> 1. Repeat this test on x86.  In particular, I think you should test on
>>> the EnterpriseDB server cthulhu, which is an 8-socket x86 server.
>>
>> I have done my test on cthulhu, basic difference is that In POWER we
>> saw ClogControlLock on top at 96 and more client with 300 scale
>> factor. But, on cthulhu at 300 scale factor transactionid lock is
>> always on top. So I repeated my test with 1000 scale factor as well on
>> cthulhu.
>
> So the upshot appears to be that this problem is a lot worse on power2
> than cthulhu, which suggests that this is architecture-dependent.  I
> guess it could also be kernel-dependent, but it doesn't seem likely,
> because:
>
> power2: Red Hat Enterprise Linux Server release 7.1 (Maipo),
> 3.10.0-229.14.1.ael7b.ppc64le
> cthulhu: CentOS Linux release 7.2.1511 (Core), 3.10.0-229.7.2.el7.x86_64
>
> So here's my theory.  The whole reason why Tomas is having difficulty
> seeing any big effect from these patches is because he's testing on
> x86.  When Dilip tests on x86, he doesn't see a big effect either,
> regardless of workload.  But when Dilip tests on POWER, which I think
> is where he's mostly been testing, he sees a huge effect, because for
> some reason POWER has major problems with this lock that don't exist
> on x86.
>
> If that's so, then we ought to be able to reproduce the big gains on
> hydra, a community POWER server.  In fact, I think I'll go run a quick
> test over there right now...

And ... nope.  I ran a 30-minute pgbench test on unpatched master
using unlogged tables at scale factor 300 with 64 clients and got
these results:
    14  LWLockTranche   | wal_insert    36  LWLockTranche   | lock_manager    45  LWLockTranche   | buffer_content
223 Lock            | tuple   527  LWLockNamed     | CLogControlLock   921  Lock            | extend  1195  LWLockNamed
   | XidGenLock  1248  LWLockNamed     | ProcArrayLock  3349  Lock            | transactionid 85957  Client          |
ClientRead135935                 |
 

I then started a run at 96 clients which I accidentally killed shortly
before it was scheduled to finish, but the results are not much
different; there is no hint of the runaway CLogControlLock contention
that Dilip sees on power2.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Renaming of pg_xlog and pg_clog
Next
From: Robert Haas
Date:
Subject: Re: Renaming of pg_xlog and pg_clog