Re: Misaligned BufferDescriptors causing major performance problems on AMD - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Misaligned BufferDescriptors causing major performance problems on AMD
Date
Msg-id 20141229215905.GA5788@momjian.us
Whole thread Raw
In response to Re: Misaligned BufferDescriptors causing major performance problems on AMD  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Misaligned BufferDescriptors causing major performance problems on AMD  (Andres Freund <andres@2ndquadrant.com>)
Re: Misaligned BufferDescriptors causing major performance problems on AMD  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Sat, Dec 27, 2014 at 08:05:42PM -0500, Robert Haas wrote:
> On Wed, Dec 24, 2014 at 11:20 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > I just verified that I can still reproduce the problem:
> >
> > # aligned case (max_connections=401)
> > afreund@axle:~$ pgbench -P 1 -h /tmp/ -p5440 postgres -n -M prepared -c 96 -j 96 -T 100 -S
> > progress: 1.0 s, 405170.2 tps, lat 0.195 ms stddev 0.928
> > progress: 2.0 s, 467011.1 tps, lat 0.204 ms stddev 0.140
> > progress: 3.0 s, 462832.1 tps, lat 0.205 ms stddev 0.154
> > progress: 4.0 s, 471035.5 tps, lat 0.202 ms stddev 0.154
> > progress: 5.0 s, 500329.0 tps, lat 0.190 ms stddev 0.132
> >
> > BufferDescriptors is at 0x7f63610a6960 (which is 32byte aligned)
> >
> > # unaligned case (max_connections=400)
> > afreund@axle:~$ pgbench -P 1 -h /tmp/ -p5440 postgres -n -M prepared -c 96 -j 96 -T 100 -S
> > progress: 1.0 s, 202271.1 tps, lat 0.448 ms stddev 1.232
> > progress: 2.0 s, 223823.4 tps, lat 0.427 ms stddev 3.007
> > progress: 3.0 s, 227584.5 tps, lat 0.414 ms stddev 4.760
> > progress: 4.0 s, 221095.6 tps, lat 0.410 ms stddev 4.390
> > progress: 5.0 s, 217430.6 tps, lat 0.454 ms stddev 7.913
> > progress: 6.0 s, 210275.9 tps, lat 0.411 ms stddev 0.606
> > BufferDescriptors is at 0x7f1718aeb980 (which is 64byte aligned)
>
> So, should we increase ALIGNOF_BUFFER from 32 to 64?  Seems like
> that's what these results are telling us.

I am glad someone else considers this important.  Andres reported the
above 2x pgbench difference in February, but no action was taken as
everyone felt there needed to be more performance testing, but it never
happened:

    http://www.postgresql.org/message-id/20140202151319.GD32123@awork2.anarazel.de

I have now performance tested this by developing the attached two
patches which both increase the Buffer Descriptors allocation by 64
bytes.  The first patch causes each 64-byte Buffer Descriptor struct to
align on a 32-byte boundary but not a 64-byte boundary, while the second
patch aligns it with a 64-byte boundary.

I tried many tests, including this:

    $ pgbench --initialize --scale 1 pgbench
    $ pgbench --protocol prepared --client 16 --jobs 16 --transactions 100000 --select-only pgbench

I cannot measure any difference on my dual-CPU-socket, 16-vcore server
(http://momjian.us/main/blogs/pgblog/2012.html#January_20_2012).   I
thought this test would cause the most Buffer Descriptor contention
between the two CPUs.  Can anyone else see a difference when testing
these two patches?  (The patch reports alignment in the server logs.)

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Attachment

pgsql-hackers by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: BUG #12330: ACID is broken for unique constraints
Next
From: Kevin Grittner
Date:
Subject: Re: BUG #12330: ACID is broken for unique constraints