Re: Changing WAL Header to reduce contention duringReserveXLogInsertLocation() - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: Changing WAL Header to reduce contention duringReserveXLogInsertLocation()
Date
Msg-id 7d6863ed-cf3f-b144-b6c9-101995ca9bd9@2ndquadrant.com
Whole thread Raw
In response to Re: Changing WAL Header to reduce contention duringReserveXLogInsertLocation()  (Heikki Linnakangas <hlinnaka@iki.fi>)
List pgsql-hackers
On 09.07.18 15:49, Heikki Linnakangas wrote:
> The way 
> forward is to test if we can get the same performance benefit from 
> switching to CMPXCHG16B, and keep the WAL format unchanged.

I have implemented this.  I didn't see any performance benefit using the
benchmark that Alexander Kuzmenkov outlined earlier in the thread.  (But
I also didn't see a great benefit for the originally proposed patch, so
maybe I'm not doing this right or this particular hardware doesn't
benefit from it as much.)

I'm attaching the patches and scripts here if someone else wants to do
more testing.

The first patch adds a zoo of 128-bit atomics support.  It's consistent
with (a.k.a. copy-and-pasted from) the existing 32- and 64-bit set, but
it's not the complete set, only as much as was necessary for this exercise.

The second patch then makes use of that in the WAL code under discussion.

pgbench invocations were:

pgbench -i -I t bench
pgbench -n -c $N -j $N -M prepared -f pgbench-wal-cas.sql -T 60 bench

for N from 1 to 32.

Note:  With gcc (at least versions 7 and 8) you need to use some
non-default -march setting to get 128-bit atomics to work.  (Otherwise
the configure test fails and the fallback implementation is used.)  I
have found the minimum to be -march=nocona.  But different -march
settings actually affect the benchmark performance, so be sure to test
the baseline with the same -march setting.  Recommended configure
invocation: ./configure ... CC='gcc -march=whatever'

clang appears to work out of the box.

Also, curiously my gcc installations provide 128-bit
__sync_val_compare_and_swap() but not 128-bit
__atomic_compare_exchange_n() in spite of what the documentation indicates.

So independent of whether this approach actually provides any benefit,
the 128-bit atomics support seems a bit wobbly.

(I'm also wondering why we are using __sync_val_compare_and_swap()
rather than __sync_bool_compare_and_swap(), since all we're doing with
the return value is emulate the bool version anyway.)

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Emre Hasegeli
Date:
Subject: Re: [PATCH] Improve geometric types
Next
From: "Andrey V. Lepikhov"
Date:
Subject: Re: [PATCH] Timestamp for a XLOG_BACKUP_END WAL-record