Re: [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflicttracking in serializable transactions - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflicttracking in serializable transactions
Date
Msg-id CACjxUsOGm8_Pkvj3zEKYojtV4vcQAgxhwYotEza839UvYKijPA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflicttracking in serializable transactions  (DEV_OPS <devops@ww-it.cn>)
Responses Re: [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflicttracking in serializable transactions  (Kevin Grittner <kgrittn@gmail.com>)
List pgsql-hackers
On Tue, Mar 14, 2017 at 6:00 AM, DEV_OPS <devops@ww-it.cn> wrote:
> On 3/14/17 17:34, Mengxing Liu wrote:

>>>>> The worst problems have been
>>>>> seen with 32 or more cores on 4 or more sockets with a large number
>>>>> of active connections.  I don't know whether you have access to a
>>>>> machine capable of putting this kind of stress on it (perhaps at
>>>>> your university?), but if not, the community has access to various
>>>>> resources we should be able to schedule time on.
>>>> There is a NUMA machine ( 120 cores, 8 sockets) in my lab.
>>> Fantastic!  Can you say a bit more about the architecture and OS?
>>
>> Intel(R) Xeon(R) CPU at 2.3GHz, with 1TB physical DRAM and 1.5 TB
>> SSD, running Ubuntu 14.04, Kernel 3.19.
>> I guess NUMA is disabled in BIOS, but I will check that.

I'm not sure what it would mean to "disable" NUMA -- if the CPU
chips are each functioning as memory controller for a subset of the
RAM you will have non-uniform memory access speeds from any core to
different RAM locations.  You can switch it to "interleaved" access,
so the speed of access from a core to any logical memory address
will be rather random, rather than letting the OS try to arrange
things so that processes do most of their access to memory that is
faster for them.  It is generally better to tune NUMA in the OS than
to randomize things at the BIOS level.

>> However, there is only one NIC, so network could be the
>> bottleneck if we use too many cores?

Well, if we run the pgbench client on the database server box, the
NIC won't matter at all.  If we move the client side to another box,
I still think that when we hit this problem, it will dwarf any
impact of the NIC throughput.

> The configuration is really cool, for the SSD, is it SATA interface?
> NVMe interface, or is PCIe Flash? different SSD interface had different
> performance characters.

Yeah, knowing model of SSD, as well as which particular Xeon we're
using, would be good.

> for the NUMA, because the affinity issue will impact PostgreSQL
> performance. so, Please disabled it if possible

I disagree.  (see above)

>> There are several alternative benchmarks. Tony suggested that we
>> should use TPC-E and TPC-DS.

More benchmarks is better, all other things being equal.  Keep in
mind that good benchmarking practice with PostgreSQL generally
requires a lot of setup time (so that we're starting from the exact
same conditions for every run), a lot of run time (so that the
effects of vacuuming, bloat, and page splitting all comes into play,
like it would in the real world), and a lot of repetitions of each
run (to account for variation).  In particular, on a NUMA machine it
is not at all unusual to see bifurcated

>> Personally, I am more familiar with TPC-C.

Unfortunately, the TPC-C benchmark does not create any cycles in the
transaction dependencies, meaning that it is not a great tool for
benchmarking serializable transactions.  I know there are variations
on TPC-C floating around that add a transaction type to do so, but
on a quick search right now, I was unable to find one.

>> And pgbench seems only have TPC-B built-in benchmark.

You can feed it your own custom queries, instead.

>> Well, I think we can easily find the implementations of these
>> benchmarks for PostgreSQL.

Reportedly, some of the implementations of TPC-C are not very good.
There is DBT-2, but I've known a couple of people to look at that
and find that it needed work before they could use it.  Based on the
PostgreSQL versions mentioned on the Wiki page, it has been
neglected for a while:

https://wiki.postgresql.org/wiki/DBT-2

>> The paper you recommended to me used a special benchmark defined
>> by themselves. But it is quite simple and easy to implement.

It also has the distinct advantage that we *know* they created a
scenario where the code we want to tune was using most of the CPU on
the machine.

>> For me, the challenge is profiling the execution. Is there any
>> tools in PostgreSQL to analysis where is the CPU cycles consumed?
>> Or I have to instrument and time by myself?

Generally oprofile or perf is used if you want to know where the
time is going.  This creates a slight dilemma -- if you configure
your build with --enable-cassert, you get the best stack traces and
you can more easily break down execution profiles.  That especially
true if you disable optimization and don't omit frame pointers.  But
all of those things distort the benchmarks -- adding a lot of time,
and not necessarily in proportion to where the time goes with an
optimized build.

--
Kevin Grittner



pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: [HACKERS] scram and \password
Next
From: Jeff Janes
Date:
Subject: Re: [HACKERS] scram and \password