Re: [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling fromrw-conflict tracking in serializable transactions - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling fromrw-conflict tracking in serializable transactions
Date
Msg-id CACjxUsO16g4kZzikRxMXmYhMMQUa93ZEBTEXHiKDvR2mjonfDw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling fromrw-conflict tracking in serializable transactions  ("Mengxing Liu" <liu-mx15@mails.tsinghua.edu.cn>)
List pgsql-hackers
On Wed, Mar 15, 2017 at 11:35 AM, Mengxing Liu
<liu-mx15@mails.tsinghua.edu.cn> wrote:

>> On a NUMA machine It is not at all unusual to see bifurcated results
>> -- with each run coming in very close to one number or a second
>> number, often at about a 50/50 rate, with no numbers falling
>> anywhere else.  This seems to be based on where the processes and
>> memory allocations happen to land.
>>
>
> Do you mean that for a NUMA machine, there usually exists two
> different results of its performance?
> Just two? Neither three nor four?

In my personal experience, I have often seen two timings that each
run randomly matched; I have not seen nor heard of more, but that
doesn't mean it can't happen.  ;-)

> At first, I will compile and install PostgreSQL by myself and try
> the profile tools (perf or oprofile).

perf is newer, and generally better if you can use it.  Don't try to
use either on HP hardware -- the BIOS uses some of the same hardware
registers that other manufacturers leave for use of profilers; an HP
machine is likely to freeze or reboot if you try to run either of
those profilers under load.

> Then I will run one or two benchmarks using different config,
> where I may need your help to ensure that my tests are close to the
> practical situation.

Yeah, we should talk about OS and PostgreSQL configuration before
you run any benchmarks.  Neither tends to come configured as I would
run a production system.

> PS: Disable NUMA in BIOS means that CPU can use its own memory
> controller when accessing local memory to reduce hops.

NUMA means that each CPU chip directly controls some of the RAM
(possibly with other, non-CPU controllers for some RAM).  The
question is whether the BIOS or the OS controls the memory
allocation.  The OS looks at what processes are on what cores and
tries to use "nearby" memory for allocations.  This can be pessimal
if the amount of RAM that is under contention is less than the size
of one memory segment, since all CPU chips need to ask the one
managing that RAM for each access.  In such a case, you actually get
best performance using a cpuset which just uses one CPU package and
the memory segments directly managed by that CPU package.  Without
the cpuset you may actually see better performance for this workload
by letting the BIOS interleave allocations, which spreads the RAM
allocations around to memory managed by all CPUs, and no one CPU
becomes the bottleneck.  The access is still not uniform, but you
dodge a bottleneck -- albeit less efficiently than using a custom
cpuset.

> On the contrary, "enable" means UMA.

No.  Think about this: regardless of this BIOS setting each RAM
package is directly connected to one CPU package, which functions as
its memory controller.  Most of the RAM used by PostgreSQL is for
disk buffers -- shared buffers in shared memory and OS cache.
Picture processes running on different CPU packages accessing a
single particular shared buffer.  Also picture processes running on
different CPU packages using the same spinlock at the same time.  No
BIOS setting can avoid the communications and data transfer among
the 8 CPU packages, and the locking of the cache lines.

> Therefore, I think Tony is right, we should disable this setting.
>
> I got the information from here.
> http://frankdenneman.nl/2010/12/28/node-interleaving-enable-or-disable/

Ah, that explains it.  There is no such thing as "disabling NUMA" --
you can have the BIOS force interleaving, or you can have the BIOS
leave the NUMA memory assignment to the OS.  I was assuming that by
"disabling NUMA" you meant to have BIOS control it through
interleaving.  You meant the opposite -- disabling the BIOS override
of OS NUMA control.  I agree that we should leave NUMA scheduling to
the OS. There are, however, some non-standard OS configuration
options that allow NUMA to behave better with PostgreSQL than the
defaults allow.  We will need to tune a little.

The author of that article seems to be assuming that the usage will
be with applications like word processing, spreadsheets, or browsers
-- where the OS can place all the related processes on a single CPU
package and all (or nearly all) memory allocations can be made from
associated memory -- yielding a fairly uniform and fast access when
the BIOS override is disabled.  On a database product which wants to
use all the cores and almost all of the memory, with heavy
contention on shared memory, the situation is very different.
Shared resource access time is going to be non-uniform no matter
what you do.  The difference is that the OS can still make *process
local* allocations from nearby memory segments, while BIOS cannot.

--
Kevin Grittner



pgsql-hackers by date:

Previous
From: Joshua Yanovski
Date:
Subject: Re: [HACKERS] Defaulting psql to ON_ERROR_ROLLBACK=interactive
Next
From: Ashutosh Sharma
Date:
Subject: Re: [HACKERS] Microvacuum support for Hash Index