Re: [HACKERS] kqueue - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: [HACKERS] kqueue |
Date | |
Msg-id | CAEepm=1YhBEH9FV_76k5GqzZcK4G+PF7_EqGc4eiMKswFtOYRg@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] kqueue (Matteo Beccati <php@beccati.com>) |
Responses |
Re: [HACKERS] kqueue
|
List | pgsql-hackers |
On Sun, Sep 30, 2018 at 9:49 PM Matteo Beccati <php@beccati.com> wrote: > On 30/09/2018 04:36, Thomas Munro wrote: > > On Sat, Sep 29, 2018 at 7:51 PM Matteo Beccati <php@beccati.com> wrote: > >> Out of curiosity, I've installed FreBSD on an identically specced VM, > >> and the select benchmark was ~75k tps for kqueue vs ~90k tps on > >> unpatched master, so maybe there's something wrong I'm doing when > >> benchmarking. Could you please provide proper instructions? > > > > Ouch. What kind of virtualisation is this? Which version of FreeBSD? > > Not sure if it's relevant, but do you happen to see gettimeofday() > > showing up as a syscall, if you truss a backend running pgbench? > > I downloaded 11.2 as VHD file in order to run on MS Hyper-V / Win10 Pro. > > Yes, I saw plenty of gettimeofday calls when running truss: > > > gettimeofday({ 1538297117.071344 },0x0) = 0 (0x0) > > gettimeofday({ 1538297117.071743 },0x0) = 0 (0x0) > > gettimeofday({ 1538297117.072021 },0x0) = 0 (0x0) Ok. Those syscalls show up depending on your kern.timecounter.hardware setting and virtualised hardware: just like on Linux, gettimeofday() can be a cheap userspace operation (vDSO) that avoids the syscall path, or not. I'm not seeing any reason to think that's relevant here. > > getpid() = 766 (0x2fe) > > __sysctl(0x7fffffffce90,0x4,0x0,0x0,0x801891000,0x2b) = 0 (0x0) > > gettimeofday({ 1538297117.072944 },0x0) = 0 (0x0) > > getpid() = 766 (0x2fe) > > __sysctl(0x7fffffffce90,0x4,0x0,0x0,0x801891000,0x29) = 0 (0x0) That's setproctitle(). Those syscalls go away if you use FreeBSD 12 (which has setproctitle_fast()). If you fix both of those problems, you are left with just: > > sendto(9,"2\0\0\0\^DT\0\0\0!\0\^Aabalance"...,71,0,NULL,0) = 71 (0x47) > > recvfrom(9,"B\0\0\0\^\\0P0_1\0\0\0\0\^A\0\0"...,8192,0,NULL,0x0) = 51 (0x33) These are the only syscalls I see for each pgbench -S transaction on my bare metal machine: just the network round trip. The funny thing is ... there are almost no kevent() calls. I managed to reproduce the regression (~70k -> ~50k) using a prewarmed scale 10 select-only pgbench with 2GB of shared_buffers (so it all fits), with -j 96 -c 96 on an 8 vCPU AWS t2.2xlarge running FreeBSD 12 ALPHA8. Here is what truss -c says, capturing data from one backend for about 10 seconds: syscall seconds calls errors sendto 0.396840146 3452 0 recvfrom 0.415802029 3443 6 kevent 0.000626393 6 0 gettimeofday 2.723923249 24053 0 ------------- ------- ------- 3.537191817 30954 6 (There's no regression with -j 8 -c 8, the problem is when significantly overloaded, the same circumstances under which Matheusz reported a great improvement). So... it's very rarely accessing the kqueue directly... but its existence somehow slows things down. Curiously, when using poll() it's actually calling poll() ~90/sec for me: syscall seconds calls errors sendto 0.352784808 3226 0 recvfrom 0.614855254 4125 916 poll 0.319396480 916 0 gettimeofday 2.659035352 22456 0 ------------- ------- ------- 3.946071894 30723 916 I don't know what's going on here. Based on the reports so far, we know that kqueue gives a speedup when using bare metal with pgbench running on a different machine, but a slowdown when using virtualisation and pgbench running on the same machine (and I just checked that that's observable with both Unix sockets and TCP sockets). That gave me the idea of looking at pgbench itself: Unpatched: syscall seconds calls errors ppoll 0.004869268 1 0 sendto 16.489416911 7033 0 recvfrom 21.137606238 7049 0 ------------- ------- ------- 37.631892417 14083 0 Patched: syscall seconds calls errors ppoll 0.002773195 1 0 sendto 16.597880468 7217 0 recvfrom 25.646406008 7238 0 ------------- ------- ------- 42.247059671 14456 0 I don't know why the existence of the kqueue should make recvfrom() slower on the pgbench side. That's probably something to look into off-line with some FreeBSD guru help. Degraded performance for clients on the same machine does seem to be a show stopper for this patch for now. Thanks for testing! -- Thomas Munro http://www.enterprisedb.com
pgsql-hackers by date: