Re: Let's make PostgreSQL multi-threaded - Mailing list pgsql-hackers
From | Merlin Moncure |
---|---|
Subject | Re: Let's make PostgreSQL multi-threaded |
Date | |
Msg-id | CAHyXU0z_miQ8QiE+bOKJSd=0OPSsrAboJov6WbVyJ_V_B1RJWg@mail.gmail.com Whole thread Raw |
In response to | Re: Let's make PostgreSQL multi-threaded (David Geier <geidav.pg@gmail.com>) |
Responses |
Re: Let's make PostgreSQL multi-threaded
|
List | pgsql-hackers |
On Thu, Jul 27, 2023 at 8:28 AM David Geier <geidav.pg@gmail.com> wrote:
Hi,
On 6/7/23 23:37, Andres Freund wrote:
> I think we're starting to hit quite a few limits related to the process model,
> particularly on bigger machines. The overhead of cross-process context
> switches is inherently higher than switching between threads in the same
> process - and my suspicion is that that overhead will continue to
> increase. Once you have a significant number of connections we end up spending
> a *lot* of time in TLB misses, and that's inherent to the process model,
> because you can't share the TLB across processes.
Another problem I haven't seen mentioned yet is the excessive kernel
memory usage because every process has its own set of page table entries
(PTEs). Without huge pages the amount of wasted memory can be huge if
shared buffers are big.
Hm, noted this upthread, but asking again, does this help/benefit interactions with the operating system make oom kill situations less likely? These things are the bane of my existence, and I'm having a hard time finding a solution that prevents them other than running pgbouncer and lowering max_connections, which adds complexity. I suspect I'm not the only one dealing with this. What's really scary about these situations is they come without warning. Here's a pretty typical example per sar -r.
kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty
14:20:02 461612 15803476 97.16 0 11120280 12346980 60.35 10017820 4806356 220
14:30:01 378244 15886844 97.67 0 11239012 12296276 60.10 10003540 4909180 240
14:40:01 308632 15956456 98.10 0 11329516 12295892 60.10 10015044 4981784 200
14:50:01 458956 15806132 97.18 0 11383484 12101652 59.15 9853612 5019916 112
15:00:01 10592736 5672352 34.87 0 4446852 8378324 40.95 1602532 3473020 264 <-- reboot!
15:10:01 9151160 7113928 43.74 0 5298184 8968316 43.83 2714936 3725092 124
15:20:01 8629464 7635624 46.94 0 6016936 8777028 42.90 2881044 4102888 148
15:30:01 8467884 7797204 47.94 0 6285856 8653908 42.30 2830572 4323292 436
15:40:02 8077480 8187608 50.34 0 6828240 8482972 41.46 2885416 4671620 320
15:50:01 7683504 8581584 52.76 0 7226132 8511932 41.60 2998752 4958880 308
16:00:01 7239068 9026020 55.49 0 7649948 8496764 41.53 3032140 5358388 232
16:10:01 7030208 9234880 56.78 0 7899512 8461588 41.36 3108692 5492296 216
14:30:01 378244 15886844 97.67 0 11239012 12296276 60.10 10003540 4909180 240
14:40:01 308632 15956456 98.10 0 11329516 12295892 60.10 10015044 4981784 200
14:50:01 458956 15806132 97.18 0 11383484 12101652 59.15 9853612 5019916 112
15:00:01 10592736 5672352 34.87 0 4446852 8378324 40.95 1602532 3473020 264 <-- reboot!
15:10:01 9151160 7113928 43.74 0 5298184 8968316 43.83 2714936 3725092 124
15:20:01 8629464 7635624 46.94 0 6016936 8777028 42.90 2881044 4102888 148
15:30:01 8467884 7797204 47.94 0 6285856 8653908 42.30 2830572 4323292 436
15:40:02 8077480 8187608 50.34 0 6828240 8482972 41.46 2885416 4671620 320
15:50:01 7683504 8581584 52.76 0 7226132 8511932 41.60 2998752 4958880 308
16:00:01 7239068 9026020 55.49 0 7649948 8496764 41.53 3032140 5358388 232
16:10:01 7030208 9234880 56.78 0 7899512 8461588 41.36 3108692 5492296 216
Triggering query was heavy (maybe even runaway), server load was minimal otherwise:
CPU %user %nice %system %iowait %steal %idle
14:30:01 all 9.55 0.00 0.63 0.02 0.00 89.81
14:40:01 all 9.95 0.00 0.69 0.02 0.00 89.33
14:50:01 all 10.22 0.00 0.83 0.02 0.00 88.93
15:00:01 all 10.62 0.00 1.63 0.76 0.00 86.99
15:10:01 all 8.55 0.00 0.72 0.12 0.00 90.61
14:40:01 all 9.95 0.00 0.69 0.02 0.00 89.33
14:50:01 all 10.22 0.00 0.83 0.02 0.00 88.93
15:00:01 all 10.62 0.00 1.63 0.76 0.00 86.99
15:10:01 all 8.55 0.00 0.72 0.12 0.00 90.61
The conjecture here is that lots of idle connections make the server appear to have less memory available than it looks, and sudden transient demands can cause it to destabilize.
Just throwing it out there, if it can be shown to help it may be supportive of moving forward with something like this, either instead of, or along with, O_DIRECT or other internalized database memory management strategies. Lowering context switches, faster page access etc are of course nice would not be a game changer for the workloads we see which are pretty varied (OLTP, analytics) although we don't extremely high transaction rates.
merlin
pgsql-hackers by date: