Re: terminate called after throwing an instance of 'std::bad_alloc' - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: terminate called after throwing an instance of 'std::bad_alloc' |
Date | |
Msg-id | 20201003230149.mtd7fjsjwgii3jv7@alap3.anarazel.de Whole thread Raw |
In response to | Re: terminate called after throwing an instance of 'std::bad_alloc' (Justin Pryzby <pryzby@telsasoft.com>) |
Responses |
Re: terminate called after throwing an instance of 'std::bad_alloc'
|
List | pgsql-hackers |
Hi, On 2020-10-03 17:02:36 -0500, Justin Pryzby wrote: > We didn't use JIT with v12, but I've been setting jit=on along with v13 > upgrades, so I was trying to convince myself that there's a change in v13, and > not just our config.. > > I've seen repeated OOMs on a small number of reports. It looks like JIT > optimization/inlining used a good deal of RAM in v12, and even more on v13. > Maybe that's what's expected; it looks like this scales with the number of > partitions and number of functions, so it could be arbitrarily large, and not > respectful of work_mem (below, at the 4MB default). > In v13: > With optimize_above=-1, inline_above=-1: 333472 kB max resident size > With optimize_above=1, inline_above=1: 654036 kB max resident size > > In v12: > With optimize_above=-1, inline_above=-1: 330988 kB max resident size > With optimize_above=1, inline_above=1: 556844 kB max resident size That's probably to some degree just the shared library being loaded / faulted in (see below for more detail). I don't think there's any difference between 12/13 in what inlining / optimization does, so I don't quite understand the difference. > The OOM itself could be due to our JIT settings and maybe not a bug. My > thinking is that when we hit OOM, that sometimes manifests itself as the C++ > exception, which somehow trickles up into an SIGABRT in postgres. C++ code throws an exception when memory allocation fails, unless a different error handler is set. While in LLVM for JITing we set a different error handler (which doesn't throw, but FATALs), but that doesn't appear to be done for all the postgis code. Nor does it - as it really should, catch the exeption. An unhandled exception is treated as an abort(), which in turn does a SIGABRT. > For example, look at a this huge backend, which is connected to the process > which have been experiencing OOM. The backend is 4.2GB while idle, so I'd have > hoped for it to have released everything it could. I'm aware that free() > usually cannot return small allocations to the kernel, but then should not lead > to continued growth over time. The below turned out to be somewhat obsoleted by what you wrote below, which I unfortunately hadn't yet read - but I think it's still good information, so I'm not gonna delete it: RES includes all the shared buffers that a backend touched, unless you use huge pages. At least on linux RES is unfortunately pretty much useless for postgres. You can gain a bit more information by looking at /proc/$pid/status and /proc/$pid/smaps_rollup E.g. here is a backend without having run any JIT (pruned all of these): andres@awork3:~$ grep ^Rss /proc/2810440/status RssAnon: 2324 kB RssFile: 6932 kB RssShmem: 4868 kB andres@awork3:~$ grep ^Pss /proc/2810440/smaps_rollup Pss: 3912 kB Pss_Anon: 1177 kB Pss_File: 1004 kB Pss_Shmem: 1730 kB and after loading the llvmjit library (LOAD 'llvmjit'), but not actually jitting anything: andres@awork3:~$ grep ^Rss /proc/2810440/status RssAnon: 7528 kB RssFile: 104844 kB RssShmem: 5204 kB andres@awork3:~$ grep ^Pss /proc/2810440/smaps_rollup Pss: 57773 kB Pss_Anon: 6436 kB Pss_File: 49606 kB Pss_Shmem: 1730 kB Note how RSS has increased by ~100MB - but all of that is file-backed. It's just the shared library being loaded (not even all required to be in memory). And if you look at the 'Pss' (p stands for proportional) values you can see how it's lower, because it's shared between all the processes that have JIT loaded - in my case here that was just two session, hence halved. The actually allocated memory differs between the two, but it's ~2.3MB vs 7.5MB. (you also can see how a postgres backend doesn't actually weigh 2MB, because a lot of the memory allocations are shared between postmaster and backends) And if you additionally utilize buffers in shared buffers, RSS also will included that. E.g. postgres[2810440][1]=# SHOW shared_buffers; ┌────────────────┐ │ shared_buffers │ ├────────────────┤ │ 2GB │ └────────────────┘ (1 row) postgres[2810325][1]=# \dt+ large List of relations ┌────────┬───────┬───────┬────────┬─────────────┬───────────────┬────────┬─────────────┐ │ Schema │ Name │ Type │ Owner │ Persistence │ Access Method │ Size │ Description │ ├────────┼───────┼───────┼────────┼─────────────┼───────────────┼────────┼─────────────┤ │ public │ large │ table │ andres │ permanent │ heap │ 501 MB │ │ └────────┴───────┴───────┴────────┴─────────────┴───────────────┴────────┴─────────────┘ (1 row) andres@awork3:~$ grep ^Rss /proc/2810440/status RssAnon: 8284 kB RssFile: 107236 kB RssShmem: 19184 kB andres@awork3:~$ grep ^Pss /proc/2810440/smaps_rollup Pss: 63282 kB Pss_Anon: 7413 kB Pss_File: 50034 kB Pss_Shmem: 5834 kB postgres[2810440][1]=# SELECT count(generate_series), count(random) FROM large; ... andres@awork3:~$ grep ^Rss /proc/2810440/status RssAnon: 8592 kB RssFile: 107464 kB RssShmem: 390328 kB andres@awork3:~$ grep ^Pss /proc/2810440/smaps_rollup Pss: 248889 kB Pss_Anon: 7560 kB Pss_File: 50069 kB Pss_Shmem: 191259 kB So now, despite only ~8.5MB of allocations, ps's RSS will show a memory usage considerably larger: andres@awork3:~$ ps -u --pid 2810440 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND andres 2810440 0.2 0.2 2420584 506384 ? Ss 15:19 0:02 postgres: dev assert: andres postgres [local] idle > The postgres memory contexts don't show anything nearly that large. > MessageContext: 25236472 total in 20 blocks; 2990656 free (4 chunks); 22245816 used > CacheMemoryContext: 25892160 total in 42 blocks; 22770984 free (28285 chunks); 3121176 used > Grand total: 70879896 bytes in 1466 blocks; 29518440 free (29097 chunks); 41361456 used I wonder if it'd be worth trying to redirect all LLVM allocations to be done into a memory context. Not to be able to free it, but to be able to see it more clearly. But it looks like it'd be too fragile to do so. > I was able to make a small, apparent leak like so. This applies to > postgis3.0/postgres12/LLVM5/centos7, and > postgis3.1alpha/postgres13/LLVM9/centos8. > > Inlining is essential to see the leak, optimization is not. Showing every 9th > query: > > $ for a in `seq 1 99`; do echo "SET jit_inline_above_cost=1; SET jit_optimize_above_cost=-1; SET jit_above_cost=0; SETjit_expressions=on; SET log_executor_stats=on; SET client_min_messages=debug; SET jit=on; explain analyze SELECT st_x(site_geo)FROM sites WHERE site_geo IS NOT NULL;"; done |psql ts 2>&1 |grep 'resident' |sed -n '1~9p' > ! 46024 kB max resident size > ! 46492 kB max resident size Huh, that's certainly not good. Looking. Greetings, Andres Freund
pgsql-hackers by date: