Thread: POC: GUC option for skipping shared buffers in core dumps
Hi! In Postgres Pro we have complaints about too large core dumps. The possible way to reduce code dump size is to skip some information. Frequently shared buffers is most long memory segment in core dump. For sure, contents of shared buffers is required for discovering many of bugs. But short core dump without shared buffers might be still useful. If system appears to be not capable to capture full core dump, short core dump appears to be valuable option. Attached POC patch implements core_dump_no_shared_buffers GUC, which does madvise(MADV_DONTDUMP) for shared buffers. Any thoughts? ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Hi, On 2020-02-10 22:07:13 +0300, Alexander Korotkov wrote: > In Postgres Pro we have complaints about too large core dumps. I've seen those too, and I've had them myself. It's pretty frustrating if a core dump makes the machine unusable for half an hour while said coredump is being written out... > The possible way to reduce code dump size is to skip some information. > Frequently shared buffers is most long memory segment in core dump. > For sure, contents of shared buffers is required for discovering many > of bugs. But short core dump without shared buffers might be still > useful. If system appears to be not capable to capture full core > dump, short core dump appears to be valuable option. It's possibly interesting, in the interim at least, that enabling huge pages on linux has the effect that pages aren't included in core dumps by default. > Attached POC patch implements core_dump_no_shared_buffers GUC, which > does madvise(MADV_DONTDUMP) for shared buffers. Any thoughts? Hm. Not really convinced by this. The rest of shared memory is still pretty large, and this can't be tuned at runtime. Have you considered postmaster (or even just the GUC processing in each process) adjusting /proc/self/coredump_filter instead? From the man page: The value in the file is a bit mask of memory mapping types (see mmap(2)). If a bit is set in the mask, then memorymappings of the corresponding type are dumped; otherwise they are not dumped. The bits in this file have the following meanings: bit 0 Dump anonymous private mappings. bit 1 Dump anonymous shared mappings. bit 2 Dump file-backed private mappings. bit 3 Dump file-backed shared mappings. bit 4 (since Linux 2.6.24) Dump ELF headers. bit 5 (since Linux 2.6.28) Dump private huge pages. bit 6 (since Linux 2.6.28) Dump shared huge pages. bit 7 (since Linux 4.4) Dump private DAX pages. bit 8 (since Linux 4.4) Dump shared DAX pages. You can also incorporate this into the start script for postgres today. > +static Size ShmemPageSize = FPM_PAGE_SIZE; I am somewhat confused by the use of FPM_PAGE_SIZE? What does this have to do with any of this? Is it just because it's set to 4kb by default? > /* > diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c > index 8228e1f3903..c578528b0bb 100644 > --- a/src/backend/utils/misc/guc.c > +++ b/src/backend/utils/misc/guc.c > @@ -2037,6 +2037,19 @@ static struct config_bool ConfigureNamesBool[] = > NULL, NULL, NULL > }, > > +#if HAVE_DECL_MADV_DONTDUMP > + { > + {"core_dump_no_shared_buffers", PGC_POSTMASTER, DEVELOPER_OPTIONS, > + gettext_noop("Exclude shared buffers from core dumps."), > + NULL, > + GUC_NOT_IN_SAMPLE > + }, > + &core_dump_no_shared_buffers, > + false, > + NULL, NULL, NULL > + }, > +#endif IMO it's better to have GUCs always present, but don't allow them to be enabled if prerequisites aren't fulfilled. Greetings, Andres Freund
On 2020-Feb-10, Andres Freund wrote: > Have you considered postmaster (or even just the GUC processing in each > process) adjusting /proc/self/coredump_filter instead? > > From the man page: > > The value in the file is a bit mask of memory mapping types (see mmap(2)). If a bit is set in the mask, thenmemory mappings of the corresponding > type are dumped; otherwise they are not dumped. The bits in this file have the following meanings: > > bit 0 Dump anonymous private mappings. > bit 1 Dump anonymous shared mappings. > bit 2 Dump file-backed private mappings. > bit 3 Dump file-backed shared mappings. > bit 4 (since Linux 2.6.24) > Dump ELF headers. > bit 5 (since Linux 2.6.28) > Dump private huge pages. > bit 6 (since Linux 2.6.28) > Dump shared huge pages. > bit 7 (since Linux 4.4) > Dump private DAX pages. > bit 8 (since Linux 4.4) > Dump shared DAX pages. > > You can also incorporate this into the start script for postgres today. Yeah. Maybe we should file bug reports against downstream packages to include a corefilter tweak. My development helper script uses this runpg_corefilter() { pid=$(head -1 $PGDATADIR/postmaster.pid) if [ ! -z "$pid" ]; then echo 0x01 > /proc/$pid/coredump_filter fi } I don't know how easy is it to teach systemd to do this on its service files. FWIW I've heard that some people like to have shmem in core files to improve debuggability, but it's *very* infrequent. But maybe we should have a way to disable the corefiltering. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2020-02-10 17:31:47 -0300, Alvaro Herrera wrote: > Yeah. Maybe we should file bug reports against downstream packages to > include a corefilter tweak. Hm, I'm not sure that's a reasonable way to scale things. Nor am I really sure that's the right granularity. > My development helper script uses this > > runpg_corefilter() { > pid=$(head -1 $PGDATADIR/postmaster.pid) > if [ ! -z "$pid" ]; then > echo 0x01 > /proc/$pid/coredump_filter > fi > } > > I don't know how easy is it to teach systemd to do this on its service > files. Well, you could just make it part of the command that starts the server. Not aware of anything else. > FWIW I've heard that some people like to have shmem in core files to > improve debuggability, but it's *very* infrequent. Oh, I pretty regularly want that. If you're debugging anthying that includes locks, page accesses, etc, it's pretty hard to succeed without? > But maybe we should have a way to disable the corefiltering. There should, imo. That's why I was wondering about making this a GUC (presumably suset). Greetings, Andres Freund
On 2020-Feb-10, Andres Freund wrote: > Hi, > > On 2020-02-10 17:31:47 -0300, Alvaro Herrera wrote: > > Yeah. Maybe we should file bug reports against downstream packages to > > include a corefilter tweak. > > Hm, I'm not sure that's a reasonable way to scale things. Nor am I > really sure that's the right granularity. Hah. This argument boils down to saying our packagers suck :-) > > I don't know how easy is it to teach systemd to do this on its service > > files. > > Well, you could just make it part of the command that starts the > server. Not aware of anything else. I tried to do that, but couldn't figure out a clean way, because you have to do it after the fact (not in the process itself). Maybe it's possible to have pg_ctl do it once postmaster is running. > > FWIW I've heard that some people like to have shmem in core files to > > improve debuggability, but it's *very* infrequent. > > Oh, I pretty regularly want that. If you're debugging anthying that > includes locks, page accesses, etc, it's pretty hard to succeed without? yyyyeah kinda, I guess -- I don't remember cases when I've wanted to do that in production systems. > > But maybe we should have a way to disable the corefiltering. > > There should, imo. That's why I was wondering about making this a GUC > (presumably suset). Not really sure about suset ... AFAIR that means superuser can SET it; but what you really care about is more like ALTER SYSTEM, which is SIGHUP unless I misremember. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2020-02-10 18:21:30 -0300, Alvaro Herrera wrote: > On 2020-Feb-10, Andres Freund wrote: > > > Hi, > > > > On 2020-02-10 17:31:47 -0300, Alvaro Herrera wrote: > > > Yeah. Maybe we should file bug reports against downstream packages to > > > include a corefilter tweak. > > > > Hm, I'm not sure that's a reasonable way to scale things. Nor am I > > really sure that's the right granularity. > > Hah. This argument boils down to saying our packagers suck :-) Hm? I'd say it's a sign of respect to not have each of them do the same work. Especially when they can't address it to the same degree core PG can. So maybe I'm saying we shouldn't be lazy ;) > > > I don't know how easy is it to teach systemd to do this on its service > > > files. > > > > Well, you could just make it part of the command that starts the > > server. Not aware of anything else. > > I tried to do that, but couldn't figure out a clean way, because you > have to do it after the fact (not in the process itself). Maybe it's > possible to have pg_ctl do it once postmaster is running. Shouldn't it be sufficient to just do it to /proc/self/coredump_filter? It's inherited IIUC? Yep: A child process created via fork(2) inherits its parent's coredump_filter value; the coredump_filter value is preservedacross an execve(2). > > > But maybe we should have a way to disable the corefiltering. > > > > There should, imo. That's why I was wondering about making this a GUC > > (presumably suset). > > Not really sure about suset ... AFAIR that means superuser can SET it; > but what you really care about is more like ALTER SYSTEM, which is > SIGHUP unless I misremember. I really want both. Sometimes it's annoying to get followup coredumps by other processes, even if I just want to get a corefile from one process doing something more specific. It seems nice to alter that session / user to have large coredumps, but not the rest? Greetings, Andres Freund
On Tue, 11 Feb 2020 at 03:07, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote: > > Hi! > > In Postgres Pro we have complaints about too large core dumps. The > possible way to reduce code dump size is to skip some information. > Frequently shared buffers is most long memory segment in core dump. > For sure, contents of shared buffers is required for discovering many > of bugs. But short core dump without shared buffers might be still > useful. If system appears to be not capable to capture full core > dump, short core dump appears to be valuable option. > > Attached POC patch implements core_dump_no_shared_buffers GUC, which > does madvise(MADV_DONTDUMP) for shared buffers. Any thoughts? I'd like this a lot. In fact I'd like it so much I kinda hope it'd be considered backpatchable because coredump_filter is much too crude and coarse grained. Now that Pg has parallel query we all rely on shm_mq, DSM/DSA, etc. It's increasingly problematic to have these areas left out of core dumps in order to avoid bloating them with shared_buffers contents. Doubly so if, like me, you work with extensions that make very heavy use of shared memory areas for their own IPC. Currently my options are "dump all shmem including shared_buffers" or "dump no shmem". But I usually want "dump all shmem except shared_buffers". It's tolerable to just dump s_b on a test system with a small s_b, but if enabling coredumps to track down some hard-to-repro crash on a production system I really don't want 20GB coredumps... Please, please apply. Please backpatch, if you can possibly stand to do so. -- Craig Ringer http://www.2ndQuadrant.com/ 2ndQuadrant - PostgreSQL Solutions for the Enterprise
RE: POC: GUC option for skipping shared buffers in core dumps
From
"tsunakawa.takay@fujitsu.com"
Date:
From: Craig Ringer <craig@2ndquadrant.com> > Currently my options are "dump all shmem including shared_buffers" or > "dump no shmem". But I usually want "dump all shmem except > shared_buffers". It's tolerable to just dump s_b on a test system with > a small s_b, but if enabling coredumps to track down some > hard-to-repro crash on a production system I really don't want 20GB > coredumps... We have a simple implementation that allows to exclude shared memory. That's been working for years. [postgresql.conf] core_directory = 'location of core dumps' core_contents = '{none | minimum | full}' # none = doesn't dump core, minimum excludes shared memory, and full dumps all I can provide it. But it simply changes the current directory and detaches shared memory when postgres receives signalsthat dump core. I made this GUC because Windows also had to be dealt with. From: Andres Freund <andres@anarazel.de> > > Hah. This argument boils down to saying our packagers suck :-) > > Hm? I'd say it's a sign of respect to not have each of them do the same > work. Especially when they can't address it to the same degree core PG > can. So maybe I'm saying we shouldn't be lazy ;) Maybe we should add options to pg_ctl just like -c which is available now, so that OS packagers can easily use in their startscripts. Or, can they just use pg_ctl's -o to specify new GUC parameters? Regards Takayuki Tsunakawa
On Wed, Feb 12, 2020, at 7:55 AM, tsunakawa.takay@fujitsu.com wrote: > From: Craig Ringer <craig@2ndquadrant.com> >> Currently my options are "dump all shmem including shared_buffers" or >> "dump no shmem". But I usually want "dump all shmem except >> shared_buffers". It's tolerable to just dump s_b on a test system with >> a small s_b, but if enabling coredumps to track down some >> hard-to-repro crash on a production system I really don't want 20GB >> coredumps... > > We have a simple implementation that allows to exclude shared memory. > That's been working for years. > > [postgresql.conf] > core_directory = 'location of core dumps' > core_contents = '{none | minimum | full}' > # none = doesn't dump core, minimum excludes shared memory, and full dumps all > > I can provide it. But it simply changes the current directory and > detaches shared memory when postgres receives signals that dump core. > > I made this GUC because Windows also had to be dealt with. If it's still possible, share your patch here. I don't know what about the core, but during development, especially the bug-fixingprocess, it is really dull to wait for the core generation process every time, even if you debug a planner issueand are not interested in shared memory blocks ... -- Regards, Andrei Lepikhov
Hi, The current approach could be better because we want to use it on Windows/MacOS and other systems. So, I've tried to develop another strategy - detaching shmem and DSM blocks before executing the abort() routine. As I can see, it helps and reduces the size of the core file. -- regards, Andrey Lepikhov Postgres Professional