Thread: Size estimation of postgres core files
I am trying to determine the upper size limit of a core file generated for any given cluster. Is it feasible that it could actually be the entire size of the system memory + shared buffers (i.e. really huge)?
I've done a little bit of testing of this myself, but want to be sure I am clear on this for planning to have enough free space for postgres core files in case of potential crashes.
Thanks!
Jeremy
Jeremy
On 2019-Feb-15, Jeremy Finzel wrote: > I am trying to determine the upper size limit of a core file generated for > any given cluster. Is it feasible that it could actually be the entire > size of the system memory + shared buffers (i.e. really huge)? In Linux, yes. Not sure about other OSes. You can turn off the dumping of shared memory with some unusably unfriendly bitwise arithmetic using the "coredump_filter" file in /proc for the process. (It's inherited by children, so you can just set it once for postmaster at server start time). -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
In Linux, yes. Not sure about other OSes.
You can turn off the dumping of shared memory with some unusably
unfriendly bitwise arithmetic using the "coredump_filter" file in /proc
for the process. (It's inherited by children, so you can just set it
once for postmaster at server start time).
Yes Linux. This is very helpful, thanks. A follow-up question - will it take postgres a really long time to crash (and hopefully recover) if I have say 1T of RAM because it has to write that all out to a core file first?
Thanks,
Jeremy
>>>>> "Jeremy" == Jeremy Finzel <finzelj@gmail.com> writes: Jeremy> Yes Linux. This is very helpful, thanks. A follow-up question - Jeremy> will it take postgres a really long time to crash (and Jeremy> hopefully recover) if I have say 1T of RAM because it has to Jeremy> write that all out to a core file first? It doesn't write out all of RAM, only the amount in use by the particular backend that crashed (plus all the shared segments attached by that backend, including the main shared_buffers, unless you disable that as previously mentioned). And yes, it can take a long time to generate a large core file. -- Andrew (irc:RhodiumToad)
It doesn't write out all of RAM, only the amount in use by the
particular backend that crashed (plus all the shared segments attached
by that backend, including the main shared_buffers, unless you disable
that as previously mentioned).
And yes, it can take a long time to generate a large core file.
--
Andrew (irc:RhodiumToad)
Based on the Alvaro's response, I thought it is reasonably possible that that *could* include nearly all of RAM, because that was my original question. If shared buffers is say 50G and my OS has 1T, shared buffers is a small portion of that. But really my question is what should we reasonably assume is possible - meaning what kind of space should I provision for a volume to be able to contain the core dump in case of crash? The time of writing the core file would definitely be a concern if it could indeed be that large.
Could someone provide more information on exactly how to do that coredump_filter?
We are looking to enable core dumps to aid in case of unexpected crashes and wondering if there are any recommendations in general in terms of balancing costs/benefits of enabling core dumps.
Thank you!
Jeremy
On 2019-02-15 13:01:50 -0600, Jeremy Finzel wrote: > It doesn't write out all of RAM, only the amount in use by the > particular backend that crashed (plus all the shared segments attached > by that backend, including the main shared_buffers, unless you disable > that as previously mentioned). > Based on the Alvaro's response, I thought it is reasonably possible that that > *could* include nearly all of RAM, because that was my original question. If > shared buffers is say 50G and my OS has 1T, shared buffers is a small portion > of that. But really my question is what should we reasonably assume is > possible The size of the core dump will be roughly the same as the VM used by the process - so that will be the initial size of the process plus shared buffers plus a (usually small) multiple of work_mem or maintenance_work_mem plus whatever memory the process allocates. The big unknown is that "(usually small) multiple of work_mem". I've seen a process use 8 times work_mem for a moderately complex query, so depending on what you do it might be worse. The extra memory allocated by processes is usually small (after all, if some datastructure were expected to be potentially large it would probably be limited by work_mem), but if there is a bug (and that's what you are looking for) it might really grow without bounds. If you know some upper bound for a reasonable size of your processes you could set the address space limit - not only will this limit the core dump size, but it will also prevent a single process from consuming all RAM and triggering the OOM killer. You probably don't want to limit the core dump size directly (another process limit you can set) as that will result in a truncated (and possibly useless) core dump. For similar reasons I'm not convinced that omitting the shared memory is a good idea. hp -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | hjp@hjp.at | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>