Re: POC: GUC option for skipping shared buffers in core dumps - Mailing list pgsql-hackers

From Andres Freund
Subject Re: POC: GUC option for skipping shared buffers in core dumps
Date
Msg-id 20200210195659.vx6slnxmoymp5yyo@alap3.anarazel.de
Whole thread Raw
In response to POC: GUC option for skipping shared buffers in core dumps  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: POC: GUC option for skipping shared buffers in core dumps
List pgsql-hackers
Hi,

On 2020-02-10 22:07:13 +0300, Alexander Korotkov wrote:
> In Postgres Pro we have complaints about too large core dumps.

I've seen those too, and I've had them myself. It's pretty frustrating
if a core dump makes the machine unusable for half an hour while said
coredump is being written out...


> The possible way to reduce code dump size is to skip some information.
> Frequently shared buffers is most long memory segment in core dump.
> For sure, contents of shared buffers is required for discovering many
> of bugs.  But short core dump without shared buffers might be still
> useful.  If system appears to be not capable to capture full core
> dump, short core dump appears to be valuable option.

It's possibly interesting, in the interim at least, that enabling huge
pages on linux has the effect that pages aren't included in core dumps
by default.


> Attached POC patch implements core_dump_no_shared_buffers GUC, which
> does madvise(MADV_DONTDUMP) for shared buffers.  Any thoughts?

Hm. Not really convinced by this. The rest of shared memory is still
pretty large, and this can't be tuned at runtime.

Have you considered postmaster (or even just the GUC processing in each
process) adjusting /proc/self/coredump_filter instead?

From the man page:

       The  value  in the file is a bit mask of memory mapping types (see mmap(2)).  If a bit is set in the mask, then
memorymappings of the corresponding
 
       type are dumped; otherwise they are not dumped.  The bits in this file have the following meanings:

           bit 0  Dump anonymous private mappings.
           bit 1  Dump anonymous shared mappings.
           bit 2  Dump file-backed private mappings.
           bit 3  Dump file-backed shared mappings.
           bit 4 (since Linux 2.6.24)
                  Dump ELF headers.
           bit 5 (since Linux 2.6.28)
                  Dump private huge pages.
           bit 6 (since Linux 2.6.28)
                  Dump shared huge pages.
           bit 7 (since Linux 4.4)
                  Dump private DAX pages.
           bit 8 (since Linux 4.4)
                  Dump shared DAX pages.

You can also incorporate this into the start script for postgres today.


> +static Size ShmemPageSize = FPM_PAGE_SIZE;

I am somewhat confused by the use of FPM_PAGE_SIZE? What does this have
to do with any of this? Is it just because it's set to 4kb by default?


>  /*
> diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
> index 8228e1f3903..c578528b0bb 100644
> --- a/src/backend/utils/misc/guc.c
> +++ b/src/backend/utils/misc/guc.c
> @@ -2037,6 +2037,19 @@ static struct config_bool ConfigureNamesBool[] =
>          NULL, NULL, NULL
>      },
>  
> +#if HAVE_DECL_MADV_DONTDUMP
> +    {
> +        {"core_dump_no_shared_buffers", PGC_POSTMASTER, DEVELOPER_OPTIONS,
> +            gettext_noop("Exclude shared buffers from core dumps."),
> +            NULL,
> +            GUC_NOT_IN_SAMPLE
> +        },
> +        &core_dump_no_shared_buffers,
> +        false,
> +        NULL, NULL, NULL
> +    },
> +#endif

IMO it's better to have GUCs always present, but don't allow them to be
enabled if prerequisites aren't fulfilled.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Dmitry Dolgov
Date:
Subject: Re: Index Skip Scan
Next
From: Alvaro Herrera
Date:
Subject: Re: POC: GUC option for skipping shared buffers in core dumps