Re: Large writable variables - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Large writable variables
Date
Msg-id 20181016045051.wr35ubb6h7h3ee2b@alap3.anarazel.de
Whole thread Raw
In response to Large writable variables  (Andres Freund <andres@anarazel.de>)
Responses Re: Large writable variables  (Andres Freund <andres@anarazel.de>)
Re: Large writable variables  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

After the last few changes (including a proposed one), we now have in
the .data segment (i.e. mutable initialized variables):

$ objdump -j .data -t ~/build/postgres/dev-optimize/vpath/src/backend/postgres|awk '{print $4, $5, $6}'|sort
-r|lessGreetings,
.data 0000000000004380 ConfigureNamesInt
.data 0000000000003570 ConfigureNamesBool
.data 0000000000002140 ConfigureNamesString
.data 00000000000010e0 ConfigureNamesEnum
.data 0000000000000d20 ConfigureNamesReal

These are by far the largest chunk of the .data segment (like 47152
bytes out of 51168 bytes).  It's not entirely trivial to fix. While we
can slap consts onto large parts of the functions taking a struct
config_generic*, there's a few places where *do* modify them.

ISTM, to actually fix, we'd have to split
struct config_generic
{
    /* constant fields, must be set correctly in initial value: */
    const char *name;            /* name of variable - MUST BE FIRST */
    GucContext    context;        /* context required to set the variable */
    enum config_group group;    /* to help organize variables by function */
    const char *short_desc;        /* short desc. of this variable's purpose */
    const char *long_desc;        /* long desc. of this variable's purpose */
    int            flags;            /* flag bits, see guc.h */
    /* variable fields, initialized at runtime: */
    enum config_type vartype;    /* type of variable (set only at startup) */
    int            status;            /* status bits, see below */
    GucSource    source;            /* source of the current actual value */
    GucSource    reset_source;    /* source of the reset_value */
    GucContext    scontext;        /* context that set the current value */
    GucContext    reset_scontext; /* context that set the reset value */
    GucStack   *stack;            /* stacked prior values */
    void       *extra;            /* "extra" pointer for current actual value */
    char       *sourcefile;        /* file current setting is from (NULL if not
                                 * set in config file) */
    int            sourceline;        /* line in source file */
};

into two different arrays. Namely the already denoted 'constant fields'
and 'variable fields.  But it's a bit more complicated than that: The
size doesn't come just from the base config_struct, but also from the
config_{bool,int,real,string,enum} arrays.  Where we again have separate
'constant fields' and 'variable fields'.

So we'd have to refactor more heavily than just splitting the above
array into two.

I kind of wonder, if we shouldn't remove the separate ConfigureNamesInt*
int arrays and change things so we have one

struct config_def
{
    const char *name;                       /* name of variable - MUST BE FIRST */
    GucContext      context;                /* context required to set the variable */
    enum config_group group;        /* to help organize variables by function */
    const char *short_desc;         /* short desc. of this variable's purpose */
    const char *long_desc;          /* long desc. of this variable's purpose */
    int                     flags;                  /* flag bits, see guc.h */
    enum config_type vartype;       /* type of variable (set only at startup) */

    union
    {
        struct
        {
            bool       *variable;
            bool            boot_val;
            GucBoolCheckHook check_hook;
            GucBoolAssignHook assign_hook;
            GucShowHook show_hook;
        } config_bool;

        struct
        {
            int           *variable;
            int            boot_val;
            int            min;
            int            max;
            GucIntCheckHook check_hook;
            GucIntAssignHook assign_hook;
            GucShowHook show_hook;
        } config_int

        ...
    } pertype;
}

and then a corresponding struct config_val with a similar union.

Besides reducing modified memory, it'd also have the benefit of making
the grouping within the various arrays much more sensible, because
related fields could be grouped together.


.data 0000000000000420 intRelOpts
.data 00000000000001c0 realRelOpts
.data 0000000000000118 boolRelOpts
.data 00000000000000a8 stringRelOpts

these should be readonly, but the code doesn't make that easy. There's
pending refactorings, I don't know how they effect this.


.data 0000000000000068 SnapshotSelfData
.data 0000000000000068 SnapshotAnyData
.data 0000000000000068 SecondarySnapshotData
.data 0000000000000068 CurrentSnapshotData
.data 0000000000000068 CatalogSnapshotData

The obviously actually are modifyable.


.data 0000000000000028 spi_printtupDR
.data 0000000000000028 printsimpleDR
.data 0000000000000028 donothingDR
.data 0000000000000028 debugtupDR

These we could actually make constant, but CreateDestReceiver() as an
API makes that inconvenient. They also are pretty darn small...  There's
a security benefit in making them constant and casting the constness
away - I think that might not be insane.


- Andres


pgsql-hackers by date:

Previous
From: Ashutosh Sharma
Date:
Subject: Re: Multi-insert into a partitioned table with before insert rowtrigger causes server crash on latest HEAD
Next
From: Andres Freund
Date:
Subject: Re: Large writable variables