Re: [PATCH] Make configuration file "include" directive handling morerobust - Mailing list pgsql-hackers

From Ian Barwick
Subject Re: [PATCH] Make configuration file "include" directive handling morerobust
Date
Msg-id 05b22e26-adb5-8193-cec5-a8ae1de0b8dc@2ndquadrant.com
Whole thread Raw
In response to Re: [PATCH] Make configuration file "include" directive handlingmore robust  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: [PATCH] Make configuration file "include" directive handling more robust  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 7/17/19 5:34 PM, Kyotaro Horiguchi wrote:> Hello.
 >
 > At Wed, 17 Jul 2019 12:29:43 +0900, Ian Barwick <ian.barwick@2ndquadrant.com> wrote in
<8c8bcbca-3bd9-dc6e-8986-04a5abdef142@2ndquadrant.com>
 >> Hi
 >>
 >> While poking about with [1], I noticed a few potential issues with the
 >> inclusion handling for configuration files; another issue is
 >> demonstrated in [2].
 >>
 >> [1]
 >> https://www.postgresql.org/message-id/aed6cc9f-98f3-2693-ac81-52bb0052307e%402ndquadrant.com
 >>     ("Stop ALTER SYSTEM from making bad assumptions")
 >> [2]
 >>
https://www.postgresql.org/message-id/CY4PR1301MB22001D3FAAB3499C5D41DE23A9E50%40CY4PR1301MB2200.namprd13.prod.outlook.com
 >>     ("First Time Starting Up PostgreSQL and Having Problems")
 >
 > Yeah! That's annoying..
 >
 >> Specifically these are:
 >>
 >> 1) Provision of empty include directives
 >> =========================================
 >>
 >> The default postgresql.conf file includes these thusly:
 >>
 >>      #include_dir = '' # include files ending in '.conf' from
 >>                                              # a directory, e.g., 'conf.d'
 >>      #include_if_exists = '' # include file only if it exists
 >>      #include = ''                           # include file
 >>
 >> Currently, uncommenting them but leaving the value empty (as happened
 >> in [2] above) can
 >> result in unexpected behaviour.
 >>
 >> For "include" and "include_if_exists", it's a not critical issue as
 >> non-existent
 >> files are, well, non-existent, however this will leave behind the
 >> cryptic
 >> message "input in flex scanner failed" in pg_file_settings's "error"
 >> column, e.g.:
 >>
 >>      postgres=# SELECT sourceline, seqno, name, setting, applied, error
 >>                   FROM pg_file_settings
 >>                  WHERE error IS NOT NULL;
 >>       sourceline | seqno | name | setting | applied |            error
 >>      ------------+-------+------+---------+---------+------------------------------
 >>                1 | 45 | | | f | input in flex scanner failed
 >>                1 | 46 | | | f | input in flex scanner failed
 >>      (2 rows)
 >>
 >> However, an empty value for "include_dir" will result in the current
 >> configuration
 >> file's directory being read, which can result in circular inclusion
 >> and triggering
 >> of the nesting depth check.
 >>
 >> Patch {1} makes provision of an empty value for any of these
 >> directives cause
 >> configuration file processing to report an approprate error, e.g.:
 >>
 >>      postgres=# SELECT sourceline, seqno, name, setting, applied, error
 >>                   FROM pg_file_settings
 >>                  WHERE error IS NOT NULL;
 >>       sourceline | seqno | name | setting | applied |                 error
 >>      ------------+-------+------+---------+---------+---------------------------------------
 >>              757 | 45 | | | f | "include" must not be empty
 >>              758 | 46 | | | f | "include_if_exists" must not be empty
 >>              759 | 47 | | | f | "include_dir" must not be empty
 >
 > The patch 1 looks somewhat superficial. All the problems are
 > reduced to creating unexpected filepath for
 > inclusion. AbsoluteConfigLocation does the core work, and it can
 > issue generic error message covering all the cases like:
 >
 > invalid parameter "<param>" at <calling_file>:<calling_lineno>
 >
 > which seems sufficient. (The function needs some additional
 > parameters.)

That seems unnecessarily complex to me, as we'd be overloading a
function with a single purpose (to manipulate a path) with some
of the parsing logic/control.

 >
 >> 2) Circular inclusion of configuration files
 >> ============================================
 >>
 >> Currently there is a simple maximum nesting threshold (currently 10)
 >> which
 >> will stop runaway circular inclusions. However, if triggered, it's not
 >> always obvious what triggered it, and sometimes resource exhaustion
 >> might kick in beforehand (as appeared to be the case in [2] above).
 >>
 >> Patch {2} attempts to handle this situation by keeping track of which
 >> files have already been included (based on their absolute, canonical
 >> path) and reporting an error if they were encountered again.
 >
 > This seems to me to be overkill. The issue [2] is prevented by
 > the patch 1's amendment.

Yes, that particular issue is prevented, but this patch is intended to
provide better protection against explicit circular inclusions, e.g.
if someone adds "include 'postgresql.conf'" at the end of "postgresql.conf"
(or more realistically has a complex setup with multiple included
configuration files where something gets mixed up).

Currently the nesting threshold stops it becoming a runaway
problem, but is not very user-friendly. E.g. with "include 'postgresql.conf'"
added to the end of "postgresql.conf", without patch on startup you get:

     LOG:  could not open configuration file "postgresql.conf": maximum nesting depth exceeded
     FATAL:  configuration file "/path/to/postgresql.conf" contains errors

(cue panicking user with production server down: "OMG the file can't be opened,
is my filesystem corrupted?" etc.)

With the patch:

     LOG:  configuration file "/path/to/postgresql.conf" was previously parsed
     FATAL:  configuration file "/path/to/postgresql.conf" contains errors

(actually maybe we could add a bit more detail such as line number there).

 > (I don't think it's not worth donig to
 > add protection from explicit inclusion of pg_hba.conf from
 > postgresql.conf or itself or such like.)

I thought about that, but came to the same conclusion.

 >> On server startup:
 >>
 >>     2019-07-11 09:13:25.610 GMT [71140] LOG: configuration file
 >>     "/var/lib/pgsql/data/postgresql.conf" was previously parsed
 >>     2019-07-11 09:13:25.610 GMT [71140] FATAL: configuration file
 >>     "/var/lib/pgsql/data/postgresql.conf" contains errors
 >>
 >> After sending SIGHUP:
 >>
 >>      postgres=# SELECT sourceline, seqno, name, setting, applied, error
 >>      FROM pg_file_settings WHERE error IS NOT NULL;
 >>       sourceline | seqno | name | setting | applied | error
 >>
------------+-------+------+---------+---------+--------------------------------------------------------------------------------
 >>              757 | 45 | | | f | configuration file
 >>              "/var/lib/pgsql/data/postgresql.conf" was previously parsed
 >>      (1 row)
 >>
 >> 3) "include" directives in postgresql.auto.conf and extension control
 >> files
 >> ===========================================================================
 >>
 >> Currently these are parsed and acted on, even though it makes no sense
 >> for further
 >> config files to be included in either case.
 >
 > Anyway manual edit is explicitly prohibited for auto.conf.

Indeed, but there are many things we tell people not to do, such as
removing tablespace directories, but they still do them...

 > And even if it is added, the 10-depth limitation would protect from
 > infinite loop.

It's not just about protecting againt infinite loops - if you do something
like "include 'postgresql.conf'", as-is the code will happily slurp in
all the items from "postgresql.conf" into "postgresql.auto.conf", which
is going to cause some headscratching if it ever happens.

Like I said in the original mail these are extremely unlikely corner cases;
but if patch {2} is in place, it's trivial to prevent them ever becoming a
problem.

 >> With "postgresql.auto.conf", if a file is successfully included, its
 >> contents
 >> will then be written to "postgresql.auto.conf" and the include
 >> directive will be
 >> removed, which seems like a recipe for confusion.
 >>
 >> These are admittedly unlikely corner cases, but it's easy enough to
 >> stop this
 >> happening on the offchance someone tries to use this to solve some
 >> problem in
 >> completely the wrong way.
 >>
 >> Patch {3} implements this (note that this patch depends on patch {2}).
 >>
 >> Extension example:
 >>
 >>     postgres=# CREATE EXTENSION repmgr;
 >>     ERROR: "include" not permitted in file
 >>     "/home/barwick/devel/postgres/builds/HEAD/share/extension/repmgr.control"
 >>     line 8
 >>     postgres=# CREATE EXTENSION repmgr;
 >>     ERROR: "include_dir" not permitted in file
 >>     "/home/barwick/devel/postgres/builds/HEAD/share/extension/repmgr.control"
 >>     line 9
 >>     postgres=# CREATE EXTENSION repmgr;
 >>     ERROR: "include_if_exists" not permitted in file
 >>     "/home/barwick/devel/postgres/builds/HEAD/share/extension/repmgr.control"
 >>     line 10
 >>
 >> pg.auto.conf example:
 >>
 >>     postgres=# ALTER SYSTEM SET default_tablespace ='pg_default';
 >>     ERROR:  could not parse contents of file "postgresql.auto.conf"
 >>      postgres=# SELECT regexp_replace(sourcefile, '^/.+/','') AS
 >>      sourcefile,
 >>                        seqno, name, setting, applied, error
 >>                   FROM pg_file_settings WHERE error IS NOT NULL;
 >>            sourcefile      | seqno | name | setting | applied |          error
 >>      ----------------------+-------+------+---------+---------+-------------------------
 >>       postgresql.auto.conf | 45 | | | f | "include" not permitted
 >>      (1 row)
 >>
 >> The patch also has the side-effect that "include" directives are no
 >> longer
 >> (silently) removed from "postgresql.auto.conf"; as the only way they
 >> can get
 >> into the file in the first place is by manually editing it, I feel
 >> it's
 >> reasonable for the user to be made aware that they're not valid and
 >> have to
 >> manually remove them.
 >>
 >>
 >> Patches
 >> =======
 >>
 >> Code:
 >>
 >> {1} disallow-empty-include-directives.v1.patch
 >> {2} track-included-files.v1.patch
 >> {3} prevent-disallowed-includes.v1.patch
 >>
 >> TAP tests:
 >> {1} tap-test-configuration.v1.patch
 >> {2} tap-test-disallow-empty-include-directives.v1.patch
 >> {3} tap-test-track-included-files.v1.patch
 >> {4} tap-test-prevent-disallowed-includes.v1.patch
 >>
 >> Patches apply cleanly to REL_12_STABLE/HEAD, they could be modfied for
 >> all supported versions if required. I can consolidate the patches
 >> if preferred.
 >
 > I don't think this is new to 12.

No, though I'm not sure how much this would be seen as a bugfix
and how far back it would be sensible to patch.

Regards

Ian Barwick


--
  Ian Barwick                   https://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Jeevan Chalke
Date:
Subject: Re: block-level incremental backup
Next
From: Tom Lane
Date:
Subject: Re: POC: converting Lists into arrays