Allow file inclusion in pg_hba and pg_ident files - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Allow file inclusion in pg_hba and pg_ident files
Date
Msg-id 20220223045959.35ipdsvbxcstrhya@jrouhaud
Whole thread Raw
Responses Re: Allow file inclusion in pg_hba and pg_ident files
List pgsql-hackers
Hi,

I recently had to work on some internal tool which, among other things, has to
merge pg_hba.conf and pg_ident.conf files between an existing (possibly
already updated) instance and some external repository.

My biggest concern is that any issue in either the external repository content
or the tool could leave the instance entirely unreachable.  There are some
protection attemps to avoid that, but really there isn't any practical
possibility if one wants to make sure that some of the entries are left alone
or not hidden no matter what.

To address that, I'd like to propose the possibility to include files in hba
and ident configuration files.  This was already discussed in the past, and in
my understanding this is mostly wanted, while some people expressed concerned
on a use case that wouldn't rely on thousands of entries.

In my case, there shouldn't be more than a dozen generated pg_hba/pg_ident
lines.  All I want is to have my main hba (and ident) files do something like:

include hba_forbid_non_ssl.conf
include hba_superuser.conf
include hba_replication.conf
include hba_application_generated.conf

So that the tool has a way to make sure that it won't prevent dba login or
replication, or allow unsecure connections, and can simply rewrite its target
file rather than merging it, sorting it with weird rules and deciding which
lines are ok to be removed and which one aren't.

I'm attaching a patchset that implements $SUBJECT:

0001 adds a new pg_ident_file_mappings view, which is basically the same as
pg_hba_file_rules view but for mappings.  It's probably already useful, for
instance if you need to tweak some regexp.

0002 adds a new "include" directive to both auth files, which will include the
given file at this exact position.  I chose to do this is in the tokenization
rather than the parsing, as it makes things simpler in general, and also allows
a single code path for both files.  It also adds 2 new columns to the
pg_hba_file_rules and pg_ident_file_mappings views: file_name and
(rule|mapping)_number, so it's possible to identify where a rule is coming from
and more importantly which has the priority over the over.  Note that I only
added "include", but maybe people would want something like include_dir or
include_if_exists too.

Both patches have updated documentation, but no test yet.  If there's an
agreement on the feature, I plan to add some TAP test for it.

Finally I also added 0003, which is a POC for a new pg_hba_matches() function,
that can help DBA to understand why their configuration isn't working as they
expect.  This only to start the discussion on that topic, the code is for now
really hackish, as I don't know how much this is wanted and/or if some other
behavior would be better, and there's also no documentation or test.  The
function for now only takes an optional inet (null means unix socket), the
target role and an optional ssl flag and returns the file, line and raw line
matching if any, or null.  For instance:

=# select * from pg_hba_matches('127.0.0.1'::inet, 'postgres');
-[ RECORD 1 ]-----------------------------------------------------------------
file_name | /tmp/pgbuild/toto.conf
line_num  | 5
raw_line  | host    all            all            127.0.0.1/32           trust

I'm wondering for instance if it would be better to (optionally?) keep
iterating over the lines and print all lines that would have matched and in
which order, or if the function should return the parsed line information
rather than the raw line, although it could make the output worse if using huge
secondary auth files.  I'm also not sure if the various possible flags (ssl,
gss) should be explicitly set or if the function should try various
permutations on its own.

Note that this function only works against the current configuration files, not
the one currently active.  As far as I can see PostgresMain gets rid of
PostmasterContext, which is where they currently live, so they don't exist
anymore in the backends.  To make it work we would have to always keep those in
each backend, but also reload them along with the main config file on SIGHUP.
Except on Windows (and -DEXEC_BACKEND), as the current version of auth files
are always used anyway.  So would it make sense to add the possibility to use
the loaded files instead, given the required extra cost and the fact that it
wouldn't make sense on Windows?  If yes, I guess that we should also allow that
in the pg_hba / pg_ident views.

While at it, I noticed this comment added in de16ab72388:

> The <filename>pg_hba.conf</filename> file is read on start-up and when
> the main server process receives a
> <systemitem>SIGHUP</systemitem><indexterm><primary>SIGHUP</primary></indexterm>
> signal. If you edit the file on an
> active system, you will need to signal the postmaster
> (using <literal>pg_ctl reload</literal>, calling the SQL function
> <function>pg_reload_conf()</function>, or using <literal>kill
> -HUP</literal>) to make it re-read the file.
> [...]
> The preceding statement is not true on Microsoft Windows: there, any
> changes in the <filename>pg_hba.conf</filename> file are immediately
> applied by subsequent new connections.

I also find this comment a bit misleading.  Wouldn't it be better to have a
similar statement for pg_ident, or at least a note that it applies to both
files?

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: pg_stat_get_replication_slot and pg_stat_get_subscription_worker incorrectly marked as proretset
Next
From: Peter Smith
Date:
Subject: Re: Design of pg_stat_subscription_workers vs pgstats