Thread: Allow file inclusion in pg_hba and pg_ident files

Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

23 February 2022, 04:59:59

Hi,

I recently had to work on some internal tool which, among other things, has to
merge pg_hba.conf and pg_ident.conf files between an existing (possibly
already updated) instance and some external repository.

My biggest concern is that any issue in either the external repository content
or the tool could leave the instance entirely unreachable.  There are some
protection attemps to avoid that, but really there isn't any practical
possibility if one wants to make sure that some of the entries are left alone
or not hidden no matter what.

To address that, I'd like to propose the possibility to include files in hba
and ident configuration files.  This was already discussed in the past, and in
my understanding this is mostly wanted, while some people expressed concerned
on a use case that wouldn't rely on thousands of entries.

In my case, there shouldn't be more than a dozen generated pg_hba/pg_ident
lines.  All I want is to have my main hba (and ident) files do something like:

include hba_forbid_non_ssl.conf
include hba_superuser.conf
include hba_replication.conf
include hba_application_generated.conf

So that the tool has a way to make sure that it won't prevent dba login or
replication, or allow unsecure connections, and can simply rewrite its target
file rather than merging it, sorting it with weird rules and deciding which
lines are ok to be removed and which one aren't.

I'm attaching a patchset that implements $SUBJECT:

0001 adds a new pg_ident_file_mappings view, which is basically the same as
pg_hba_file_rules view but for mappings.  It's probably already useful, for
instance if you need to tweak some regexp.

0002 adds a new "include" directive to both auth files, which will include the
given file at this exact position.  I chose to do this is in the tokenization
rather than the parsing, as it makes things simpler in general, and also allows
a single code path for both files.  It also adds 2 new columns to the
pg_hba_file_rules and pg_ident_file_mappings views: file_name and
(rule|mapping)_number, so it's possible to identify where a rule is coming from
and more importantly which has the priority over the over.  Note that I only
added "include", but maybe people would want something like include_dir or
include_if_exists too.

Both patches have updated documentation, but no test yet.  If there's an
agreement on the feature, I plan to add some TAP test for it.

Finally I also added 0003, which is a POC for a new pg_hba_matches() function,
that can help DBA to understand why their configuration isn't working as they
expect.  This only to start the discussion on that topic, the code is for now
really hackish, as I don't know how much this is wanted and/or if some other
behavior would be better, and there's also no documentation or test.  The
function for now only takes an optional inet (null means unix socket), the
target role and an optional ssl flag and returns the file, line and raw line
matching if any, or null.  For instance:

=# select * from pg_hba_matches('127.0.0.1'::inet, 'postgres');
-[ RECORD 1 ]-----------------------------------------------------------------
file_name | /tmp/pgbuild/toto.conf
line_num  | 5
raw_line  | host    all            all            127.0.0.1/32           trust

I'm wondering for instance if it would be better to (optionally?) keep
iterating over the lines and print all lines that would have matched and in
which order, or if the function should return the parsed line information
rather than the raw line, although it could make the output worse if using huge
secondary auth files.  I'm also not sure if the various possible flags (ssl,
gss) should be explicitly set or if the function should try various
permutations on its own.

Note that this function only works against the current configuration files, not
the one currently active.  As far as I can see PostgresMain gets rid of
PostmasterContext, which is where they currently live, so they don't exist
anymore in the backends.  To make it work we would have to always keep those in
each backend, but also reload them along with the main config file on SIGHUP.
Except on Windows (and -DEXEC_BACKEND), as the current version of auth files
are always used anyway.  So would it make sense to add the possibility to use
the loaded files instead, given the required extra cost and the fact that it
wouldn't make sense on Windows?  If yes, I guess that we should also allow that
in the pg_hba / pg_ident views.

While at it, I noticed this comment added in de16ab72388:

> The <filename>pg_hba.conf</filename> file is read on start-up and when
> the main server process receives a
> <systemitem>SIGHUP</systemitem><indexterm><primary>SIGHUP</primary></indexterm>
> signal. If you edit the file on an
> active system, you will need to signal the postmaster
> (using <literal>pg_ctl reload</literal>, calling the SQL function
> <function>pg_reload_conf()</function>, or using <literal>kill
> -HUP</literal>) to make it re-read the file.
> [...]
> The preceding statement is not true on Microsoft Windows: there, any
> changes in the <filename>pg_hba.conf</filename> file are immediately
> applied by subsequent new connections.

I also find this comment a bit misleading.  Wouldn't it be better to have a
similar statement for pg_ident, or at least a note that it applies to both
files?

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Nathan Bossart

Date:

23 February 2022, 17:44:58

On Wed, Feb 23, 2022 at 12:59:59PM +0800, Julien Rouhaud wrote:
> To address that, I'd like to propose the possibility to include files in hba
> and ident configuration files.  This was already discussed in the past, and in
> my understanding this is mostly wanted, while some people expressed concerned
> on a use case that wouldn't rely on thousands of entries.

+1, I think this would be very useful.

> 0001 adds a new pg_ident_file_mappings view, which is basically the same as
> pg_hba_file_rules view but for mappings.  It's probably already useful, for
> instance if you need to tweak some regexp.

This seems reasonable.

> Finally I also added 0003, which is a POC for a new pg_hba_matches() function,
> that can help DBA to understand why their configuration isn't working as they
> expect.  This only to start the discussion on that topic, the code is for now
> really hackish, as I don't know how much this is wanted and/or if some other
> behavior would be better, and there's also no documentation or test.  The
> function for now only takes an optional inet (null means unix socket), the
> target role and an optional ssl flag and returns the file, line and raw line
> matching if any, or null.  For instance:

I think another use-case for this is testing updates to your configuration
files.  For example, I could ensure that hba_forbid_non_ssl.conf wasn't
accidentally reverted as part of an unrelated change.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

26 February 2022, 06:04:43

On Wed, Feb 23, 2022 at 09:44:58AM -0800, Nathan Bossart wrote:
> On Wed, Feb 23, 2022 at 12:59:59PM +0800, Julien Rouhaud wrote:
>> 0001 adds a new pg_ident_file_mappings view, which is basically the same as
>> pg_hba_file_rules view but for mappings.  It's probably already useful, for
>> instance if you need to tweak some regexp.
>
> This seems reasonable.

Interesting.  One can note that hba.c is already large, and this makes
the file larger.  I'd like to think that it would be better to move
all the code related to the SQL functions for pg_hba.conf and such to
a new hbafuncs.c under adt/.  Would that make sense?
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

26 February 2022, 06:27:15

Hi,

On Sat, Feb 26, 2022 at 03:04:43PM +0900, Michael Paquier wrote:
> On Wed, Feb 23, 2022 at 09:44:58AM -0800, Nathan Bossart wrote:
> > On Wed, Feb 23, 2022 at 12:59:59PM +0800, Julien Rouhaud wrote:
> >> 0001 adds a new pg_ident_file_mappings view, which is basically the same as
> >> pg_hba_file_rules view but for mappings.  It's probably already useful, for
> >> instance if you need to tweak some regexp.
> > 
> > This seems reasonable.
> 
> Interesting.  One can note that hba.c is already large, and this makes
> the file larger.  I'd like to think that it would be better to move
> all the code related to the SQL functions for pg_hba.conf and such to
> a new hbafuncs.c under adt/.  Would that make sense?

I'm fine with it.  Assuming that you meant to move also the underlying
functions that goes with it (fill_hba_line and such), that would end up
removing about 15% of hba.c (after applying 0001, 0002 and 0003).

Note that in order to do so we would need to expose quite a lot more about hba
internals, like tokenize_file() and parse_hba_line(), along with structs
HbaToken and TokenizedLine.

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

26 February 2022, 06:36:19

On Sat, Feb 26, 2022 at 02:27:15PM +0800, Julien Rouhaud wrote:
> I'm fine with it.  Assuming that you meant to move also the underlying
> functions that goes with it (fill_hba_line and such), that would end up
> removing about 15% of hba.c (after applying 0001, 0002 and 0003).

Cool.  Thanks for the number.

> Note that in order to do so we would need to expose quite a lot more about hba
> internals, like tokenize_file() and parse_hba_line(), along with structs
> HbaToken and TokenizedLine.

I'd be rather fine with exposing all that in the shape of a clean
split, renaming some of those structures and/or function with an
Hba-like prefix, for consistency.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

26 February 2022, 06:50:33

On Sat, Feb 26, 2022 at 03:36:19PM +0900, Michael Paquier wrote:
> On Sat, Feb 26, 2022 at 02:27:15PM +0800, Julien Rouhaud wrote:
> 
> > Note that in order to do so we would need to expose quite a lot more about hba
> > internals, like tokenize_file() and parse_hba_line(), along with structs
> > HbaToken and TokenizedLine.
> 
> I'd be rather fine with exposing all that in the shape of a clean
> split, renaming some of those structures and/or function with an
> Hba-like prefix, for consistency.

Of course.  I was thinking using "auth" for something that's common to pg_hba
and pg_ident (like e.g. TokenizeAuthFile()), and otherwise keep the current
hba/ident prefix.

Unless someone object or suggest better naming in the next few days I will take
care of that.

I would also welcome some opinion on the points I mentioned about 0002.

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

28 February 2022, 08:01:07

On Sat, Feb 26, 2022 at 02:50:33PM +0800, Julien Rouhaud wrote:
> Of course.  I was thinking using "auth" for something that's common to pg_hba
> and pg_ident (like e.g. TokenizeAuthFile()), and otherwise keep the current
> hba/ident prefix.

Okay, thanks.

> Unless someone object or suggest better naming in the next few days I will take
> care of that.

I don't have an opinion to share about 0002 and 0003 yet, but 0001
seems like a good idea on its own.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

28 February 2022, 11:39:37

Hi,

On Wed, Feb 23, 2022 at 09:44:58AM -0800, Nathan Bossart wrote:
>
> > Finally I also added 0003, which is a POC for a new pg_hba_matches() function,
> > that can help DBA to understand why their configuration isn't working as they
> > expect.  This only to start the discussion on that topic, the code is for now
> > really hackish, as I don't know how much this is wanted and/or if some other
> > behavior would be better, and there's also no documentation or test.  The
> > function for now only takes an optional inet (null means unix socket), the
> > target role and an optional ssl flag and returns the file, line and raw line
> > matching if any, or null.  For instance:
>
> I think another use-case for this is testing updates to your configuration
> files.  For example, I could ensure that hba_forbid_non_ssl.conf wasn't
> accidentally reverted as part of an unrelated change.

Indeed, that function could really be helpful in many scenario.  Note that this
isn't my idea but Magnus idea, which he mentioned quite a long time ago.

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

28 February 2022, 11:42:17

Hi,

On Mon, Feb 28, 2022 at 05:01:07PM +0900, Michael Paquier wrote:
> On Sat, Feb 26, 2022 at 02:50:33PM +0800, Julien Rouhaud wrote:
> > Of course.  I was thinking using "auth" for something that's common to pg_hba
> > and pg_ident (like e.g. TokenizeAuthFile()), and otherwise keep the current
> > hba/ident prefix.
> 
> Okay, thanks.

Done in attached v2.  I did the split in a separate commit, as the diff is
otherwise unreadable.  While at it I also fixed a few minor issues (I missed a
MemoryContextDelete, and now avoid relying on inet_net_pton which apparently
doesn't exist in cygwin).

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

01 March 2022, 07:45:48

On Mon, Feb 28, 2022 at 07:42:17PM +0800, Julien Rouhaud wrote:
> Done in attached v2.  I did the split in a separate commit, as the diff is
> otherwise unreadable.  While at it I also fixed a few minor issues (I missed a
> MemoryContextDelete, and now avoid relying on inet_net_pton which apparently
> doesn't exist in cygwin).

Hmm.  The diffs of 0001 are really hard to read.  Do you know why this
is happening?  Is that because some code has been moved around?  I
have been doing a comparison of all the routines showing up in the
diffs, to note that the contents of load_hba(), load_ident(),
hba_getauthmethod() & friends are actually the same, but this makes
the change history harder to follow.  Moving around fill_hba_line()
and fill_hba_view() should be enough, indeed.

+#include "utils/guc.h"
+//#include "utils/tuplestore.h"

Ditto.

+   /* Build a tuple descriptor for our result type */
+   if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+       elog(ERROR, "return type must be a row type");

Worth noting that I was planning to apply a patch from Melanie
Plageman to simplify the creation of tupledesc and tuplestores for
set-returning functions like this one, so this would cut a bit of code
here.  This is not directly related to your patch, though, that's my
business :)

Well, as of 0002, one thing that makes things harder to follow is that
parse_ident_line() is moved at a different place in hba.c, one slight
difference being err_msg to store the error message in the token
line..  But shouldn't the extension of parse_ident_line() with its
elevel be included in 0001?  Or, well, it could just be done in its
own patch to make for a cleaner history, so as 0002 could be shaped as
two commits itself.

Also, it seems to me that we'd better have some TAP tests for that to
make sure of its contents?  One place would be src/test/auth/.
Another place where we make use of user mapping is the krb5 tests but
these are run in a limited fashion in the buildfarm.  We also set some
mappings for SSPI on Windows all the time, so we'd better be careful
about that and paint some $windows_os in the tests when looking at the
output of the view.

+-- We expect no user mapping in this test
+select count(*) = 0 as ok from pg_ident_file_mappings;

It could be possible to do installcheck on an instance that has user
mappings meaning that this had better be ">= 0", no?  Does this pass
on Windows where pg_regress sets some mappings for SSPI when creating
one or more roles?
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

01 March 2022, 09:19:50

Hi,

On Tue, Mar 01, 2022 at 04:45:48PM +0900, Michael Paquier wrote:
>
> Hmm.  The diffs of 0001 are really hard to read.  Do you know why this
> is happening?  Is that because some code has been moved around?

Yes, I followed the file convention to put the static functions first and then
the exposed functions, and git-diff makes a terrible mess out of it :(

> I
> have been doing a comparison of all the routines showing up in the
> diffs, to note that the contents of load_hba(), load_ident(),
> hba_getauthmethod() & friends are actually the same, but this makes
> the change history harder to follow.  Moving around fill_hba_line()
> and fill_hba_view() should be enough, indeed.

There's no functional change apart from exposing some functions and moving some
in another file, so I though it's still ok to keep some consistency.  There
isn't much changes backpatched in that file, so it shouldn't create more
maintenance burden than simply splitting the file.

If you prefer to interleave static and non static function I can change it.

> +#include "utils/guc.h"
> +//#include "utils/tuplestore.h"

Yes I noticed this one this morning.  I didn't want to send a new patch version
just for that, but I already fixed it locally.

> +   /* Build a tuple descriptor for our result type */
> +   if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
> +       elog(ERROR, "return type must be a row type");
> 
> Worth noting that I was planning to apply a patch from Melanie
> Plageman to simplify the creation of tupledesc and tuplestores for
> set-returning functions like this one, so this would cut a bit of code
> here.  This is not directly related to your patch, though, that's my
> business :)

Yes I'm aware of that thread.  I will be happy to change the patch to use
MakeFuncResultTuplestore() as soon as it lands.  Thanks for the notice though.

> Well, as of 0002, one thing that makes things harder to follow is that
> parse_ident_line() is moved at a different place in hba.c, one slight
> difference being err_msg to store the error message in the token
> line..  But shouldn't the extension of parse_ident_line() with its
> elevel be included in 0001?

No, because 0001 is unrelated.  The changes you mentioned (exposing the
function and adding the error reporting) are only needed for the new view
introduced in 0002, thus not part of 0001.

> Or, well, it could just be done in its own patch to make for a cleaner
> history, so as 0002 could be shaped as two commits itself.

It seems strange to me to add a commit (or include in a commit) just to make a
single function exposed while nothing needs it, but I can do it this way if you
prefer.

> Also, it seems to me that we'd better have some TAP tests for that to
> make sure of its contents?

As I mentioned in my initial email, I intentionally didn't add any test in the
patchset yet, except the exact same coverage for the new view as there's for
pg_hba_file_rules.  Ideally I'd like to add tests only once, to cover both 002
and 0003.  But I don't want to waste time for that right now, especially since
no one seems to be interested in 0003.

> +-- We expect no user mapping in this test
> +select count(*) = 0 as ok from pg_ident_file_mappings;
> 
> It could be possible to do installcheck on an instance that has user
> mappings meaning that this had better be ">= 0", no?

I thought about it, and supposed it would bring a bit more value with the test
like that.  I can change it if you prefer.

> Does this pass
> on Windows where pg_regress sets some mappings for SSPI when creating
> one or more roles?

According to CI and cfbot yes.  E.g.
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/37/3558.
Note that the failed runs are the warning I mentioned for mingw32 and the POC
0004, which is now fixed.

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

01 March 2022, 11:28:09

On Tue, Mar 01, 2022 at 05:19:50PM +0800, Julien Rouhaud wrote:
> On Tue, Mar 01, 2022 at 04:45:48PM +0900, Michael Paquier wrote:
>> Hmm.  The diffs of 0001 are really hard to read.  Do you know why this
>> is happening?  Is that because some code has been moved around?
>
> Yes, I followed the file convention to put the static functions first and then
> the exposed functions, and git-diff makes a terrible mess out of it :(

I'd like to think that not doing such a thing would be more helpful in
this case.  As the diffs show, anyone is going to have a hard time to
figure out if there are any differences in any of those routines, and
if these are the origin of a different problem.  A second thing is
that this is going to make back-patching unnecessarily harder.

> There's no functional change apart from exposing some functions and moving some
> in another file, so I though it's still ok to keep some consistency.  There
> isn't much changes backpatched in that file, so it shouldn't create more
> maintenance burden than simply splitting the file.

A lot of files do that already.  History clarity matters most IMO.

> As I mentioned in my initial email, I intentionally didn't add any test in the
> patchset yet, except the exact same coverage for the new view as there's for
> pg_hba_file_rules.  Ideally I'd like to add tests only once, to cover both 002
> and 0003.  But I don't want to waste time for that right now, especially since
> no one seems to be interested in 0003.

But that would be helpful for 0002.  I think that we should have a bit
more coverage in this area.  pg_hba_file_rules could gain in coverage,
additionally, but this is unrelated to what you are proposing here..

>> Does this pass
>> on Windows where pg_regress sets some mappings for SSPI when creating
>> one or more roles?
>
> According to CI and cfbot yes.  E.g.
> https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/37/3558.
> Note that the failed runs are the warning I mentioned for mingw32 and the POC
> 0004, which is now fixed.

Interesting, I would not have expected that.  I may poke at that more
seriously.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

16 March 2022, 03:53:25

Hi,

The cfbot says that the patch doesn't apply anymore, so here's a v3 with the
changes mentioned below.

On Tue, Mar 01, 2022 at 05:19:50PM +0800, Julien Rouhaud wrote:
>
> If you prefer to interleave static and non static function I can change it.

Change the split to not reorder functions.

> > +#include "utils/guc.h"
> > +//#include "utils/tuplestore.h"
>
> Yes I noticed this one this morning.  I didn't want to send a new patch version
> just for that, but I already fixed it locally.

Included.

> Yes I'm aware of that thread.  I will be happy to change the patch to use
> MakeFuncResultTuplestore() as soon as it lands.  Thanks for the notice though.

Done, with the new SetSingleFuncCall().

> > It could be possible to do installcheck on an instance that has user
> > mappings meaning that this had better be ">= 0", no?
>
> I thought about it, and supposed it would bring a bit more value with the test
> like that.  I can change it if you prefer.

Change this way.

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Aleksander Alekseev

Date:

22 March 2022, 12:21:20

Hi hackers,

> The cfbot says that the patch doesn't apply anymore, so here's a v3 with the
> changes mentioned below.

I came across this thread while looking for the patches that need review.

The v3-0001 patch LGTM.

Since v3-0002 adds a new view and alters pg_proc.dat shouldn't it also increase CATALOG_VERSION_NO? Not sure if we generally do this in the patches or expect the committer to make the change manually.

Same question regarding v3-0003. Other than that the patch looks OK, but doesn't seem to add any tests for the new functionality. Do you think it would be possible to test-cover the file inclusion? Personally I don't think it's that critical to have these particular tests, but if they can be added, I think we should do so.

I didn't review v3-0004 since it's marked as PoC and seems to be a separate feature that is not targeting PG15. I suggest excluding it from the patchset in order to keep the focus. Consider creating a new thread and a new CF entry after we deal with v3-0001...v3-0003.

All in all, the patchset seems to be in good shape and I don't have anything but some little nitpicks. It passes `make installcheck` and I verified manually that the file inclusion 1) works 2) write proper error messages to the logfile when the included file doesn't exist or has wrong permissions.

--
Best regards,
Aleksander Alekseev

Re: Allow file inclusion in pg_hba and pg_ident files

From

Aleksander Alekseev

Date:

22 March 2022, 12:32:12

Hi hackers,

> It passes `make installcheck` ...

`make installcheck-world`, of course. Sorry for the confusion.

Best regards,

Aleksander Alekseev

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

22 March 2022, 13:38:00

Hi,

On Tue, Mar 22, 2022 at 03:21:20PM +0300, Aleksander Alekseev wrote:
>
> The v3-0001 patch LGTM.
>
> Since v3-0002 adds a new view and alters pg_proc.dat shouldn't it also
> increase CATALOG_VERSION_NO? Not sure if we generally do this in the
> patches or expect the committer to make the change manually.

It's better to no include any catversion bump, otherwise the patches will
rot very fast.  Author can mention the need for a catversion bump in the
commit message, and I usually do so but I apparently forgot.

> Same question regarding v3-0003. Other than that the patch looks OK, but
> doesn't seem to add any tests for the new functionality. Do you think it
> would be possible to test-cover the file inclusion? Personally I don't
> think it's that critical to have these particular tests, but if they can be
> added, I think we should do so.

Yes, as I mentioned in the first email I'm willing to add test coverage.  But
this will require TAP tests, and it's likely going to be a bit annoying to make
sure it has decent coverage and works on all platforms, so I'd rather do it
once I know there's at least some basic agreement on the feature and/or the
approach.  Unfortunately, until now there was no feedback on that part despite
the activity on the thread.  In my experience it's a good sign that this will
get rejected soon, so I still didn't write tests.

> I didn't review v3-0004 since it's marked as PoC and seems to be a separate
> feature that is not targeting PG15. I suggest excluding it from the
> patchset in order to keep the focus. Consider creating a new thread and a
> new CF entry after we deal with v3-0001...v3-0003.

This feature was discussed in some old thread when the file inclusion was first
discussed almost a decade ago, and in my understanding it was part of the "must
have" (along with 0002) in order to accept the feature.  That's why I added
it, since I'm also willing to work of that if needed, whether before or after
the file inclusion thing.

> All in all, the patchset seems to be in good shape and I don't have
> anything but some little nitpicks. It passes `make installcheck` and I
> verified manually that the file inclusion 1) works 2) write proper error
> messages to the logfile when the included file doesn't exist or has wrong
> permissions.

Thanks!

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

23 March 2022, 02:03:46

On Tue, Mar 22, 2022 at 09:38:00PM +0800, Julien Rouhaud wrote:
> On Tue, Mar 22, 2022 at 03:21:20PM +0300, Aleksander Alekseev wrote:
>> Since v3-0002 adds a new view and alters pg_proc.dat shouldn't it also
>> increase CATALOG_VERSION_NO? Not sure if we generally do this in the
>> patches or expect the committer to make the change manually.
>
> It's better to no include any catversion bump, otherwise the patches will
> rot very fast.  Author can mention the need for a catversion bump in the
> commit message, and I usually do so but I apparently forgot.

Yeah, committers take care of that.  You would just expose yourself to
more noise in the CF bot for no gain, as a catversion bump is useful
after a patch has been merged so as as users are able to know when a
cluster needs to be pg_upgrade'd or initdb'd because the catalog
created and run are incompatible.

>> All in all, the patchset seems to be in good shape and I don't have
>> anything but some little nitpicks. It passes `make installcheck` and I
>> verified manually that the file inclusion 1) works 2) write proper error
>> messages to the logfile when the included file doesn't exist or has wrong
>> permissions.
>
> Thanks!

Pushing forward with 0001 by the end of the CF is the part that has no
controversy IMO, and I have no objections to it.  Now, after looking
at this part, I found a few things, as of:
- HbaToken, the set of elements in the lists of TokenizedAuthLine, is
a weird to use as this layer gets used by both pg_hba.conf and
pg_indent.conf before transforming them into each HbaLine and
IdentLine.  While making this part of the internals exposed, I think
that we'd better rename that to AuthToken at least.  This impacts the
names of some routines internal to hba.c to copy and create
AuthTokens.
- s/gethba_options/get_hba_options/, to be consistent with
fill_hba_view() and other things.
- The comment at the top of tokenize_auth_file() needed a refresh.

That's mostly cosmetic, and the rest of the code moved is identical.
So at the end this part looks rather commitable to me.

I have not been able to test 0002 in details, but it looks rather
rather sane to me at quick glance, and it is simple.  The argument
about more TAP tests applies to it, though, even if there is one SQL
test to check the function execution.  It is probably better to not
consider 0003 and 0004 for this CF.
--
Michael

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

23 March 2022, 02:16:34

Hi,

On Wed, Mar 23, 2022 at 11:03:46AM +0900, Michael Paquier wrote:
> 
> Pushing forward with 0001 by the end of the CF is the part that has no
> controversy IMO, and I have no objections to it.  Now, after looking
> at this part, I found a few things, as of:
> - HbaToken, the set of elements in the lists of TokenizedAuthLine, is
> a weird to use as this layer gets used by both pg_hba.conf and
> pg_indent.conf before transforming them into each HbaLine and
> IdentLine.  While making this part of the internals exposed, I think
> that we'd better rename that to AuthToken at least.  This impacts the
> names of some routines internal to hba.c to copy and create
> AuthTokens.

Yeah, I thought about it but didn't rename it given your concerns about git
history.  I'm fine either way.

> - s/gethba_options/get_hba_options/, to be consistent with
> fill_hba_view() and other things.
> - The comment at the top of tokenize_auth_file() needed a refresh.
> 
> That's mostly cosmetic, and the rest of the code moved is identical.
> So at the end this part looks rather commitable to me.

Looks good to me, thanks.

> I have not been able to test 0002 in details, but it looks rather
> rather sane to me at quick glance, and it is simple.  The argument
> about more TAP tests applies to it, though, even if there is one SQL
> test to check the function execution.  It is probably better to not
> consider 0003 and 0004 for this CF.

No objection to moving 0003 and 0004 to the next commitfest.

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

23 March 2022, 04:14:02

On Wed, Mar 23, 2022 at 10:16:34AM +0800, Julien Rouhaud wrote:
> Yeah, I thought about it but didn't rename it given your concerns about git
> history.  I'm fine either way.

Oh, OK.  The amount of diffs created by 0001 is still fine to grab
even with the struct rename, so let's stick with it.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

24 March 2022, 07:50:31

On Wed, Mar 23, 2022 at 01:14:02PM +0900, Michael Paquier wrote:
> On Wed, Mar 23, 2022 at 10:16:34AM +0800, Julien Rouhaud wrote:
>> Yeah, I thought about it but didn't rename it given your concerns about git
>> history.  I'm fine either way.
>
> Oh, OK.  The amount of diffs created by 0001 is still fine to grab
> even with the struct rename, so let's stick with it.

And so, this first part has been applied as of d4781d8.  I'll try to
look at 0002 shortly.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

24 March 2022, 08:08:38

On Thu, Mar 24, 2022 at 04:50:31PM +0900, Michael Paquier wrote:
> On Wed, Mar 23, 2022 at 01:14:02PM +0900, Michael Paquier wrote:
> > On Wed, Mar 23, 2022 at 10:16:34AM +0800, Julien Rouhaud wrote:
> >> Yeah, I thought about it but didn't rename it given your concerns about git
> >> history.  I'm fine either way.
> > 
> > Oh, OK.  The amount of diffs created by 0001 is still fine to grab
> > even with the struct rename, so let's stick with it.
> 
> And so, this first part has been applied as of d4781d8.  I'll try to
> look at 0002 shortly.

Thanks!

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

25 March 2022, 11:18:31

On Thu, Mar 24, 2022 at 04:08:38PM +0800, Julien Rouhaud wrote:
> On Thu, Mar 24, 2022 at 04:50:31PM +0900, Michael Paquier wrote:
>> And so, this first part has been applied as of d4781d8.  I'll try to
>> look at 0002 shortly.
>
> Thanks!

Now looking at 0002.  The changes in hba.c are straight-forward,
that's a nice read.

     if (!field) { \
-        ereport(LOG, \
+        ereport(elevel, \
                 (errcode(ERRCODE_CONFIG_FILE_ERROR), \
                  errmsg("missing entry in file \"%s\" at end of line %d", \
                         IdentFileName, line_num))); \
+        *err_msg = psprintf("missing entry at end of line"); \
         return NULL; \
     } \
I think that we'd better add to err_msg the line number and the file
name.  This would become particularly important once the facility to
include files gets added.  We won't use IdentFileName for this
purpose, but at least we would know which areas to change.  Also, even
if the the view proposes line_number, there is an argument in favor of
consistency here.

+select count(*) >= 0 as ok from pg_ident_file_mappings;

I'd really like to see more tests for this stuff (pg_hba_file_rules is
not a great example in this area), where we could rely on something
for *nix platforms.  Windows would provide some valid coverage as we
enable it by default for SSPI in pg_regress but that's not great for
the average Joe.  One thing I thought about first is that
src/test/authentication/ does not test peer authentication at all,
which would map nicely with pg_ident.conf and some user mappings.  So
we could have a new test there, that depends on how the backend reacts
when calling getpeereid(), making the results conditional.  This would
be nice in the long-term.

As of a set of tests, I think that for now I would add two things, for
a total of four tests:
- Stick some queries on pg_ident_file_mappings only for Windows in
some of the tests of src/test/authentication/, say 001_password.pl.
One test should test for some valid fields.  Another idea I have here
is to add some junk to pg_ident.conf, reload and check that an error
is generated (the missing field case on a given line).
- Do the same for src/test/kerberos/, with one positive and one
negative test with some junk in the ident conf file.

A last idea is to abuse of the fact that pg_ident is loaded even if we
don't use it: aka we could add some right and junky contents in
pg_ident.conf, then check its validity.  Using one of the existing
tests may not be right, particularly if we finish by extending it, so
I would move that to a new fresh test script.

+    a.pg_usernamee,
[...]
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>pg_username</structfield> <type>text</type>

pg_usernamee sounds a bit weird as attribute name for the view.  We
could stick to a simple postgres_user_name, or postgres_name, for
example.  Perhaps that's just a typo in the function output and you
intended to use pg_username?

+   /* Free parse_hba_line memory */
+   MemoryContextSwitchTo(oldcxt);
+   MemoryContextDelete(identcxt);
Incorrect comment, this should be parse_ident_line.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

27 March 2022, 09:52:22

Hi,

On Fri, Mar 25, 2022 at 08:18:31PM +0900, Michael Paquier wrote:
>
> Now looking at 0002.  The changes in hba.c are straight-forward,
> that's a nice read.

Thanks!

>      if (!field) { \
> -        ereport(LOG, \
> +        ereport(elevel, \
>                  (errcode(ERRCODE_CONFIG_FILE_ERROR), \
>                   errmsg("missing entry in file \"%s\" at end of line %d", \
>                          IdentFileName, line_num))); \
> +        *err_msg = psprintf("missing entry at end of line"); \
>          return NULL; \
>      } \
> I think that we'd better add to err_msg the line number and the file
> name.  This would become particularly important once the facility to
> include files gets added.  We won't use IdentFileName for this
> purpose, but at least we would know which areas to change.  Also, even
> if the the view proposes line_number, there is an argument in favor of
> consistency here.

I don't really like it.  The file inclusion patch adds a file_name column in
both views so that you have a direct access to the information, whether the
line is in error or not.  Having the file name and line number in error message
doesn't add any value as it would be redundant, and just make the view output
bigger (on top of making testing more difficult).  I kept the err_msg as-is
(and fixed the ereport filename in the file inclusion patch that I indeed
missed).
>
> +select count(*) >= 0 as ok from pg_ident_file_mappings;
>
> I'd really like to see more tests for this stuff

I didn't like the various suggestions, as it would mean to scatter the tests
all over the place.  The whole point of those views is indeed to check the
current content of a file without applying the configuration change (not on
Windows or EXEC_BACKEND, but there's nothing we can do there), so let's use
this way.  I added a naive src/test/authentication/003_hba_ident_views.pl test
that validates that specific new valid and invalid lines in both files are
correctly reported.  Note that I didn't update those tests for the file
inclusion.

Note that those tests fail on Windows (and I'm assuming on EXEC_BACKEND
builds), as they're testing invalid files which by definition prevent any
further connection attempt.  I'm not sure what would be best to do here, apart
from bypassing the invalid config tests on such platforms.  I don't think that
validating that trying to connect on such platforms when an invalid
pg_hba/pg_ident file brings anything.

> +    a.pg_usernamee,
> [...]
> +      <entry role="catalog_table_entry"><para role="column_definition">
> +       <structfield>pg_username</structfield> <type>text</type>
>
> Perhaps that's just a typo in the function output and you
> intended to use pg_username?

Yes that was a typo :)  It's correctly documented in catalogs.sgml, so I just
fixed pg_proc.dat and rules.out.

> +   /* Free parse_hba_line memory */
> +   MemoryContextSwitchTo(oldcxt);
> +   MemoryContextDelete(identcxt);
> Incorrect comment, this should be parse_ident_line.

Indeed.  I actually fixed it before but lost the change when rebasing after the
2nd hbafuncs.c refactoring.  I also fixed an incorrect comment about
pg_hba_file_mappings.

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

28 March 2022, 07:20:07

On Sun, Mar 27, 2022 at 05:52:22PM +0800, Julien Rouhaud wrote:
> I didn't like the various suggestions, as it would mean to scatter the tests
> all over the place.  The whole point of those views is indeed to check the
> current content of a file without applying the configuration change (not on
> Windows or EXEC_BACKEND, but there's nothing we can do there), so let's use
> this way.  I added a naive src/test/authentication/003_hba_ident_views.pl test
> that validates that specific new valid and invalid lines in both files are
> correctly reported.  Note that I didn't update those tests for the file
> inclusion.

I see.  The tests ought to be more extensive though, as hba.c checks
for multiple or missing fields for a varying number of expecting
parameters.  Here are the patterns that would cover most of the
failures of the backend.  For hba.conf:
+host
+host incorrect_db
+host incorrect_db incorrect_user
+host incorrect_db incorrect_user incorrect_host

For pg_ident.conf:
+# Error with incomplete lines.
+incorrect_map
+incorrect_map os_user
+# Errors with lines that have multiple values, for each field.
+incorrect_map_1,incorrect_map_2
+incorrect_map os_user_1,os_user_2
+incorrect_map os_user pg_role_1,pg_role_2

Then I was thinking about doing full scan of each view and could the
expected errors with some GROUP BY magic.  See the attached, for
reference, but it would fail with EXEC_BACKEND on WIN32.

> Note that those tests fail on Windows (and I'm assuming on EXEC_BACKEND
> builds), as they're testing invalid files which by definition prevent any
> further connection attempt.  I'm not sure what would be best to do here, apart
> from bypassing the invalid config tests on such platforms.  I don't think that
> validating that trying to connect on such platforms when an invalid
> pg_hba/pg_ident file brings anything.

Hmm, indeed.  I have been looking at that and that's annoying.  As you
mentioned off-list, in order to know if a build has an emulation of
fork() that would avoid failures with new connection attempts after
including some dummy entries in pg_hba.conf or pg_ident.conf, we'd
need to look at EXEC_BACKEND (well, mostly), as of:
- CFLAGS in pg_config, or a query on pg_config() (does not work with
MSVC as CFLAGS is not set there).
- A potential check on pg_config{_manual}.h.
- Perhaps an extra dependency on $windows_os, discarding incorrectly
cygwin that should be able to work.

We could use a failure path for each psql command rather than a SKIP
block, as you told me, if the psql fails and check that we get some
error strings related to the loading of auth files.  However, I am
scared of this design in the long-term as it could cause the tests to
pass with a failure triggered on platforms and/or configurations where
we should have a success.  So, I am tempted to drop the ball for now
with the TAP part.

The patch still has value for the end-user.  I have checked the
backend part, and I did not notice any obvious issue.  There is one
thing that I am wondering though: should we change the two queries in
sysviews.sql so as we check that there are zero errors in the two
views when the files are parsed?  This simple change would avoid
mistakes for users running installcheck on a production installation.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

28 March 2022, 07:22:32

On Mon, Mar 28, 2022 at 04:20:07PM +0900, Michael Paquier wrote:
> See the attached, for reference, but it would fail with EXEC_BACKEND
> on WIN32.

Ditto.
--
Michael

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

28 March 2022, 07:43:41

Hi,

On Mon, Mar 28, 2022 at 04:20:07PM +0900, Michael Paquier wrote:
>
> > Note that those tests fail on Windows (and I'm assuming on EXEC_BACKEND
> > builds), as they're testing invalid files which by definition prevent any
> > further connection attempt.  I'm not sure what would be best to do here, apart
> > from bypassing the invalid config tests on such platforms.  I don't think that
> > validating that trying to connect on such platforms when an invalid
> > pg_hba/pg_ident file brings anything.
>
> Hmm, indeed.  I have been looking at that and that's annoying.  As you
> mentioned off-list, in order to know if a build has an emulation of
> fork() that would avoid failures with new connection attempts after
> including some dummy entries in pg_hba.conf or pg_ident.conf, we'd
> need to look at EXEC_BACKEND (well, mostly), as of:
> - CFLAGS in pg_config, or a query on pg_config() (does not work with
> MSVC as CFLAGS is not set there).
> - A potential check on pg_config{_manual}.h.
> - Perhaps an extra dependency on $windows_os, discarding incorrectly
> cygwin that should be able to work.

Yeah, detecting !cygwin-Windows or EXEC_BACKEND in the TAP tests is quite hard.

> We could use a failure path for each psql command rather than a SKIP
> block, as you told me, if the psql fails and check that we get some
> error strings related to the loading of auth files.  However, I am
> scared of this design in the long-term as it could cause the tests to
> pass with a failure triggered on platforms and/or configurations where
> we should have a success.  So, I am tempted to drop the ball for now
> with the TAP part.

Ok.  We could still keep the tests for the valid lines part though?

> The patch still has value for the end-user.  I have checked the
> backend part, and I did not notice any obvious issue.  There is one
> thing that I am wondering though: should we change the two queries in
> sysviews.sql so as we check that there are zero errors in the two
> views when the files are parsed?  This simple change would avoid
> mistakes for users running installcheck on a production installation.

Do you mean something like

SELECT count(*) > 0 AS ok,
       count(*) FILTER (WHERE error IS NOT NULL) = 0 AS has_no_error
FROM pg_hba_file_rules ;

and similar for pg_ident_rule_mappings?

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

28 March 2022, 08:02:14

On Mon, Mar 28, 2022 at 03:43:41PM +0800, Julien Rouhaud wrote:
> On Mon, Mar 28, 2022 at 04:20:07PM +0900, Michael Paquier wrote:
>> We could use a failure path for each psql command rather than a SKIP
>> block, as you told me, if the psql fails and check that we get some
>> error strings related to the loading of auth files.  However, I am
>> scared of this design in the long-term as it could cause the tests to
>> pass with a failure triggered on platforms and/or configurations where
>> we should have a success.  So, I am tempted to drop the ball for now
>> with the TAP part.
>
> Ok.  We could still keep the tests for the valid lines part though?

With the SQLs modified as below, this part is less interesting.

>> The patch still has value for the end-user.  I have checked the
>> backend part, and I did not notice any obvious issue.  There is one
>> thing that I am wondering though: should we change the two queries in
>> sysviews.sql so as we check that there are zero errors in the two
>> views when the files are parsed?  This simple change would avoid
>> mistakes for users running installcheck on a production installation.
>
> Do you mean something like
>
> SELECT count(*) > 0 AS ok,
>        count(*) FILTER (WHERE error IS NOT NULL) = 0 AS has_no_error
> FROM pg_hba_file_rules ;
>
> and similar for pg_ident_rule_mappings?

Something like that, yes.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

28 March 2022, 08:33:30

On Mon, Mar 28, 2022 at 05:02:14PM +0900, Michael Paquier wrote:
> On Mon, Mar 28, 2022 at 03:43:41PM +0800, Julien Rouhaud wrote:
> >
> > Ok.  We could still keep the tests for the valid lines part though?
>
> With the SQLs modified as below, this part is less interesting.

Ok.

> > Do you mean something like
> >
> > SELECT count(*) > 0 AS ok,
> >        count(*) FILTER (WHERE error IS NOT NULL) = 0 AS has_no_error
> > FROM pg_hba_file_rules ;
> >
> > and similar for pg_ident_rule_mappings?
>
> Something like that, yes.

Ok, v5 attached without the TAP tests and updated sysviews tests.

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

29 March 2022, 01:21:36

On Mon, Mar 28, 2022 at 04:33:30PM +0800, Julien Rouhaud wrote:
> Ok, v5 attached without the TAP tests and updated sysviews tests.

The update of the query related to pg_hba_file_rules in the regression
tests was independant, so I have split and applied that first, as of
091a971.

Now, for the rest, I have found one place in the docs that had an
incorrect link, two incorrect comments (aka
s/pg_hba.conf/pg_ident.conf/), and the indentation was a bit off.
Anyway, all that was cosmetic, so applied after adjusting all those
things.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

29 March 2022, 08:14:54

Hi,

On Tue, Mar 29, 2022 at 10:21:36AM +0900, Michael Paquier wrote:
> On Mon, Mar 28, 2022 at 04:33:30PM +0800, Julien Rouhaud wrote:
> > Ok, v5 attached without the TAP tests and updated sysviews tests.
> 
> The update of the query related to pg_hba_file_rules in the regression
> tests was independant, so I have split and applied that first, as of
> 091a971.

Ok.

> Now, for the rest, I have found one place in the docs that had an
> incorrect link, two incorrect comments (aka
> s/pg_hba.conf/pg_ident.conf/), and the indentation was a bit off.
> Anyway, all that was cosmetic, so applied after adjusting all those
> things.

Thanks!

I'm attaching a rebase of the rest of the patchset.  Note that I also added
support for an "include_dir" directive, as it's also something that would be
quite helpful (and very welcome for my $work use case).

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

18 May 2022, 07:12:45

Hi,

On Tue, Mar 29, 2022 at 04:14:54PM +0800, Julien Rouhaud wrote:
> 
> I'm attaching a rebase of the rest of the patchset.  Note that I also added
> support for an "include_dir" directive, as it's also something that would be
> quite helpful (and very welcome for my $work use case).

The cfbot reports that the patch doesn't apply anymore, rebased v7 attached.

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

23 May 2022, 06:53:06

On Wed, May 18, 2022 at 03:12:45PM +0800, Julien Rouhaud wrote:
> The cfbot reports that the patch doesn't apply anymore, rebased v7 attached.

+       switch (kind)
+       {
+           case SecondaryAuthFile:
+               msglog = "could not open secondary authentication file \"@%s\" as \"%s\": %m";
+               msgview = "could not open secondary authentication file \"@%s\" as \"%s\": %s";
+               break;
+           case IncludedAuthFile:
+               msglog = "could not open included authentication file \"%s\" as \"%s\": %m";
+               msgview = "could not open included authentication file \"%s\" as \"%s\": %s";
+               break;
+           default:
+               elog(ERROR, "unknown HbaIncludeKind: %d", kind);
+               break;
+       }
I don't think that HbaIncludeKind is necessary, considering that we
could rely on the file name to provide enough context about the type
involved in the error string generated.  The "default" clause in the
switch could just be removed, btw, to generate warnings if a new value
is added in the kind enum.

+       /* relative path is relative to dir of calling file */
There are many relatives here.  It seems to me that this is the kind
of behavior we should document precisely (and test!).  I am
understanding the following for cascading configurations:
- pg_hba.conf has "include_dir path1".
- path1/ has file1.conf that has "include file2.conf".  This means
that file2.conf has to be also included in path1/.

postmaster.c and postinit.c do that:
    /*
     * Load configuration files for client authentication.
     */
    if (!load_hba())
    {
        /*
         * It makes no sense to continue if we fail to load the HBA file,
         * since there is no way to connect to the database in this case.
         */
        ereport(FATAL,
                (errmsg("could not load pg_hba.conf")));
    }
This could be confusing as a different file may fail to load, while
pg_hba.conf was able to do its work outside an include clause.

How does include_dir work with the ordering of the HBA entries across
multiple files?  The set of HBA rules are very sensitive to their
ordering, and ReadDir() depends on readdir(), which provides a list
of files depending on its FS implementation.  That does not seem like
a sane concept to me.  I can this this order (tracked by rule_number)
being strictly enforced when it comes to the loading of files, though,
so I would recommend to focus on the implementation of "include"
rather than "include_dir".

+      <para>
+       Rule number, in priority order, of this rule if the rule is valid,
+       otherwise null
+      </para></entry>
This is a very important field.  I think that this explanation should
be extended and explain why relying on this number counts (aka this is
the order of the rules loaded across files).  Btw, this could be added
as a separate patch, even if this maps to the line number when it
comes to the ordering.

+# include       FILE
+# include_dir   FILE
You mean s/FILE/DIRECTORY/ for include_dir, I guess?

+                   /*
+                    * Only parse files with names ending in ".conf".
+                    * Explicitly reject files starting with ".". This
+                    * excludes things like "." and "..", as well as typical
+                    * hidden files, backup files, and editor debris.
+                    */
I don't think that there is any need to restrict that to files ending
with .conf.  We don't do that for postgresql.conf's include, for one.

In 0002, pg_hba_matches() had better have some documentation,
explaining for which purpose this function can be used with a short
example (aka for an address and a role, find the matching set of HBA
rules and report their line and file)?

I am not sure to be a huge fan of this implementation, actually.  The
function is shaped so as the user provides in input the arguments to
fill hbaPort with, passing it down to check_hba().  This could bite
easily in the future if some of the internal fields filled in by the
HBA load and used by the HBA check change over time, particularly if
this stuff has no tests to provide some validation, though we
discussed that a couple of months ago.  Perhaps we should think
harder on this point.

+       char        tmp[sizeof("xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:255.255.255.255/128")];
Okay.  This is the same way of doing things as network.c or
inet_cidr_ntop.c.  Shouldn't we centralize the definition of such a
maximum size instead?

+   if (!load_hba())
+       ereport(ERROR,
+               (errcode(ERRCODE_CONFIG_FILE_ERROR),
+                errmsg("Invalidation auth configuration file")));
This error message sounds wrong to me.  It would be more consistent to
write that as "could not load authentication file" or such.
postinit.c and postmaster.c do that (these error strings become
partially confusing with the possibility to include extra auth files,
actually, on a separate note).

+   if (PG_ARGISNULL(1))
+       ereport(ERROR,
+               (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                errmsg("parameter role is mandatory")));
+   port->user_name = text_to_cstring(PG_GETARG_TEXT_PP(1));
This could be moved at the beginning of the function, before
processing the first argument of the function (address) in details.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

23 May 2022, 11:43:08

Hi,

On Mon, May 23, 2022 at 03:53:06PM +0900, Michael Paquier wrote:
>
> +       switch (kind)
> +       {
> +           case SecondaryAuthFile:
> +               msglog = "could not open secondary authentication file \"@%s\" as \"%s\": %m";
> +               msgview = "could not open secondary authentication file \"@%s\" as \"%s\": %s";
> +               break;
> +           case IncludedAuthFile:
> +               msglog = "could not open included authentication file \"%s\" as \"%s\": %m";
> +               msgview = "could not open included authentication file \"%s\" as \"%s\": %s";
> +               break;
> +           default:
> +               elog(ERROR, "unknown HbaIncludeKind: %d", kind);
> +               break;
> +       }
> I don't think that HbaIncludeKind is necessary, considering that we
> could rely on the file name to provide enough context about the type
> involved in the error string generated.

I'm not sure it would always be the case.  For instance, why wouldn't someone
use something like

peer all @app_users ident

rather than

@include app_users

or a mix of both?

That being said, I'd gladly drop that enum and only handle a single error
message, as the rest of the error context (including the owning file name and
line) should provide enough information to users.

If so, should I use "included authentication file" everywhere, or use something
else?

> The "default" clause in the
> switch could just be removed, btw, to generate warnings if a new value
> is added in the kind enum.

Right.

> +       /* relative path is relative to dir of calling file */
> There are many relatives here.  It seems to me that this is the kind
> of behavior we should document precisely

Agreed.  I will try to mimic how it's done for
https://www.postgresql.org/docs/current/config-setting.html#CONFIG-INCLUDES.

> (and test!).

Sure.  As I said previously I'm waiting for some agreement on the feature (and
the syntax), before adding tests.  After that I will definitely include full
test coverage!

> I am
> understanding the following for cascading configurations:
> - pg_hba.conf has "include_dir path1".
> - path1/ has file1.conf that has "include file2.conf".  This means
> that file2.conf has to be also included in path1/.

Right, same rules as for postgresql.conf's include_dir.
>
> postmaster.c and postinit.c do that:
>     /*
>      * Load configuration files for client authentication.
>      */
>     if (!load_hba())
>     {
>         /*
>          * It makes no sense to continue if we fail to load the HBA file,
>          * since there is no way to connect to the database in this case.
>          */
>         ereport(FATAL,
>                 (errmsg("could not load pg_hba.conf")));
>     }
> This could be confusing as a different file may fail to load, while
> pg_hba.conf was able to do its work outside an include clause.

Well, even if included they are still part of the configuration and lead to a
failure to load the main file.

The detail should be provided in the logs so it should disambiguate the
situation.  While you're mentioning this message, AFAICS it can already be
entirely wrong as-is, as it doesn't take into account the hba_file
configuration:

ALTER SYSTEM SET hba_file = '/tmp/myfile';

After a restart:
2022-05-23 18:52:08.081 CST [41890] LOG:  could not open configuration file
"/tmp/myfile": No such file or directory
2022-05-23 18:52:08.081 CST [41890] FATAL:  could not load pg_hba.conf

> How does include_dir work with the ordering of the HBA entries across
> multiple files?

Bugs apart, same as for postgresql.conf's include_dir.

> The set of HBA rules are very sensitive to their
> ordering, and ReadDir() depends on readdir(), which provides a list
> of files depending on its FS implementation.  That does not seem like
> a sane concept to me.  I can this this order (tracked by rule_number)
> being strictly enforced when it comes to the loading of files, though,
> so I would recommend to focus on the implementation of "include"
> rather than "include_dir".

But the same problem already exists for the postgresql.conf's include_dir.

Having an hba rule masking another isn't really better than having a GUC
value overloaded by another one.  Usually people rely on a sane naming (like
01_file.conf and so on) to get a predictable behavior, and we even document
that very thoroughly for the postgresql.conf part.  Improving the
documentation, as you already pointed out a bit above, should be enough?

> +      <para>
> +       Rule number, in priority order, of this rule if the rule is valid,
> +       otherwise null
> +      </para></entry>
> This is a very important field.  I think that this explanation should
> be extended and explain why relying on this number counts (aka this is
> the order of the rules loaded across files).  Btw, this could be added
> as a separate patch, even if this maps to the line number when it
> comes to the ordering.

Agreed, I will improve the documentation to outline the importance of that
information.

Do you mean a separate patch to ease review or to eventually commit both
separately?  FWIW I don't think that adding any form of inclusion in the hba
files should be done without a way for users to check the results.  Any test
facility would also probably rely on this field.
>
> +# include       FILE
> +# include_dir   FILE
> You mean s/FILE/DIRECTORY/ for include_dir, I guess?

Oops yes.
>
> +                   /*
> +                    * Only parse files with names ending in ".conf".
> +                    * Explicitly reject files starting with ".". This
> +                    * excludes things like "." and "..", as well as typical
> +                    * hidden files, backup files, and editor debris.
> +                    */
> I don't think that there is any need to restrict that to files ending
> with .conf.  We don't do that for postgresql.conf's include, for one.

I'm confused.  Are you talking about postgresql.conf's include or include_dir
option?  I'm pretty sure that I borrowed this logic from
src/backend/utils/misc/guc-file.l / ParseConfigDirectory(), which implements
the include_dir logic for the postgresql.conf file.

> In 0002, pg_hba_matches() had better have some documentation,
> explaining for which purpose this function can be used with a short
> example (aka for an address and a role, find the matching set of HBA
> rules and report their line and file)?
>
> I am not sure to be a huge fan of this implementation, actually.  The
> function is shaped so as the user provides in input the arguments to
> fill hbaPort with, passing it down to check_hba().  This could bite
> easily in the future if some of the internal fields filled in by the
> HBA load and used by the HBA check change over time, particularly if
> this stuff has no tests to provide some validation, though we
> discussed that a couple of months ago.  Perhaps we should think
> harder on this point.

First, it's for now only a POC to demonstrate a way to diagnose problems in
those files (thus no doc and obviously no tests).  I added some details in my
initial email:

> Finally I also added 0003, which is a POC for a new pg_hba_matches()
> function, that can help DBA to understand why their configuration isn't
> working as they expect.  This only to start the discussion on that topic, the
> code is for now really hackish, as I don't know how much this is wanted
> and/or if some other behavior would be better, and there's also no
> documentation or test.  The function for now only takes an optional inet
> (null means unix socket), the target role and an optional ssl flag and
> returns the file, line and raw line matching if any, or null.  For instance:
>
> =# select * from pg_hba_matches('127.0.0.1'::inet, 'postgres');
> -[ RECORD 1  ]-----------------------------------------------------------------
> file_name | /tmp/pgbuild/toto.conf
> line_num  | 5
> raw_line  | host    all            all            127.0.0.1/32           trust

I will put more details and this example in the commit message if you want, but
if no one is interested in that feature I'm ok with discarding it.

I can of course change the implementation, but unless I'm missing something
users will always have to provide all the relevant info to check the behavior
for a specific connection origin?

> +       char        tmp[sizeof("xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:255.255.255.255/128")];
> Okay.  This is the same way of doing things as network.c or
> inet_cidr_ntop.c.  Shouldn't we centralize the definition of such a
> maximum size instead?
>
> +   if (!load_hba())
> +       ereport(ERROR,
> +               (errcode(ERRCODE_CONFIG_FILE_ERROR),
> +                errmsg("Invalidation auth configuration file")));
> This error message sounds wrong to me.  It would be more consistent to
> write that as "could not load authentication file" or such.
> postinit.c and postmaster.c do that (these error strings become
> partially confusing with the possibility to include extra auth files,
> actually, on a separate note).

Oops, I meant "Invalid".  I'll Cleanup the error messages and refactor stuff if
this goes beyond the POC stage.

> +   if (PG_ARGISNULL(1))
> +       ereport(ERROR,
> +               (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +                errmsg("parameter role is mandatory")));
> +   port->user_name = text_to_cstring(PG_GETARG_TEXT_PP(1));
> This could be moved at the beginning of the function, before
> processing the first argument of the function (address) in details.

I thought it'd be better to handle each argument sequentially for readability,
but I could move all the sanity checks first if you prefer.

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

24 May 2022, 01:44:05

On Mon, May 23, 2022 at 07:43:08PM +0800, Julien Rouhaud wrote:
> That being said, I'd gladly drop that enum and only handle a single error
> message, as the rest of the error context (including the owning file name and
> line) should provide enough information to users.
>
> If so, should I use "included authentication file" everywhere, or use something
> else?

With the file name provided the context, "authentication file"?

> The detail should be provided in the logs so it should disambiguate the
> situation.  While you're mentioning this message, AFAICS it can already be
> entirely wrong as-is, as it doesn't take into account the hba_file
> configuration:
>
> ALTER SYSTEM SET hba_file = '/tmp/myfile';

Hmm, indeed.  And we load the GUCs before the HBA rules, obviously.
This could be made less confusing.  Do you think that this can be
improved independently of the main patch, building the include
structure on top of it?

>> The set of HBA rules are very sensitive to their
>> ordering, and ReadDir() depends on readdir(), which provides a list
>> of files depending on its FS implementation.  That does not seem like
>> a sane concept to me.  I can this this order (tracked by rule_number)
>> being strictly enforced when it comes to the loading of files, though,
>> so I would recommend to focus on the implementation of "include"
>> rather than "include_dir".
>
> But the same problem already exists for the postgresql.conf's include_dir.
>
> Having an hba rule masking another isn't really better than having a GUC
> value overloaded by another one.  Usually people rely on a sane naming (like
> 01_file.conf and so on) to get a predictable behavior, and we even document
> that very thoroughly for the postgresql.conf part.  Improving the
> documentation, as you already pointed out a bit above, should be enough?

Ah.  ParseConfigDirectory() does a qsort() before processing each
file included in a folder.  Unless I am missing something, your patch
reads the entries in the directory, but does not sort the files by
name to ensure a strict ordering of them.

While on it, it looks like you should have sanity checks similar to
ParseConfigDirectory() for CRLFs & friends, as of:
if (strspn(includedir, " \t\r\n") == strlen(includedir))

All this logic is very similar, so this could be perhaps refactored.

>> +      <para>
>> +       Rule number, in priority order, of this rule if the rule is valid,
>> +       otherwise null
>> +      </para></entry>
>> This is a very important field.  I think that this explanation should
>> be extended and explain why relying on this number counts (aka this is
>> the order of the rules loaded across files).  Btw, this could be added
>> as a separate patch, even if this maps to the line number when it
>> comes to the ordering.
>
> Agreed, I will improve the documentation to outline the importance of that
> information.
>
> Do you mean a separate patch to ease review or to eventually commit both
> separately?  FWIW I don't think that adding any form of inclusion in the hba
> files should be done without a way for users to check the results.  Any test
> facility would also probably rely on this field.

Yeah, agreed that rule_number is important for the sake of the
inclusions.  Still, I was wondering if rule_number makes sense on its
own, meaning that we could add it before working on the inclusion
logic.  Having it without the inclusion is less interesting if you
have the line numbers to order the HBA rules, of course, as this is
enough to guess the priority of the HBA entries.

>> +                   /*
>> +                    * Only parse files with names ending in ".conf".
>> +                    * Explicitly reject files starting with ".". This
>> +                    * excludes things like "." and "..", as well as typical
>> +                    * hidden files, backup files, and editor debris.
>> +                    */
>> I don't think that there is any need to restrict that to files ending
>> with .conf.  We don't do that for postgresql.conf's include, for one.
>
> I'm confused.  Are you talking about postgresql.conf's include or include_dir
> option?  I'm pretty sure that I borrowed this logic from
> src/backend/utils/misc/guc-file.l / ParseConfigDirectory(), which implements
> the include_dir logic for the postgresql.conf file.

Ah, ParseConfigDirectory() does that for the sake of avoiding any
debris files.  The reasoning (2a0c81a, aka
https://www.postgresql.org/message-id/4EC341E1.1060908%402ndQuadrant.com)
relates to debris files like the ones created by editors and such.  So
you are right to do so.

>> Finally I also added 0003, which is a POC for a new pg_hba_matches()
>> function, that can help DBA to understand why their configuration isn't
>> working as they expect.  This only to start the discussion on that topic, the
>> code is for now really hackish, as I don't know how much this is wanted
>> and/or if some other behavior would be better, and there's also no
>> documentation or test.  The function for now only takes an optional inet
>> (null means unix socket), the target role and an optional ssl flag and
>> returns the file, line and raw line matching if any, or null.  For instance:
>>
>> =# select * from pg_hba_matches('127.0.0.1'::inet, 'postgres');
>> -[ RECORD 1  ]-----------------------------------------------------------------
>> file_name | /tmp/pgbuild/toto.conf
>> line_num  | 5
>> raw_line  | host    all            all            127.0.0.1/32           trust
>
> I will put more details and this example in the commit message if you want, but
> if no one is interested in that feature I'm ok with discarding it.

Oh, OK, I was not really following, then.  I'd be fine to revisit this
part if necessary.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

26 May 2022, 07:26:57

On Tue, May 24, 2022 at 10:44:05AM +0900, Michael Paquier wrote:
> On Mon, May 23, 2022 at 07:43:08PM +0800, Julien Rouhaud wrote:
> > That being said, I'd gladly drop that enum and only handle a single error
> > message, as the rest of the error context (including the owning file name and
> > line) should provide enough information to users.
> > 
> > If so, should I use "included authentication file" everywhere, or use something
> > else?
> 
> With the file name provided the context, "authentication file"?

So you mean having an error message like that (having an "include myconf"
in the HBA file):

LOG:  could not open authentication file "myconf" as "/path/to/myconf": No such file or directory
LOG:  pg_hba.conf was not reloaded

This would be somewhat consistent with how it's done for the postgresql.conf,
assuming that we do fix that hardcoded "pg_hba.conf":

LOG:  could not open configuration file "/path/to/toto": No such file or directory
LOG:  configuration file "/path/to/postgresql.conf" contains errors; no changes were applied

With this message it's clear that the file that couldn't be opened is an
included file.

However those two cases are slightly different, and technically the "secondary
file" is not an authentication file, just a list of tokens, so I'm still a bit
worried about ambiguity here.

> > The detail should be provided in the logs so it should disambiguate the
> > situation.  While you're mentioning this message, AFAICS it can already be
> > entirely wrong as-is, as it doesn't take into account the hba_file
> > configuration:
> > 
> > ALTER SYSTEM SET hba_file = '/tmp/myfile';
> 
> Hmm, indeed.  And we load the GUCs before the HBA rules, obviously.
> This could be made less confusing.  Do you think that this can be
> improved independently of the main patch, building the include
> structure on top of it?

Sure, I can split that in another commit.

After a bit more digging, I think that this comes from the fact that there's no
"official" name for this file.  Even the documentation just says "the
pg_hba.conf file" [1].  So using pg_hba.conf can either means explicitly
$PGDATA/pg_hba.conf or the instance's HBA file in general, whatever its
location.

I think it would be good to improve this, including in the doc, but I'm
assuming it's entirely for HEAD only, including the error messages?

If so, should I also change the doc to replace "pg_hba.conf" with something
else when it's not referring to the file default name?

I'm thinking of using "HBA file" to replace pg_hba.conf, and using
"authentication file" when it can be either the "HBA file" and the "User Name
Maps file", would that be ok?

> > But the same problem already exists for the postgresql.conf's include_dir.
> > 
> > Having an hba rule masking another isn't really better than having a GUC
> > value overloaded by another one.  Usually people rely on a sane naming (like
> > 01_file.conf and so on) to get a predictable behavior, and we even document
> > that very thoroughly for the postgresql.conf part.  Improving the
> > documentation, as you already pointed out a bit above, should be enough?
> 
> Ah.  ParseConfigDirectory() does a qsort() before processing each
> file included in a folder.  Unless I am missing something, your patch
> reads the entries in the directory, but does not sort the files by
> name to ensure a strict ordering of them.
> 
> While on it, it looks like you should have sanity checks similar to
> ParseConfigDirectory() for CRLFs & friends, as of:
> if (strspn(includedir, " \t\r\n") == strlen(includedir))
> 
> All this logic is very similar, so this could be perhaps refactored.

Indeed, I will move that in a common function to be consistent.

> >> +      <para>
> >> +       Rule number, in priority order, of this rule if the rule is valid,
> >> +       otherwise null
> >> +      </para></entry>
> >> This is a very important field.  I think that this explanation should
> >> be extended and explain why relying on this number counts (aka this is
> >> the order of the rules loaded across files).  Btw, this could be added
> >> as a separate patch, even if this maps to the line number when it
> >> comes to the ordering.
> > 
> > Agreed, I will improve the documentation to outline the importance of that
> > information.
> > 
> > Do you mean a separate patch to ease review or to eventually commit both
> > separately?  FWIW I don't think that adding any form of inclusion in the hba
> > files should be done without a way for users to check the results.  Any test
> > facility would also probably rely on this field.
> 
> Yeah, agreed that rule_number is important for the sake of the
> inclusions.  Still, I was wondering if rule_number makes sense on its
> own, meaning that we could add it before working on the inclusion
> logic.  Having it without the inclusion is less interesting if you
> have the line numbers to order the HBA rules, of course, as this is
> enough to guess the priority of the HBA entries.

Ah I see.  Sure I could split it in another commit too, but this isn't
backpatchable (and pg15 isn't branched yet anyway) so I'm not sure how useful
that is.  I will go head and do that anyway, it will be easy to merge the
commits if needed.


[1] https://www.postgresql.org/docs/current/auth-pg-hba-conf.html

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

02 June 2022, 01:08:15

On Thu, May 26, 2022 at 03:26:57PM +0800, Julien Rouhaud wrote:
> So you mean having an error message like that (having an "include myconf"
> in the HBA file):
>
> LOG:  could not open authentication file "myconf" as "/path/to/myconf": No such file or directory
> LOG:  pg_hba.conf was not reloaded
>
> This would be somewhat consistent with how it's done for the postgresql.conf,
> assuming that we do fix that hardcoded "pg_hba.conf":
>
> LOG:  could not open configuration file "/path/to/toto": No such file or directory
> LOG:  configuration file "/path/to/postgresql.conf" contains errors; no changes were applied
>
> With this message it's clear that the file that couldn't be opened is an
> included file.

Yes, consistency would be good here.

> However those two cases are slightly different, and technically the "secondary
> file" is not an authentication file, just a list of tokens, so I'm still a bit
> worried about ambiguity here.

Hmm.  Yes, using only this term could lead to some confusion.
Thinking more about that, perhaps you are right to use fully separated
terms for the ident and HBA parts, particularly once the inclusion is
in place as the file names could be anything, so we would lose context
about where a file for used for what.

> After a bit more digging, I think that this comes from the fact that there's no
> "official" name for this file.  Even the documentation just says "the
> pg_hba.conf file" [1].  So using pg_hba.conf can either means explicitly
> $PGDATA/pg_hba.conf or the instance's HBA file in general, whatever its
> location.
>
> I think it would be good to improve this, including in the doc, but I'm
> assuming it's entirely for HEAD only, including the error messages?

Yes, that would be a set of changes only for HEAD, once 16~ opens for
business.  FWIW, the acronym "HBA" is defined as "Host-Based
Authentication", so we could use that as a base for the description of
the file, using simply HBA in the follow-up paragraphs for simplicity,
telling that pg_hba.conf is the default.

> If so, should I also change the doc to replace "pg_hba.conf" with something
> else when it's not referring to the file default name?
>
> I'm thinking of using "HBA file" to replace pg_hba.conf, and using
> "authentication file" when it can be either the "HBA file" and the "User Name
> Maps file", would that be ok?

Using "HBA file" in the docs is fine by me, knowing that the acronym
is already defined.  The modified parts of the docs should perhaps
mention once something like "Host-Based Authentication file (or HBA
file)" for clarity.  For the error message, I think that we tend to
avoid those acronyms, don't we?

> Ah I see.  Sure I could split it in another commit too, but this isn't
> backpatchable (and pg15 isn't branched yet anyway) so I'm not sure how useful
> that is.  I will go head and do that anyway, it will be easy to merge the
> commits if needed.

The rule_number shines with the inclusion logic in place, still
splitting that into a separate commit makes the review and the commit
history cleaner IMO.  Perhaps that's just me being overly-pedantic in
keeping a clean commit history :)
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

08 July 2022, 06:57:21

Hi,

On Thu, Jun 02, 2022 at 10:08:15AM +0900, Michael Paquier wrote:
> On Thu, May 26, 2022 at 03:26:57PM +0800, Julien Rouhaud wrote:
>
> > After a bit more digging, I think that this comes from the fact that there's no
> > "official" name for this file.  Even the documentation just says "the
> > pg_hba.conf file" [1].  So using pg_hba.conf can either means explicitly
> > $PGDATA/pg_hba.conf or the instance's HBA file in general, whatever its
> > location.
> >
> > I think it would be good to improve this, including in the doc, but I'm
> > assuming it's entirely for HEAD only, including the error messages?
>
> Yes, that would be a set of changes only for HEAD, once 16~ opens for
> business.  FWIW, the acronym "HBA" is defined as "Host-Based
> Authentication", so we could use that as a base for the description of
> the file, using simply HBA in the follow-up paragraphs for simplicity,
> telling that pg_hba.conf is the default.

Ok.

> > If so, should I also change the doc to replace "pg_hba.conf" with something
> > else when it's not referring to the file default name?
> >
> > I'm thinking of using "HBA file" to replace pg_hba.conf, and using
> > "authentication file" when it can be either the "HBA file" and the "User Name
> > Maps file", would that be ok?
>
> Using "HBA file" in the docs is fine by me, knowing that the acronym
> is already defined.  The modified parts of the docs should perhaps
> mention once something like "Host-Based Authentication file (or HBA
> file)" for clarity.  For the error message, I think that we tend to
> avoid those acronyms, don't we?

I don't have an extensive knowledge of all the user facing error messages, but
after a quick grep I see multiple usage of OID, PID, GIN and other defined
acronyms.  I also see multiple occurrences of "only heap AM is supported",
while AM isn't even a defined acronym.

It doesn't seem that my proposal would be inconsistent with existing messages
and will help to reduce the message length, but if you prefer to keep the full
name I'm fine with it.  Those should be very rare and specialized errors
anyway.

While on the bikeshedding part, are you ok with the proposed keywords (include
and include_dir), behaving exactly like for postgresql.conf, and to also add
include_if_exists, so that we have the exact same possibilities with
postgresql.conf, pg_hba.conf and pg_ident.conf?

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

11 July 2022, 01:16:44

On Fri, Jul 08, 2022 at 02:57:21PM +0800, Julien Rouhaud wrote:

My apologies for the late reply.

> I don't have an extensive knowledge of all the user facing error messages, but
> after a quick grep I see multiple usage of OID, PID, GIN and other defined
> acronyms.  I also see multiple occurrences of "only heap AM is supported",
> while AM isn't even a defined acronym.

A lot depends on the context of the code, it seems.

> It doesn't seem that my proposal would be inconsistent with existing messages
> and will help to reduce the message length, but if you prefer to keep the full
> name I'm fine with it.  Those should be very rare and specialized errors
> anyway.

So you mean to use "HBA file" instead of pg_hba.conf and
"authentication file" when it can be either one of an HBA file or a
mapping file?  That would be okay by me.  We would have a full cycle
to tune them depending on the feedback we'd get afterwards.

> While on the bikeshedding part, are you ok with the proposed keywords (include
> and include_dir), behaving exactly like for postgresql.conf, and to also add
> include_if_exists, so that we have the exact same possibilities with
> postgresql.conf, pg_hba.conf and pg_ident.conf?

Okay, agreed for consistency.  With include_dir being careful about
the ordering of the entries and ignoring anything else than a .conf
file (that's something you mentioned already upthread).
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

18 July 2022, 07:11:51

Hi,

On Mon, Jul 11, 2022 at 10:16:44AM +0900, Michael Paquier wrote:
> On Fri, Jul 08, 2022 at 02:57:21PM +0800, Julien Rouhaud wrote:
> 
> My apologies for the late reply.
> 
> > I don't have an extensive knowledge of all the user facing error messages, but
> > after a quick grep I see multiple usage of OID, PID, GIN and other defined
> > acronyms.  I also see multiple occurrences of "only heap AM is supported",
> > while AM isn't even a defined acronym.
> 
> A lot depends on the context of the code, it seems.
> 
> > It doesn't seem that my proposal would be inconsistent with existing messages
> > and will help to reduce the message length, but if you prefer to keep the full
> > name I'm fine with it.  Those should be very rare and specialized errors
> > anyway.
> 
> So you mean to use "HBA file" instead of pg_hba.conf and
> "authentication file" when it can be either one of an HBA file or a
> mapping file?  That would be okay by me.

Yes, it seems to me like a good compromise for not being overly verbose and
still being understandable.

> We would have a full cycle
> to tune them depending on the feedback we'd get afterwards.

Agreed.

> > While on the bikeshedding part, are you ok with the proposed keywords (include
> > and include_dir), behaving exactly like for postgresql.conf, and to also add
> > include_if_exists, so that we have the exact same possibilities with
> > postgresql.conf, pg_hba.conf and pg_ident.conf?
> 
> Okay, agreed for consistency.  With include_dir being careful about
> the ordering of the entries and ignoring anything else than a .conf
> file (that's something you mentioned already upthread).

Ok!  All those that should be covered by new regression test so it should be
clear what is being implemented.

While on the regression tests topic, I started to implement those and faced
some problems quite fast when trying to workaround the problem we previously
discussed (1).

So first, even if we can test 99% of the features with just testing the views
output, I think it's should use the TAP framework since the tests will have to
mess with the pg_ident/pg_hba files.  It's way easier to modify the auth files,
and it uses a dedicated instance so we don't have to worry about breaking other
test that would run concurrently.

Also, if we want to test the views error reporting we have to use a persistent
connection (as in interactive_psql()), otherwise tests will immediately fail on
Windows / EXEC_BACKEND builds.  Adding the ability to run queries and wait for
completion on top of interactive_psql() doesn't seem to cause any problem, but
interpreting the results does.

Since it's just manipulating the psql's stdin/stdout, we retrieve the prompt
and executed query too.  So for instance, if you just want to test "SELECT 1",
you will have to cleanup something like

dbname=# SELECT 1;
1
dbname#

That may still be workable by splitting the output per newline (and possibly
removing the first prompt before sending the query text), and remove the first
and last entry (assuming you want to test somewhat sane data, and not e.g. run
the regression test on a database containing a newline), but then you have to
also account for possible escape sequences, for instance if you use
enable-bracketed-paste.  In that case, the output becomes something like

dbname=# SELECT 1;
[?2004l
1
[?2004hpostgres=#

It could probably be handled with some regexp to remove escape sequences and
remove empty lines, but it seems like really fragile, and thus a very bad idea.

I'm not really sure what should be done here.  The best compromise I can think
of is to split the tests in 3 parts:

1) view reporting with various inclusions using safe_psql()
2) log error reporting
3) view reporting with various inclusions errors, using safe_psql()

And when testing 3), detect first if we can still connect after introducing
errors.  If not, assume this is Windows / EXEC_BACKEND and give up here without
reporting an error.  Otherwise continue, and fail the test if we later can't
connect anymore.  As discussed previously, detecting that the build is using
the fork emulation code path doesn't seem worthwhile so guessing it from the
psql error may be a better approach.

Do you have any better idea, or do you have comments on this approach?

[1]: https://www.postgresql.org/message-id/YkFhpydhyeNNo3Xl@paquier.xyz

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

19 July 2022, 06:13:12

On Mon, Jul 18, 2022 at 03:11:51PM +0800, Julien Rouhaud wrote:
> So first, even if we can test 99% of the features with just testing the views
> output, I think it's should use the TAP framework since the tests will have to
> mess with the pg_ident/pg_hba files.  It's way easier to modify the auth files,
> and it uses a dedicated instance so we don't have to worry about breaking other
> test that would run concurrently.

Agreed.

> That may still be workable by splitting the output per newline (and possibly
> removing the first prompt before sending the query text), and remove the first
> and last entry (assuming you want to test somewhat sane data, and not e.g. run
> the regression test on a database containing a newline), but then you have to
> also account for possible escape sequences, for instance if you use
> enable-bracketed-paste.  In that case, the output becomes something like
>
> dbname=# SELECT 1;
> [?2004l
> 1
> [?2004hpostgres=#
>
> It could probably be handled with some regexp to remove escape sequences and
> remove empty lines, but it seems like really fragile, and thus a very bad idea.

Hmm.  Indeed, that sounds fragile.  And I don't really think that we
should make more testing infrastructure a requirement for this patch
as being able to maintain a connection while running the tests is
something we've bumped on for a bit of time.

> I'm not really sure what should be done here.  The best compromise I can think
> of is to split the tests in 3 parts:
>
> 1) view reporting with various inclusions using safe_psql()

You mean in the case where the HBA and indent files can be loaded,
so as it is possible to peek at the system views without the
EXEC_BACKEND problem, right?

> 2) log error reporting

This one should be reliable and stable enough by parsing the logs of
the backend, thanks to connect_ok() and connect_fails().

> 3) view reporting with various inclusions errors, using safe_psql()
>
> And when testing 3), detect first if we can still connect after introducing
> errors.  If not, assume this is Windows / EXEC_BACKEND and give up here without
> reporting an error.  Otherwise continue, and fail the test if we later can't
> connect anymore.  As discussed previously, detecting that the build is using
> the fork emulation code path doesn't seem worthwhile so guessing it from the
> psql error may be a better approach.

Yeah, we could do that.  Now we may also fail on other patterns, so we
would need to make sure that a set of expected error outputs are the
ones generated?  I'd be fine to give up testing the error output
generated in the system views at the end.  Without a persistent
connection state with the same kind of APIs as any of the drivers able
to do so, that's going to be a PITA to maintain.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

26 July 2022, 05:04:02

Hi,

On Mon, Mar 28, 2022 at 04:22:32PM +0900, Michael Paquier wrote:
> On Mon, Mar 28, 2022 at 04:20:07PM +0900, Michael Paquier wrote:
> > See the attached, for reference, but it would fail with EXEC_BACKEND
> > on WIN32.
> 
> Ditto.

While working on the full regression test coverage for the file inclusion
thing, I discovered an embarrassing typo in the pg_ident_file_mapping
infrastructure, which was using the hba file name rather than the ident file
name in one of the calls.

It doesn't have much impact most of the time.  The filename is reported if
there's an IO error while reading the already opened correct file.  The real
problem is if the hba_file and ident_file are stored in different directory,
any secondary file (@filename) in the pg_ident.conf would be searched in the
wrong directory.  With the pending file inclusion patchset, the problem is
immediately visible as the view is reporting the wrong file name.

Simple fix attached.  I'll add a v15 open item shortly.

Attachment

v1-0001-Fix-fill_ident_view-incorrect-usage-of-HbaFileNam.patch

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

26 July 2022, 05:14:48

On Tue, Jul 26, 2022 at 01:04:02PM +0800, Julien Rouhaud wrote:
> It doesn't have much impact most of the time.  The filename is reported if
> there's an IO error while reading the already opened correct file.  The real
> problem is if the hba_file and ident_file are stored in different directory,
> any secondary file (@filename) in the pg_ident.conf would be searched in the
> wrong directory.  With the pending file inclusion patchset, the problem is
> immediately visible as the view is reporting the wrong file name.

Oops, obviously.  I'll go and fix that as that's on me.

> Simple fix attached.  I'll add a v15 open item shortly.

Thanks.  Better not to lose track of it.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

30 July 2022, 08:09:36

Hi,

On Tue, Jul 19, 2022 at 03:13:12PM +0900, Michael Paquier wrote:
> On Mon, Jul 18, 2022 at 03:11:51PM +0800, Julien Rouhaud wrote:
>
> > I'm not really sure what should be done here.  The best compromise I can think
> > of is to split the tests in 3 parts:
> >
> > 1) view reporting with various inclusions using safe_psql()
>
> You mean in the case where the HBA and indent files can be loaded,
> so as it is possible to peek at the system views without the
> EXEC_BACKEND problem, right?

Yes, testing the general behavior when there are no errors in the auth files.

> > 2) log error reporting
>
> This one should be reliable and stable enough by parsing the logs of
> the backend, thanks to connect_ok() and connect_fails().

I meant testing the postgres logs reporting, something like generating entirely
bogus files, restarting, checking that the start failed and verify that the
logs are expected.  With a variation that if only the pg_ident.conf is bogus,
the server will (and should) still start, so both needs to be tested
separately.

> > 3) view reporting with various inclusions errors, using safe_psql()
> >
> > And when testing 3), detect first if we can still connect after introducing
> > errors.  If not, assume this is Windows / EXEC_BACKEND and give up here without
> > reporting an error.  Otherwise continue, and fail the test if we later can't
> > connect anymore.  As discussed previously, detecting that the build is using
> > the fork emulation code path doesn't seem worthwhile so guessing it from the
> > psql error may be a better approach.
>
> Yeah, we could do that.  Now we may also fail on other patterns, so we
> would need to make sure that a set of expected error outputs are the
> ones generated?  I'd be fine to give up testing the error output
> generated in the system views at the end.  Without a persistent
> connection state with the same kind of APIs as any of the drivers able
> to do so, that's going to be a PITA to maintain.

Yes, or just checking that the error is expected on the psql side as the server
will report it.

I've been working on all of that and came up with the attached v8.

- 0001: I modified things as discussed previously to report the real auth file
  names rather than the hardcoded "pg_ident.conf" and "pg_hba.conf" in the
  various log messages
- 0002 and 0003 are minor fixes to error logs more consistent
- 0004: the rule_number / mapping_number addition in the views in a separate
  commit
- 0005: the main file inclusion patch.  Only a few minor bugfix since
  previous version discovered thanks to the tests (a bit more about it after),
  and documentation tweaks based on previous discussions
- 0006: the pg_hba_matches() POC, no changes

About the regression tests:
I added a new 003_file_inclusion.pl in src/test/authentication TAP tests to do
things as previously described, ie test both extisting behavior and new
features added, for working and non working scenarios.  The view error
reporting are bypassed for Windows / EXEC_BACKEND build by detecting a
connection failure with the expected error reported by the server.  If the
error doesn't match, the rest of the tests are still skipped but overall test
will fail due to the mismatch in the reported error.

It's a bit troublesome to write such tests, as you need to keep in sync the
various rule and line number, format things differently depending on whether
you want to check the logs or the view with each new tests added and so on.

To make the tests easier to write (and to maintain) I added some wrapper
functions to add lines in the wanted files that allow to automatically
generates the wanted regex (for the logs) or output (for the view).  There are
still things to be careful about and a bit of duplication, but this way I can
easily modify any test, or add new ones, without the need to modify everything
around.  As written, it also uses the same included files for the log error
reporting and the view error reporting, so there's no need to define everything
twice.

The tests work as expected locally on a normal build and an EXEC_BACKEND build,
and also on the CI.

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

01 August 2022, 01:12:52

On Tue, Jul 26, 2022 at 1:14 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Jul 26, 2022 at 01:04:02PM +0800, Julien Rouhaud wrote:
> > It doesn't have much impact most of the time.  The filename is reported if
> > there's an IO error while reading the already opened correct file.  The real
> > problem is if the hba_file and ident_file are stored in different directory,
> > any secondary file (@filename) in the pg_ident.conf would be searched in the
> > wrong directory.  With the pending file inclusion patchset, the problem is
> > immediately visible as the view is reporting the wrong file name.
>
> Oops, obviously.  I'll go and fix that as that's on me.
>
> > Simple fix attached.  I'll add a v15 open item shortly.
>
> Thanks.  Better not to lose track of it.

For the archives's sake, this was fixed shortly after as of
27e0ee57f68d27af68967759a2ff61a581f501dc on master and the open item
is closed.

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

02 August 2022, 10:32:54

On Sat, Jul 30, 2022 at 04:09:36PM +0800, Julien Rouhaud wrote:
> I've been working on all of that and came up with the attached v8.
>
> - 0001: I modified things as discussed previously to report the real auth file
>   names rather than the hardcoded "pg_ident.conf" and "pg_hba.conf" in the
>   various log messages
> - 0002 and 0003 are minor fixes to error logs more consistent

As a quick update from my side, I intend to look and apply 0001~0003
(not double-checked yet) shortly.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

05 August 2022, 00:56:29

On Tue, Aug 02, 2022 at 07:32:54PM +0900, Michael Paquier wrote:
> As a quick update from my side, I intend to look and apply 0001~0003
> (not double-checked yet) shortly.

And a couple of days later, these look fine so done as of 47ab1ac and
718fe0a.  0002 and 0003 have been merged together.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

06 August 2022, 09:19:35

Hi,

On Fri, Aug 05, 2022 at 09:56:29AM +0900, Michael Paquier wrote:
> On Tue, Aug 02, 2022 at 07:32:54PM +0900, Michael Paquier wrote:
> > As a quick update from my side, I intend to look and apply 0001~0003
> > (not double-checked yet) shortly.
>
> And a couple of days later, these look fine so done as of 47ab1ac and
> 718fe0a.  0002 and 0003 have been merged together.

Thanks!

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

16 August 2022, 06:10:30

Hi,

On Sat, Jul 30, 2022 at 04:09:36PM +0800, Julien Rouhaud wrote:

>
> - 0001: the rule_number / mapping_number addition in the views in a separate
>   commit
> - 0002: the main file inclusion patch.  Only a few minor bugfix since
>   previous version discovered thanks to the tests (a bit more about it after),
>   and documentation tweaks based on previous discussions
> - 0003: the pg_hba_matches() POC, no changes

Attached v9, simple rebase after multiple conflicts and part of the patchset
applied.

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

17 September 2022, 17:06:12

On Tue, Aug 16, 2022 at 02:10:30PM +0800, Julien Rouhaud wrote:
> Hi,
> 
> On Sat, Jul 30, 2022 at 04:09:36PM +0800, Julien Rouhaud wrote:
> 
> >
> > - 0001: the rule_number / mapping_number addition in the views in a separate
> >   commit
> > - 0002: the main file inclusion patch.  Only a few minor bugfix since
> >   previous version discovered thanks to the tests (a bit more about it after),
> >   and documentation tweaks based on previous discussions
> > - 0003: the pg_hba_matches() POC, no changes

v10 attached to fix a few conflict with recent refactoring commits, no other
change.

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

10 October 2022, 02:51:32

Hi,

On Sun, Sep 18, 2022 at 01:06:12AM +0800, Julien Rouhaud wrote:
> On Tue, Aug 16, 2022 at 02:10:30PM +0800, Julien Rouhaud wrote:
> >
> > On Sat, Jul 30, 2022 at 04:09:36PM +0800, Julien Rouhaud wrote:
> >
> > >
> > > - 0001: the rule_number / mapping_number addition in the views in a separate
> > >   commit
> > > - 0002: the main file inclusion patch.  Only a few minor bugfix since
> > >   previous version discovered thanks to the tests (a bit more about it after),
> > >   and documentation tweaks based on previous discussions
> > > - 0003: the pg_hba_matches() POC, no changes

v11 attached, fixing a shadowed variable warning.

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

24 October 2022, 05:33:12

On Mon, Oct 10, 2022 at 10:51:32AM +0800, Julien Rouhaud wrote:
> On Sun, Sep 18, 2022 at 01:06:12AM +0800, Julien Rouhaud wrote:
> > On Tue, Aug 16, 2022 at 02:10:30PM +0800, Julien Rouhaud wrote:
> > >
> > > On Sat, Jul 30, 2022 at 04:09:36PM +0800, Julien Rouhaud wrote:
> > >
> > > >
> > > > - 0001: the rule_number / mapping_number addition in the views in a separate
> > > >   commit
> > > > - 0002: the main file inclusion patch.  Only a few minor bugfix since
> > > >   previous version discovered thanks to the tests (a bit more about it after),
> > > >   and documentation tweaks based on previous discussions
> > > > - 0003: the pg_hba_matches() POC, no changes

v12 attached, fixing multiple conflicts with recent activity.

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

24 October 2022, 07:13:51

On Mon, Oct 24, 2022 at 01:33:12PM +0800, Julien Rouhaud wrote:
> v12 attached, fixing multiple conflicts with recent activity.

 typedef struct TokenizedAuthLine
 {
    List       *fields;         /* List of lists of AuthTokens */
+   char       *file_name;      /* File name *

Hmm.  While putting my eyes on this patch set for the first time in a
few months (sorry!), the addition of file_name to TokenizedAuthLine in
0002 stands out.  This impacts all the macros used for the validation
of the ident and HBA lines, leading as well to a lot of bloat in the
patch with patterns like that in 20~25 places:
                 errcontext("line %d of configuration file \"%s\"", \
-                           line_num, IdentFileName))); \
+                           line_num, file_name))); \
[...]
                         errcontext("line %d of configuration file \"%s\"",
-                                   line_num, HbaFileName)));
+                                   line_num, file_name)));

Do you think that it would make sense to split that into its own
patch?  That would move the code toward less references to HbaFileName
and IdentFileName, which is one step toward what we want for this
thread.  This tokenization of HBA and ident files is already shared,
so moving this knowledge into TokenizedAuthLine looks helpful
independently of the rest.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

24 October 2022, 08:03:03

Hi,

On Mon, Oct 24, 2022 at 04:13:51PM +0900, Michael Paquier wrote:
> On Mon, Oct 24, 2022 at 01:33:12PM +0800, Julien Rouhaud wrote:
> > v12 attached, fixing multiple conflicts with recent activity.
>
>  typedef struct TokenizedAuthLine
>  {
>     List       *fields;         /* List of lists of AuthTokens */
> +   char       *file_name;      /* File name *
>
> Hmm.  While putting my eyes on this patch set for the first time in a
> few months (sorry!), the addition of file_name to TokenizedAuthLine in
> 0002 stands out.  This impacts all the macros used for the validation
> of the ident and HBA lines, leading as well to a lot of bloat in the
> patch with patterns like that in 20~25 places:
>                  errcontext("line %d of configuration file \"%s\"", \
> -                           line_num, IdentFileName))); \
> +                           line_num, file_name))); \
> [...]
>                          errcontext("line %d of configuration file \"%s\"",
> -                                   line_num, HbaFileName)));
> +                                   line_num, file_name)));
>
> Do you think that it would make sense to split that into its own
> patch?  That would move the code toward less references to HbaFileName
> and IdentFileName, which is one step toward what we want for this
> thread.  This tokenization of HBA and ident files is already shared,
> so moving this knowledge into TokenizedAuthLine looks helpful
> independently of the rest.

It would also require to bring HbaLine->sourcefile.  I'm afraid it would be
weird to introduce such a refactoring in a separate commit just to pass a
constant down multiple level of indirection, as all the macro will remain
specific to either hba or ident anyway.

I agree that there are quite a lot of s/XXXFileName/file_name/, but those
aren't complicated, and keeping them in the same commit makes it easy to
validate that none has been forgotten since the regression tests covering those
messages are in that commit too.

And of course while double checking that none was forgotten I realize that I
missed the new regcomp_auth_token() which introduced a couple new usage of
HbaFileName.

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

25 October 2022, 06:43:21

On Mon, Oct 24, 2022 at 04:03:03PM +0800, Julien Rouhaud wrote:
> It would also require to bring HbaLine->sourcefile.  I'm afraid it would be
> weird to introduce such a refactoring in a separate commit just to pass a
> constant down multiple level of indirection, as all the macro will remain
> specific to either hba or ident anyway.

Putting my hands on it, I am not really afraid of doing that
independently.  From what I can see, this is cutting 23kB worth of
diffs from 0002, reducing it from 94K to 71kB.

> I agree that there are quite a lot of s/XXXFileName/file_name/, but those
> aren't complicated, and keeping them in the same commit makes it easy to
> validate that none has been forgotten since the regression tests covering those
> messages are in that commit too.

Another advantage is that it minimizes the presence of the hardcoded
HbaFileName and IdentFileName in hba.c, which is one thing we are
trying to achieve here for the inclusion of more files.  I found a bit
strange that IdentLine had no sourcefile, actually.  We track the file
number but use it nowhere, and it seems to me that having more
symmetry between both would be a good thing.

So, the key of the logic is how we are going to organize the
tokenization of the HBA and ident lines through all the inclusions..
As far as I get it, tokenize_auth_file() is the root call and
tokenize_file_with_context() with its depth is able to work on each
individual file, and it can optionally recurse depending on what's
included.  Why do you need to switch to the old context in
tokenize_file_with_context()?  Could it be simpler to switch once to
linecxt outside of the internal routine?

It looks like GetDirConfFiles() is another piece that can be
refactored and reviewed on its own, as we use it in
ParseConfigDirectory()@guc.c.
--
Michael

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

25 October 2022, 07:08:59

On Tue, Oct 25, 2022 at 03:43:21PM +0900, Michael Paquier wrote:
>
> Another advantage is that it minimizes the presence of the hardcoded
> HbaFileName and IdentFileName in hba.c, which is one thing we are
> trying to achieve here for the inclusion of more files.  I found a bit
> strange that IdentLine had no sourcefile, actually.  We track the file
> number but use it nowhere, and it seems to me that having more
> symmetry between both would be a good thing.

If IdentLine->linenumber is useless, why not get rid of it rather than tracking
another useless info?  Hba is much more structured so we need a more
specialized struct, but I don't think ident will ever go that way.

> So, the key of the logic is how we are going to organize the
> tokenization of the HBA and ident lines through all the inclusions..
> As far as I get it, tokenize_auth_file() is the root call and
> tokenize_file_with_context() with its depth is able to work on each
> individual file, and it can optionally recurse depending on what's
> included.  Why do you need to switch to the old context in
> tokenize_file_with_context()?  Could it be simpler to switch once to
> linecxt outside of the internal routine?

It just seemed the cleanest way to go.  We could do without but then we would
have to duplicate MemoryContextSwitchTo calls all over the place, and probably
handling an optional memory context creation in the function.

> It looks like GetDirConfFiles() is another piece that can be
> refactored and reviewed on its own, as we use it in
> ParseConfigDirectory()@guc.c.

I'm fine with it.

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

25 October 2022, 11:59:57

On Tue, Oct 25, 2022 at 03:08:59PM +0800, Julien Rouhaud wrote:
> On Tue, Oct 25, 2022 at 03:43:21PM +0900, Michael Paquier wrote:
>> Another advantage is that it minimizes the presence of the hardcoded
>> HbaFileName and IdentFileName in hba.c, which is one thing we are
>> trying to achieve here for the inclusion of more files.  I found a bit
>> strange that IdentLine had no sourcefile, actually.  We track the file
>> number but use it nowhere, and it seems to me that having more
>> symmetry between both would be a good thing.
>
> If IdentLine->linenumber is useless, why not get rid of it rather than tracking
> another useless info?  Hba is much more structured so we need a more
> specialized struct, but I don't think ident will ever go that way.

Hmm.  I would be tempted to keep track of the file name and the line
number as well in IdentLine.  One reason is that this can become
useful for debugging.  A second is that this can reduce a bit the
arguments of fill_ident_line() and fill_hba_line() in hbafuncs.c once
we track these in HbaLine and IdentLine.  And HEAD is slightly
overdoing it in its interface for the line number, actually, as we
pass the line number twice: from {Ident,Hba}Line and the respective
field from TokenizedAuthLine.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

26 October 2022, 03:19:48

On Tue, Oct 25, 2022 at 08:59:57PM +0900, Michael Paquier wrote:
>
> Hmm.  I would be tempted to keep track of the file name and the line
> number as well in IdentLine.  One reason is that this can become
> useful for debugging.  A second is that this can reduce a bit the
> arguments of fill_ident_line() and fill_hba_line() in hbafuncs.c once
> we track these in HbaLine and IdentLine.

Ok, I guess something like the attached v14 is what you want.

> And HEAD is slightly
> overdoing it in its interface for the line number, actually, as we
> pass the line number twice: from {Ident,Hba}Line and the respective
> field from TokenizedAuthLine.

That wouldn't be overdoing anymore if we remove the line number / filename from
the fill_*_line prototypes right?

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

26 October 2022, 04:15:05

On Wed, Oct 26, 2022 at 11:19:48AM +0800, Julien Rouhaud wrote:
> That wouldn't be overdoing anymore if we remove the line number / filename from
> the fill_*_line prototypes right?

Yeah, but there is a twist: HbaLine or IdentLine can be passed as
NULL when entering in fill_hba_line() or fill_ident_line() in the
event of an error when tokenizing the line or when failing the
creation of their Line, so attempting to read the line number from
either of them when filling in a tuple for their respective view would
cause a crash.

So, for now, I have taken the minimalistic approach with the addition
of the source file name in HbaFile and in TokenizedAuthLine.  The
former is actually necessary anyway as auth.c wants it in two
locations (improved auth_failed() and set_authn_id()).

It is still true that the line number in IdentLine remains unused.
Hence, do you think that the addition of the source file name and its
line number could be useful for the error reporting done in
check_ident_usermap()?  The contents of the HbaLines are used in
auth.c for its own reporting magic, and it looks like this would be
the only location where what's in the IdentLines is useful, aka
provide more details about what is happening.  Once users are able to
include ident files, that would likely help in debugging issues.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

26 October 2022, 06:56:07

On Wed, Oct 26, 2022 at 11:19:48AM +0800, Julien Rouhaud wrote:
> That wouldn't be overdoing anymore if we remove the line number / filename from
> the fill_*_line prototypes right?

So, I have spent a good portion of today looking at what you have
here, applying 0001 and 0003 while fixing, rebasing and testing the
whole, discarding 0002 (we could do more for the line number and
source file in terms of the LOGs reported for a regexec failure).

Now remains 0004, which is the core of the proposal, and while it
needs a rebase, I have not been able to spend much time looking at its
internals.  In order to help with the follow-up steps, I have spotted
a few areas that could be done first:
- The diffs in guc.h/c for the introduction of GetDirConfFiles() to
get a set of files in a directory (got to think a bit more about
ParseConfigFp() when it comes to the keywords, but that comes to play
with the parsing logic which is different).
- The TAP test, which is half the size of the patch in line number.
Could it be possible to make it more edible, introducing a basic
infrastructure to check a set of rules in pg_hba.conf and
pg_ident.conf without the inclusion logic?  Checks for error patterns
(that I agree we strongly lack tests for) look like something we'd
want to tackle independently of the inclusion logic, and it should be
built on top of a basic test infra, at least.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

26 October 2022, 15:32:14

On Wed, Oct 26, 2022 at 03:56:07PM +0900, Michael Paquier wrote:
>
> So, I have spent a good portion of today looking at what you have
> here, applying 0001 and 0003 while fixing, rebasing and testing the
> whole, discarding 0002 (we could do more for the line number and
> source file in terms of the LOGs reported for a regexec failure).

Thanks!

> Now remains 0004, which is the core of the proposal, and while it
> needs a rebase,

Have you already done a rebase while working on the patch or are you intending
to take care of it, or should I?  Let's no both do the work :)

> - The TAP test, which is half the size of the patch in line number.
> Could it be possible to make it more edible, introducing a basic
> infrastructure to check a set of rules in pg_hba.conf and
> pg_ident.conf without the inclusion logic?  Checks for error patterns
> (that I agree we strongly lack tests for) look like something we'd
> want to tackle independently of the inclusion logic, and it should be
> built on top of a basic test infra, at least.

I don't mind taking care of that, but before doing so I'd like to have some
feedback on whether you're ok with my approach (per my initial email about it
at [1]) or if you had some different
ideas on how to do it.

[1] https://www.postgresql.org/message-id/20220730080936.atyxodvwlmf2wnoc@jrouhaud

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

27 October 2022, 03:08:31

On Wed, Oct 26, 2022 at 11:32:14PM +0800, Julien Rouhaud wrote:
> I don't mind taking care of that, but before doing so I'd like to have some
> feedback on whether you're ok with my approach (per my initial email about it
> at [1]) or if you had some different
> ideas on how to do it.

Putting things afresh, there are two different things here (sorry I
need to see that typed ;p):
1) How do we want to check reliably the loading of the HBA and ident
files on errors?  EXEC_BACKEND would reload an entire new thing for
each connection, hence we need some loops to go through that.
2) How to check the contents of pg_hba_file_rules and
pg_ident_file_mappings?

There is a dependency between 1) and 2) once we try to check for error
patterns in pg_hba_file_rules, because connections would just not
happen.  This is not the case for pg_ident_file_mappings though, so we
could still test for buggy patterns in pg_ident.conf (or any of its
included parts) with some expected content of
pg_ident_file_mappings.error after a successful connection.

Hmm.  And what if we just gave up on the checks for error patterns in
pg_hba_file_rules?  One thing that we could do for this part is to
include all the buggy patterns we want to check at once in pg_hba.conf
in its included portions, then scan for all the logs produced after
attempting to start a server as the loading of pg_hba.conf would
produce one LOG line with its CONTEXT for each buggy entry.  The patch
checks for error patterns with generate_log_err_rows(), but it looks
like it would make the part 3 of the new test cleaner and easier to
maintain in the long-term.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

27 October 2022, 03:10:09

On Wed, Oct 26, 2022 at 11:32:14PM +0800, Julien Rouhaud wrote:
> Have you already done a rebase while working on the patch or are you intending
> to take care of it, or should I?  Let's no both do the work :)

Spoiler alert: I have not done a rebase yet ;)
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

27 October 2022, 04:26:25

Hi,

On Thu, Oct 27, 2022 at 12:08:31PM +0900, Michael Paquier wrote:
>
> Putting things afresh, there are two different things here (sorry I
> need to see that typed ;p):
> 1) How do we want to check reliably the loading of the HBA and ident
> files on errors?

I guess you meant the failure to load HBA / ident files containing invalid
data?

> EXEC_BACKEND would reload an entire new thing for
> each connection, hence we need some loops to go through that.
> 2) How to check the contents of pg_hba_file_rules and
> pg_ident_file_mappings?
>
> There is a dependency between 1) and 2) once we try to check for error
> patterns in pg_hba_file_rules, because connections would just not
> happen.  This is not the case for pg_ident_file_mappings though, so we
> could still test for buggy patterns in pg_ident.conf (or any of its
> included parts) with some expected content of
> pg_ident_file_mappings.error after a successful connection.
>
> Hmm.  And what if we just gave up on the checks for error patterns in
> pg_hba_file_rules?

We discussed this problem in the past (1), and my understanding was that
detecting a -DEXEC_BACKEND/Win32 build and skipping those tests in that case
would be an acceptable solution to make sure there's at least some coverage.
The proposed patch adds such an approach, making sure that the failure is due
to an invalid HBA file.  If you changed you mind I can remove that part, but
again I'd like to be sure of what you exactly want before starting to rewrite
stuff.

> One thing that we could do for this part is to
> include all the buggy patterns we want to check at once in pg_hba.conf
> in its included portions, then scan for all the logs produced after
> attempting to start a server as the loading of pg_hba.conf would
> produce one LOG line with its CONTEXT for each buggy entry.  The patch
> checks for error patterns with generate_log_err_rows(), but it looks
> like it would make the part 3 of the new test cleaner and easier to
> maintain in the long-term.

I'm not sure what you mean here.  The patch does check for all the errors
looking at LOG lines and CONTEXT lines, but to make the regexp easier it
doesn't try to make sure that each CONTEXT line is immediately following the
expected LOG line.

That's why the errors are divided in 2 steps: a first step with a single error
using some inclusion, so we can validate that the CONTEXT line is entirely
correct (wrt. line number and such), and then every possible error pattern
where we assume that the CONTEXT line are still following their LOG entry if
they're found.  It also has the knowledge of which errors adds a CONTEXT line
and which don't.  And that's done twice, for HBA and ident.

The part 3 is just concatenating everything back, for HBA and ident.  So
long-term maintenance shouldn't get any harder as there won't be any need for
more steps.  We can just keep appending stuff in the 2nd step and all the tests
should run as expected.

[1] https://www.postgresql.org/message-id/YtZLeJPlMutL9heh%40paquier.xyz

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

28 October 2022, 01:24:23

On Thu, Oct 27, 2022 at 12:26:25PM +0800, Julien Rouhaud wrote:
> On Thu, Oct 27, 2022 at 12:08:31PM +0900, Michael Paquier wrote:
>>
>> Putting things afresh, there are two different things here (sorry I
>> need to see that typed ;p):
>> 1) How do we want to check reliably the loading of the HBA and ident
>> files on errors?
>
> I guess you meant the failure to load HBA / ident files containing invalid
> data?

Yeah.

>> Hmm.  And what if we just gave up on the checks for error patterns in
>> pg_hba_file_rules?

One part that I was thinking about when typing this part yesterday is
that an EXEC_BACKEND build should work in non-WIN32 in TAP even if
pg_ident.conf cannot be loaded, but I forgot entirely about the part
where we need a user mapping for the SSPI authentication on WIN32, as
set by pg_regress.

> We discussed this problem in the past (1), and my understanding was that
> detecting a -DEXEC_BACKEND/Win32 build and skipping those tests in that case
> would be an acceptable solution to make sure there's at least some coverage.
> The proposed patch adds such an approach, making sure that the failure is due
> to an invalid HBA file.  If you changed you mind I can remove that part, but
> again I'd like to be sure of what you exactly want before starting to rewrite
> stuff.

I am still not completely sure what's the best way to do things here,
so let's do the following: let's keep the patch the way you think is
better for now (I may change my opinion on that but I'll hack that by
myself anyway).  Using what you have as a base, could you split the
test and have it in its simplest to ease irs review?  It would be able
to stress the buildfarm with a first version of the test and see how
it goes from there, and it is useful by itself IMO as HEAD has zero
coverage for this area.

> I'm not sure what you mean here.  The patch does check for all the errors
> looking at LOG lines and CONTEXT lines, but to make the regexp easier it
> doesn't try to make sure that each CONTEXT line is immediately following the
> expected LOG line.

Hmm.  Perhaps we'd better make sure that the LOG/CONTEXT link is
checked?  The context includes the line number while a generic
sentence, and the LOG provides all the details of the error
happening.

> That's why the errors are divided in 2 steps: a first step with a single error
> using some inclusion, so we can validate that the CONTEXT line is entirely
> correct (wrt. line number and such), and then every possible error pattern
> where we assume that the CONTEXT line are still following their LOG entry if
> they're found.  It also has the knowledge of which errors adds a CONTEXT line
> and which don't.  And that's done twice, for HBA and ident.

Okay, so you do check the relationship between both, after all.

> The part 3 is just concatenating everything back, for HBA and ident.  So
> long-term maintenance shouldn't get any harder as there won't be any need for
> more steps.  We can just keep appending stuff in the 2nd step and all the tests
> should run as expected.

Hmm.  Okay.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

28 October 2022, 03:49:54

On Fri, Oct 28, 2022 at 10:24:23AM +0900, Michael Paquier wrote:
> On Thu, Oct 27, 2022 at 12:26:25PM +0800, Julien Rouhaud wrote:
>
> I am still not completely sure what's the best way to do things here,
> so let's do the following: let's keep the patch the way you think is
> better for now (I may change my opinion on that but I'll hack that by
> myself anyway).  Using what you have as a base, could you split the
> test and have it in its simplest to ease irs review?  It would be able
> to stress the buildfarm with a first version of the test and see how
> it goes from there, and it is useful by itself IMO as HEAD has zero
> coverage for this area.

To be honest I'd rather not to.  It's excessively annoying to work on those
tests (I spent multiple days trying to make it as clean and readable as
possible), and splitting it to only test the current infrastructure will need
some substantial efforts.

But more importantly, the next commit that will add tests for file inclusion
will then be totally unmaintainable and unreadable, so that's IMO even worse.
I think it will probably either be the current file overwritten or a new one
written from scratch if some changes are done in the simplified test, and I'm
not volunteering to do that.

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

02 November 2022, 07:46:48

On Fri, Oct 28, 2022 at 11:49:54AM +0800, Julien Rouhaud wrote:
> To be honest I'd rather not to.  It's excessively annoying to work on those
> tests (I spent multiple days trying to make it as clean and readable as
> possible), and splitting it to only test the current infrastructure will need
> some substantial efforts.

Well, I'd really like to split things as much as possible, beginning
with some solid basics before extending its capabilities.  This
reduces the odds of introducing issues in-between, particularly in
areas as sensible as authentication that involves not-yet-logged-in
users.

> But more importantly, the next commit that will add tests for file inclusion
> will then be totally unmaintainable and unreadable, so that's IMO even worse.
> I think it will probably either be the current file overwritten or a new one
> written from scratch if some changes are done in the simplified test, and I'm
> not volunteering to do that.

Not sure about that either.

Anyway, the last patch posted on this thread does not apply and the CF
bot fails, so it needed a rebase.  I have first noticed that your
patch included some doc fixes independent of this thread, so I have
applied that as e765028 and backpatched it down to 15.  The TAP test
needs to be renamed to 0004, and it was missing from meson.build,
hence the CI was not testing it.

I have spent some time starting at the logic today of the whole, and
GetConfFilesInDir() is really the first thing that stood out.  I am
not sure that it makes much sense to keep that in guc-file.c, TBH,
once we feed it into hba.c.  Perhaps we could put the refactored
routine (and AbsoluteConfigLocation as a side effect) into a new file
in misc/?

As of HEAD, tokenize_inc_file() is the place where we handle a list of
tokens included with an '@' file, appending the existing set of
AuthTokens into the list we are building by grabbing a copy of these
before deleting the line memory context.

Your patch proposes a different alternative, which is to pass down the
memory context created in tokenize_auth_file() down to the callers
with tokenize_file_with_context() dealing with all the internals.
process_included_auth_file() is different extension of that.
This may come down to a matter of taste, but could be be cleaner to
take an approach similar to tokenize_inc_file() and just create a copy
of the AuthToken list coming from a full file and append it to the
result rather than passing around the memory context for the lines?
This would lead to some simplifications, it seems, at least with the
number of arguments passed down across the layers.

The addition of a check for the depth in two places seems unnecessary,
and it looks like we should do this kind of check in only one place.
I have not tried yet, but if we actually move the AllocateFile() and
FreeFile() calls within tokenize_auth_file(), it looks like we may be
able to get to a simpler logic without the need of the with_context()
flavor (even no process_included_auth_file required)?  That could make
the interface easier to follow as a whole, while limiting the presence
of AllocateFile() and FreeFile() to a single code path, impacting
open_inc_file() that relies on what the patch uses for
SecondaryAuthFile and IncludedAuthFile (we could attempt to use the
same error message everywhere as well, as one could expect that
expanded and included files have different names which is enough to
guess which one is an inclusion and which one is a secondary).

Attached are two patches: the first one is a rebase of what you have
posted, and the second one are some changes I did while playing with
the logic.  In the second one, except for conffiles.{h.c}, the changes
are just a POC but that's to show the areas that I am planning to
rework, and your tests pass with it.  I still need to think about all
that and reconsider the design of the interface that would fit with
the tokenization of the inclusions, without that many subroutines to
do the work as it makes the code harder to follow.  Well, that's to
say that I am not staying idle :)
--
Michael

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

02 November 2022, 13:06:02

On Wed, Nov 02, 2022 at 04:46:48PM +0900, Michael Paquier wrote:
> On Fri, Oct 28, 2022 at 11:49:54AM +0800, Julien Rouhaud wrote:
> > To be honest I'd rather not to.  It's excessively annoying to work on those
> > tests (I spent multiple days trying to make it as clean and readable as
> > possible), and splitting it to only test the current infrastructure will need
> > some substantial efforts.
>
> Well, I'd really like to split things as much as possible, beginning
> with some solid basics before extending its capabilities.  This
> reduces the odds of introducing issues in-between, particularly in
> areas as sensible as authentication that involves not-yet-logged-in
> users.
>
> > But more importantly, the next commit that will add tests for file inclusion
> > will then be totally unmaintainable and unreadable, so that's IMO even worse.
> > I think it will probably either be the current file overwritten or a new one
> > written from scratch if some changes are done in the simplified test, and I'm
> > not volunteering to do that.
>
> Not sure about that either.

Maybe one alternative approach would be to keep modifying the current test, and
only do the split when committing the first part?  The split itself should be
easy and mechanical: just remove everything unneeded (different file names
including the file name argument from the add_(hba|ident)_line, inclusion
directive and so on).  Even if the diff is then difficult to read it's not
really a problem as it should have already been reviewed.

> Anyway, the last patch posted on this thread does not apply and the CF
> bot fails, so it needed a rebase.  I have first noticed that your
> patch included some doc fixes independent of this thread, so I have
> applied that as e765028 and backpatched it down to 15.

Yes I saw that this morning, thanks!

> The TAP test
> needs to be renamed to 0004, and it was missing from meson.build,
> hence the CI was not testing it.

Ah right.  This is quite annoying that we have to explicitly name every single
test file there.  For the source code it's hard to not notice a missed file and
you will get an error in the CI, but a missed test is just entirely transparent
:(

> I have spent some time starting at the logic today of the whole, and
> GetConfFilesInDir() is really the first thing that stood out.  I am
> not sure that it makes much sense to keep that in guc-file.c, TBH,
> once we feed it into hba.c.  Perhaps we could put the refactored
> routine (and AbsoluteConfigLocation as a side effect) into a new file
> in misc/?

Yes I was wondering about it too.  A new fine in misc/ looks sensible.

> As of HEAD, tokenize_inc_file() is the place where we handle a list of
> tokens included with an '@' file, appending the existing set of
> AuthTokens into the list we are building by grabbing a copy of these
> before deleting the line memory context.
>
> Your patch proposes a different alternative, which is to pass down the
> memory context created in tokenize_auth_file() down to the callers
> with tokenize_file_with_context() dealing with all the internals.
> process_included_auth_file() is different extension of that.
> This may come down to a matter of taste, but could be be cleaner to
> take an approach similar to tokenize_inc_file() and just create a copy
> of the AuthToken list coming from a full file and append it to the
> result rather than passing around the memory context for the lines?
> This would lead to some simplifications, it seems, at least with the
> number of arguments passed down across the layers.

I guess it goes in the same direction as 8fea86830e1 where infrastructure to
copy AuthTokens was added, I'm fine either way.

> The addition of a check for the depth in two places seems unnecessary,
> and it looks like we should do this kind of check in only one place.

I usually prefer to add a maybe unnecessary depth check since it's basically
free rather than risking an unfriendly stack size error (including in possible
later refactoring), but no objection to get rid of it.

> I have not tried yet, but if we actually move the AllocateFile() and
> FreeFile() calls within tokenize_auth_file(), it looks like we may be
> able to get to a simpler logic without the need of the with_context()
> flavor (even no process_included_auth_file required)?  That could make
> the interface easier to follow as a whole, while limiting the presence
> of AllocateFile() and FreeFile() to a single code path, impacting
> open_inc_file() that relies on what the patch uses for
> SecondaryAuthFile and IncludedAuthFile (we could attempt to use the
> same error message everywhere as well, as one could expect that
> expanded and included files have different names which is enough to
> guess which one is an inclusion and which one is a secondary).

You meant tokenize_auth_file_internal?  Yes possibly, in general I tried
to avoid changing too much the existing code to ease the patch acceptance, but
I agree it would make things simpler.

> Attached are two patches: the first one is a rebase of what you have
> posted, and the second one are some changes I did while playing with
> the logic.  In the second one, except for conffiles.{h.c}, the changes
> are just a POC but that's to show the areas that I am planning to
> rework, and your tests pass with it.  I still need to think about all
> that and reconsider the design of the interface that would fit with
> the tokenization of the inclusions, without that many subroutines to
> do the work as it makes the code harder to follow.  Well, that's to
> say that I am not staying idle :)

Thanks!

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

03 November 2022, 11:55:05

On Wed, Nov 02, 2022 at 09:06:02PM +0800, Julien Rouhaud wrote:
> Maybe one alternative approach would be to keep modifying the current test, and
> only do the split when committing the first part?  The split itself should be
> easy and mechanical: just remove everything unneeded (different file names
> including the file name argument from the add_(hba|ident)_line, inclusion
> directive and so on).  Even if the diff is then difficult to read it's not
> really a problem as it should have already been reviewed.

I have not reviewed the test part much yet, TBH.  The path
manipulations because of WIN32 are a bit annoying.  I was wondering if
we could avoid all that by using the basenames, as one option.

>> I have spent some time starting at the logic today of the whole, and
>> GetConfFilesInDir() is really the first thing that stood out.  I am
>> not sure that it makes much sense to keep that in guc-file.c, TBH,
>> once we feed it into hba.c.  Perhaps we could put the refactored
>> routine (and AbsoluteConfigLocation as a side effect) into a new file
>> in misc/?
>
> Yes I was wondering about it too.  A new fine in misc/ looks sensible.

conffiles.c is the best thing I could come up with, conf.c and
config.c being too generic and these are routines that work on
configuration files, so..

>> Your patch proposes a different alternative, which is to pass down the
>> memory context created in tokenize_auth_file() down to the callers
>> with tokenize_file_with_context() dealing with all the internals.
>> process_included_auth_file() is different extension of that.
>> This may come down to a matter of taste, but could be be cleaner to
>> take an approach similar to tokenize_inc_file() and just create a copy
>> of the AuthToken list coming from a full file and append it to the
>> result rather than passing around the memory context for the lines?
>> This would lead to some simplifications, it seems, at least with the
>> number of arguments passed down across the layers.
>
> I guess it goes in the same direction as 8fea86830e1 where infrastructure to
> copy AuthTokens was added, I'm fine either way.

I won't hide that I am trying to make the changes a maximum
incremental for this thread so as the final part is easy-ish.

>> The addition of a check for the depth in two places seems unnecessary,
>> and it looks like we should do this kind of check in only one place.
>
> I usually prefer to add a maybe unnecessary depth check since it's basically
> free rather than risking an unfriendly stack size error (including in possible
> later refactoring), but no objection to get rid of it.

The depth check is a good idea, though I guess that there is an
argument for having it in only one code path, not two.

>> I have not tried yet, but if we actually move the AllocateFile() and
>> FreeFile() calls within tokenize_auth_file(), it looks like we may be
>> able to get to a simpler logic without the need of the with_context()
>> flavor (even no process_included_auth_file required)?  That could make
>> the interface easier to follow as a whole, while limiting the presence
>> of AllocateFile() and FreeFile() to a single code path, impacting
>> open_inc_file() that relies on what the patch uses for
>> SecondaryAuthFile and IncludedAuthFile (we could attempt to use the
>> same error message everywhere as well, as one could expect that
>> expanded and included files have different names which is enough to
>> guess which one is an inclusion and which one is a secondary).
>
> You meant tokenize_auth_file_internal?  Yes possibly, in general I tried
> to avoid changing too much the existing code to ease the patch acceptance, but
> I agree it would make things simpler.

I *guess* that my mind implied tokenize_auth_file() as of
yesterday's.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

07 November 2022, 06:07:15

On Thu, Nov 03, 2022 at 08:55:05PM +0900, Michael Paquier wrote:
> On Wed, Nov 02, 2022 at 09:06:02PM +0800, Julien Rouhaud wrote:
>>> The addition of a check for the depth in two places seems unnecessary,
>>> and it looks like we should do this kind of check in only one place.
>>
>> I usually prefer to add a maybe unnecessary depth check since it's basically
>> free rather than risking an unfriendly stack size error (including in possible
>> later refactoring), but no objection to get rid of it.
>
> The depth check is a good idea, though I guess that there is an
> argument for having it in only one code path, not two.

So, I have looked at this specific part, and I think that we should
try to solve this issue on its own as the error message leading to a
failure at startup is confusing now, giving:
FATAL:  exceeded maxAllocatedDescs (166) while trying to open file \"blah\".

It took me some time to ponder about what kind of interface we should
use for this job, but I think that it comes down to three things:
- The depth ought to be checked just before doing AllocateFile() for a
file.
- We should have one code path calling AllocateFile().
- The same interface should be used for hba.c and hbafuncs.c.

I have settled down to the following in the attached patch 0002:
open_auth_file(const char *filename, int elevel, int depth, char
**err_msg)

More checks could be added in this code path (guc-file.l does more
with immediate recursions, for example), but the depth level should be
enough anyway.  I am aware of the fact that this reduces a bit the
information provided for what's called a "secondary" file in your
patch set, but I tend toward moving into something that's
minimalistic.  Currently, we provide the file and its '@' shortcut,
and we don't mention at all the file where it comes from.  This is a
minimal problem now because we only have pg_hba.conf to worry about
for most users, still I am wondering whether it would be better in the
long run to complete that with an error context that mentions the
file(s) where this comes from and report a full chain of the events
that led to this opening failure.  At the end, I think that 0002 moves
us closer toward the goal of this thread.  Not been a fan of
HbaIncludeKind since the patch was proposed, TBH, as well.

Another thing is that this removes the need for open_inc_file(), my
opinion is that it is overcomplicated, and actually it seems like
under the non-strict mode (aka include_if_exists), we would generate a
LOG to mention that a configuration file is skipped, but there would
be nothing in err_msg for the TokenizedAuthLine, which is a mistake I
guess?

>>> I have not tried yet, but if we actually move the AllocateFile() and
>>> FreeFile() calls within tokenize_auth_file(), it looks like we may be
>>> able to get to a simpler logic without the need of the with_context()
>>> flavor (even no process_included_auth_file required)?  That could make
>>> the interface easier to follow as a whole, while limiting the presence
>>> of AllocateFile() and FreeFile() to a single code path, impacting
>>> open_inc_file() that relies on what the patch uses for
>>> SecondaryAuthFile and IncludedAuthFile (we could attempt to use the
>>> same error message everywhere as well, as one could expect that
>>> expanded and included files have different names which is enough to
>>> guess which one is an inclusion and which one is a secondary).
>>
>> You meant tokenize_auth_file_internal?  Yes possibly, in general I tried
>> to avoid changing too much the existing code to ease the patch acceptance, but
>> I agree it would make things simpler.
>
> I *guess* that my mind implied tokenize_auth_file() as of
> yesterday's.

While thinking more about the logic, I saw that your approach to use
the same MemoryContext "linecxt" and pass it down to the various
layers with tokenize_file_with_context() has the advantage to remove
the need to copy a list of TokenizedAuthLine coming from other
included files and/or dirs, which could easily lead to bugs especially
for the sourcefile if someone is not careful enough.  There is no need
to touch the existing tokenize_inc_file() that copies a set of
AuthTokens to an existing list in TokenizedAuthLine, of course.

Should we do without tokenize_file_with_context() though?  There is an
argument for making everything go through tokenize_auth_file(), having
a MemoryContext as argument (aka if NULL create it, if not use it),
while still returning the memory context used?  Similarly, it looks
like we should have no need for process_included_authfile() at the
end.

I have applied as a1a7bb8 the refactoring of the directory logic for
configuration files, and noticed that this could also be used for
tokenize_inc_file().  This reduces the whole patch by ~15%.

Attached is a set of three patches:
- 0001 changes tokenize_inc_file() to use AbsoluteConfigLocation().
AbsoluteConfigLocation() uses a static buffer and a MAXPGPATH, but
we'd rather change it to use a palloc()+strcpy() instead and remove
the static restriction?  What do you think?  The same applies for the
case where we use DataDir, actually, and it seems like there is no
point in this path-length restriction in this code path.
- 0002 invents the interface to open auth files and check for their
depths, simplifying the main patch a bit as there is no need to track
the depth level here and there anymore.
- 0003 is the rebased patch, simplified after the other changes.  The
bulk of the patch is in its TAP test.
--
Michael

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

08 November 2022, 01:04:16

On Mon, Nov 07, 2022 at 03:07:15PM +0900, Michael Paquier wrote:
> Attached is a set of three patches:
> - 0001 changes tokenize_inc_file() to use AbsoluteConfigLocation().
> AbsoluteConfigLocation() uses a static buffer and a MAXPGPATH, but
> we'd rather change it to use a palloc()+strcpy() instead and remove
> the static restriction?  What do you think?  The same applies for the
> case where we use DataDir, actually, and it seems like there is no
> point in this path-length restriction in this code path.
> - 0002 invents the interface to open auth files and check for their
> depths, simplifying the main patch a bit as there is no need to track
> the depth level here and there anymore.
> - 0003 is the rebased patch, simplified after the other changes.  The
> bulk of the patch is in its TAP test.

CF bot unhappy as I have messed up with rules.out.  Rebased.  I have
removed the restriction on MAXPGPATH in AbsoluteConfigLocation() in
0001, while on it.  The absolute paths built on GUC or ident
inclusions are the same.
--
Michael

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

09 November 2022, 00:51:17

On Tue, Nov 08, 2022 at 10:04:16AM +0900, Michael Paquier wrote:
> CF bot unhappy as I have messed up with rules.out.  Rebased.  I have
> removed the restriction on MAXPGPATH in AbsoluteConfigLocation() in
> 0001, while on it.  The absolute paths built on GUC or ident
> inclusions are the same.

Rebased after 6bbd8b7g that is an equivalent of the previous 0001.
Julien, please note that this is waiting on author for now.  What do
you think about the now-named v18-0001 and the addition of an
ErrorContextCallback to provide more information about the list of
included files on error?
--
Michael

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

09 November 2022, 04:09:01

Hi,

On Wed, Nov 09, 2022 at 09:51:17AM +0900, Michael Paquier wrote:
> On Tue, Nov 08, 2022 at 10:04:16AM +0900, Michael Paquier wrote:
> > CF bot unhappy as I have messed up with rules.out.  Rebased.  I have
> > removed the restriction on MAXPGPATH in AbsoluteConfigLocation() in
> > 0001, while on it.  The absolute paths built on GUC or ident
> > inclusions are the same.
>
> Rebased after 6bbd8b7g that is an equivalent of the previous 0001.

Thanks a lot!

> Julien, please note that this is waiting on author for now.  What do
> you think about the now-named v18-0001 and the addition of an
> ErrorContextCallback to provide more information about the list of
> included files on error?

Yes, I'm unfortunately fully aware that it's waiting on me.  I've been a bit
busy this week with $work but I will try to go back to it as soon as I can,
hopefully this week!

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

10 November 2022, 01:29:40

On Wed, Nov 09, 2022 at 12:09:01PM +0800, Julien Rouhaud wrote:
> On Wed, Nov 09, 2022 at 09:51:17AM +0900, Michael Paquier wrote:
>> Julien, please note that this is waiting on author for now.  What do
>> you think about the now-named v18-0001 and the addition of an
>> ErrorContextCallback to provide more information about the list of
>> included files on error?
>
> Yes, I'm unfortunately fully aware that it's waiting on me.  I've been a bit
> busy this week with $work but I will try to go back to it as soon as I can,
> hopefully this week!

FWIW, I have been playing with the addition of a ErrorContextCallback
in tokenize_auth_file(), and this addition leads to a really nice
result.  With this method, it is possible to know the full chain of
events leading to a failure when tokenizing included files, which is
not available now in the logs when reloading the server.

We could extend it to have more verbose information by passing more
arguments to tokenize_auth_file(), still I'd like to think that just
knowing the line number and the full path to the file is more than
enough once you know the full chain of events.  0001 and 0002 ought to
be merged together, but I am keeping these separate to show how simple
the addition of the ErrorContextCallback is.
--
Michael

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

12 November 2022, 08:13:53

On Thu, Nov 10, 2022 at 10:29:40AM +0900, Michael Paquier wrote:
>
> FWIW, I have been playing with the addition of a ErrorContextCallback
> in tokenize_auth_file(), and this addition leads to a really nice
> result.  With this method, it is possible to know the full chain of
> events leading to a failure when tokenizing included files, which is
> not available now in the logs when reloading the server.
>
> We could extend it to have more verbose information by passing more
> arguments to tokenize_auth_file(), still I'd like to think that just
> knowing the line number and the full path to the file is more than
> enough once you know the full chain of events.  0001 and 0002 ought to
> be merged together, but I am keeping these separate to show how simple
> the addition of the ErrorContextCallback is.

It's looks good to me.  I agree that file name and line number should be enough
to diagnose any unexpected error.

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

14 November 2022, 05:40:37

On Sat, Nov 12, 2022 at 04:13:53PM +0800, Julien Rouhaud wrote:
> It's looks good to me.  I agree that file name and line number should be enough
> to diagnose any unexpected error.

Thanks for checking.  I have looked at 0001 and 0002 again with a
fresh mind, and applied both of them this morning.

This makes the remaining bits of the patch much easier to follow in
hba.c.  Here are more comments after a closer review of the whole for
the C logic.

-MemoryContext
-tokenize_auth_file(const char *filename, FILE *file, List **tok_lines,
-                  int elevel, int depth)
+static void
+tokenize_file_with_context(MemoryContext linecxt, const char *filename,

I really tend to prefer having one routine rather than two for the
tokenization entry point.  Switching to the line context after setting
up the callback is better, and tokenize_file_with_context() does so.
Anyway, what do you think about having one API that gains a
"MemoryContext *" argument, as of the following:
void tokenize_auth_file(const char *filename, FILE *file,
                        List **tok_lines,
                        int depth, int elevel, MemoryContext *linectx)

If the caller passes NULL for *linectx as the initial line context,
just create it as we do now.  If *linectx is not NULL, just reuse it.
That may be cleaner than returning the created MemoryContext as
returned result from tokenize_auth_file().

+    /* Cumulate errors if any. */
+    if (err_msg)
+    {
+        if (err_buf.len > 0)
+            appendStringInfoChar(&err_buf, '\n');
+        appendStringInfoString(&err_buf, err_msg);
+    }

This aggregates all the error messages for all the files included in a
given repository.  As the patch stands, it seems to me that we would
get errors related to an include_dir clause for two cases:
- The specified path does not exist, in which case we have only one
err_msg to consume and report back.
- Multiple failures in opening and/or reading included files.
In the second case, aggregating the reports would provide a full set
of information, but that's not something a user would be able to act
on directly as this is system-related.  Or there is a case to know a
full list of files in the case of multiple files that cannot be read
because of permission issues?  We may be fine with just the first
system error here.  Note that in the case where files can be read and
opened, these would have their own TokenizedAuthLines for each line
parsed, meaning one line in the SQL views once translated to an
HbaLine or an IdentLine.

This line of thoughts brings an interesting point, actually: there is
an inconsistency between "include_if_exists" and "include" compared to
the GUC processing.  As of the patch, if we use "include" on a file
that does not exist, the tokenization logic would jump over it and
continue processing the follow-up entries anyway.  This is a different
policy than the GUCs, where we would immediately stop looking at
parameters after an "include" if it fails because its file does not
exist, working as a immediate stop in the processing.  The difference
that the patch brings between "include_if_exists" and "include" is
that we report an error in one case but not the other, still skip the
files in both cases and move on with the rest.  Hence my question,
shouldn't we do like the GUC processing for the hba and ident files,
aka stop immediately when we fail to find a file on an "include"
clause?  This would be equivalent to doing a "break" in
tokenize_file_with_context() after failing an include file.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

14 November 2022, 07:47:27

Hi,

On Mon, Nov 14, 2022 at 02:40:37PM +0900, Michael Paquier wrote:
> On Sat, Nov 12, 2022 at 04:13:53PM +0800, Julien Rouhaud wrote:
> > It's looks good to me.  I agree that file name and line number should be enough
> > to diagnose any unexpected error.
>
> Thanks for checking.  I have looked at 0001 and 0002 again with a
> fresh mind, and applied both of them this morning.

Thanks a lot!

> This makes the remaining bits of the patch much easier to follow in
> hba.c.  Here are more comments after a closer review of the whole for
> the C logic.

Agreed.

> -MemoryContext
> -tokenize_auth_file(const char *filename, FILE *file, List **tok_lines,
> -                  int elevel, int depth)
> +static void
> +tokenize_file_with_context(MemoryContext linecxt, const char *filename,
>
> I really tend to prefer having one routine rather than two for the
> tokenization entry point.  Switching to the line context after setting
> up the callback is better, and tokenize_file_with_context() does so.
> Anyway, what do you think about having one API that gains a
> "MemoryContext *" argument, as of the following:
> void tokenize_auth_file(const char *filename, FILE *file,
>                         List **tok_lines,
>                         int depth, int elevel, MemoryContext *linectx)
>
> If the caller passes NULL for *linectx as the initial line context,
> just create it as we do now.  If *linectx is not NULL, just reuse it.
> That may be cleaner than returning the created MemoryContext as
> returned result from tokenize_auth_file().

I originally used two functions as it's a common pattern (e.g. ReadBuffer /
ReadBufferExtended or all the *_internal versions of functions) and to avoid
unnecessary branch, but I agree that here having an extra branch is unlikely to
make any measurable difference.  It would only matter on -DEXEC_BACKEND /
win32, and in that case the extra overhead (over an already expensive new
backend mechanism) is way more than that, so +1 to keep things simple here.

> +    /* Cumulate errors if any. */
> +    if (err_msg)
> +    {
> +        if (err_buf.len > 0)
> +            appendStringInfoChar(&err_buf, '\n');
> +        appendStringInfoString(&err_buf, err_msg);
> +    }
>
> This aggregates all the error messages for all the files included in a
> given repository.  As the patch stands, it seems to me that we would
> get errors related to an include_dir clause for two cases:
> - The specified path does not exist, in which case we have only one
> err_msg to consume and report back.
> - Multiple failures in opening and/or reading included files.
> In the second case, aggregating the reports would provide a full set
> of information, but that's not something a user would be able to act
> on directly as this is system-related.  Or there is a case to know a
> full list of files in the case of multiple files that cannot be read
> because of permission issues?  We may be fine with just the first
> system error here.  Note that in the case where files can be read and
> opened, these would have their own TokenizedAuthLines for each line
> parsed, meaning one line in the SQL views once translated to an
> HbaLine or an IdentLine.

If you have an include_dir directive and multiple files have wrong permission
(or maybe broken symlink or something like that), you will get multiple errors
when trying to process that single directive.  I think it's friendlier to
report as much detail as we can, so users can make sure they fix everything
rather than iterating over the first error.  That's especially helpful if the
fix is done in some external tooling (puppet or whatever) rather than directly
on the server.
>
> This line of thoughts brings an interesting point, actually: there is
> an inconsistency between "include_if_exists" and "include" compared to
> the GUC processing.  As of the patch, if we use "include" on a file
> that does not exist, the tokenization logic would jump over it and
> continue processing the follow-up entries anyway.  This is a different
> policy than the GUCs, where we would immediately stop looking at
> parameters after an "include" if it fails because its file does not
> exist, working as a immediate stop in the processing.  The difference
> that the patch brings between "include_if_exists" and "include" is
> that we report an error in one case but not the other, still skip the
> files in both cases and move on with the rest.  Hence my question,
> shouldn't we do like the GUC processing for the hba and ident files,
> aka stop immediately when we fail to find a file on an "include"
> clause?  This would be equivalent to doing a "break" in
> tokenize_file_with_context() after failing an include file.

I think that the problem is that we have the same interface for processing the
files on a startup/reload and for filling the views, so if we want the views to
be helpful and report all errors we have to also allow a bogus "include" to
continue in the reload case too.  The same problem doesn't exists for GUCs, so
a slightly different behavior there might be acceptable.

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

14 November 2022, 23:46:55

On Mon, Nov 14, 2022 at 03:47:27PM +0800, Julien Rouhaud wrote:
> On Mon, Nov 14, 2022 at 02:40:37PM +0900, Michael Paquier wrote:
>> If the caller passes NULL for *linectx as the initial line context,
>> just create it as we do now.  If *linectx is not NULL, just reuse it.
>> That may be cleaner than returning the created MemoryContext as
>> returned result from tokenize_auth_file().
>
> I originally used two functions as it's a common pattern (e.g. ReadBuffer /
> ReadBufferExtended or all the *_internal versions of functions) and to avoid
> unnecessary branch, but I agree that here having an extra branch is unlikely to
> make any measurable difference.  It would only matter on -DEXEC_BACKEND /
> win32, and in that case the extra overhead (over an already expensive new
> backend mechanism) is way more than that, so +1 to keep things simple here.

Oh, Okay.  So that was your intention here.

>> This aggregates all the error messages for all the files included in a
>> given repository.  As the patch stands, it seems to me that we would
>> get errors related to an include_dir clause for two cases:
>> - The specified path does not exist, in which case we have only one
>> err_msg to consume and report back.
>> - Multiple failures in opening and/or reading included files.
>> In the second case, aggregating the reports would provide a full set
>> of information, but that's not something a user would be able to act
>> on directly as this is system-related.  Or there is a case to know a
>> full list of files in the case of multiple files that cannot be read
>> because of permission issues?  We may be fine with just the first
>> system error here.  Note that in the case where files can be read and
>> opened, these would have their own TokenizedAuthLines for each line
>> parsed, meaning one line in the SQL views once translated to an
>> HbaLine or an IdentLine.
>
> If you have an include_dir directive and multiple files have wrong permission
> (or maybe broken symlink or something like that), you will get multiple errors
> when trying to process that single directive.  I think it's friendlier to
> report as much detail as we can, so users can make sure they fix everything
> rather than iterating over the first error.  That's especially helpful if the
> fix is done in some external tooling (puppet or whatever) rather than directly
> on the server.

Hmm, Okay.  Well, this would include only errors on I/O or permission
problems for existing files..  I have not seen deployments that have
dozens of sub-files for GUCs with an included dir, but perhaps I lack
of experience on user histories.

> I think that the problem is that we have the same interface for processing the
> files on a startup/reload and for filling the views, so if we want the views to
> be helpful and report all errors we have to also allow a bogus "include" to
> continue in the reload case too.  The same problem doesn't exists for GUCs, so
> a slightly different behavior there might be acceptable.

Well, there is as well the point that we don't have yet a view for
GUCs that does the equivalent of HBA and ident, but that does not
mean, it seems to me, that there should be an inconsistency in the way
we process those commands because one has implemented a feature but
not the other.  On the contrary, I'd rather try to make them
consistent.  As things are in the patch, the only difference between
"include_if_exists" and "include" is that the latter would report some
information if the file goes missing, the former generates a LOG entry
about the file being skipped without something in the system view.
Now, wouldn't it be useful for the end-user to report that a file is
skipped as an effect of "include_if_exists" in the system views?  If
not, then what's the point of having this clause to begin with?  My
opinion is that both clauses are useful, still on the ground of
consistency both clauses should work the same as for GUCs.  Both
should report something in err_msg even if that's just about a file
being skipped, though I agree that this could be considered confusing
as well for an if_exists clause (does not look like a big deal to me
based on the debuggability gain).
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

16 November 2022, 02:53:02

On Tue, Nov 15, 2022 at 08:46:55AM +0900, Michael Paquier wrote:
> On Mon, Nov 14, 2022 at 03:47:27PM +0800, Julien Rouhaud wrote:
> >
> > If you have an include_dir directive and multiple files have wrong permission
> > (or maybe broken symlink or something like that), you will get multiple errors
> > when trying to process that single directive.  I think it's friendlier to
> > report as much detail as we can, so users can make sure they fix everything
> > rather than iterating over the first error.  That's especially helpful if the
> > fix is done in some external tooling (puppet or whatever) rather than directly
> > on the server.
>
> Hmm, Okay.  Well, this would include only errors on I/O or permission
> problems for existing files..  I have not seen deployments that have
> dozens of sub-files for GUCs with an included dir, but perhaps I lack
> of experience on user histories.

While being the same inclusion infrastructure, it's likely that people will
have different usage.  I'm assuming that for GUCs the main usage is to have
your automation tool put one of your template conf for instance (small_vm.conf,
big_vm.conf or something like that) in an included directory, so you don't
really need a lot of them.  You can also rely on ALTER SYSTEM to avoid manually
handling configuration files entirely.

For authentication there are probably very different pattern.  The use case I
had when writing this patch is some complex application that relies on many
services, each service having dedicated role and authentication, and services
can be enabled or disabled dynamically pretty much anytime.  I had to write
code to merge possibly new entries with existing pg_hba/pg_ident configuration
files.  With this feature it would be much easier (and robust) to simply have a
main pg_hba/pg_ident that includes some directory, and have the service
enable/disable simply creates/removes a dedicated file for each service, and we
then would usually have at least a dozen files there.

I'm assuming people doing multi-tenant can also have similar usage.

> > I think that the problem is that we have the same interface for processing the
> > files on a startup/reload and for filling the views, so if we want the views to
> > be helpful and report all errors we have to also allow a bogus "include" to
> > continue in the reload case too.  The same problem doesn't exists for GUCs, so
> > a slightly different behavior there might be acceptable.
>
> Well, there is as well the point that we don't have yet a view for
> GUCs that does the equivalent of HBA and ident,

Yes that's what I meant by "the same problem doesn't exists for GUCs".

> but that does not
> mean, it seems to me, that there should be an inconsistency in the way
> we process those commands because one has implemented a feature but
> not the other.  On the contrary, I'd rather try to make them
> consistent.

You mean stopping at the first error, even if it's only for the view reporting?
That will make the reload consistent, but the view will be a bit useless then.

> As things are in the patch, the only difference between
> "include_if_exists" and "include" is that the latter would report some
> information if the file goes missing, the former generates a LOG entry
> about the file being skipped without something in the system view.
> Now, wouldn't it be useful for the end-user to report that a file is
> skipped as an effect of "include_if_exists" in the system views?  If
> not, then what's the point of having this clause to begin with?

I don't really get the argument "the proposed monitoring view doesn't give this
information, so the feature isn't needed".  Also, unless I'm missing something
the other difference between "include" and "include_if_exists" is that the
first one will refuse to reload the conf (or start the server) if the file is
missing, while the other one will?

> My
> opinion is that both clauses are useful, still on the ground of
> consistency both clauses should work the same as for GUCs.  Both
> should report something in err_msg even if that's just about a file
> being skipped, though I agree that this could be considered confusing
> as well for an if_exists clause (does not look like a big deal to me
> based on the debuggability gain).

It would be nice to have the information that an "include_if_exists" file
didn't exist, but having a log-level message in the "error" column is a clear
POLA violation.  People will probably just do something like

SELECT file_name, line_number, error FROM pg_hba_file_rules WHERE error IS NOT
NULL;

and report an error if any row is found.  Having to parse the error field to
know if that's really an error or not is going to be a huge foot-gun.  Maybe we
could indeed report the problem in err_msg but for include_if_exists display it
in some other column of the view?

Re: Allow file inclusion in pg_hba and pg_ident files

From

Ian Lawrence Barwick

Date:

16 November 2022, 05:01:06

2022年11月14日(月) 14:41 Michael Paquier <michael@paquier.xyz>:
>
> On Sat, Nov 12, 2022 at 04:13:53PM +0800, Julien Rouhaud wrote:
> > It's looks good to me.  I agree that file name and line number should be enough
> > to diagnose any unexpected error.
>
> Thanks for checking.  I have looked at 0001 and 0002 again with a
> fresh mind, and applied both of them this morning.

Hi

Just a quick note to mention that 0003 adds a new test ("004_file_inclusion.pl")
but doesn't seem to include it in the "meson.build" file.

Regards

Ian Barwick

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

16 November 2022, 05:30:39

Le mer. 16 nov. 2022 à 13:01, Ian Lawrence Barwick <barwick@gmail.com> a écrit :

2022年11月14日(月) 14:41 Michael Paquier <michael@paquier.xyz>:
>
> On Sat, Nov 12, 2022 at 04:13:53PM +0800, Julien Rouhaud wrote:
> > It's looks good to me. I agree that file name and line number should be enough
> > to diagnose any unexpected error.
>
> Thanks for checking. I have looked at 0001 and 0002 again with a
> fresh mind, and applied both of them this morning.

Hi

Just a quick note to mention that 0003 adds a new test ("004_file_inclusion.pl")
but doesn't seem to include it in the "meson.build" file.

ah thanks! Michael also mentioned that previously but as we were focusing on preliminary refactoring patches I didn't check if if was fixed in the patches sent recently, but I will definitely take care of it for the next round

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

17 November 2022, 02:33:05

On Wed, Nov 16, 2022 at 10:53:02AM +0800, Julien Rouhaud wrote:
> While being the same inclusion infrastructure, it's likely that people will
> have different usage.  I'm assuming that for GUCs the main usage is to have
> your automation tool put one of your template conf for instance (small_vm.conf,
> big_vm.conf or something like that) in an included directory, so you don't
> really need a lot of them.  You can also rely on ALTER SYSTEM to avoid manually
> handling configuration files entirely.
>
> For authentication there are probably very different pattern.  The use case I
> had when writing this patch is some complex application that relies on many
> services, each service having dedicated role and authentication, and services
> can be enabled or disabled dynamically pretty much anytime.  I had to write
> code to merge possibly new entries with existing pg_hba/pg_ident configuration
> files.  With this feature it would be much easier (and robust) to simply have a
> main pg_hba/pg_ident that includes some directory, and have the service
> enable/disable simply creates/removes a dedicated file for each service, and we
> then would usually have at least a dozen files there.
>
> I'm assuming people doing multi-tenant can also have similar usage.

So you main application for HBA/ident is include_dir/..  For GUCs, we
would just skip the directory if it has no files, but the directory
has to exist.  Your patch is behaving the same.

>> but that does not
>> mean, it seems to me, that there should be an inconsistency in the way
>> we process those commands because one has implemented a feature but
>> not the other.  On the contrary, I'd rather try to make them
>> consistent.
>
> You mean stopping at the first error, even if it's only for the view reporting?
> That will make the reload consistent, but the view will be a bit useless then.

Yep, I meant to stop the generation of the TokenizedAuthLines at the
first inclusion error by letting tokenize_auth_file() return a status
to be able to stop the recursions.  But after some second-thoughts
pondering about this, I see why I am wrong and why you are right.  As
you say, stopping the generation of the TokenizedAuthLines would just
limit the amount of data reported at once in the view, the golden rule
for HBA/ident being that we would reload nothing as long as there is
at least one error reported when parsing things.  So giving more
control to tokenize_auth_file() to stop a recursion, say by making it
return a boolean status, makes little sense.

> It would be nice to have the information that an "include_if_exists" file
> didn't exist, but having a log-level message in the "error" column is a clear
> POLA violation.  People will probably just do something like
>
> SELECT file_name, line_number, error FROM pg_hba_file_rules WHERE error IS NOT
> NULL;
>
> and report an error if any row is found.  Having to parse the error field to
> know if that's really an error or not is going to be a huge foot-gun.  Maybe we
> could indeed report the problem in err_msg but for include_if_exists display it
> in some other column of the view?

Hmm.  One possibility would be to add a direct mention to the
"include", "include_dir" and "include_if_exists" clauses through
pg_hba_file_rules.type?  I don't see how to do that without making the
patch much more invasive as per the existing separation between
tokenization and Hba/IdentLine filling, though, and perhaps the error
provided with the full path to the file would provide enough context
for one to know if the failure is happening on an included file for
database/user lists or a full file from an "include" clause.  It also
means that include_if_exists remains transparent all the time.
Simpler may be for the best here, at the end.

By the way, I am wondering whether process_included_authfile() is
the most intuitive interface here.  The only thing that prevents a
common routine to process the include commands is the failure on
GetConfFilesInDir(), where we need to build a TokenizedAuthLine when
the others have already done so tokenize_file_with_context().  Could
it be cleaner to have a small routine like makeTokenizedAuthLine()
that gets reused when we fail scanning a directory to build a
TokenizedAuthLine, in combination with a small-ish routine working on
a directory like ParseConfigDirectory() but for HBA/ident?  Or we
could just drop process_included_authfile() entirely?  On failure,
this would make the code do a next_line all the time for all the
include clauses.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

22 November 2022, 08:20:01

On Thu, Nov 17, 2022 at 11:33:05AM +0900, Michael Paquier wrote:
> By the way, I am wondering whether process_included_authfile() is
> the most intuitive interface here.  The only thing that prevents a
> common routine to process the include commands is the failure on
> GetConfFilesInDir(), where we need to build a TokenizedAuthLine when
> the others have already done so tokenize_file_with_context().  Could
> it be cleaner to have a small routine like makeTokenizedAuthLine()
> that gets reused when we fail scanning a directory to build a
> TokenizedAuthLine, in combination with a small-ish routine working on
> a directory like ParseConfigDirectory() but for HBA/ident?  Or we
> could just drop process_included_authfile() entirely?  On failure,
> this would make the code do a next_line all the time for all the
> include clauses.

I have been waiting for your reply for some time, so I have taken some
to look at this patch by myself and hacked on it.

At the end, the thing I was not really happy about is the
MemoryContext used to store the set of TokenizedAuthLines.  I have
looked at a couple of approaches, like passing around the context as
you do, but at the end there is something I found annoying: we may
tokenize a file in the line context of a different file, storing in
much more data than just the TokenizedAuthLines.  Then I got a few
steps back and began using a static memory context that only stored
the TokenizedAuthLines, switching to it in one place as of the end of
tokenize_auth_file() when inserting an item.  This makes hbafuncs.c a
bit less aware of the memory context, but I think that we could live
with that with a cleaner tokenization interface.  That's close to what
GUCs do, in some way.  Note that this may be better as an independent
patch, actually, as it impacts the current @ inclusions.

I have noticed a bug in the logic of include_if_exists: we should
ignore only files on ENOENT, but complain for other errnos after
opening the file.

The docs and the sample files have been tweaked a bit, giving a
cleaner separation between the main record types and the inclusion
ones.

+     /* XXX: this should stick to elevel for some cases? */
+     ereport(LOG,
+             (errmsg("skipping missing authentication file \"%s\"",
+                     inc_fullname)));
Should we always issue a LOG here?  In some cases we use an elevel of
DEBUG3.

So, what do you think about something like the attached?  I have begun
a lookup at the tests, but I don't have enough material for an actual
review of this part yet.  Note that I have removed temporarily
process_included_authfile(), as I was looking at all the code branches
in details.  The final result ought to include AbsoluteConfigLocation(),
open_auth_file() and tokenize_auth_file() with an extra missing_ok
argument, I guess ("strict" as proposed originally is not the usual
PG-way for ENOENT-ish problems).  process_line should be used only
when we have no err_msg, meaning that these have been consumed in some
TokenizedAuthLines already.
--
Michael

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

23 November 2022, 06:05:18

On Tue, Nov 22, 2022 at 05:20:01PM +0900, Michael Paquier wrote:
> +     /* XXX: this should stick to elevel for some cases? */
> +     ereport(LOG,
> +             (errmsg("skipping missing authentication file \"%s\"",
> +                     inc_fullname)));
> Should we always issue a LOG here?  In some cases we use an elevel of
> DEBUG3.

And here I think that we need to use elevel.  In hba.c, this would
generate a LOG while hbafuncs.c uses DEBUG3, leading to less noise in
the log files.

> So, what do you think about something like the attached?  I have begun
> a lookup at the tests, but I don't have enough material for an actual
> review of this part yet.  Note that I have removed temporarily
> process_included_authfile(), as I was looking at all the code branches
> in details.  The final result ought to include AbsoluteConfigLocation(),
> open_auth_file() and tokenize_auth_file() with an extra missing_ok
> argument, I guess ("strict" as proposed originally is not the usual
> PG-way for ENOENT-ish problems).  process_line should be used only
> when we have no err_msg, meaning that these have been consumed in some
> TokenizedAuthLines already.

This, however, was still brittle in terms of memory handling.
Reloading the server a few hundred times proved that this was leaking
memory in the tokenization.  This becomes quite simple once you switch
to an approach where tokenize_auth_file() uses its own memory context,
while we store all the TokenizedAuthLines in the static context.  It
also occurred to me that we can create and drop the static context
holding the tokens when opening the top-root auth file, aka with a
depth at 0.  It may be a bit simpler to switch to a single-context
approach for all the allocations of tokenize_auth_file(), though I
find the use of a static context much cleaner for the inclusions with
@ files when we expand an existing list.

The patch can be split, again, into more pieces.  Attached 0001
reworks the memory allocation contexts for the tokenization and 0002
is the main feature.  As of things are, I am quite happy with the
shapes of 0001 and 0002.  I have tested quite a bit its robustness,
with valgrind for example, to make sure that there are no leaks in the
postmaster (with[out] -DEXEC_BACKEND).  The inclusion logic is
refactored to be in a state close to your previous patches, see
tokenize_inclusion_file().

Note that the tests are failing for some of the Windows CIs, actually,
due to a difference in some of the paths generated vs the file paths
(aka c:\cirrus vs c:/cirrus, as far as I saw).
--
Michael

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

23 November 2022, 07:56:50

Hi,

Sorry for the very late answer, I had quite a lot of other things going on
recently.  And thanks for taking care of the patchset!

On Wed, Nov 23, 2022 at 03:05:18PM +0900, Michael Paquier wrote:
> On Tue, Nov 22, 2022 at 05:20:01PM +0900, Michael Paquier wrote:
> > +     /* XXX: this should stick to elevel for some cases? */
> > +     ereport(LOG,
> > +             (errmsg("skipping missing authentication file \"%s\"",
> > +                     inc_fullname)));
> > Should we always issue a LOG here?  In some cases we use an elevel of
> > DEBUG3.
>
> And here I think that we need to use elevel.  In hba.c, this would
> generate a LOG while hbafuncs.c uses DEBUG3, leading to less noise in
> the log files.

Yeah I agree the LOG message is only interesting if you're reloading the conf.
I actually thought that this is what I did sorry about that.

> > So, what do you think about something like the attached?  I have begun
> > a lookup at the tests, but I don't have enough material for an actual
> > review of this part yet.  Note that I have removed temporarily
> > process_included_authfile(), as I was looking at all the code branches
> > in details.  The final result ought to include AbsoluteConfigLocation(),
> > open_auth_file() and tokenize_auth_file() with an extra missing_ok
> > argument, I guess ("strict" as proposed originally is not the usual
> > PG-way for ENOENT-ish problems).  process_line should be used only
> > when we have no err_msg, meaning that these have been consumed in some
> > TokenizedAuthLines already.
>
> This, however, was still brittle in terms of memory handling.
> Reloading the server a few hundred times proved that this was leaking
> memory in the tokenization.

Oh, nice catch.

> This becomes quite simple once you switch
> to an approach where tokenize_auth_file() uses its own memory context,
> while we store all the TokenizedAuthLines in the static context.  It
> also occurred to me that we can create and drop the static context
> holding the tokens when opening the top-root auth file, aka with a
> depth at 0.  It may be a bit simpler to switch to a single-context
> approach for all the allocations of tokenize_auth_file(), though I
> find the use of a static context much cleaner for the inclusions with
> @ files when we expand an existing list.

Agreed.

The depth 0 is getting used quite a lot now, maybe we should have a define for
it to make it easier to grep, like TOP_LEVEL_AUTH_FILE or something like that?
And also add a define for the magical 10 for the max inclusion depth, for both
auth files and GUC files while at it?

> The patch can be split, again, into more pieces.  Attached 0001
> reworks the memory allocation contexts for the tokenization and 0002
> is the main feature.  As of things are, I am quite happy with the
> shapes of 0001 and 0002.  I have tested quite a bit its robustness,
> with valgrind for example, to make sure that there are no leaks in the
> postmaster (with[out] -DEXEC_BACKEND).  The inclusion logic is
> refactored to be in a state close to your previous patches, see
> tokenize_inclusion_file().

Yep I saw that.  I don't have much to say about the patch, it looks good to me.
The only nitpicking I have is:

+/*
+ * Memory context holding the list of TokenizedAuthLines when parsing
+ * HBA or ident config files.  This is created when loading the top HBA
+ + or ident file (depth of 0).
+ */
+static MemoryContext tokenize_context = NULL;

The comment seems a bit ambiguous as with "loading the top..." you probably
meant something like "loading the file in memory" rather than "(re)loading the
configuration".  Maybe s/loading/opening/?
>
> Note that the tests are failing for some of the Windows CIs, actually,
> due to a difference in some of the paths generated vs the file paths
> (aka c:\cirrus vs c:/cirrus, as far as I saw).

Mmm, I haven't looked deeply so I'm not sure if the perl podules are aware of
it or not, but maybe we could somehow detect the used delimiter at the
beginning after normalizing the directory, and use a $DELIM rather than a plain
"/"?

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

24 November 2022, 05:07:21

On Wed, Nov 23, 2022 at 03:56:50PM +0800, Julien Rouhaud wrote:
> The depth 0 is getting used quite a lot now, maybe we should have a define for
> it to make it easier to grep, like TOP_LEVEL_AUTH_FILE or something like that?
> And also add a define for the magical 10 for the max inclusion depth, for both
> auth files and GUC files while at it?

Sounds like a good idea to me, and it seems to me that this had better
be unified between the GUCs (see ParseConfigFp() that hardcodes a
depth of 0) and hba.c.  It looks like they could be added to
conffiles.h, as of CONF_FILE_START_{LEVEL,DEPTH} and
CONF_FILE_MAX_{LEVEL,DEPTH}.  Would you like to send a patch?

> The comment seems a bit ambiguous as with "loading the top..." you probably
> meant something like "loading the file in memory" rather than "(re)loading the
> configuration".  Maybe s/loading/opening/?

Right.  I have used "opening" at the end.

I have worked on all that, did more polishing to the docs, the
comments and some tiny bits of the logic, and applied both 0001 and
0002.  Now, to the tests..

> Mmm, I haven't looked deeply so I'm not sure if the perl podules are aware of
> it or not, but maybe we could somehow detect the used delimiter at the
> beginning after normalizing the directory, and use a $DELIM rather than a plain
> "/"?

I am not sure.  Could you have a look and see if you can get the CI
back to green?  The first thing I would test is to switch the error
patterns to be regexps based on the basenames rather than the full
paths (tweaking the queries on the system views to do htat), so as we
avoid all this business with slash and backslash transformations.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

24 November 2022, 06:37:23

On Thu, Nov 24, 2022 at 02:07:21PM +0900, Michael Paquier wrote:
> On Wed, Nov 23, 2022 at 03:56:50PM +0800, Julien Rouhaud wrote:
> > The depth 0 is getting used quite a lot now, maybe we should have a define for
> > it to make it easier to grep, like TOP_LEVEL_AUTH_FILE or something like that?
> > And also add a define for the magical 10 for the max inclusion depth, for both
> > auth files and GUC files while at it?
> 
> Sounds like a good idea to me, and it seems to me that this had better
> be unified between the GUCs (see ParseConfigFp() that hardcodes a
> depth of 0) and hba.c.  It looks like they could be added to
> conffiles.h, as of CONF_FILE_START_{LEVEL,DEPTH} and
> CONF_FILE_MAX_{LEVEL,DEPTH}.  Would you like to send a patch?

Agreed, and yes I will take care of that shortly.

> > The comment seems a bit ambiguous as with "loading the top..." you probably
> > meant something like "loading the file in memory" rather than "(re)loading the
> > configuration".  Maybe s/loading/opening/?
> 
> Right.  I have used "opening" at the end.
> 
> I have worked on all that, did more polishing to the docs, the
> comments and some tiny bits of the logic, and applied both 0001 and
> 0002.

Thanks a lot!

> Now, to the tests..
> 
> > Mmm, I haven't looked deeply so I'm not sure if the perl podules are aware of
> > it or not, but maybe we could somehow detect the used delimiter at the
> > beginning after normalizing the directory, and use a $DELIM rather than a plain
> > "/"?
> 
> I am not sure.  Could you have a look and see if you can get the CI
> back to green?  The first thing I would test is to switch the error
> patterns to be regexps based on the basenames rather than the full
> paths (tweaking the queries on the system views to do htat), so as we
> avoid all this business with slash and backslash transformations.

I'm also working on it!  Hopefully I should be able to come up with a fix soon.

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

24 November 2022, 09:07:24

Hi,

On Thu, Nov 24, 2022 at 02:37:23PM +0800, Julien Rouhaud wrote:
> On Thu, Nov 24, 2022 at 02:07:21PM +0900, Michael Paquier wrote:
> > On Wed, Nov 23, 2022 at 03:56:50PM +0800, Julien Rouhaud wrote:
> > > The depth 0 is getting used quite a lot now, maybe we should have a define for
> > > it to make it easier to grep, like TOP_LEVEL_AUTH_FILE or something like that?
> > > And also add a define for the magical 10 for the max inclusion depth, for both
> > > auth files and GUC files while at it?
> > 
> > Sounds like a good idea to me, and it seems to me that this had better
> > be unified between the GUCs (see ParseConfigFp() that hardcodes a
> > depth of 0) and hba.c.  It looks like they could be added to
> > conffiles.h, as of CONF_FILE_START_{LEVEL,DEPTH} and
> > CONF_FILE_MAX_{LEVEL,DEPTH}.  Would you like to send a patch?

So I went with CONF_FILE_START_DEPTH and CONF_FILE_MAX_DEPTH.  Attached v22
that fixes it in all the places I found.

> > Now, to the tests..
> > 
> > > Mmm, I haven't looked deeply so I'm not sure if the perl podules are aware of
> > > it or not, but maybe we could somehow detect the used delimiter at the
> > > beginning after normalizing the directory, and use a $DELIM rather than a plain
> > > "/"?
> > 
> > I am not sure.  Could you have a look and see if you can get the CI
> > back to green?  The first thing I would test is to switch the error
> > patterns to be regexps based on the basenames rather than the full
> > paths (tweaking the queries on the system views to do htat), so as we
> > avoid all this business with slash and backslash transformations.

Apparently just making sure that the $node->data_dir consistently uses forward
slashes is enough to make the CI happy, for VS 2019 [1] and MinGW64 [2], so
done this way with an extra normalization step.

[1] https://cirrus-ci.com/task/4944203946917888
[2] https://cirrus-ci.com/task/6070103853760512

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

24 November 2022, 22:41:59

On Thu, Nov 24, 2022 at 05:07:24PM +0800, Julien Rouhaud wrote:
> So I went with CONF_FILE_START_DEPTH and CONF_FILE_MAX_DEPTH.  Attached v22
> that fixes it in all the places I found.

Sounds fine.  Added one comment, fixed one comment, and applied.
Thanks!
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

25 November 2022, 02:25:30

On Fri, Nov 25, 2022 at 07:41:59AM +0900, Michael Paquier wrote:
> On Thu, Nov 24, 2022 at 05:07:24PM +0800, Julien Rouhaud wrote:
> > So I went with CONF_FILE_START_DEPTH and CONF_FILE_MAX_DEPTH.  Attached v22
> > that fixes it in all the places I found.
> 
> Sounds fine.  Added one comment, fixed one comment, and applied.
> Thanks!

Thanks a lot!

Re: Allow file inclusion in pg_hba and pg_ident files

From

Ian Lawrence Barwick

Date:

27 November 2022, 07:31:38

2022年11月25日(金) 11:25 Julien Rouhaud <rjuju123@gmail.com>:
>
> On Fri, Nov 25, 2022 at 07:41:59AM +0900, Michael Paquier wrote:
> > On Thu, Nov 24, 2022 at 05:07:24PM +0800, Julien Rouhaud wrote:
> > > So I went with CONF_FILE_START_DEPTH and CONF_FILE_MAX_DEPTH.  Attached v22
> > > that fixes it in all the places I found.
> >
> > Sounds fine.  Added one comment, fixed one comment, and applied.
> > Thanks!
>
> Thanks a lot!

Hi

I'm trying to reconcile open CommitFest entries with their actual
status; the entry for this:

  https://commitfest.postgresql.org/40/3558/

shows as "Waiting on Author", but looks like it's all been committed;
is there anything
left to do on this?

Thanks

Ian Barwick

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

27 November 2022, 07:39:44

Le dim. 27 nov. 2022 à 15:31, Ian Lawrence Barwick

I'm trying to reconcile open CommitFest entries with their actual
status; the entry for this:

https://commitfest.postgresql.org/40/3558/

shows as "Waiting on Author", but looks like it's all been committed;
is there anything
left to do on this?

right the CF entry is out of date. there's still the additional tap test that needs to be taken care of. I sent a new version recently that works on windows CI, so I guess the correct status should now be needs review, although the latest patchset probably doesn't apply anymore since the first patch has been committed. I'm traveling today I'll try to send a rebased version in the evening

Re: Allow file inclusion in pg_hba and pg_ident files

From

Julien Rouhaud

Date:

27 November 2022, 11:04:46

On Sun, Nov 27, 2022 at 03:39:44PM +0800, Julien Rouhaud wrote:
> Le dim. 27 nov. 2022 à 15:31, Ian Lawrence Barwick
> 
> >
> > I'm trying to reconcile open CommitFest entries with their actual
> > status; the entry for this:
> >
> >   https://commitfest.postgresql.org/40/3558/
> >
> > shows as "Waiting on Author", but looks like it's all been committed;
> > is there anything
> > left to do on this?
> >
> 
> right the CF entry is out of date. there's still the additional tap test
> that needs to be taken care of. I sent a new version recently that works on
> windows CI, so I guess the correct status should now be needs review,
> although the latest patchset probably doesn't apply anymore since the first
> patch has been committed. I'm traveling today I'll try to send a rebased
> version in the evening

And here's the rebased patch for the TAP tests.  I will switch the CF entry to
Needs Review.

Attachment

v23-0001-Add-regression-tests-for-file-inclusion-in-HBA-e.patch

Re: Allow file inclusion in pg_hba and pg_ident files

From

Ian Lawrence Barwick

Date:

27 November 2022, 12:49:31

2022年11月27日(日) 20:04 Julien Rouhaud <rjuju123@gmail.com>:
>
> On Sun, Nov 27, 2022 at 03:39:44PM +0800, Julien Rouhaud wrote:
> > Le dim. 27 nov. 2022 à 15:31, Ian Lawrence Barwick
> >
> > >
> > > I'm trying to reconcile open CommitFest entries with their actual
> > > status; the entry for this:
> > >
> > >   https://commitfest.postgresql.org/40/3558/
> > >
> > > shows as "Waiting on Author", but looks like it's all been committed;
> > > is there anything
> > > left to do on this?
> > >
> >
> > right the CF entry is out of date. there's still the additional tap test
> > that needs to be taken care of. I sent a new version recently that works on
> > windows CI, so I guess the correct status should now be needs review,
> > although the latest patchset probably doesn't apply anymore since the first
> > patch has been committed. I'm traveling today I'll try to send a rebased
> > version in the evening
>
> And here's the rebased patch for the TAP tests.  I will switch the CF entry to
> Needs Review.

Thanks for the quick update!

Regards

Ian Barwick

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

28 November 2022, 01:08:36

On Sun, Nov 27, 2022 at 09:49:31PM +0900, Ian Lawrence Barwick wrote:
> Thanks for the quick update!

FWIW, I do intend to tackle this last part ASAP, as the last piece to
commit to get the full picture in the tree.
--
Michael

Attachment

signature.asc

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

28 November 2022, 07:12:40

On Sun, Nov 27, 2022 at 07:04:46PM +0800, Julien Rouhaud wrote:
> And here's the rebased patch for the TAP tests.  I will switch the CF entry to
> Needs Review.

I have been looking at that, and applied the part 1 of the test for
the positive tests to cover the basic ground I'd like to see covered
for this feature.  I have done quite a few changes to this part, like
reworking the way the incrementation of the counters is done for the
rule/map numbers and the line numbers of each file.  The idea to use a
global dictionary to track the current number of lines was rather
clear, so I have kept that.  add_{hba,ident}_line() rely on the
files included in a directory to be ordered in the same way as the
backend, and I am fine to live with that.

+# Normalize the data directory for Windows
+$data_dir =~ s/\/\.\//\//g;    # reduce /./ to /
+$data_dir =~ s/\/\//\//g;      # reduce // to /
+$data_dir =~ s/\/$//;          # remove trailing /
+$data_dir =~ s/\\/\//g;        # change \ to /
The mysticism around the conversion of the data directory path is
something I'd like to avoid, or I suspect that we are going to have a
hard time debugging any issues in this area.  For the positive cases
committed, I have gone through the approach of using the base name of
the configuration file to avoid those compatibility issues.  Wouldn't
it better to do the same for the expected error messages, switching to
an approach where we only check for the configuration file name in all
these regexps?

+# Generate the expected output for the auth file view error reporting (file
+# name, file line, error), for the given array of error conditions, as
+# generated generated by add_hba_line/add_ident_line with an err_str.
+sub generate_log_err_rows
+{
generate_log_err_rows() does not strike me as a good interface.  The
issues I am pointed at here is it depends directly on the result
returned by add_hba_line() or add_ident_line() when the caller passes
down an error string.  As far as I get it, this interface is done
as-is to correctly increment the line number of one file, and I'd like
to believe that add_{hba,ident}_line() should be kept designed so as
they only return the expected matching entry in their catalog views,
without any code path returning something else (in your case the
triplet [ $filename, $fileline, $err_str ]).  That makes the whole
harder to understand.

Attached is a rebased patch of the rest.  With everything we have
dealt with in this CF, perhaps it would be better to mark this entry
as committed and switch to a new thread where the negative TAP tests
could be discussed?  It looks like the part for the error pattern
checks compared with the logs could be the easiest portion of what's
remaining.
--
Michael

Attachment

Re: Allow file inclusion in pg_hba and pg_ident files

From

Michael Paquier

Date:

28 November 2022, 23:48:44

On Mon, Nov 28, 2022 at 04:12:40PM +0900, Michael Paquier wrote:
> Attached is a rebased patch of the rest.  With everything we have
> dealt with in this CF, perhaps it would be better to mark this entry
> as committed and switch to a new thread where the negative TAP tests
> could be discussed?

For now, I have marked the CF entry as committed.
--
Michael

Attachment

signature.asc