Thread: Allow continuations in "pg_hba.conf" files

Allow continuations in "pg_hba.conf" files

From

Fabien COELHO

Date:

25 March 2020, 18:09:38

Hello,

After writing an unreadable and stupidly long line for ldap 
authentification in a "pg_hba.conf" file, I figured out that allowing 
continuations looked simple enough and should just be done.

Patch attached.

-- 
Fabien.

Attachment

pg-hba-cont-1.patch

Re: Allow continuations in "pg_hba.conf" files

From

Justin Pryzby

Date:

25 March 2020, 19:39:23

Hi,

On Wed, Mar 25, 2020 at 07:09:38PM +0100, Fabien COELHO wrote:
> 
> Hello,
> 
> After writing an unreadable and stupidly long line for ldap authentification
> in a "pg_hba.conf" file, I figured out that allowing continuations looked
> simple enough and should just be done.

I tried this briefly.

> -   Records cannot be continued across lines.
> +   Records can be backslash-continued across lines.

Maybe say: "lines ending with a backslash are logically continued on the next
line", or similar.

> +            /* else we have a continuation, just blank it and loop */
> +            continuations++;
> +            *curend++ = ' ';

Since it puts a blank there, it creates a "word" boundary, which I gather
worked for your use case.  But I wonder whether it's needed to add a space (or
otherwise, document that lines cannot be split beween words?).

You might compare this behavior with that of makefiles (or find a better
example) which I happen to recall *don't* add a space; if you want that, you
have to end the line with: " \" not just "\".

Anyway, I checked that the current patch handles users split across lines, like:
alice,\
bob,\
carol

As written, that depends on the parser's behavior of ignoring commas and
blanks, since it sees:
"alice,[SPACE]bob,[SPACE]carol"

Maybe it'd be nice to avoid depending on that.

I tried with a username called "alice,bob", split across lines:

"alice,\
bob",\

But then your patch makes it look for a user called "alice, bob" (with a
space).  I realize that's not a compelling argument :)

Note, that also appears to affect the "username maps" file.  So mention in that
chapter, too.
https://www.postgresql.org/docs/current/auth-username-maps.html

Cheers,
-- 
Justin

Re: Allow continuations in "pg_hba.conf" files

From

Fabien COELHO

Date:

25 March 2020, 20:45:40

Hello Justin,

thanks for the feedback.

>> -   Records cannot be continued across lines.
>> +   Records can be backslash-continued across lines.
>
> Maybe say: "lines ending with a backslash are logically continued on the next
> line", or similar.

I tried to change it along that.

> Since it puts a blank there, it creates a "word" boundary, which I gather
> worked for your use case.  But I wonder whether it's needed to add a space (or
> otherwise, document that lines cannot be split beween words?).

Hmmm. Ok, you are right. I hesitated while doing it. I removed the char 
instead, so that it does not add a word break.

> Note, that also appears to affect the "username maps" file.  So mention 
> in that chapter, too. 
> https://www.postgresql.org/docs/current/auth-username-maps.html

Indeed, the same tokenizer is used. I updated a sentence to point on 
continuations.

-- 
Fabien.

Attachment

pg-hba-cont-2.patch

Re: Allow continuations in "pg_hba.conf" files

From

David Zhang

Date:

02 April 2020, 00:20:12

Hi Fabien,
Should we consider the case "\ ", i.e. one or more spaces after the backslash?
For example, if I replace a user map 
"mymap   /^(.*)@mydomain\.com$      \1" with 
"mymap   /^(.*)@mydomain\.com$      \ "
"\1"
by adding one extra space after the backslash, then I got the pg_role="\\"
but I think what we expect is pg_role="\\1"

Re: Allow continuations in "pg_hba.conf" files

From

Kyotaro Horiguchi

Date:

02 April 2020, 04:25:43

At Thu, 02 Apr 2020 00:20:12 +0000, David Zhang <david.zhang@highgo.ca> wrote in 
> Hi Fabien,
> Should we consider the case "\ ", i.e. one or more spaces after the backslash?
> For example, if I replace a user map 
> "mymap   /^(.*)@mydomain\.com$      \1" with 
> "mymap   /^(.*)@mydomain\.com$      \ "
> "\1"
> by adding one extra space after the backslash, then I got the pg_role="\\"
> but I think what we expect is pg_role="\\1"

FWIW, I don't think so. Generally a trailing backspace is an escape
character for the following newline.  And '\ ' is a escaped space,
which is usualy menas a space itself.

In this case escape character doesn't work generally and I think it is
natural that a backslash in the middle of a line is a backslash
character itself.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Allow continuations in "pg_hba.conf" files

From

Fabien COELHO

Date:

02 April 2020, 05:25:36

Hello,

> FWIW, I don't think so. Generally a trailing backspace is an escape
> character for the following newline.  And '\ ' is a escaped space,
> which is usualy menas a space itself.
>
> In this case escape character doesn't work generally and I think it is 
> natural that a backslash in the middle of a line is a backslash 
> character itself.

I concur: The backslash char is only a continuation as the very last 
character of the line, before cr/nl line ending markers.

There are no assumption about backslash escaping, quotes and such, which 
seems reasonable given the lexing structure of the files, i.e. records of 
space-separated words, and # line comments.

-- 
Fabien.

Re: Allow continuations in "pg_hba.conf" files

From

Justin Pryzby

Date:

02 April 2020, 05:58:19

On Thu, Apr 02, 2020 at 07:25:36AM +0200, Fabien COELHO wrote:
> 
> Hello,
> 
> > FWIW, I don't think so. Generally a trailing backspace is an escape
> > character for the following newline.  And '\ ' is a escaped space,
> > which is usualy menas a space itself.
> > 
> > In this case escape character doesn't work generally and I think it is
> > natural that a backslash in the middle of a line is a backslash
> > character itself.
> 
> I concur: The backslash char is only a continuation as the very last
> character of the line, before cr/nl line ending markers.
> 
> There are no assumption about backslash escaping, quotes and such, which
> seems reasonable given the lexing structure of the files, i.e. records of
> space-separated words, and # line comments.

Quoting does allow words containing spaces:

https://www.postgresql.org/docs/current/auth-pg-hba-conf.html
|A record is made up of a number of fields which are separated by spaces and/or
|tabs. Fields can contain white space if the field value is double-quoted.
|Quoting one of the keywords in a database, user, or address field (e.g., all or
|replication) makes the word lose its special meaning, and just match a
|database, user, or host with that name.

-- 
Justin

Re: Allow continuations in "pg_hba.conf" files

From

Fabien COELHO

Date:

02 April 2020, 06:46:08

Hi Justin,

>> There are no assumption about backslash escaping, quotes and such, which
>> seems reasonable given the lexing structure of the files, i.e. records of
>> space-separated words, and # line comments.
>
> Quoting does allow words containing spaces:

Ok.

I meant that the continuation handling does not care of that, i.e. if the 
continuation is within quotes, then the quoted stuff is implicitely 
continuated, there is no different rule because it is within quotes.

-- 
Fabien.

Re: Allow continuations in "pg_hba.conf" files

From

David Zhang

Date:

02 April 2020, 18:05:17

On 2020-04-01 10:25 p.m., Fabien COELHO wrote:

>
> Hello,
>
>> FWIW, I don't think so. Generally a trailing backspace is an escape
>> character for the following newline.  And '\ ' is a escaped space,
>> which is usualy menas a space itself.
>>
>> In this case escape character doesn't work generally and I think it
>> is natural that a backslash in the middle of a line is a backslash
>> character itself.
>
> I concur: The backslash char is only a continuation as the very last
> character of the line, before cr/nl line ending markers.

+Agree. However, it would nice to update the sentence below if I
understand it correctly.

"+   Comments, whitespace and continuations are handled in the same way
as in" pg_hba.conf

For example, if a user provide a configuration like below (even such a
comments is not recommended)

"host    all     all     127.0.0.1/32    trust  #COMMENTS, it works"

i.e. the original pg_hba.conf allows to have comments in each line, but
with new continuation introduced, the comments has to be put to the last
line.

>
> There are no assumption about backslash escaping, quotes and such,
> which seems reasonable given the lexing structure of the files, i.e.
> records of space-separated words, and # line comments.
>
--
David

Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca

Re: Allow continuations in "pg_hba.conf" files

From

Fabien COELHO

Date:

03 April 2020, 05:46:32

Hello David,

> +Agree. However, it would nice to update the sentence below if I understand 
> it correctly.
>
> "+   Comments, whitespace and continuations are handled in the same way as 
> in" pg_hba.conf

In the attached v3, I've tried to clarify comments and doc about 
tokenization rules relating to comments, strings and continuations.

-- 
Fabien.

Attachment

pg-hba-cont-3.patch

Re: Allow continuations in "pg_hba.conf" files

From

Fabien COELHO

Date:

06 July 2020, 15:38:14

> In the attached v3, I've tried to clarify comments and doc about tokenization 
> rules relating to comments, strings and continuations.

Attached v4 improves comments & doc as suggested by Justin.

-- 
Fabien.

Attachment

pg-hba-cont-4.patch

Re: Allow continuations in "pg_hba.conf" files

From

Tom Lane

Date:

02 September 2020, 18:53:13

Fabien COELHO <coelho@cri.ensmp.fr> writes:
> [ pg-hba-cont-4.patch ]

I looked this over and I think it seems reasonable, but there's
something else we should do.  If people are writing lines long
enough that they need to continue them, how long will it be
before they overrun the line length limit?  Admittedly, there's
a good deal of daylight between 80 characters and 8K, but if
we're busy removing restrictions on password length in an adjacent
thread [1], I think we ought to get rid of pg_hba.conf's line length
restriction while we're at it.

Accordingly, I borrowed some code from that thread and present
the attached revision.  I also added some test coverage, since
that was lacking before, and wordsmithed docs and comments slightly.

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/09512C4F-8CB9-4021-B455-EF4C4F0D55A0%40amazon.com

diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index 5cd88b462d..d62d1a061c 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -77,13 +77,15 @@
    The general format of the <filename>pg_hba.conf</filename> file is
    a set of records, one per line. Blank lines are ignored, as is any
    text after the <literal>#</literal> comment character.
-   Records cannot be continued across lines.
+   A record can be continued onto the next line by ending the line with
+   a backslash. (Backslashes are not special except at the end of a line.)
    A record is made
    up of a number of fields which are separated by spaces and/or tabs.
    Fields can contain white space if the field value is double-quoted.
    Quoting one of the keywords in a database, user, or address field (e.g.,
    <literal>all</literal> or <literal>replication</literal>) makes the word lose its special
    meaning, and just match a database, user, or host with that name.
+   Backslash line continuation applies even within quoted text or comments.
   </para>

   <para>
@@ -821,7 +823,7 @@ local   db1,db2,@demodbs  all                                   md5
 <synopsis>
 <replaceable>map-name</replaceable> <replaceable>system-username</replaceable>
<replaceable>database-username</replaceable>
 </synopsis>
-   Comments and whitespace are handled in the same way as in
+   Comments, whitespace and line continuations are handled in the same way as in
    <filename>pg_hba.conf</filename>.  The
    <replaceable>map-name</replaceable> is an arbitrary name that will be used to
    refer to this mapping in <filename>pg_hba.conf</filename>. The other
diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
index 9d63830553..5991a21cf2 100644
--- a/src/backend/libpq/hba.c
+++ b/src/backend/libpq/hba.c
@@ -29,6 +29,7 @@
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
 #include "common/ip.h"
+#include "common/string.h"
 #include "funcapi.h"
 #include "libpq/ifaddr.h"
 #include "libpq/libpq.h"
@@ -54,7 +55,6 @@


 #define MAX_TOKEN    256
-#define MAX_LINE    8192

 /* callback data for check_network_callback */
 typedef struct check_network_data
@@ -166,11 +166,19 @@ pg_isblank(const char c)
 /*
  * Grab one token out of the string pointed to by *lineptr.
  *
- * Tokens are strings of non-blank
- * characters bounded by blank characters, commas, beginning of line, and
- * end of line. Blank means space or tab. Tokens can be delimited by
- * double quotes (this allows the inclusion of blanks, but not newlines).
- * Comments (started by an unquoted '#') are skipped.
+ * Tokens are strings of non-blank characters bounded by blank characters,
+ * commas, beginning of line, and end of line.  Blank means space or tab.
+ *
+ * Tokens can be delimited by double quotes (this allows the inclusion of
+ * blanks or '#', but not newlines).  As in SQL, write two double-quotes
+ * to represent a double quote.
+ *
+ * Comments (started by an unquoted '#') are skipped, i.e. the remainder
+ * of the line is ignored.
+ *
+ * (Note that line continuation processing happens before tokenization.
+ * Thus, if a continuation occurs within quoted text or a comment, the
+ * quoted text or comment is considered to continue to the next line.)
  *
  * The token, if any, is returned at *buf (a buffer of size bufsz), and
  * *lineptr is advanced past the token.
@@ -470,6 +478,7 @@ static MemoryContext
 tokenize_file(const char *filename, FILE *file, List **tok_lines, int elevel)
 {
     int            line_number = 1;
+    StringInfoData buf;
     MemoryContext linecxt;
     MemoryContext oldcxt;

@@ -478,47 +487,72 @@ tokenize_file(const char *filename, FILE *file, List **tok_lines, int elevel)
                                     ALLOCSET_SMALL_SIZES);
     oldcxt = MemoryContextSwitchTo(linecxt);

+    initStringInfo(&buf);
+
     *tok_lines = NIL;

     while (!feof(file) && !ferror(file))
     {
-        char        rawline[MAX_LINE];
         char       *lineptr;
         List       *current_line = NIL;
         char       *err_msg = NULL;
+        int            last_backslash_buflen = 0;
+        int            continuations = 0;

-        if (!fgets(rawline, sizeof(rawline), file))
-        {
-            int            save_errno = errno;
+        /* Collect the next input line, handling backslash continuations */
+        resetStringInfo(&buf);

-            if (!ferror(file))
-                break;            /* normal EOF */
-            /* I/O error! */
-            ereport(elevel,
-                    (errcode_for_file_access(),
-                     errmsg("could not read file \"%s\": %m", filename)));
-            err_msg = psprintf("could not read file \"%s\": %s",
-                               filename, strerror(save_errno));
-            rawline[0] = '\0';
-        }
-        if (strlen(rawline) == MAX_LINE - 1)
+        while (!feof(file) && !ferror(file))
         {
-            /* Line too long! */
-            ereport(elevel,
-                    (errcode(ERRCODE_CONFIG_FILE_ERROR),
-                     errmsg("authentication file line too long"),
-                     errcontext("line %d of configuration file \"%s\"",
-                                line_number, filename)));
-            err_msg = "authentication file line too long";
-        }
+            /* Make sure there's a reasonable amount of room in the buffer */
+            enlargeStringInfo(&buf, 128);
+
+            /* Read some data, appending it to what we already have */
+            if (fgets(buf.data + buf.len, buf.maxlen - buf.len, file) == NULL)
+            {
+                int            save_errno = errno;
+
+                if (!ferror(file))
+                    break;        /* normal EOF */
+                /* I/O error! */
+                ereport(elevel,
+                        (errcode_for_file_access(),
+                         errmsg("could not read file \"%s\": %m", filename)));
+                err_msg = psprintf("could not read file \"%s\": %s",
+                                   filename, strerror(save_errno));
+                resetStringInfo(&buf);
+                break;
+            }
+            buf.len += strlen(buf.data + buf.len);
+
+            /* If we haven't got a whole line, loop to read more */
+            if (!(buf.len > 0 && buf.data[buf.len - 1] == '\n'))
+                continue;
+
+            /* Strip trailing newline, including \r in case we're on Windows */
+            buf.len = pg_strip_crlf(buf.data);
+
+            /*
+             * Check for backslash continuation.  The backslash must be after
+             * the last place we found a continuation, else two backslashes
+             * followed by two \n's would behave surprisingly.
+             */
+            if (buf.len > last_backslash_buflen &&
+                buf.data[buf.len - 1] == '\\')
+            {
+                /* Continuation, so strip it and keep reading */
+                buf.data[--buf.len] = '\0';
+                last_backslash_buflen = buf.len;
+                continuations++;
+                continue;
+            }

-        /* Strip trailing linebreak from rawline */
-        lineptr = rawline + strlen(rawline) - 1;
-        while (lineptr >= rawline && (*lineptr == '\n' || *lineptr == '\r'))
-            *lineptr-- = '\0';
+            /* Nope, so we have the whole line */
+            break;
+        }

         /* Parse fields */
-        lineptr = rawline;
+        lineptr = buf.data;
         while (*lineptr && err_msg == NULL)
         {
             List       *current_field;
@@ -538,12 +572,12 @@ tokenize_file(const char *filename, FILE *file, List **tok_lines, int elevel)
             tok_line = (TokenizedLine *) palloc(sizeof(TokenizedLine));
             tok_line->fields = current_line;
             tok_line->line_num = line_number;
-            tok_line->raw_line = pstrdup(rawline);
+            tok_line->raw_line = pstrdup(buf.data);
             tok_line->err_msg = err_msg;
             *tok_lines = lappend(*tok_lines, tok_line);
         }

-        line_number++;
+        line_number += continuations + 1;
     }

     MemoryContextSwitchTo(oldcxt);
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index 1b4323fe2a..59b1b403c5 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -29,7 +29,8 @@ sub reset_pg_hba
     my $hba_method = shift;

     unlink($node->data_dir . '/pg_hba.conf');
-    $node->append_conf('pg_hba.conf', "local all all $hba_method");
+    # just for testing purposes, use a continuation line
+    $node->append_conf('pg_hba.conf', "local all all\\\n $hba_method");
     $node->reload;
     return;
 }

Re: Allow continuations in "pg_hba.conf" files

From

Tom Lane

Date:

03 September 2020, 16:17:51

I wrote:
> Accordingly, I borrowed some code from that thread and present
> the attached revision.  I also added some test coverage, since
> that was lacking before, and wordsmithed docs and comments slightly.

Hearing no comments, pushed that way.

            regards, tom lane

Re: Allow continuations in "pg_hba.conf" files

From

Fabien COELHO

Date:

04 September 2020, 14:05:02

Hello Tom,

>> Accordingly, I borrowed some code from that thread and present
>> the attached revision.  I also added some test coverage, since
>> that was lacking before, and wordsmithed docs and comments slightly.
>
> Hearing no comments, pushed that way.

Thanks for the fixes and improvements!

I notice that buf.data is not freed. I guess that the server memory 
management will recover it.

-- 
Fabien.

Re: Allow continuations in "pg_hba.conf" files

From

Tom Lane

Date:

04 September 2020, 15:00:31

Fabien COELHO <coelho@cri.ensmp.fr> writes:
> I notice that buf.data is not freed. I guess that the server memory 
> management will recover it.

Yeah, it's in the transient context holding all of the results of
reading the file.  I considered pfree'ing it at the end of the
function, but I concluded there's no point.  The space will be
recycled when the context is destroyed, and since we're not (IIRC)
going to allocate anything more in that context, nothing would be
gained by freeing it earlier --- it'd just stay as unused memory
within the context.

            regards, tom lane