Thread: Allow continuations in "pg_hba.conf" files
Hello, After writing an unreadable and stupidly long line for ldap authentification in a "pg_hba.conf" file, I figured out that allowing continuations looked simple enough and should just be done. Patch attached. -- Fabien.
Attachment
Hi, On Wed, Mar 25, 2020 at 07:09:38PM +0100, Fabien COELHO wrote: > > Hello, > > After writing an unreadable and stupidly long line for ldap authentification > in a "pg_hba.conf" file, I figured out that allowing continuations looked > simple enough and should just be done. I tried this briefly. > - Records cannot be continued across lines. > + Records can be backslash-continued across lines. Maybe say: "lines ending with a backslash are logically continued on the next line", or similar. > + /* else we have a continuation, just blank it and loop */ > + continuations++; > + *curend++ = ' '; Since it puts a blank there, it creates a "word" boundary, which I gather worked for your use case. But I wonder whether it's needed to add a space (or otherwise, document that lines cannot be split beween words?). You might compare this behavior with that of makefiles (or find a better example) which I happen to recall *don't* add a space; if you want that, you have to end the line with: " \" not just "\". Anyway, I checked that the current patch handles users split across lines, like: alice,\ bob,\ carol As written, that depends on the parser's behavior of ignoring commas and blanks, since it sees: "alice,[SPACE]bob,[SPACE]carol" Maybe it'd be nice to avoid depending on that. I tried with a username called "alice,bob", split across lines: "alice,\ bob",\ But then your patch makes it look for a user called "alice, bob" (with a space). I realize that's not a compelling argument :) Note, that also appears to affect the "username maps" file. So mention in that chapter, too. https://www.postgresql.org/docs/current/auth-username-maps.html Cheers, -- Justin
Hello Justin, thanks for the feedback. >> - Records cannot be continued across lines. >> + Records can be backslash-continued across lines. > > Maybe say: "lines ending with a backslash are logically continued on the next > line", or similar. I tried to change it along that. > Since it puts a blank there, it creates a "word" boundary, which I gather > worked for your use case. But I wonder whether it's needed to add a space (or > otherwise, document that lines cannot be split beween words?). Hmmm. Ok, you are right. I hesitated while doing it. I removed the char instead, so that it does not add a word break. > Note, that also appears to affect the "username maps" file. So mention > in that chapter, too. > https://www.postgresql.org/docs/current/auth-username-maps.html Indeed, the same tokenizer is used. I updated a sentence to point on continuations. -- Fabien.
Attachment
Hi Fabien, Should we consider the case "\ ", i.e. one or more spaces after the backslash? For example, if I replace a user map "mymap /^(.*)@mydomain\.com$ \1" with "mymap /^(.*)@mydomain\.com$ \ " "\1" by adding one extra space after the backslash, then I got the pg_role="\\" but I think what we expect is pg_role="\\1"
At Thu, 02 Apr 2020 00:20:12 +0000, David Zhang <david.zhang@highgo.ca> wrote in > Hi Fabien, > Should we consider the case "\ ", i.e. one or more spaces after the backslash? > For example, if I replace a user map > "mymap /^(.*)@mydomain\.com$ \1" with > "mymap /^(.*)@mydomain\.com$ \ " > "\1" > by adding one extra space after the backslash, then I got the pg_role="\\" > but I think what we expect is pg_role="\\1" FWIW, I don't think so. Generally a trailing backspace is an escape character for the following newline. And '\ ' is a escaped space, which is usualy menas a space itself. In this case escape character doesn't work generally and I think it is natural that a backslash in the middle of a line is a backslash character itself. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Hello, > FWIW, I don't think so. Generally a trailing backspace is an escape > character for the following newline. And '\ ' is a escaped space, > which is usualy menas a space itself. > > In this case escape character doesn't work generally and I think it is > natural that a backslash in the middle of a line is a backslash > character itself. I concur: The backslash char is only a continuation as the very last character of the line, before cr/nl line ending markers. There are no assumption about backslash escaping, quotes and such, which seems reasonable given the lexing structure of the files, i.e. records of space-separated words, and # line comments. -- Fabien.
On Thu, Apr 02, 2020 at 07:25:36AM +0200, Fabien COELHO wrote: > > Hello, > > > FWIW, I don't think so. Generally a trailing backspace is an escape > > character for the following newline. And '\ ' is a escaped space, > > which is usualy menas a space itself. > > > > In this case escape character doesn't work generally and I think it is > > natural that a backslash in the middle of a line is a backslash > > character itself. > > I concur: The backslash char is only a continuation as the very last > character of the line, before cr/nl line ending markers. > > There are no assumption about backslash escaping, quotes and such, which > seems reasonable given the lexing structure of the files, i.e. records of > space-separated words, and # line comments. Quoting does allow words containing spaces: https://www.postgresql.org/docs/current/auth-pg-hba-conf.html |A record is made up of a number of fields which are separated by spaces and/or |tabs. Fields can contain white space if the field value is double-quoted. |Quoting one of the keywords in a database, user, or address field (e.g., all or |replication) makes the word lose its special meaning, and just match a |database, user, or host with that name. -- Justin
Hi Justin, >> There are no assumption about backslash escaping, quotes and such, which >> seems reasonable given the lexing structure of the files, i.e. records of >> space-separated words, and # line comments. > > Quoting does allow words containing spaces: Ok. I meant that the continuation handling does not care of that, i.e. if the continuation is within quotes, then the quoted stuff is implicitely continuated, there is no different rule because it is within quotes. -- Fabien.
On 2020-04-01 10:25 p.m., Fabien COELHO wrote: > > Hello, > >> FWIW, I don't think so. Generally a trailing backspace is an escape >> character for the following newline. And '\ ' is a escaped space, >> which is usualy menas a space itself. >> >> In this case escape character doesn't work generally and I think it >> is natural that a backslash in the middle of a line is a backslash >> character itself. > > I concur: The backslash char is only a continuation as the very last > character of the line, before cr/nl line ending markers. +Agree. However, it would nice to update the sentence below if I understand it correctly. "+ Comments, whitespace and continuations are handled in the same way as in" pg_hba.conf For example, if a user provide a configuration like below (even such a comments is not recommended) "host all all 127.0.0.1/32 trust #COMMENTS, it works" i.e. the original pg_hba.conf allows to have comments in each line, but with new continuation introduced, the comments has to be put to the last line. > > There are no assumption about backslash escaping, quotes and such, > which seems reasonable given the lexing structure of the files, i.e. > records of space-separated words, and # line comments. > -- David Software Engineer Highgo Software Inc. (Canada) www.highgo.ca
Hello David, > +Agree. However, it would nice to update the sentence below if I understand > it correctly. > > "+ Comments, whitespace and continuations are handled in the same way as > in" pg_hba.conf In the attached v3, I've tried to clarify comments and doc about tokenization rules relating to comments, strings and continuations. -- Fabien.
Attachment
> In the attached v3, I've tried to clarify comments and doc about tokenization > rules relating to comments, strings and continuations. Attached v4 improves comments & doc as suggested by Justin. -- Fabien.
Attachment
Fabien COELHO <coelho@cri.ensmp.fr> writes: > [ pg-hba-cont-4.patch ] I looked this over and I think it seems reasonable, but there's something else we should do. If people are writing lines long enough that they need to continue them, how long will it be before they overrun the line length limit? Admittedly, there's a good deal of daylight between 80 characters and 8K, but if we're busy removing restrictions on password length in an adjacent thread [1], I think we ought to get rid of pg_hba.conf's line length restriction while we're at it. Accordingly, I borrowed some code from that thread and present the attached revision. I also added some test coverage, since that was lacking before, and wordsmithed docs and comments slightly. regards, tom lane [1] https://www.postgresql.org/message-id/flat/09512C4F-8CB9-4021-B455-EF4C4F0D55A0%40amazon.com diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml index 5cd88b462d..d62d1a061c 100644 --- a/doc/src/sgml/client-auth.sgml +++ b/doc/src/sgml/client-auth.sgml @@ -77,13 +77,15 @@ The general format of the <filename>pg_hba.conf</filename> file is a set of records, one per line. Blank lines are ignored, as is any text after the <literal>#</literal> comment character. - Records cannot be continued across lines. + A record can be continued onto the next line by ending the line with + a backslash. (Backslashes are not special except at the end of a line.) A record is made up of a number of fields which are separated by spaces and/or tabs. Fields can contain white space if the field value is double-quoted. Quoting one of the keywords in a database, user, or address field (e.g., <literal>all</literal> or <literal>replication</literal>) makes the word lose its special meaning, and just match a database, user, or host with that name. + Backslash line continuation applies even within quoted text or comments. </para> <para> @@ -821,7 +823,7 @@ local db1,db2,@demodbs all md5 <synopsis> <replaceable>map-name</replaceable> <replaceable>system-username</replaceable> <replaceable>database-username</replaceable> </synopsis> - Comments and whitespace are handled in the same way as in + Comments, whitespace and line continuations are handled in the same way as in <filename>pg_hba.conf</filename>. The <replaceable>map-name</replaceable> is an arbitrary name that will be used to refer to this mapping in <filename>pg_hba.conf</filename>. The other diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c index 9d63830553..5991a21cf2 100644 --- a/src/backend/libpq/hba.c +++ b/src/backend/libpq/hba.c @@ -29,6 +29,7 @@ #include "catalog/pg_collation.h" #include "catalog/pg_type.h" #include "common/ip.h" +#include "common/string.h" #include "funcapi.h" #include "libpq/ifaddr.h" #include "libpq/libpq.h" @@ -54,7 +55,6 @@ #define MAX_TOKEN 256 -#define MAX_LINE 8192 /* callback data for check_network_callback */ typedef struct check_network_data @@ -166,11 +166,19 @@ pg_isblank(const char c) /* * Grab one token out of the string pointed to by *lineptr. * - * Tokens are strings of non-blank - * characters bounded by blank characters, commas, beginning of line, and - * end of line. Blank means space or tab. Tokens can be delimited by - * double quotes (this allows the inclusion of blanks, but not newlines). - * Comments (started by an unquoted '#') are skipped. + * Tokens are strings of non-blank characters bounded by blank characters, + * commas, beginning of line, and end of line. Blank means space or tab. + * + * Tokens can be delimited by double quotes (this allows the inclusion of + * blanks or '#', but not newlines). As in SQL, write two double-quotes + * to represent a double quote. + * + * Comments (started by an unquoted '#') are skipped, i.e. the remainder + * of the line is ignored. + * + * (Note that line continuation processing happens before tokenization. + * Thus, if a continuation occurs within quoted text or a comment, the + * quoted text or comment is considered to continue to the next line.) * * The token, if any, is returned at *buf (a buffer of size bufsz), and * *lineptr is advanced past the token. @@ -470,6 +478,7 @@ static MemoryContext tokenize_file(const char *filename, FILE *file, List **tok_lines, int elevel) { int line_number = 1; + StringInfoData buf; MemoryContext linecxt; MemoryContext oldcxt; @@ -478,47 +487,72 @@ tokenize_file(const char *filename, FILE *file, List **tok_lines, int elevel) ALLOCSET_SMALL_SIZES); oldcxt = MemoryContextSwitchTo(linecxt); + initStringInfo(&buf); + *tok_lines = NIL; while (!feof(file) && !ferror(file)) { - char rawline[MAX_LINE]; char *lineptr; List *current_line = NIL; char *err_msg = NULL; + int last_backslash_buflen = 0; + int continuations = 0; - if (!fgets(rawline, sizeof(rawline), file)) - { - int save_errno = errno; + /* Collect the next input line, handling backslash continuations */ + resetStringInfo(&buf); - if (!ferror(file)) - break; /* normal EOF */ - /* I/O error! */ - ereport(elevel, - (errcode_for_file_access(), - errmsg("could not read file \"%s\": %m", filename))); - err_msg = psprintf("could not read file \"%s\": %s", - filename, strerror(save_errno)); - rawline[0] = '\0'; - } - if (strlen(rawline) == MAX_LINE - 1) + while (!feof(file) && !ferror(file)) { - /* Line too long! */ - ereport(elevel, - (errcode(ERRCODE_CONFIG_FILE_ERROR), - errmsg("authentication file line too long"), - errcontext("line %d of configuration file \"%s\"", - line_number, filename))); - err_msg = "authentication file line too long"; - } + /* Make sure there's a reasonable amount of room in the buffer */ + enlargeStringInfo(&buf, 128); + + /* Read some data, appending it to what we already have */ + if (fgets(buf.data + buf.len, buf.maxlen - buf.len, file) == NULL) + { + int save_errno = errno; + + if (!ferror(file)) + break; /* normal EOF */ + /* I/O error! */ + ereport(elevel, + (errcode_for_file_access(), + errmsg("could not read file \"%s\": %m", filename))); + err_msg = psprintf("could not read file \"%s\": %s", + filename, strerror(save_errno)); + resetStringInfo(&buf); + break; + } + buf.len += strlen(buf.data + buf.len); + + /* If we haven't got a whole line, loop to read more */ + if (!(buf.len > 0 && buf.data[buf.len - 1] == '\n')) + continue; + + /* Strip trailing newline, including \r in case we're on Windows */ + buf.len = pg_strip_crlf(buf.data); + + /* + * Check for backslash continuation. The backslash must be after + * the last place we found a continuation, else two backslashes + * followed by two \n's would behave surprisingly. + */ + if (buf.len > last_backslash_buflen && + buf.data[buf.len - 1] == '\\') + { + /* Continuation, so strip it and keep reading */ + buf.data[--buf.len] = '\0'; + last_backslash_buflen = buf.len; + continuations++; + continue; + } - /* Strip trailing linebreak from rawline */ - lineptr = rawline + strlen(rawline) - 1; - while (lineptr >= rawline && (*lineptr == '\n' || *lineptr == '\r')) - *lineptr-- = '\0'; + /* Nope, so we have the whole line */ + break; + } /* Parse fields */ - lineptr = rawline; + lineptr = buf.data; while (*lineptr && err_msg == NULL) { List *current_field; @@ -538,12 +572,12 @@ tokenize_file(const char *filename, FILE *file, List **tok_lines, int elevel) tok_line = (TokenizedLine *) palloc(sizeof(TokenizedLine)); tok_line->fields = current_line; tok_line->line_num = line_number; - tok_line->raw_line = pstrdup(rawline); + tok_line->raw_line = pstrdup(buf.data); tok_line->err_msg = err_msg; *tok_lines = lappend(*tok_lines, tok_line); } - line_number++; + line_number += continuations + 1; } MemoryContextSwitchTo(oldcxt); diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl index 1b4323fe2a..59b1b403c5 100644 --- a/src/test/authentication/t/001_password.pl +++ b/src/test/authentication/t/001_password.pl @@ -29,7 +29,8 @@ sub reset_pg_hba my $hba_method = shift; unlink($node->data_dir . '/pg_hba.conf'); - $node->append_conf('pg_hba.conf', "local all all $hba_method"); + # just for testing purposes, use a continuation line + $node->append_conf('pg_hba.conf', "local all all\\\n $hba_method"); $node->reload; return; }
I wrote: > Accordingly, I borrowed some code from that thread and present > the attached revision. I also added some test coverage, since > that was lacking before, and wordsmithed docs and comments slightly. Hearing no comments, pushed that way. regards, tom lane
Hello Tom, >> Accordingly, I borrowed some code from that thread and present >> the attached revision. I also added some test coverage, since >> that was lacking before, and wordsmithed docs and comments slightly. > > Hearing no comments, pushed that way. Thanks for the fixes and improvements! I notice that buf.data is not freed. I guess that the server memory management will recover it. -- Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes: > I notice that buf.data is not freed. I guess that the server memory > management will recover it. Yeah, it's in the transient context holding all of the results of reading the file. I considered pfree'ing it at the end of the function, but I concluded there's no point. The space will be recycled when the context is destroyed, and since we're not (IIRC) going to allocate anything more in that context, nothing would be gained by freeing it earlier --- it'd just stay as unused memory within the context. regards, tom lane