Thread: Removing the fixed-size buffer restriction in hba.c

Removing the fixed-size buffer restriction in hba.c

From
Tom Lane
Date:
We got a complaint at [1] about how a not-so-unreasonable LDAP
configuration can hit the "authentication file token too long,
skipping" error case in hba.c's next_token().  I think we've
seen similar complaints before, although a desultory archives
search didn't turn one up.

A minimum-change response would be to increase the MAX_TOKEN
constant from 256 to (say) 1K or 10K.  But it wouldn't be all
that hard to replace the fixed-size buffer with a StringInfo,
as attached.

Given the infrequency of complaints, I'm inclined to apply
the more thorough fix only in HEAD, and to just raise MAX_TOKEN
in the back branches.  Thoughts?

            regards, tom lane

[1]
https://www.postgresql.org/message-id/PH0PR04MB8294A4C5A65D9D492CBBD349C002A%40PH0PR04MB8294.namprd04.prod.outlook.com

diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
index 5d4ddbb04d..629463a00f 100644
--- a/src/backend/libpq/hba.c
+++ b/src/backend/libpq/hba.c
@@ -56,8 +56,6 @@
 #endif
 
 
-#define MAX_TOKEN    256
-
 /* callback data for check_network_callback */
 typedef struct check_network_data
 {
@@ -161,8 +159,8 @@ pg_isblank(const char c)
  * commas, beginning of line, and end of line.  Blank means space or tab.
  *
  * Tokens can be delimited by double quotes (this allows the inclusion of
- * blanks or '#', but not newlines).  As in SQL, write two double-quotes
- * to represent a double quote.
+ * commas, blanks, and '#', but not newlines).  As in SQL, write two
+ * double-quotes to represent a double quote.
  *
  * Comments (started by an unquoted '#') are skipped, i.e. the remainder
  * of the line is ignored.
@@ -171,7 +169,7 @@ pg_isblank(const char c)
  * Thus, if a continuation occurs within quoted text or a comment, the
  * quoted text or comment is considered to continue to the next line.)
  *
- * The token, if any, is returned at *buf (a buffer of size bufsz), and
+ * The token, if any, is returned into *buf, and
  * *lineptr is advanced past the token.
  *
  * Also, we set *initial_quote to indicate whether there was quoting before
@@ -180,30 +178,28 @@ pg_isblank(const char c)
  * we want to allow that to support embedded spaces in file paths.)
  *
  * We set *terminating_comma to indicate whether the token is terminated by a
- * comma (which is not returned).
+ * comma (which is not returned, nor advanced over).
  *
  * In event of an error, log a message at ereport level elevel, and also
- * set *err_msg to a string describing the error.  Currently the only
- * possible error is token too long for buf.
+ * set *err_msg to a string describing the error.  Currently no error
+ * conditions are defined.
  *
- * If successful: store null-terminated token at *buf and return true.
- * If no more tokens on line: set *buf = '\0' and return false.
+ * If successful: store dequoted token in *buf and return true.
+ * If no more tokens on line: set *buf to empty and return false.
  * If error: fill buf with truncated or misformatted token and return false.
  */
 static bool
-next_token(char **lineptr, char *buf, int bufsz,
+next_token(char **lineptr, StringInfo buf,
            bool *initial_quote, bool *terminating_comma,
            int elevel, char **err_msg)
 {
     int            c;
-    char       *start_buf = buf;
-    char       *end_buf = buf + (bufsz - 1);
     bool        in_quote = false;
     bool        was_quote = false;
     bool        saw_quote = false;
 
-    Assert(end_buf > start_buf);
-
+    /* Initialize output parameters */
+    resetStringInfo(buf);
     *initial_quote = false;
     *terminating_comma = false;
 
@@ -226,22 +222,6 @@ next_token(char **lineptr, char *buf, int bufsz,
             break;
         }
 
-        if (buf >= end_buf)
-        {
-            *buf = '\0';
-            ereport(elevel,
-                    (errcode(ERRCODE_CONFIG_FILE_ERROR),
-                     errmsg("authentication file token too long, skipping: \"%s\"",
-                            start_buf)));
-            *err_msg = "authentication file token too long";
-            /* Discard remainder of line */
-            while ((c = (*(*lineptr)++)) != '\0')
-                ;
-            /* Un-eat the '\0', in case we're called again */
-            (*lineptr)--;
-            return false;
-        }
-
         /* we do not pass back a terminating comma in the token */
         if (c == ',' && !in_quote)
         {
@@ -250,7 +230,7 @@ next_token(char **lineptr, char *buf, int bufsz,
         }
 
         if (c != '"' || was_quote)
-            *buf++ = c;
+            appendStringInfoChar(buf, c);
 
         /* Literal double-quote is two double-quotes */
         if (in_quote && c == '"')
@@ -262,7 +242,7 @@ next_token(char **lineptr, char *buf, int bufsz,
         {
             in_quote = !in_quote;
             saw_quote = true;
-            if (buf == start_buf)
+            if (buf->len == 0)
                 *initial_quote = true;
         }
 
@@ -275,9 +255,7 @@ next_token(char **lineptr, char *buf, int bufsz,
      */
     (*lineptr)--;
 
-    *buf = '\0';
-
-    return (saw_quote || buf > start_buf);
+    return (saw_quote || buf->len > 0);
 }
 
 /*
@@ -409,21 +387,23 @@ static List *
 next_field_expand(const char *filename, char **lineptr,
                   int elevel, int depth, char **err_msg)
 {
-    char        buf[MAX_TOKEN];
+    StringInfoData buf;
     bool        trailing_comma;
     bool        initial_quote;
     List       *tokens = NIL;
 
+    initStringInfo(&buf);
+
     do
     {
-        if (!next_token(lineptr, buf, sizeof(buf),
+        if (!next_token(lineptr, &buf,
                         &initial_quote, &trailing_comma,
                         elevel, err_msg))
             break;
 
         /* Is this referencing a file? */
-        if (!initial_quote && buf[0] == '@' && buf[1] != '\0')
-            tokens = tokenize_expand_file(tokens, filename, buf + 1,
+        if (!initial_quote && buf.len > 1 && buf.data[0] == '@')
+            tokens = tokenize_expand_file(tokens, filename, buf.data + 1,
                                           elevel, depth + 1, err_msg);
         else
         {
@@ -434,11 +414,13 @@ next_field_expand(const char *filename, char **lineptr,
              * for the list of tokens.
              */
             oldcxt = MemoryContextSwitchTo(tokenize_context);
-            tokens = lappend(tokens, make_auth_token(buf, initial_quote));
+            tokens = lappend(tokens, make_auth_token(buf.data, initial_quote));
             MemoryContextSwitchTo(oldcxt);
         }
     } while (trailing_comma && (*err_msg == NULL));
 
+    pfree(buf.data);
+
     return tokens;
 }


Re: Removing the fixed-size buffer restriction in hba.c

From
Fabrízio de Royes Mello
Date:

On Mon, Jul 24, 2023 at 2:53 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> We got a complaint at [1] about how a not-so-unreasonable LDAP
> configuration can hit the "authentication file token too long,
> skipping" error case in hba.c's next_token().  I think we've
> seen similar complaints before, although a desultory archives
> search didn't turn one up.
>
> A minimum-change response would be to increase the MAX_TOKEN
> constant from 256 to (say) 1K or 10K.  But it wouldn't be all
> that hard to replace the fixed-size buffer with a StringInfo,
> as attached.
>

+1 for replacing it with StringInfo. And the patch LGTM!

 
>
> Given the infrequency of complaints, I'm inclined to apply
> the more thorough fix only in HEAD, and to just raise MAX_TOKEN
> in the back branches.  Thoughts?
>

It makes sense to change it only in HEAD. 

Regards,

--
Fabrízio de Royes Mello

Re: Removing the fixed-size buffer restriction in hba.c

From
Nathan Bossart
Date:
On Mon, Jul 24, 2023 at 03:07:07PM -0300, Fabrízio de Royes Mello wrote:
> On Mon, Jul 24, 2023 at 2:53 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> We got a complaint at [1] about how a not-so-unreasonable LDAP
>> configuration can hit the "authentication file token too long,
>> skipping" error case in hba.c's next_token().  I think we've
>> seen similar complaints before, although a desultory archives
>> search didn't turn one up.
>>
>> A minimum-change response would be to increase the MAX_TOKEN
>> constant from 256 to (say) 1K or 10K.  But it wouldn't be all
>> that hard to replace the fixed-size buffer with a StringInfo,
>> as attached.
>>
> 
> +1 for replacing it with StringInfo. And the patch LGTM!

I'd suggest noting that next_token() clears buf, but otherwise it looks
reasonable to me, too.

>> Given the infrequency of complaints, I'm inclined to apply
>> the more thorough fix only in HEAD, and to just raise MAX_TOKEN
>> in the back branches.  Thoughts?
>>
> 
> It makes sense to change it only in HEAD.

I wouldn't be opposed to back-patching the more thorough fix, but I don't
feel too strongly about it.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: Removing the fixed-size buffer restriction in hba.c

From
Michael Paquier
Date:
On Mon, Jul 24, 2023 at 03:58:31PM -0700, Nathan Bossart wrote:
> On Mon, Jul 24, 2023 at 03:07:07PM -0300, Fabrízio de Royes Mello wrote:
>>> Given the infrequency of complaints, I'm inclined to apply
>>> the more thorough fix only in HEAD, and to just raise MAX_TOKEN
>>> in the back branches.  Thoughts?
>>>
>>
>> It makes sense to change it only in HEAD.
>
> I wouldn't be opposed to back-patching the more thorough fix, but I don't
> feel too strongly about it.

- * The token, if any, is returned at *buf (a buffer of size bufsz), and
+ * The token, if any, is returned into *buf, and
  * *lineptr is advanced past the token.

The comment indentation is a bit off here.

  * In event of an error, log a message at ereport level elevel, and also
- * set *err_msg to a string describing the error.  Currently the only
- * possible error is token too long for buf.
+ * set *err_msg to a string describing the error.  Currently no error
+ * conditions are defined.

I find the choice to keep err_msg in next_token() a bit confusing in
next_field_expand().  If no errors are possible, why not just get rid
of it?

FWIW, I don't feel strongly about backpatching that either, so for the
back branches I'd choose to bump up the token size limit and call it a
day.
--
Michael

Attachment

Re: Removing the fixed-size buffer restriction in hba.c

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> I find the choice to keep err_msg in next_token() a bit confusing in
> next_field_expand().  If no errors are possible, why not just get rid
> of it?

Yeah, I dithered about that.  I felt like removing it only to put it
back later would be silly, but then again maybe there won't ever be
a need to put it back.  I'm OK with removing it if there's not
objections.

            regards, tom lane



Re: Removing the fixed-size buffer restriction in hba.c

From
Michael Paquier
Date:
On Mon, Jul 24, 2023 at 08:21:54PM -0400, Tom Lane wrote:
> Yeah, I dithered about that.  I felt like removing it only to put it
> back later would be silly, but then again maybe there won't ever be
> a need to put it back.  I'm OK with removing it if there's not
> objections.

Seeing what this code does, the odds of needing an err_msg seem rather
low to me, so I would just remove it for now for the sake of clarity.
It can always be added later if need be.
--
Michael

Attachment

Re: Removing the fixed-size buffer restriction in hba.c

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> Seeing what this code does, the odds of needing an err_msg seem rather
> low to me, so I would just remove it for now for the sake of clarity.
> It can always be added later if need be.

Done that way.  Thanks for looking at it!

            regards, tom lane



Re: Removing the fixed-size buffer restriction in hba.c

From
Michael Paquier
Date:
On Thu, Jul 27, 2023 at 12:11:38PM -0400, Tom Lane wrote:
> Done that way.  Thanks for looking at it!

Thanks for applying!
--
Michael

Attachment