Thread: reading uninitialized buffer
I've been testing pg using valgrind and have found a read of an uninitialized buffer. In the hba-tokenizer when we have not read any characters (or too few) we still perform a couple of: strncmp(start_buf,"sameuser",8) Since this is done on random data it might return true although we have not read anything. The result is that we can (even if the probability is low) return the wrong thing. The solution is simply to terminate the buffer with '\0' before the strncmp(). I also moved our test inside the previous if, outside of that block our test can never be true anyway. I don't know why it was outside in the first place. -- /Dennis Björklund
Attachment
This time it is my fault, rather than freebsd's ;-) I think I can do something slightly cleaner than this, though, by hoisting the buf termination above the test. We could also replace the strncmp calls with strcmp calls if the buffer has its nul. I will post something soon. cheers andrew Dennis Bjorklund wrote: >I've been testing pg using valgrind and have found a read of an >uninitialized buffer. In the hba-tokenizer when we have not read any >characters (or too few) we still perform a couple of: > > strncmp(start_buf,"sameuser",8) > >Since this is done on random data it might return true although we have >not read anything. The result is that we can (even if the probability is >low) return the wrong thing. > >The solution is simply to terminate the buffer with '\0' before the >strncmp(). > >I also moved our test inside the previous if, outside of that block our >test can never be true anyway. I don't know why it was outside in the >first place. > > > >------------------------------------------------------------------------ > >Index: src/backend/libpq/hba.c >=================================================================== >RCS file: /cvsroot/pgsql-server/src/backend/libpq/hba.c,v >retrieving revision 1.119 >diff -u -c -r1.119 hba.c >*** src/backend/libpq/hba.c 25 Dec 2003 03:44:04 -0000 1.119 >--- src/backend/libpq/hba.c 1 Feb 2004 07:40:00 -0000 >*************** >*** 166,188 **** > */ > if (c != EOF) > ungetc(c, fp); >- } > > >! if ( !saw_quote && >! ( >! strncmp(start_buf,"all",3) == 0 || >! strncmp(start_buf,"sameuser",8) == 0 || >! strncmp(start_buf,"samegroup",9) == 0 >! ) >! ) >! { >! /* append newline to a magical keyword */ >! *buf++ = '\n'; > } > > *buf = '\0'; >- > } > > /* >--- 166,188 ---- > */ > if (c != EOF) > ungetc(c, fp); > >+ if (!saw_quote) >+ { >+ *buf = '\0'; > >! if (strncmp(start_buf,"all",3) == 0 || >! strncmp(start_buf,"sameuser",8) == 0 || >! strncmp(start_buf,"samegroup",9) == 0 >! ) >! { >! /* append newline to a magical keyword */ >! *buf++ = '\n'; >! } >! } > } > > *buf = '\0'; > } > > /* > > >------------------------------------------------------------------------ > > >---------------------------(end of broadcast)--------------------------- >TIP 8: explain analyze is your friend > >
... and here it is. As for the test being outside the "if" statement, it is true that that might waste a few cycles, but it hardly matters. Personally, I would prefer to replace the if statement with this: if (c == EOF || c == '\n') { *buf = '\0'; return; } and then it wouldn't be an issue at all, but I know some people don't like early function returns - is there a general postgres style rule about it? cheers andrew I wrote: > > This time it is my fault, rather than freebsd's ;-) > > I think I can do something slightly cleaner than this, though, by > hoisting the buf termination above the test. We could also replace the > strncmp calls with strcmp calls if the buffer has its nul. I will post > something soon. > > cheers > > andrew > > > Dennis Bjorklund wrote: > >> I've been testing pg using valgrind and have found a read of an >> uninitialized buffer. In the hba-tokenizer when we have not read any >> characters (or too few) we still perform a couple of: >> >> strncmp(start_buf,"sameuser",8) >> >> Since this is done on random data it might return true although we have >> not read anything. The result is that we can (even if the probability is >> low) return the wrong thing. >> >> The solution is simply to terminate the buffer with '\0' before the >> strncmp(). >> >> I also moved our test inside the previous if, outside of that block our >> test can never be true anyway. I don't know why it was outside in the >> first place. >> >> >> > Index: src/backend/libpq/hba.c =================================================================== RCS file: /projects/cvsroot/pgsql-server/src/backend/libpq/hba.c,v retrieving revision 1.119 diff -c -w -r1.119 hba.c *** src/backend/libpq/hba.c 25 Dec 2003 03:44:04 -0000 1.119 --- src/backend/libpq/hba.c 1 Feb 2004 13:53:51 -0000 *************** *** 169,187 **** } if ( !saw_quote && ( ! strncmp(start_buf,"all",3) == 0 || ! strncmp(start_buf,"sameuser",8) == 0 || ! strncmp(start_buf,"samegroup",9) == 0 ) ) { /* append newline to a magical keyword */ *buf++ = '\n'; } - *buf = '\0'; } --- 169,189 ---- } + *buf = '\0'; + if ( !saw_quote && ( ! strcmp(start_buf,"all") == 0 || ! strcmp(start_buf,"sameuser") == 0 || ! strcmp(start_buf,"samegroup") == 0 ) ) { /* append newline to a magical keyword */ *buf++ = '\n'; + *buf = '\0'; } }
On Sun, 1 Feb 2004, Andrew Dunstan wrote: > As for the test being outside the "if" statement, it is true that that > might waste a few cycles, but it hardly matters. The cycles are not important. My "fix" wasn't the most optimized either if one should count cycles. It was terminating the string twice in some cases. That I thought about and came to the conclusion that it was not important. That I didn't rewrite the strncmp() to strcmp() is strange to me, the length is obviously not needed. Good thing you looked at it. > Personally, I would prefer to replace the if statement with this: > > if (c == EOF || c == '\n') > { > *buf = '\0'; > return; > } > > and then it wouldn't be an issue at all, but I know some people don't > like early function returns - is there a general postgres style rule I don't know what the style rules say. I have nothing against early returns if used with grace. Early exits for odd cases, before the main part of the function, just helps readability if you ask me. On the other hand it does not matter since the correct is always to use whatever style the rest of the program uses. -- /Dennis Björklund
Andrew Dunstan <andrew@dunslane.net> writes: > ... and then it wouldn't be an issue at all, but I know some people don't > like early function returns - is there a general postgres style rule > about it? Hardly; we do that a lot. Write it however it seems clearest. regards, tom lane
OK, then *This* patch does it the way I think is clearest. Most of it is just reindenting. cheers andrew Dennis Bjorklund wrote: >On Sun, 1 Feb 2004, Andrew Dunstan wrote: > > > >>As for the test being outside the "if" statement, it is true that that >>might waste a few cycles, but it hardly matters. >> >> > >The cycles are not important. My "fix" wasn't the most optimized either if >one should count cycles. It was terminating the string twice in some >cases. That I thought about and came to the conclusion that it was not >important. That I didn't rewrite the strncmp() to strcmp() is strange to >me, the length is obviously not needed. Good thing you looked at it. > > > >>Personally, I would prefer to replace the if statement with this: >> >> if (c == EOF || c == '\n') >> { >> *buf = '\0'; >> return; >> } >> >>and then it wouldn't be an issue at all, but I know some people don't >>like early function returns - is there a general postgres style rule >> >> > >I don't know what the style rules say. I have nothing against early >returns if used with grace. Early exits for odd cases, before the main >part of the function, just helps readability if you ask me. On the other >hand it does not matter since the correct is always to use whatever style >the rest of the program uses. > > > Index: src/backend/libpq/hba.c =================================================================== RCS file: /projects/cvsroot/pgsql-server/src/backend/libpq/hba.c,v retrieving revision 1.119 diff -c -r1.119 hba.c *** src/backend/libpq/hba.c 25 Dec 2003 03:44:04 -0000 1.119 --- src/backend/libpq/hba.c 1 Feb 2004 17:17:34 -0000 *************** *** 105,187 **** while ((c = getc(fp)) != EOF && (pg_isblank(c) || c == ',')) ; ! if (c != EOF && c != '\n') { ! /* ! * Build a token in buf of next characters up to EOF, EOL, ! * unquoted comma, or unquoted whitespace. ! */ ! while (c != EOF && c != '\n' && ! (!pg_isblank(c) || in_quote == true)) { ! /* skip comments to EOL */ ! if (c == '#' && !in_quote) ! { ! while ((c = getc(fp)) != EOF && c != '\n') ! ; ! /* If only comment, consume EOL too; return EOL */ ! if (c != EOF && buf == start_buf) ! c = getc(fp); ! break; ! } ! ! if (buf >= end_buf) ! { ! ereport(LOG, ! (errcode(ERRCODE_CONFIG_FILE_ERROR), ! errmsg("authentication file token too long, skipping: \"%s\"", ! buf))); ! /* Discard remainder of line */ ! while ((c = getc(fp)) != EOF && c != '\n') ! ; ! buf[0] = '\0'; ! break; ! } ! ! if (c != '"' || (c == '"' && was_quote)) ! *buf++ = c; ! ! /* We pass back the comma so the caller knows there is more */ ! if ((pg_isblank(c) || c == ',') && !in_quote) ! break; ! ! /* Literal double-quote is two double-quotes */ ! if (in_quote && c == '"') ! was_quote = !was_quote; ! else ! was_quote = false; ! ! if (c == '"') ! { ! in_quote = !in_quote; ! saw_quote = true; ! } ! c = getc(fp); } ! /* ! * Put back the char right after the token (critical in case it is ! * EOL, since we need to detect end-of-line at next call). ! */ ! if (c != EOF) ! ungetc(c, fp); } if ( !saw_quote && ( ! strncmp(start_buf,"all",3) == 0 || ! strncmp(start_buf,"sameuser",8) == 0 || ! strncmp(start_buf,"samegroup",9) == 0 ) ) { /* append newline to a magical keyword */ *buf++ = '\n'; } - *buf = '\0'; } --- 105,191 ---- while ((c = getc(fp)) != EOF && (pg_isblank(c) || c == ',')) ; ! if (c == EOF || c == '\n') { ! *buf = '\0'; ! return; ! } ! ! /* ! * Build a token in buf of next characters up to EOF, EOL, ! * unquoted comma, or unquoted whitespace. ! */ ! while (c != EOF && c != '\n' && ! (!pg_isblank(c) || in_quote == true)) ! { ! /* skip comments to EOL */ ! if (c == '#' && !in_quote) { ! while ((c = getc(fp)) != EOF && c != '\n') ! ; ! /* If only comment, consume EOL too; return EOL */ ! if (c != EOF && buf == start_buf) ! c = getc(fp); ! break; ! } ! if (buf >= end_buf) ! { ! ereport(LOG, ! (errcode(ERRCODE_CONFIG_FILE_ERROR), ! errmsg("authentication file token too long, skipping: \"%s\"", ! buf))); ! /* Discard remainder of line */ ! while ((c = getc(fp)) != EOF && c != '\n') ! ; ! buf[0] = '\0'; ! break; } ! if (c != '"' || (c == '"' && was_quote)) ! *buf++ = c; ! ! /* We pass back the comma so the caller knows there is more */ ! if ((pg_isblank(c) || c == ',') && !in_quote) ! break; ! ! /* Literal double-quote is two double-quotes */ ! if (in_quote && c == '"') ! was_quote = !was_quote; ! else ! was_quote = false; ! ! if (c == '"') ! { ! in_quote = !in_quote; ! saw_quote = true; ! } ! ! c = getc(fp); } + /* + * Put back the char right after the token (critical in case it is + * EOL, since we need to detect end-of-line at next call). + */ + if (c != EOF) + ungetc(c, fp); + + *buf = '\0'; if ( !saw_quote && ( ! strcmp(start_buf,"all") == 0 || ! strcmp(start_buf,"sameuser") == 0 || ! strcmp(start_buf,"samegroup") == 0 ) ) { /* append newline to a magical keyword */ *buf++ = '\n'; + *buf = '\0'; } }
Andrew Dunstan <andrew@dunslane.net> writes: > OK, then *This* patch does it the way I think is clearest. Most of > it is just reindenting. Unless anyone objects, I'll review and apply this patch within 24 hours. Thanks for the patch, Dennis and Andrew. -Neil
Andrew Dunstan <andrew@dunslane.net> writes: > OK, then *This* patch does it the way I think is clearest. Most of it > is just reindenting. Okay, I applied this patch, and made a few additional cleanups along the way (the whole function and a couple other things don't need to be declared in hba.h, for example). The incremental diff between what I applied and what Andrew posted is attached. BTW, I think the proper fix is to rewrite next_token(), which is rather obfuscated at present. Anyone want to take a shot at this? -Neil
Attachment
Neil Conway wrote: >Andrew Dunstan <andrew@dunslane.net> writes: > > >>OK, then *This* patch does it the way I think is clearest. Most of it >>is just reindenting. >> >> > >Okay, I applied this patch, and made a few additional cleanups along >the way (the whole function and a couple other things don't need to be >declared in hba.h, for example). The incremental diff between what I >applied and what Andrew posted is attached. > >BTW, I think the proper fix is to rewrite next_token(), which is >rather obfuscated at present. Anyone want to take a shot at this? > > the file probably has several candidates for cleanup - parse_hba() not least among them, but I am trying to do several things during my involuntary "time off", which I expect to end RSN, so I doubt I can get to either of these soon. cheers andrew