Thread: fatal flex error in guc-file.l kills the postmaster
While testing the configuration include directory patch, I decided to try the regular postgresql.conf "include" on a directory: include 'base' Reloading the configuration then rather rudely kills the postmaster: FATAL: input in flex scanner failed Setting aside whether we should offer a better diagnostic when a user points "include" at a directory, the yy_fatal_error hack that made sense for scan.l doesn't cut it for guc-file.l. Like other guc-file.l errors, we need to choose the elevel based on which process is running the code. ERROR is never the right choice. We should instead stash the message, longjmp out of the flex parser, and proceed to handle the error mostly like a regular syntax error. See attached small patch. It seems plainly hackish to mutate the body of yy_fatal_error() by #define'ing fprintf(). Why didn't we instead #define YY_FATAL_ERROR and provide our own complete replacement? (Both strategies are undocumented.) I haven't changed that decision in this patch, though. On a related note, the out-of-memory handling during config file reload is inconsistent. guc-file.l uses palloc/pstrdup in various places. An OOM at those sites during a reload would kill the postmaster. guc.c has functions like guc_malloc that handle variable elevels. However, various check_* and assign_* functions, such as check_log_destination(), use pstrdup() and guc_malloc(ERROR). Failing there during a reload would also kill the postmaster. Is it worth doing better here? Incidentally, the wal writer process no longer exits promptly on postmaster death. I didn't look into that further. Thanks, nm
Attachment
Noah Misch <noah@leadboat.com> writes: > Setting aside whether we should offer a better diagnostic when a user points > "include" at a directory, the yy_fatal_error hack that made sense for scan.l > doesn't cut it for guc-file.l. Like other guc-file.l errors, we need to > choose the elevel based on which process is running the code. ERROR is never > the right choice. We should instead stash the message, longjmp out of the > flex parser, and proceed to handle the error mostly like a regular syntax > error. See attached small patch. Well, if you're going to criticize the original method as being hackish, you really need to offer a cleaner substitute than this one ;-). In particular I'm not happy with adding a sigsetjmp() cycle for every single token parsed. > It seems plainly hackish to mutate the body of yy_fatal_error() by #define'ing > fprintf(). I agree, but the flex boys haven't provided any nicer API ... at least not unless we'd like to convert to C++. > Why didn't we instead #define YY_FATAL_ERROR and provide our own > complete replacement? For one reason, yy_fatal_error would still be there and would draw an "unused static function" warning. > On a related note, the out-of-memory handling during config file reload is > inconsistent. guc-file.l uses palloc/pstrdup in various places. An OOM at > those sites during a reload would kill the postmaster. If you've got OOM in the postmaster, you're dead anyway. I feel no compulsion to worry about this. regards, tom lane
On Sat, Dec 17, 2011 at 11:04:43PM -0500, Tom Lane wrote: > Noah Misch <noah@leadboat.com> writes: > > Setting aside whether we should offer a better diagnostic when a user points > > "include" at a directory, the yy_fatal_error hack that made sense for scan.l > > doesn't cut it for guc-file.l. Like other guc-file.l errors, we need to > > choose the elevel based on which process is running the code. ERROR is never > > the right choice. We should instead stash the message, longjmp out of the > > flex parser, and proceed to handle the error mostly like a regular syntax > > error. See attached small patch. > > Well, if you're going to criticize the original method as being hackish, > you really need to offer a cleaner substitute than this one ;-). In > particular I'm not happy with adding a sigsetjmp() cycle for every > single token parsed. On the contrary, I want to make it even more of a hack to get better behavior! Here's a version that calls sigsetjmp() once per file. While postgresql.conf scanning hardly seems like the place to be shaving cycles, this does catch fatal errors in functions other than yylex(), notably yy_create_buffer(). > > On a related note, the out-of-memory handling during config file reload is > > inconsistent. guc-file.l uses palloc/pstrdup in various places. An OOM at > > those sites during a reload would kill the postmaster. > > If you've got OOM in the postmaster, you're dead anyway. I feel no > compulsion to worry about this. Works for me. Thanks, nm
Attachment
On Sun, Dec 18, 2011 at 11:53 AM, Noah Misch <noah@leadboat.com> wrote: > Here's a version that calls sigsetjmp() once per file. =A0While postgresq= l.conf > scanning hardly seems like the place to be shaving cycles, this does catch > fatal errors in functions other than yylex(), notably yy_create_buffer(). This strikes me as more clever than necessary: #define fprintf(file, fmt, msg) \ 0; /* eat cast to void */ \ GUC_flex_fatal_errmsg =3D msg; \ siglongjmp(*GUC_flex_fatal_jmp, 1) Can't we just define a function jumpoutoftheparser() here and call that function rather than doing that /* eat cast to void */ hack? That would also involve fewer assumptions about the coding style emitted by flex. For example, if flex chose to do something like: if (bumpity) fprintf(__FILE__, "%s", "dinglewump"); ...the proposed definition would be a disaster. I doubt that inlining is a material performance benefit here; siglongjmp() can't be all that cheap, and the error path shouldn't be all that frequent. Instead of making ParseConfigFp responsible for restoring GUC_flex_fatal_jmp after calling anything that might recursively call ParseConfigFp, I think it would make more sense to define it to stash away the previous value and restore it on exit. That way it wouldn't need to know which of the things that it calls might in turn recursively call it, which seems likely to reduce the chances of present or future bugs. A few comments about whichever way we go with it seem like a good idea, too. --=20 Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Jan 16, 2012 at 08:54:53PM -0500, Robert Haas wrote: > On Sun, Dec 18, 2011 at 11:53 AM, Noah Misch <noah@leadboat.com> wrote: > > Here's a version that calls sigsetjmp() once per file. ?While postgresql.conf > > scanning hardly seems like the place to be shaving cycles, this does catch > > fatal errors in functions other than yylex(), notably yy_create_buffer(). > > This strikes me as more clever than necessary: > > #define fprintf(file, fmt, msg) \ > 0; /* eat cast to void */ \ > GUC_flex_fatal_errmsg = msg; \ > siglongjmp(*GUC_flex_fatal_jmp, 1) > > Can't we just define a function jumpoutoftheparser() here and call > that function rather than doing that /* eat cast to void */ hack? > That would also involve fewer assumptions about the coding style > emitted by flex. For example, if flex chose to do something like: > > if (bumpity) fprintf(__FILE__, "%s", "dinglewump"); > > ...the proposed definition would be a disaster. I doubt that inlining > is a material performance benefit here; siglongjmp() can't be all that > cheap, and the error path shouldn't be all that frequent. > > Instead of making ParseConfigFp responsible for restoring > GUC_flex_fatal_jmp after calling anything that might recursively call > ParseConfigFp, I think it would make more sense to define it to stash > away the previous value and restore it on exit. That way it wouldn't > need to know which of the things that it calls might in turn > recursively call it, which seems likely to reduce the chances of > present or future bugs. A few comments about whichever way we go with > it seem like a good idea, too. Agreed on all points. I also changed the save/restore of ConfigFileLineno to work the same way. New version attached. Thanks, nm
Attachment
On Tue, Jan 17, 2012 at 1:23 PM, Noah Misch <noah@leadboat.com> wrote: > On Mon, Jan 16, 2012 at 08:54:53PM -0500, Robert Haas wrote: >> On Sun, Dec 18, 2011 at 11:53 AM, Noah Misch <noah@leadboat.com> wrote: >> > Here's a version that calls sigsetjmp() once per file. ?While postgres= ql.conf >> > scanning hardly seems like the place to be shaving cycles, this does c= atch >> > fatal errors in functions other than yylex(), notably yy_create_buffer= (). >> >> This strikes me as more clever than necessary: >> >> #define fprintf(file, fmt, msg) \ >> =A0 =A0 0; /* eat cast to void */ \ >> =A0 =A0 GUC_flex_fatal_errmsg =3D msg; \ >> =A0 =A0 siglongjmp(*GUC_flex_fatal_jmp, 1) >> >> Can't we just define a function jumpoutoftheparser() here and call >> that function rather than doing that /* eat cast to void */ hack? >> That would also involve fewer assumptions about the coding style >> emitted by flex. =A0For example, if flex chose to do something like: >> >> =A0 if (bumpity) fprintf(__FILE__, "%s", "dinglewump"); >> >> ...the proposed definition would be a disaster. =A0I doubt that inlining >> is a material performance benefit here; siglongjmp() can't be all that >> cheap, and the error path shouldn't be all that frequent. >> >> Instead of making ParseConfigFp responsible for restoring >> GUC_flex_fatal_jmp after calling anything that might recursively call >> ParseConfigFp, I think it would make more sense to define it to stash >> away the previous value and restore it on exit. =A0That way it wouldn't >> need to know which of the things that it calls might in turn >> recursively call it, which seems likely to reduce the chances of >> present or future bugs. =A0A few comments about whichever way we go with >> it seem like a good idea, too. > > Agreed on all points. =A0I also changed the save/restore of ConfigFileLin= eno to > work the same way. =A0New version attached. OK, committed. --=20 Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company