Thread: Obsolete coding in fork_process.c
fork_process.c quoth: /* * Flush stdio channels just before fork, to avoid double-output problems. * Ideally we'd use fflush(NULL) here,but there are still a few non-ANSI * stdio libraries out there (like SunOS 4.1.x) that coredump if we do. * Presentlystdout and stderr are the only stdio output channels used by * the postmaster, so fflush'ing them should be sufficient. */ fflush(stdout); fflush(stderr); Is there any reason not to change this to just fflush(NULL)? We dropped support for SunOS 4.1 quite some time ago ... While it's still true that the postmaster proper doesn't need to fflush anything but stdout and stderr, this coding seems a bit less than safe when you consider the possibility of third-party libraries loaded into the postmaster process. regards, tom lane
On Thu, May 01, 2014 at 12:13:28PM -0400, Tom Lane wrote: > fork_process.c quoth: > > /* > * Flush stdio channels just before fork, to avoid double-output problems. > * Ideally we'd use fflush(NULL) here, but there are still a few non-ANSI > * stdio libraries out there (like SunOS 4.1.x) that coredump if we do. > * Presently stdout and stderr are the only stdio output channels used by > * the postmaster, so fflush'ing them should be sufficient. > */ > fflush(stdout); > fflush(stderr); > > Is there any reason not to change this to just fflush(NULL)? We dropped > support for SunOS 4.1 quite some time ago ... Modern systems have other fflush(NULL) problems: http://www.nntp.perl.org/group/perl.perl5.porters/2013/09/msg207692.html http://perl5.git.perl.org/metaconfig.git/blob/master:/U/perl/fflushall.U -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Noah Misch <noah@leadboat.com> writes: > On Thu, May 01, 2014 at 12:13:28PM -0400, Tom Lane wrote: >> Is there any reason not to change this to just fflush(NULL)? We dropped >> support for SunOS 4.1 quite some time ago ... > Modern systems have other fflush(NULL) problems: > http://www.nntp.perl.org/group/perl.perl5.porters/2013/09/msg207692.html > http://perl5.git.perl.org/metaconfig.git/blob/master:/U/perl/fflushall.U Fun. I doubt that the postmaster's stdin would ever be a pipe, but maybe we'd better leave well enough alone. Should update the comment though. regards, tom lane
I wrote: > Noah Misch <noah@leadboat.com> writes: >> On Thu, May 01, 2014 at 12:13:28PM -0400, Tom Lane wrote: >>> Is there any reason not to change this to just fflush(NULL)? We dropped >>> support for SunOS 4.1 quite some time ago ... >> Modern systems have other fflush(NULL) problems: >> http://www.nntp.perl.org/group/perl.perl5.porters/2013/09/msg207692.html >> http://perl5.git.perl.org/metaconfig.git/blob/master:/U/perl/fflushall.U > Fun. I doubt that the postmaster's stdin would ever be a pipe, but > maybe we'd better leave well enough alone. Should update the comment > though. However ... after looking around I notice that fflush(NULL) is already being used in parallel pg_dump and pg_upgrade; and at least in the latter case I'm afraid to change that because it looks like there are probably other stdio output files open in the process. I think we should probably go ahead and switch over to using fflush(NULL). The capability is required by C89 for crissake; are we really going to rely on hearsay evidence that there are current platforms that are still broken? Also, I found at least one claim on the net that this has been fixed in Solaris for awhile: http://grokbase.com/t/perl/perl5-porters/021hn4z9pq/fflush-null-on-solaris-9 Attached is a draft patch that I was working on before I decided that making the indicated changes in pg_upgrade was probably a bad idea. This is mainly to memorialize the places we should look at if we change it. BTW, while working on this I noticed that there are a boatload of places where we use system() or popen() without a prior fflush. I suspect most of them are safe, or we'd have heard more complaints --- but shouldn't we clamp down on that? regards, tom lane diff --git a/contrib/pg_upgrade/controldata.c b/contrib/pg_upgrade/controldata.c index fa0a005..3bf9c6a 100644 *** a/contrib/pg_upgrade/controldata.c --- b/contrib/pg_upgrade/controldata.c *************** get_control_data(ClusterInfo *cluster, b *** 114,119 **** --- 113,120 ---- cluster->bindir, live_check ? "pg_controldata\"" : "pg_resetxlog\" -n", cluster->pgdata); + + /* Don't use fflush(NULL); see comments in fork_process.c */ fflush(stdout); fflush(stderr); diff --git a/contrib/pg_upgrade/parallel.c b/contrib/pg_upgrade/parallel.c index f4201e1..7807687 100644 *** a/contrib/pg_upgrade/parallel.c --- b/contrib/pg_upgrade/parallel.c *************** parallel_exec_prog(const char *log_file, *** 118,125 **** /* set this before we start the job */ parallel_jobs++; ! /* Ensure stdio state is quiesced before forking */ ! fflush(NULL); #ifndef WIN32 child = fork(); --- 118,126 ---- /* set this before we start the job */ parallel_jobs++; ! /* Don't use fflush(NULL); see comments in fork_process.c */ ! fflush(stdout); ! fflush(stderr); #ifndef WIN32 child = fork(); *************** parallel_transfer_all_new_dbs(DbInfoArr *** 227,234 **** /* set this before we start the job */ parallel_jobs++; ! /* Ensure stdio state is quiesced before forking */ ! fflush(NULL); #ifndef WIN32 child = fork(); --- 228,236 ---- /* set this before we start the job */ parallel_jobs++; ! /* Don't use fflush(NULL); see comments in fork_process.c */ ! fflush(stdout); ! fflush(stderr); #ifndef WIN32 child = fork(); diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index 7c1e59e..cf84fcf 100644 *** a/contrib/pgbench/pgbench.c --- b/contrib/pgbench/pgbench.c *************** pthread_create(pthread_t *thread, *** 3328,3333 **** --- 3328,3337 ---- return errno; } + /* Don't use fflush(NULL); see comments in fork_process.c */ + fflush(stdout); + fflush(stderr); + th->pid = fork(); if (th->pid == -1) /* error */ { diff --git a/src/backend/postmaster/fork_process.c b/src/backend/postmaster/fork_process.c index 3e2acdd..28f4ca7 100644 *** a/src/backend/postmaster/fork_process.c --- b/src/backend/postmaster/fork_process.c *************** fork_process(void) *** 38,45 **** /* * Flush stdio channels just before fork, to avoid double-output problems. ! * Ideally we'd use fflush(NULL) here, but there are still a few non-ANSI ! * stdio libraries out there (like SunOS 4.1.x) that coredump if we do. * Presently stdout and stderr are the only stdio output channels used by * the postmaster, so fflush'ing them should be sufficient. */ --- 38,50 ---- /* * Flush stdio channels just before fork, to avoid double-output problems. ! * ! * Ideally we'd use fflush(NULL) here, but that has portability issues --- ! * notably, it's reported that current Solaris as of 2013 still has some ! * odd behaviors in fflush(NULL), such as closing stdin if it's a pipe. ! * While it's not clear that that would affect the postmaster, it still ! * seems best to stay away from fflush(NULL). ! * * Presently stdout and stderr are the only stdio output channels used by * the postmaster, so fflush'ing them should be sufficient. */ diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c index 0560bf9..268584a 100644 *** a/src/backend/storage/file/fd.c --- b/src/backend/storage/file/fd.c *************** OpenPipeStream(const char *command, cons *** 1741,1746 **** --- 1741,1747 ---- ReleaseLruFiles(); TryAgain: + /* Don't use fflush(NULL); see comments in fork_process.c */ fflush(stdout); fflush(stderr); errno = 0; diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c index b53fa8b..55e8c71 100644 *** a/src/bin/initdb/initdb.c --- b/src/bin/initdb/initdb.c *************** popen_check(const char *command, const c *** 692,697 **** --- 692,698 ---- { FILE *cmdfd; + /* Don't use fflush(NULL); see comments in fork_process.c */ fflush(stdout); fflush(stderr); errno = 0; diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index 1a468fa..73a2428 100644 *** a/src/bin/pg_basebackup/pg_basebackup.c --- b/src/bin/pg_basebackup/pg_basebackup.c *************** StartLogStreamer(char *startpos, uint32 *** 437,442 **** --- 437,446 ---- * a fork(). On Windows, we create a thread. */ #ifndef WIN32 + /* Don't use fflush(NULL); see comments in fork_process.c */ + fflush(stdout); + fflush(stderr); + bgchild = fork(); if (bgchild == 0) { diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c index afe941f..1f35ab9 100644 *** a/src/bin/pg_dump/parallel.c --- b/src/bin/pg_dump/parallel.c *************** ParallelBackupStart(ArchiveHandle *AH, R *** 489,497 **** Assert(AH->public.numWorkers > 0); - /* Ensure stdio state is quiesced before forking */ - fflush(NULL); - pstate = (ParallelState *) pg_malloc(sizeof(ParallelState)); pstate->numWorkers = AH->public.numWorkers; --- 489,494 ---- *************** ParallelBackupStart(ArchiveHandle *AH, R *** 539,544 **** --- 536,542 ---- pstate->parallelSlot[i].args = (ParallelArgs *) pg_malloc(sizeof(ParallelArgs)); pstate->parallelSlot[i].args->AH = NULL; pstate->parallelSlot[i].args->te = NULL; + #ifdef WIN32 /* Allocate a new structure for every worker */ wi = (WorkerInfo *) pg_malloc(sizeof(WorkerInfo)); *************** ParallelBackupStart(ArchiveHandle *AH, R *** 553,558 **** --- 551,561 ---- wi, 0, &(pstate->parallelSlot[i].threadId)); pstate->parallelSlot[i].hThread = handle; #else + + /* Don't use fflush(NULL); see comments in fork_process.c */ + fflush(stdout); + fflush(stderr); + pid = fork(); if (pid == 0) { diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c index 47fe6cc..fbb3822 100644 *** a/src/bin/pg_dump/pg_dumpall.c --- b/src/bin/pg_dump/pg_dumpall.c *************** runPgDump(const char *dbname) *** 1692,1697 **** --- 1692,1698 ---- if (verbose) fprintf(stderr, _("%s: running \"%s\"\n"), progname, cmd->data); + /* Don't use fflush(NULL); see comments in fork_process.c */ fflush(stdout); fflush(stderr); diff --git a/src/bin/psql/copy.c b/src/bin/psql/copy.c index d706206..09b2021 100644 *** a/src/bin/psql/copy.c --- b/src/bin/psql/copy.c *************** do_copy(const char *args) *** 288,293 **** --- 288,294 ---- { if (options->program) { + /* Don't use fflush(NULL); see comments in fork_process.c */ fflush(stdout); fflush(stderr); errno = 0; *************** do_copy(const char *args) *** 307,318 **** { if (options->program) { fflush(stdout); fflush(stderr); - errno = 0; #ifndef WIN32 pqsignal(SIGPIPE, SIG_IGN); #endif copystream = popen(options->file, PG_BINARY_W); } else --- 308,320 ---- { if (options->program) { + /* Don't use fflush(NULL); see comments in fork_process.c */ fflush(stdout); fflush(stderr); #ifndef WIN32 pqsignal(SIGPIPE, SIG_IGN); #endif + errno = 0; copystream = popen(options->file, PG_BINARY_W); } else diff --git a/src/common/exec.c b/src/common/exec.c index 037bef2..6547896 100644 *** a/src/common/exec.c --- b/src/common/exec.c *************** pipe_read_line(char *cmd, char *line, in *** 350,356 **** #ifndef WIN32 FILE *pgver; ! /* flush output buffers in case popen does not... */ fflush(stdout); fflush(stderr); --- 350,356 ---- #ifndef WIN32 FILE *pgver; ! /* Don't use fflush(NULL); see comments in fork_process.c */ fflush(stdout); fflush(stderr); diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index 07dd803..65b69a8 100644 *** a/src/test/regress/pg_regress.c --- b/src/test/regress/pg_regress.c *************** stop_postmaster(void) *** 288,300 **** char buf[MAXPGPATH * 2]; int r; - /* On Windows, system() seems not to force fflush, so... */ - fflush(stdout); - fflush(stderr); - snprintf(buf, sizeof(buf), SYSTEMQUOTE "\"%s/pg_ctl\" stop -D \"%s/data\" -s -m fast" SYSTEMQUOTE, bindir, temp_install); r = system(buf); if (r != 0) { --- 288,301 ---- char buf[MAXPGPATH * 2]; int r; snprintf(buf, sizeof(buf), SYSTEMQUOTE "\"%s/pg_ctl\" stop -D \"%s/data\" -s -m fast" SYSTEMQUOTE, bindir, temp_install); + + /* Don't use fflush(NULL); see comments in fork_process.c */ + fflush(stdout); + fflush(stderr); + r = system(buf); if (r != 0) { *************** spawn_process(const char *cmdline) *** 929,938 **** #ifndef WIN32 pid_t pid; ! /* ! * Must flush I/O buffers before fork. Ideally we'd use fflush(NULL) here ! * ... does anyone still care about systems where that doesn't work? ! */ fflush(stdout); fflush(stderr); if (logfile) --- 930,936 ---- #ifndef WIN32 pid_t pid; ! /* Don't use fflush(NULL); see comments in fork_process.c */ fflush(stdout); fflush(stderr); if (logfile)
On Thu, May 1, 2014 at 5:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> Noah Misch <noah@leadboat.com> writes: >>> On Thu, May 01, 2014 at 12:13:28PM -0400, Tom Lane wrote: >>>> Is there any reason not to change this to just fflush(NULL)? We dropped >>>> support for SunOS 4.1 quite some time ago ... > >>> Modern systems have other fflush(NULL) problems: > >>> http://www.nntp.perl.org/group/perl.perl5.porters/2013/09/msg207692.html >>> http://perl5.git.perl.org/metaconfig.git/blob/master:/U/perl/fflushall.U > >> Fun. I doubt that the postmaster's stdin would ever be a pipe, but >> maybe we'd better leave well enough alone. Should update the comment >> though. > > However ... after looking around I notice that fflush(NULL) is already > being used in parallel pg_dump and pg_upgrade; and at least in the latter > case I'm afraid to change that because it looks like there are probably > other stdio output files open in the process. > > I think we should probably go ahead and switch over to using > fflush(NULL). The capability is required by C89 for crissake; > are we really going to rely on hearsay evidence that there are > current platforms that are still broken? Also, I found at least > one claim on the net that this has been fixed in Solaris for awhile: > http://grokbase.com/t/perl/perl5-porters/021hn4z9pq/fflush-null-on-solaris-9 I am having a hard time understanding what the real impetus to change this is. IIUC, we have zero reports of the current coding being a problem, so I'm not sure why we want to go make it different. Sure, there are hypothetical problems with the current code, but there are also hypothetical problems with the proposed change, and the current code has survived quite a bit of real-world deployment. I guess it's hard for me to be dead-set against this if we're using that pattern elsewhere, but I won't be very surprised if more weird cases where it doesn't work turn up down the road; and I'm a bit worried that if we let it proliferate we'll find it hard to get rid of if and when those reports turn up. > Attached is a draft patch that I was working on before I decided that > making the indicated changes in pg_upgrade was probably a bad idea. > This is mainly to memorialize the places we should look at if we > change it. > > BTW, while working on this I noticed that there are a boatload of places > where we use system() or popen() without a prior fflush. I suspect most > of them are safe, or we'd have heard more complaints --- but shouldn't > we clamp down on that? I think that would probably be a good idea. I wouldn't be shocked if there are problems there that we've failed to notice. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, May 01, 2014 at 05:59:08PM -0400, Tom Lane wrote: > I wrote: > > Noah Misch <noah@leadboat.com> writes: > >> Modern systems have other fflush(NULL) problems: > > >> http://www.nntp.perl.org/group/perl.perl5.porters/2013/09/msg207692.html > >> http://perl5.git.perl.org/metaconfig.git/blob/master:/U/perl/fflushall.U > > > Fun. I doubt that the postmaster's stdin would ever be a pipe, but > > maybe we'd better leave well enough alone. Should update the comment > > though. > > However ... after looking around I notice that fflush(NULL) is already > being used in parallel pg_dump and pg_upgrade; and at least in the latter > case I'm afraid to change that because it looks like there are probably > other stdio output files open in the process. Do those programs, operating in those modes, read from stdin or some other long-lived, pipe-backed FILE*? > BTW, while working on this I noticed that there are a boatload of places > where we use system() or popen() without a prior fflush. I suspect most > of them are safe, or we'd have heard more complaints --- but shouldn't > we clamp down on that? You need the fflush() when forking and then using stdio in the child before any exec(). Have you caught wind of any system() or popen() implementation having that property? -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Noah Misch <noah@leadboat.com> writes: > On Thu, May 01, 2014 at 05:59:08PM -0400, Tom Lane wrote: >> However ... after looking around I notice that fflush(NULL) is already >> being used in parallel pg_dump and pg_upgrade; and at least in the latter >> case I'm afraid to change that because it looks like there are probably >> other stdio output files open in the process. > Do those programs, operating in those modes, read from stdin or some other > long-lived, pipe-backed FILE*? Probably not. However, if that's your criterion, I'd say that I'd much rather assume that the postmaster doesn't have a pipe connected to stdin than that it has no stdio outputs besides stdout/stderr. >> BTW, while working on this I noticed that there are a boatload of places >> where we use system() or popen() without a prior fflush. I suspect most >> of them are safe, or we'd have heard more complaints --- but shouldn't >> we clamp down on that? > You need the fflush() when forking and then using stdio in the child before > any exec(). Have you caught wind of any system() or popen() implementation > having that property? You're only considering one aspect of the problem. Yeah, you might not get duplicated output unless system() prints something before exec(), but we would also like to have causality: that is, whatever we sent to stdout before calling system() should appear there before anything the child process sends to stdout. regards, tom lane
On Thu, May 01, 2014 at 08:44:46PM -0400, Tom Lane wrote: > Noah Misch <noah@leadboat.com> writes: > > On Thu, May 01, 2014 at 05:59:08PM -0400, Tom Lane wrote: > >> However ... after looking around I notice that fflush(NULL) is already > >> being used in parallel pg_dump and pg_upgrade; and at least in the latter > >> case I'm afraid to change that because it looks like there are probably > >> other stdio output files open in the process. > > > Do those programs, operating in those modes, read from stdin or some other > > long-lived, pipe-backed FILE*? > > Probably not. Then the use of fflush(NULL) in those programs tells us nothing about the status of the aforementioned HP-UX bug. > However, if that's your criterion, I'd say that I'd much > rather assume that the postmaster doesn't have a pipe connected to stdin > than that it has no stdio outputs besides stdout/stderr. Either way, the beneficiary is theoretical; who can tell which one deserves priority? Let's do what we usually do in the absence of a clear improvement: keep the longstanding behavior. > >> BTW, while working on this I noticed that there are a boatload of places > >> where we use system() or popen() without a prior fflush. I suspect most > >> of them are safe, or we'd have heard more complaints --- but shouldn't > >> we clamp down on that? > > > You need the fflush() when forking and then using stdio in the child before > > any exec(). Have you caught wind of any system() or popen() implementation > > having that property? > > You're only considering one aspect of the problem. Yeah, you might not > get duplicated output unless system() prints something before exec(), > but we would also like to have causality: that is, whatever we sent to > stdout before calling system() should appear there before anything the > child process sends to stdout. Good point. I suppose a couple of fflush() calls have negligible cost next to a system() or popen(). Introduce pg_popen()/pg_system(), and adopt a rule that they are [almost] our only callers of raw popen()/system()? -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Noah Misch <noah@leadboat.com> writes: > On Thu, May 01, 2014 at 08:44:46PM -0400, Tom Lane wrote: >> You're only considering one aspect of the problem. Yeah, you might not >> get duplicated output unless system() prints something before exec(), >> but we would also like to have causality: that is, whatever we sent to >> stdout before calling system() should appear there before anything the >> child process sends to stdout. > Good point. I suppose a couple of fflush() calls have negligible cost next to > a system() or popen(). Introduce pg_popen()/pg_system(), and adopt a rule > that they are [almost] our only callers of raw popen()/system()? Meh. I'm not usually in favor of adopting nonstandard notation, and this doesn't seem like a place to start. In particular, if you don't want to use fflush(NULL) in these proposed wrappers, then call sites are still going to have an issue with needing to do manual fflushes; pg_regress.c's spawn_process is an example: /* * Must flush I/O buffers before fork. Ideally we'd use fflush(NULL) here * ... does anyone still care aboutsystems where that doesn't work? */ fflush(stdout); fflush(stderr); if (logfile) fflush(logfile); pid = fork(); I think that removing the need for fflush(stdout) and fflush(stderr) in this context would mostly result in people forgetting to fflush other output files. I'd rather have the two lines of boilerplate (and a comment about why we're refusing to depend on fflush(NULL)) than take that risk. Now, if you were willing to put fflush(NULL) into the wrappers, I could get on board with that ;-) regards, tom lane
On Thu, May 01, 2014 at 11:07:51PM -0400, Tom Lane wrote: > Noah Misch <noah@leadboat.com> writes: > > On Thu, May 01, 2014 at 08:44:46PM -0400, Tom Lane wrote: > >> You're only considering one aspect of the problem. Yeah, you might not > >> get duplicated output unless system() prints something before exec(), > >> but we would also like to have causality: that is, whatever we sent to > >> stdout before calling system() should appear there before anything the > >> child process sends to stdout. > > > Good point. I suppose a couple of fflush() calls have negligible cost next to > > a system() or popen(). Introduce pg_popen()/pg_system(), and adopt a rule > > that they are [almost] our only callers of raw popen()/system()? > > Meh. I'm not usually in favor of adopting nonstandard notation, and > this doesn't seem like a place to start. In particular, if you don't > want to use fflush(NULL) in these proposed wrappers, then call sites > are still going to have an issue with needing to do manual fflushes; > pg_regress.c's spawn_process is an example: > > /* > * Must flush I/O buffers before fork. Ideally we'd use fflush(NULL) here > * ... does anyone still care about systems where that doesn't work? > */ > fflush(stdout); > fflush(stderr); > if (logfile) > fflush(logfile); > > pid = fork(); > > I think that removing the need for fflush(stdout) and fflush(stderr) > in this context would mostly result in people forgetting to fflush > other output files. I'd rather have the two lines of boilerplate > (and a comment about why we're refusing to depend on fflush(NULL)) > than take that risk. Works for me. -- Noah Misch EnterpriseDB http://www.enterprisedb.com