Thread: Detecting glibc getopt?
I have traced down the postmaster-option-processing failure that Thomas reported this morning. It appears to be specific to systems running glibc: the problem is that resetting optind to 1 is not enough to put glibc's getopt() subroutine into a good state to process a fresh set of options. (Internally it has a "nextchar" pointer that is still pointing at the old argv list, and only if the pointer points to a null character will it wake up enough to reexamine the argv pointer you give it.) The reason we see this now, and didn't see it before, is that I rearranged startup to set the ps process title as soon as possible after forking a subprocess --- and at least on Linux machines, that "nextchar" pointer is pointing into the argv array that's overwritten by init_ps_display. While I could revert that change, I don't want to. The idea was to be sure that a postmaster child running its authentication cycle could be identified, and I still think that's an important feature. So I want to find a way to make it work. Looking at the source code of glibc's getopt, it seems there are two ways to force a reset: * set __getopt_initialized to 0. I thought this was an ideal solution since configure could check for the presence of __getopt_initialized. Unfortunately it seems that glibc is built in such a way that that symbol isn't exported :-(, even though it looks global in the source. * set optind to 0, instead of the more usual 1. This will work, but it requires us to know that we're dealing with glibc getopt and not anyone else's getopt. I have thought of two ways to detect glibc getopt: one is to assume that if getopt_long() is available, we should set optind=0. The other is to try a runtime test in configure and see if it works to set optind=0. Runtime configure tests aren't very appealing, but I don't much care for equating HAVE_GETOPT_LONG to how we should reset optind, either. Opinions anyone? Better ideas? regards, tom lane
(I still see the symptom btw; did a make distclean and configure after updating my tree)
Thomas Lockhart <lockhart@fourpalms.org> writes: > (I still see the symptom btw; did a make distclean and configure after > updating my tree) Yeah, it's still busted; my first try was wrong. I have confirmed the "optind = 0" fix works on my LinuxPPC machine, but we need to decide how to autoconfigure that hack. regards, tom lane
Tom Lane writes: > The reason we see this now, and didn't see it before, is that > I rearranged startup to set the ps process title as soon as possible > after forking a subprocess --- and at least on Linux machines, that > "nextchar" pointer is pointing into the argv array that's overwritten > by init_ps_display. How about copying the entire argv[] array to a new location before the very first call to getopt(). Then you can use getopt() without hackery and can do anything you want to the "real" argv area. That should be a lot safer. (We don't know yet what other platforms might play optimization tricks in getopt().) -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
Peter Eisentraut <peter_e@gmx.net> writes: > How about copying the entire argv[] array to a new location before the > very first call to getopt(). Then you can use getopt() without hackery > and can do anything you want to the "real" argv area. That should be a > lot safer. (We don't know yet what other platforms might play > optimization tricks in getopt().) Well, mumble --- strictly speaking, there is *NO* way to use getopt over multiple cycles "without hackery". The standard for getopt (http://www.opengroup.org/onlinepubs/7908799/xsh/getopt.html) doesn't say you're allowed to scribble on optind in the first place. But you're probably right that having a read-only copy of the argv vector will make things safer. Will do it that way. regards, tom lane
Is this resolved? --------------------------------------------------------------------------- > I have traced down the postmaster-option-processing failure that Thomas > reported this morning. It appears to be specific to systems running > glibc: the problem is that resetting optind to 1 is not enough to > put glibc's getopt() subroutine into a good state to process a fresh > set of options. (Internally it has a "nextchar" pointer that is still > pointing at the old argv list, and only if the pointer points to a null > character will it wake up enough to reexamine the argv pointer you give > it.) The reason we see this now, and didn't see it before, is that > I rearranged startup to set the ps process title as soon as possible > after forking a subprocess --- and at least on Linux machines, that > "nextchar" pointer is pointing into the argv array that's overwritten > by init_ps_display. > > While I could revert that change, I don't want to. The idea was to be > sure that a postmaster child running its authentication cycle could be > identified, and I still think that's an important feature. So I want to > find a way to make it work. > > Looking at the source code of glibc's getopt, it seems there are two > ways to force a reset: > > * set __getopt_initialized to 0. I thought this was an ideal solution > since configure could check for the presence of __getopt_initialized. > Unfortunately it seems that glibc is built in such a way that that > symbol isn't exported :-(, even though it looks global in the source. > > * set optind to 0, instead of the more usual 1. This will work, but > it requires us to know that we're dealing with glibc getopt and not > anyone else's getopt. > > I have thought of two ways to detect glibc getopt: one is to assume that > if getopt_long() is available, we should set optind=0. The other is to > try a runtime test in configure and see if it works to set optind=0. > Runtime configure tests aren't very appealing, but I don't much care > for equating HAVE_GETOPT_LONG to how we should reset optind, either. > > Opinions anyone? Better ideas? > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> Is this resolved? Sure. Within a day or two of the initial problem report. - Thomas