Detecting glibc getopt?
I have traced down the postmaster-option-processing failure that Thomas
reported this morning. It appears to be specific to systems running
glibc: the problem is that resetting optind to 1 is not enough to
put glibc's getopt() subroutine into a good state to process a fresh
set of options. (Internally it has a "nextchar" pointer that is still
pointing at the old argv list, and only if the pointer points to a null
character will it wake up enough to reexamine the argv pointer you give
it.) The reason we see this now, and didn't see it before, is that
I rearranged startup to set the ps process title as soon as possible
after forking a subprocess --- and at least on Linux machines, that
"nextchar" pointer is pointing into the argv array that's overwritten
by init_ps_display.
While I could revert that change, I don't want to. The idea was to be
sure that a postmaster child running its authentication cycle could be
identified, and I still think that's an important feature. So I want to
find a way to make it work.
Looking at the source code of glibc's getopt, it seems there are two
ways to force a reset:
* set __getopt_initialized to 0. I thought this was an ideal solution
since configure could check for the presence of __getopt_initialized.
Unfortunately it seems that glibc is built in such a way that that
symbol isn't exported :-(, even though it looks global in the source.
* set optind to 0, instead of the more usual 1. This will work, but
it requires us to know that we're dealing with glibc getopt and not
anyone else's getopt.
I have thought of two ways to detect glibc getopt: one is to assume that
if getopt_long() is available, we should set optind=0. The other is to
try a runtime test in configure and see if it works to set optind=0.
Runtime configure tests aren't very appealing, but I don't much care
for equating HAVE_GETOPT_LONG to how we should reset optind, either.
Opinions anyone? Better ideas?
regards, tom lane
(I still see the symptom btw; did a make distclean and configure after
updating my tree)
Thomas Lockhart <lockhart@fourpalms.org> writes:
(I still see the symptom btw; did a make distclean and configure after
updating my tree)
Yeah, it's still busted; my first try was wrong. I have confirmed the
"optind = 0" fix works on my LinuxPPC machine, but we need to decide
how to autoconfigure that hack.
regards, tom lane
Tom Lane writes:
The reason we see this now, and didn't see it before, is that
I rearranged startup to set the ps process title as soon as possible
after forking a subprocess --- and at least on Linux machines, that
"nextchar" pointer is pointing into the argv array that's overwritten
by init_ps_display.
How about copying the entire argv[] array to a new location before the
very first call to getopt(). Then you can use getopt() without hackery
and can do anything you want to the "real" argv area. That should be a
lot safer. (We don't know yet what other platforms might play
optimization tricks in getopt().)
--
Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
Peter Eisentraut <peter_e@gmx.net> writes:
How about copying the entire argv[] array to a new location before the
very first call to getopt(). Then you can use getopt() without hackery
and can do anything you want to the "real" argv area. That should be a
lot safer. (We don't know yet what other platforms might play
optimization tricks in getopt().)
Well, mumble --- strictly speaking, there is *NO* way to use getopt
over multiple cycles "without hackery". The standard for getopt
(http://www.opengroup.org/onlinepubs/7908799/xsh/getopt.html)
doesn't say you're allowed to scribble on optind in the first place.
But you're probably right that having a read-only copy of the argv
vector will make things safer. Will do it that way.
regards, tom lane
Is this resolved?
---------------------------------------------------------------------------
I have traced down the postmaster-option-processing failure that Thomas
reported this morning. It appears to be specific to systems running
glibc: the problem is that resetting optind to 1 is not enough to
put glibc's getopt() subroutine into a good state to process a fresh
set of options. (Internally it has a "nextchar" pointer that is still
pointing at the old argv list, and only if the pointer points to a null
character will it wake up enough to reexamine the argv pointer you give
it.) The reason we see this now, and didn't see it before, is that
I rearranged startup to set the ps process title as soon as possible
after forking a subprocess --- and at least on Linux machines, that
"nextchar" pointer is pointing into the argv array that's overwritten
by init_ps_display.While I could revert that change, I don't want to. The idea was to be
sure that a postmaster child running its authentication cycle could be
identified, and I still think that's an important feature. So I want to
find a way to make it work.Looking at the source code of glibc's getopt, it seems there are two
ways to force a reset:* set __getopt_initialized to 0. I thought this was an ideal solution
since configure could check for the presence of __getopt_initialized.
Unfortunately it seems that glibc is built in such a way that that
symbol isn't exported :-(, even though it looks global in the source.* set optind to 0, instead of the more usual 1. This will work, but
it requires us to know that we're dealing with glibc getopt and not
anyone else's getopt.I have thought of two ways to detect glibc getopt: one is to assume that
if getopt_long() is available, we should set optind=0. The other is to
try a runtime test in configure and see if it works to set optind=0.
Runtime configure tests aren't very appealing, but I don't much care
for equating HAVE_GETOPT_LONG to how we should reset optind, either.Opinions anyone? Better ideas?
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026