add non-option reordering to in-tree getopt_long

Started by Nathan Bossartover 2 years ago37 messages
#1Nathan Bossart
nathandbossart@gmail.com
1 attachment(s)

While working on 2dcd157, I noticed cfbot failures for Windows due to tests
with commands that had non-options specified before options. For example,
"createuser myrole -a myadmin" specified a non-option (myrole) before the
option/argument pair (-a admin). To get the tests passing for Windows,
non-options must be placed at the end (e.g., "createuser -a myadmin
myrole"). This same problem was encountered while working on 08951a7 [0]/messages/by-id/20220525.110752.305692011781436338.horikyota.ntt@gmail.com,
but it was again fortunately caught with cfbot. Others have not been so
lucky [1]https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=869aa40 [2]https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ffd3980 [3]https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=d9ddc50.

The reason for this discrepancy is because Windows uses the in-tree
implementation of getopt_long(), which differs from the other
implementations I've found in that it doesn't reorder non-options to the
end of argv by default. Instead, it returns -1 as soon as the first
non-option is found, even if there are other options listed afterwards. By
moving non-options to the end of argv, getopt_long() can parse all
specified options and return -1 when only non-options remain. The
implementations I reviewed all reorder argv unless the POSIXLY_CORRECT
environment variable is set or the "optstring" argument begins with '+'.

The best reasons I can think of to keep the current behavior are 1)
reordering involves writing to the original argv array, which could be
risky (as noted by Tom [4]/messages/by-id/20539.1237314382@sss.pgh.pa.us) and 2) any systems with a getopt_long()
implementation that doesn't reorder non-options could begin failing tests
that take advantage of this behavior. However, this quirk in the in-tree
getopt_long() is periodically missed by hackers, the only systems I'm aware
of that use it are Windows and AIX, and every other implementation of
getopt_long() I saw reorders non-options by default. Furthermore, C99
omits const decorations in main()'s signature, so modifying argv might not
be too scary.

Thus, I propose introducing this non-option reordering behavior but
allowing it to be disabled via the POSIXLY_CORRECT environment variable.
The attached patch is my first attempt at implementing this proposal. I
don't think we need to bother with handling '+' at the beginning of
"optstring" since it seems unlikely to be used in PostgreSQL, but it would
probably be easy enough to add if folks want it.

I briefly looked at getopt() and concluded that we should probably retain
its POSIX-compliant behavior for non-options, as reordering support seems
much less universal than with getopt_long(). AFAICT all client utilities
use getopt_long(), anyway.

Thoughts?

[0]: /messages/by-id/20220525.110752.305692011781436338.horikyota.ntt@gmail.com
[1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=869aa40
[2]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ffd3980
[3]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=d9ddc50
[4]: /messages/by-id/20539.1237314382@sss.pgh.pa.us

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Teach-in-tree-getopt_long-to-move-non-options-to-.patchtext/x-diff; charset=us-asciiDownload
From 68c866da205592a370279d6b2a180e0888b0587c Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Fri, 9 Jun 2023 15:51:58 -0700
Subject: [PATCH v1 1/1] Teach in-tree getopt_long() to move non-options to the
 end of argv.

---
 src/bin/scripts/t/040_createuser.pl | 10 ++++---
 src/port/getopt_long.c              | 43 ++++++++++++++++++++++++++---
 2 files changed, 45 insertions(+), 8 deletions(-)

diff --git a/src/bin/scripts/t/040_createuser.pl b/src/bin/scripts/t/040_createuser.pl
index 9ca282181d..ba40ab11a3 100644
--- a/src/bin/scripts/t/040_createuser.pl
+++ b/src/bin/scripts/t/040_createuser.pl
@@ -42,9 +42,9 @@ $node->issues_sql_like(
 	'add a role as a member with admin option of the newly created role');
 $node->issues_sql_like(
 	[
-		'createuser', '-m',
-		'regress_user3', '-m',
-		'regress user #4', 'REGRESS_USER5'
+		'createuser', 'REGRESS_USER5',
+		'-m', 'regress_user3',
+		'-m', 'regress user #4'
 	],
 	qr/statement: CREATE ROLE "REGRESS_USER5" NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS ROLE regress_user3,"regress user #4";/,
 	'add a role as a member of the newly created role');
@@ -73,11 +73,13 @@ $node->issues_sql_like(
 	qr/statement: CREATE ROLE regress_user11 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--role');
 $node->issues_sql_like(
-	[ 'createuser', '--member-of', 'regress_user1', 'regress_user12' ],
+	[ 'createuser', 'regress_user12', '--member-of', 'regress_user1' ],
 	qr/statement: CREATE ROLE regress_user12 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--member-of');
 
 $node->command_fails([ 'createuser', 'regress_user1' ],
 	'fails if role already exists');
+$node->command_fails([ 'createuser', 'regress_user1', '-m', 'regress_user2', 'regress_user3' ],
+	'fails for too many non-options');
 
 done_testing();
diff --git a/src/port/getopt_long.c b/src/port/getopt_long.c
index c989276988..94e76769a3 100644
--- a/src/port/getopt_long.c
+++ b/src/port/getopt_long.c
@@ -50,8 +50,8 @@
  * This implementation does not use optreset.  Instead, we guarantee that
  * it can be restarted on a new argv array after a previous call returned -1,
  * if the caller resets optind to 1 before the first call of the new series.
- * (Internally, this means we must be sure to reset "place" to EMSG before
- * returning -1.)
+ * (Internally, this means we must be sure to reset "place" to EMSG and
+ * nonopt_start to -1 before returning -1.)
  */
 int
 getopt_long(int argc, char *const argv[],
@@ -60,21 +60,54 @@ getopt_long(int argc, char *const argv[],
 {
 	static char *place = EMSG;	/* option letter processing */
 	char	   *oli;			/* option letter list index */
+	static int	nonopt_start = -1;
+	static bool posix = false;
+	static bool posix_init = false;
+
+	if (!posix_init)
+	{
+		posix = (getenv("POSIXLY_CORRECT") != NULL);
+		posix_init = true;
+	}
 
 	if (!*place)
 	{							/* update scanning pointer */
 		if (optind >= argc)
 		{
 			place = EMSG;
+			nonopt_start = -1;
 			return -1;
 		}
 
+retry:
 		place = argv[optind];
 
 		if (place[0] != '-')
 		{
-			place = EMSG;
-			return -1;
+			char	  **args = (char **) argv;
+
+			/*
+			 * If POSIXLY_CORRECT is set, we must return -1 as soon as we see
+			 * a non-option. Else, move the non-options to the end and return
+			 * -1 when only non-options remain.
+			 */
+			if (posix || optind == nonopt_start)
+			{
+				place = EMSG;
+				nonopt_start = -1;
+				return -1;
+			}
+
+			for (int i = optind; i < argc - 1; i++)
+				args[i] = args[i + 1];
+			args[argc - 1] = place;
+
+			if (nonopt_start == -1)
+				nonopt_start = argc - 1;
+			else
+				nonopt_start--;
+
+			goto retry;
 		}
 
 		place++;
@@ -83,6 +116,7 @@ getopt_long(int argc, char *const argv[],
 		{
 			/* treat "-" as not being an option */
 			place = EMSG;
+			nonopt_start = -1;
 			return -1;
 		}
 
@@ -91,6 +125,7 @@ getopt_long(int argc, char *const argv[],
 			/* found "--", treat it as end of options */
 			++optind;
 			place = EMSG;
+			nonopt_start = -1;
 			return -1;
 		}
 
-- 
2.25.1

#2Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Nathan Bossart (#1)
Re: add non-option reordering to in-tree getopt_long

At Fri, 9 Jun 2023 16:22:57 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in

While working on 2dcd157, I noticed cfbot failures for Windows due to tests
with commands that had non-options specified before options. For example,
"createuser myrole -a myadmin" specified a non-option (myrole) before the
option/argument pair (-a admin). To get the tests passing for Windows,
non-options must be placed at the end (e.g., "createuser -a myadmin
myrole"). This same problem was encountered while working on 08951a7 [0],
but it was again fortunately caught with cfbot. Others have not been so
lucky [1] [2] [3].

While I don't see it as reason to change the behavior, I do believe
the change could be beneficial from a user's perspective.

The reason for this discrepancy is because Windows uses the in-tree
implementation of getopt_long(), which differs from the other
implementations I've found in that it doesn't reorder non-options to the
end of argv by default. Instead, it returns -1 as soon as the first
non-option is found, even if there are other options listed afterwards. By
moving non-options to the end of argv, getopt_long() can parse all
specified options and return -1 when only non-options remain. The
implementations I reviewed all reorder argv unless the POSIXLY_CORRECT
environment variable is set or the "optstring" argument begins with '+'.

The best reasons I can think of to keep the current behavior are 1)
reordering involves writing to the original argv array, which could be
risky (as noted by Tom [4]) and 2) any systems with a getopt_long()
implementation that doesn't reorder non-options could begin failing tests
that take advantage of this behavior. However, this quirk in the in-tree
getopt_long() is periodically missed by hackers, the only systems I'm aware
of that use it are Windows and AIX, and every other implementation of
getopt_long() I saw reorders non-options by default. Furthermore, C99
omits const decorations in main()'s signature, so modifying argv might not
be too scary.

POSIXLY_CORRECT appears to be intended for debugging or feature
validation. If we know we can always rearrange argv on those
platforms, we don't need it. I would suggest that we turn on the new
feature at the compile time on those platforms where we know this
rearrangement works, instead of switching at runtime.

Thus, I propose introducing this non-option reordering behavior but
allowing it to be disabled via the POSIXLY_CORRECT environment variable.
The attached patch is my first attempt at implementing this proposal. I
don't think we need to bother with handling '+' at the beginning of
"optstring" since it seems unlikely to be used in PostgreSQL, but it would
probably be easy enough to add if folks want it.

I briefly looked at getopt() and concluded that we should probably retain
its POSIX-compliant behavior for non-options, as reordering support seems
much less universal than with getopt_long(). AFAICT all client utilities
use getopt_long(), anyway.

Thoughts?

As far as I can see, getopt_long on Rocky9 does *not* rearrange argv
until it reaches the end of the array. But it won't matter much.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#3Nathan Bossart
nathandbossart@gmail.com
In reply to: Kyotaro Horiguchi (#2)
Re: add non-option reordering to in-tree getopt_long

On Tue, Jun 13, 2023 at 12:00:01PM +0900, Kyotaro Horiguchi wrote:

POSIXLY_CORRECT appears to be intended for debugging or feature
validation. If we know we can always rearrange argv on those
platforms, we don't need it. I would suggest that we turn on the new
feature at the compile time on those platforms where we know this
rearrangement works, instead of switching at runtime.

I'd be okay with leaving it out wherever possible. I'm curious whether any
supported systems do not allow this.

As far as I can see, getopt_long on Rocky9 does *not* rearrange argv
until it reaches the end of the array. But it won't matter much.

Do you mean that it rearranges argv once all the options have been
returned, or that it doesn't rearrange argv at all?

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#4Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Nathan Bossart (#3)
Re: add non-option reordering to in-tree getopt_long

At Mon, 12 Jun 2023 22:13:43 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in

On Tue, Jun 13, 2023 at 12:00:01PM +0900, Kyotaro Horiguchi wrote:

POSIXLY_CORRECT appears to be intended for debugging or feature
validation. If we know we can always rearrange argv on those
platforms, we don't need it. I would suggest that we turn on the new
feature at the compile time on those platforms where we know this
rearrangement works, instead of switching at runtime.

I'd be okay with leaving it out wherever possible. I'm curious whether any
supported systems do not allow this.

Hmm. from the initial mail, I got the impression that AIX and Windows
allow this, so I thought that we can do that for them. While there
could be other platforms that allow it, perhaps we don't need to go as
far as extending this until we come across another platform that does.

As far as I can see, getopt_long on Rocky9 does *not* rearrange argv
until it reaches the end of the array. But it won't matter much.

Do you mean that it rearranges argv once all the options have been
returned, or that it doesn't rearrange argv at all?

I meant the former. argv remains unaltered until getopt_long returns
-1. Once it does, non-optional arguments are being collected at the
end of argv. (But in my opinion, that behavior isn't very significant
in this context..)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#5Nathan Bossart
nathandbossart@gmail.com
In reply to: Kyotaro Horiguchi (#4)
Re: add non-option reordering to in-tree getopt_long

On Tue, Jun 13, 2023 at 04:02:01PM +0900, Kyotaro Horiguchi wrote:

Hmm. from the initial mail, I got the impression that AIX and Windows
allow this, so I thought that we can do that for them. While there
could be other platforms that allow it, perhaps we don't need to go as
far as extending this until we come across another platform that does.

Windows seems to allow rearranging argv, based upon cfbot's results. I do
not know about AIX. In any case, C99 explicitly mentions that argv should
be modifiable.

As far as I can see, getopt_long on Rocky9 does *not* rearrange argv
until it reaches the end of the array. But it won't matter much.

Do you mean that it rearranges argv once all the options have been
returned, or that it doesn't rearrange argv at all?

I meant the former. argv remains unaltered until getopt_long returns
-1. Once it does, non-optional arguments are being collected at the
end of argv. (But in my opinion, that behavior isn't very significant
in this context..)

Got it.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#6Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#5)
Re: add non-option reordering to in-tree getopt_long

On Tue, Jun 13, 2023 at 03:36:57PM -0700, Nathan Bossart wrote:

Windows seems to allow rearranging argv, based upon cfbot's results. I do
not know about AIX. In any case, C99 explicitly mentions that argv should
be modifiable.

Few people have AIX machines around these days, but looking around it
seems like the answer to this question would be no:
https://github.com/nodejs/node/pull/10633

Noah, do you have an idea?

Got it.

Making the internal implementation of getopt_long more flexible would
be really nice in the long-term. This is something that people have
stepped on for many years, like ffd3980.

(TBH, I think that there is little value in spending resources on AIX
these days. For one, few have an access to it, and getopt is not the
only tweak in the tree related to it. On top of that, C99 is required
since v12.)
--
Michael

#7Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#6)
Re: add non-option reordering to in-tree getopt_long

On Wed, Jun 14, 2023 at 08:52:27AM +0900, Michael Paquier wrote:

On Tue, Jun 13, 2023 at 03:36:57PM -0700, Nathan Bossart wrote:

Windows seems to allow rearranging argv, based upon cfbot's results. I do
not know about AIX. In any case, C99 explicitly mentions that argv should
be modifiable.

Few people have AIX machines around these days, but looking around it
seems like the answer to this question would be no:
https://github.com/nodejs/node/pull/10633

Noah, do you have an idea?

Forgot to add Noah in CC about this part.
--
Michael

#8Noah Misch
noah@leadboat.com
In reply to: Michael Paquier (#7)
Re: add non-option reordering to in-tree getopt_long

On Wed, Jun 14, 2023 at 09:03:22AM +0900, Michael Paquier wrote:

On Wed, Jun 14, 2023 at 08:52:27AM +0900, Michael Paquier wrote:

On Tue, Jun 13, 2023 at 03:36:57PM -0700, Nathan Bossart wrote:

Windows seems to allow rearranging argv, based upon cfbot's results. I do
not know about AIX. In any case, C99 explicitly mentions that argv should
be modifiable.

Few people have AIX machines around these days, but looking around it
seems like the answer to this question would be no:
https://github.com/nodejs/node/pull/10633

Noah, do you have an idea?

No, I don't have specific knowledge about mutating argv on AIX. PostgreSQL
includes this, which makes some statement about it working:

#elif ... || defined(_AIX) || ...
#define PS_USE_CLOBBER_ARGV

If you have a test program to be run, I can run it on AIX.

#9Nathan Bossart
nathandbossart@gmail.com
In reply to: Noah Misch (#8)
Re: add non-option reordering to in-tree getopt_long

On Tue, Jun 13, 2023 at 05:17:37PM -0700, Noah Misch wrote:

If you have a test program to be run, I can run it on AIX.

Thanks. The patch above [0]/messages/by-id/attachment/147420/v1-0001-Teach-in-tree-getopt_long-to-move-non-options-to-.patch adjusts 040_createuser.pl to test modifying
argv, so that's one test program. And here's a few lines for reversing
argv:

#include <stdio.h>

int
main(int argc, char *argv[])
{
for (int i = 0; i < argc / 2; i++)
{
char *tmp = argv[i];

argv[i] = argv[argc - i - 1];
argv[argc - i - 1] = tmp;
}

for (int i = 0; i < argc; i++)
printf("%s ", argv[i]);
printf("\n");
}

[0]: /messages/by-id/attachment/147420/v1-0001-Teach-in-tree-getopt_long-to-move-non-options-to-.patch

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#10Noah Misch
noah@leadboat.com
In reply to: Nathan Bossart (#9)
Re: add non-option reordering to in-tree getopt_long

On Wed, Jun 14, 2023 at 02:28:16PM -0700, Nathan Bossart wrote:

On Tue, Jun 13, 2023 at 05:17:37PM -0700, Noah Misch wrote:

If you have a test program to be run, I can run it on AIX.

Thanks. The patch above [0] adjusts 040_createuser.pl to test modifying
argv, so that's one test program. And here's a few lines for reversing
argv:

#include <stdio.h>

int
main(int argc, char *argv[])
{
for (int i = 0; i < argc / 2; i++)
{
char *tmp = argv[i];

argv[i] = argv[argc - i - 1];
argv[argc - i - 1] = tmp;
}

for (int i = 0; i < argc; i++)
printf("%s ", argv[i]);
printf("\n");
}

Here's some output from this program (on AIX 7.1, same output when compiled
32-bit or 64-bit):

$ ./a.out a b c d e f
f e d c b a ./a.out

Interesting discussion here, too:
https://github.com/libuv/libuv/pull/1187

#11Nathan Bossart
nathandbossart@gmail.com
In reply to: Noah Misch (#10)
Re: add non-option reordering to in-tree getopt_long

On Wed, Jun 14, 2023 at 03:11:54PM -0700, Noah Misch wrote:

Here's some output from this program (on AIX 7.1, same output when compiled
32-bit or 64-bit):

$ ./a.out a b c d e f
f e d c b a ./a.out

Thanks again.

Interesting discussion here, too:
https://github.com/libuv/libuv/pull/1187

Hm. IIUC modifying the argv pointers on AIX will modify the process title,
which could cause 'ps' to temporarily show duplicate/missing arguments
during option parsing. That doesn't seem too terrible, but if pointer
assignments aren't atomic, maybe 'ps' could be sent off to another part of
memory, which does seem terrible.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#12Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Nathan Bossart (#11)
Re: add non-option reordering to in-tree getopt_long

At Wed, 14 Jun 2023 15:46:08 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in

On Wed, Jun 14, 2023 at 03:11:54PM -0700, Noah Misch wrote:

Here's some output from this program (on AIX 7.1, same output when compiled
32-bit or 64-bit):

$ ./a.out a b c d e f
f e d c b a ./a.out

Thanks again.

Interesting discussion here, too:
https://github.com/libuv/libuv/pull/1187

Hm. IIUC modifying the argv pointers on AIX will modify the process title,
which could cause 'ps' to temporarily show duplicate/missing arguments
during option parsing. That doesn't seem too terrible, but if pointer
assignments aren't atomic, maybe 'ps' could be sent off to another part of
memory, which does seem terrible.

Hmm, the discussion seems to be based on the assumption that argv[0]
can be safely redirected to a different memory location. If that's the
case, we can prpbably rearrange the array, even if there's a small
window where ps might display a confusing command line, right?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#13Nathan Bossart
nathandbossart@gmail.com
In reply to: Kyotaro Horiguchi (#12)
1 attachment(s)
Re: add non-option reordering to in-tree getopt_long

On Thu, Jun 15, 2023 at 02:30:34PM +0900, Kyotaro Horiguchi wrote:

At Wed, 14 Jun 2023 15:46:08 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in

Hm. IIUC modifying the argv pointers on AIX will modify the process title,
which could cause 'ps' to temporarily show duplicate/missing arguments
during option parsing. That doesn't seem too terrible, but if pointer
assignments aren't atomic, maybe 'ps' could be sent off to another part of
memory, which does seem terrible.

Hmm, the discussion seems to be based on the assumption that argv[0]
can be safely redirected to a different memory location. If that's the
case, we can prpbably rearrange the array, even if there's a small
window where ps might display a confusing command line, right?

If that's the extent of the breakage, then it seems alright to me. I've
attached a new version of the patch that omits the POSIXLY_CORRECT stuff.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v2-0001-Teach-in-tree-getopt_long-to-move-non-options-to-.patchtext/x-diff; charset=us-asciiDownload
From 89ec098d2515d7cf03b630b787e8f53ca25916b9 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Fri, 9 Jun 2023 15:51:58 -0700
Subject: [PATCH v2 1/1] Teach in-tree getopt_long() to move non-options to the
 end of argv.

---
 src/bin/scripts/t/040_createuser.pl | 10 +++++----
 src/port/getopt_long.c              | 34 +++++++++++++++++++++++++----
 2 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/src/bin/scripts/t/040_createuser.pl b/src/bin/scripts/t/040_createuser.pl
index 9ca282181d..ba40ab11a3 100644
--- a/src/bin/scripts/t/040_createuser.pl
+++ b/src/bin/scripts/t/040_createuser.pl
@@ -42,9 +42,9 @@ $node->issues_sql_like(
 	'add a role as a member with admin option of the newly created role');
 $node->issues_sql_like(
 	[
-		'createuser', '-m',
-		'regress_user3', '-m',
-		'regress user #4', 'REGRESS_USER5'
+		'createuser', 'REGRESS_USER5',
+		'-m', 'regress_user3',
+		'-m', 'regress user #4'
 	],
 	qr/statement: CREATE ROLE "REGRESS_USER5" NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS ROLE regress_user3,"regress user #4";/,
 	'add a role as a member of the newly created role');
@@ -73,11 +73,13 @@ $node->issues_sql_like(
 	qr/statement: CREATE ROLE regress_user11 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--role');
 $node->issues_sql_like(
-	[ 'createuser', '--member-of', 'regress_user1', 'regress_user12' ],
+	[ 'createuser', 'regress_user12', '--member-of', 'regress_user1' ],
 	qr/statement: CREATE ROLE regress_user12 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--member-of');
 
 $node->command_fails([ 'createuser', 'regress_user1' ],
 	'fails if role already exists');
+$node->command_fails([ 'createuser', 'regress_user1', '-m', 'regress_user2', 'regress_user3' ],
+	'fails for too many non-options');
 
 done_testing();
diff --git a/src/port/getopt_long.c b/src/port/getopt_long.c
index c989276988..4bbd8e0b85 100644
--- a/src/port/getopt_long.c
+++ b/src/port/getopt_long.c
@@ -50,8 +50,8 @@
  * This implementation does not use optreset.  Instead, we guarantee that
  * it can be restarted on a new argv array after a previous call returned -1,
  * if the caller resets optind to 1 before the first call of the new series.
- * (Internally, this means we must be sure to reset "place" to EMSG before
- * returning -1.)
+ * (Internally, this means we must be sure to reset "place" to EMSG and
+ * nonopt_start to -1 before returning -1.)
  */
 int
 getopt_long(int argc, char *const argv[],
@@ -60,21 +60,45 @@ getopt_long(int argc, char *const argv[],
 {
 	static char *place = EMSG;	/* option letter processing */
 	char	   *oli;			/* option letter list index */
+	static int	nonopt_start = -1;
 
 	if (!*place)
 	{							/* update scanning pointer */
 		if (optind >= argc)
 		{
 			place = EMSG;
+			nonopt_start = -1;
 			return -1;
 		}
 
+retry:
 		place = argv[optind];
 
 		if (place[0] != '-')
 		{
-			place = EMSG;
-			return -1;
+			char	  **args = (char **) argv;
+
+			/*
+			 * If only non-options remain, return -1.  Else, move the
+			 * non-option to the end and try again.
+			 */
+			if (optind == nonopt_start)
+			{
+				place = EMSG;
+				nonopt_start = -1;
+				return -1;
+			}
+
+			for (int i = optind; i < argc - 1; i++)
+				args[i] = args[i + 1];
+			args[argc - 1] = place;
+
+			if (nonopt_start == -1)
+				nonopt_start = argc - 1;
+			else
+				nonopt_start--;
+
+			goto retry;
 		}
 
 		place++;
@@ -83,6 +107,7 @@ getopt_long(int argc, char *const argv[],
 		{
 			/* treat "-" as not being an option */
 			place = EMSG;
+			nonopt_start = -1;
 			return -1;
 		}
 
@@ -91,6 +116,7 @@ getopt_long(int argc, char *const argv[],
 			/* found "--", treat it as end of options */
 			++optind;
 			place = EMSG;
+			nonopt_start = -1;
 			return -1;
 		}
 
-- 
2.25.1

#14Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#13)
Re: add non-option reordering to in-tree getopt_long

On Thu, Jun 15, 2023 at 05:09:59PM -0700, Nathan Bossart wrote:

On Thu, Jun 15, 2023 at 02:30:34PM +0900, Kyotaro Horiguchi wrote:

Hmm, the discussion seems to be based on the assumption that argv[0]
can be safely redirected to a different memory location. If that's the
case, we can prpbably rearrange the array, even if there's a small
window where ps might display a confusing command line, right?

If that's the extent of the breakage, then it seems alright to me.

Okay by me to live with this burden.

I've attached a new version of the patch that omits the
POSIXLY_CORRECT stuff.

This looks OK at quick glance, though you may want to document at the
top of getopt_long() the reordering business with the non-options?
--
Michael

#15Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#14)
1 attachment(s)
Re: add non-option reordering to in-tree getopt_long

On Fri, Jun 16, 2023 at 10:30:09AM +0900, Michael Paquier wrote:

On Thu, Jun 15, 2023 at 05:09:59PM -0700, Nathan Bossart wrote:

I've attached a new version of the patch that omits the
POSIXLY_CORRECT stuff.

This looks OK at quick glance, though you may want to document at the
top of getopt_long() the reordering business with the non-options?

I added a comment to this effect in v3. I also noticed that '-' wasn't
properly handled as a non-option, so I've tried to fix that as well.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v3-0001-Teach-in-tree-getopt_long-to-move-non-options-to-.patchtext/x-diff; charset=us-asciiDownload
From e31fa0ab5237d3ad35bdb44404fd5b5eeea3f5c6 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Fri, 9 Jun 2023 15:51:58 -0700
Subject: [PATCH v3 1/1] Teach in-tree getopt_long() to move non-options to the
 end of argv.

---
 src/bin/scripts/t/040_createuser.pl | 10 ++++---
 src/port/getopt_long.c              | 45 +++++++++++++++++++++--------
 2 files changed, 39 insertions(+), 16 deletions(-)

diff --git a/src/bin/scripts/t/040_createuser.pl b/src/bin/scripts/t/040_createuser.pl
index 9ca282181d..ba40ab11a3 100644
--- a/src/bin/scripts/t/040_createuser.pl
+++ b/src/bin/scripts/t/040_createuser.pl
@@ -42,9 +42,9 @@ $node->issues_sql_like(
 	'add a role as a member with admin option of the newly created role');
 $node->issues_sql_like(
 	[
-		'createuser', '-m',
-		'regress_user3', '-m',
-		'regress user #4', 'REGRESS_USER5'
+		'createuser', 'REGRESS_USER5',
+		'-m', 'regress_user3',
+		'-m', 'regress user #4'
 	],
 	qr/statement: CREATE ROLE "REGRESS_USER5" NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS ROLE regress_user3,"regress user #4";/,
 	'add a role as a member of the newly created role');
@@ -73,11 +73,13 @@ $node->issues_sql_like(
 	qr/statement: CREATE ROLE regress_user11 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--role');
 $node->issues_sql_like(
-	[ 'createuser', '--member-of', 'regress_user1', 'regress_user12' ],
+	[ 'createuser', 'regress_user12', '--member-of', 'regress_user1' ],
 	qr/statement: CREATE ROLE regress_user12 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--member-of');
 
 $node->command_fails([ 'createuser', 'regress_user1' ],
 	'fails if role already exists');
+$node->command_fails([ 'createuser', 'regress_user1', '-m', 'regress_user2', 'regress_user3' ],
+	'fails for too many non-options');
 
 done_testing();
diff --git a/src/port/getopt_long.c b/src/port/getopt_long.c
index c989276988..da233728e1 100644
--- a/src/port/getopt_long.c
+++ b/src/port/getopt_long.c
@@ -50,8 +50,11 @@
  * This implementation does not use optreset.  Instead, we guarantee that
  * it can be restarted on a new argv array after a previous call returned -1,
  * if the caller resets optind to 1 before the first call of the new series.
- * (Internally, this means we must be sure to reset "place" to EMSG before
- * returning -1.)
+ * (Internally, this means we must be sure to reset "place" to EMSG and
+ * nonopt_start to -1 before returning -1.)
+ *
+ * Note that this routine reorders the pointers in argv (despite the const
+ * qualifier) so that all non-options will be at the end when -1 is returned.
  */
 int
 getopt_long(int argc, char *const argv[],
@@ -60,37 +63,55 @@ getopt_long(int argc, char *const argv[],
 {
 	static char *place = EMSG;	/* option letter processing */
 	char	   *oli;			/* option letter list index */
+	static int	nonopt_start = -1;
 
 	if (!*place)
 	{							/* update scanning pointer */
 		if (optind >= argc)
 		{
 			place = EMSG;
+			nonopt_start = -1;
 			return -1;
 		}
 
+retry:
 		place = argv[optind];
 
-		if (place[0] != '-')
+		if (place[0] != '-' || place[1] == '\0')
 		{
-			place = EMSG;
-			return -1;
-		}
+			char	  **args = (char **) argv;
 
-		place++;
+			/*
+			 * If only non-options remain, return -1.  Else, move the
+			 * non-option to the end and try again.
+			 */
+			if (optind == nonopt_start)
+			{
+				place = EMSG;
+				nonopt_start = -1;
+				return -1;
+			}
 
-		if (!*place)
-		{
-			/* treat "-" as not being an option */
-			place = EMSG;
-			return -1;
+			for (int i = optind; i < argc - 1; i++)
+				args[i] = args[i + 1];
+			args[argc - 1] = place;
+
+			if (nonopt_start == -1)
+				nonopt_start = argc - 1;
+			else
+				nonopt_start--;
+
+			goto retry;
 		}
 
+		place++;
+
 		if (place[0] == '-' && place[1] == '\0')
 		{
 			/* found "--", treat it as end of options */
 			++optind;
 			place = EMSG;
+			nonopt_start = -1;
 			return -1;
 		}
 
-- 
2.25.1

#16Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Nathan Bossart (#15)
Re: add non-option reordering to in-tree getopt_long

At Thu, 15 Jun 2023 21:58:28 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in

On Fri, Jun 16, 2023 at 10:30:09AM +0900, Michael Paquier wrote:

On Thu, Jun 15, 2023 at 05:09:59PM -0700, Nathan Bossart wrote:

I've attached a new version of the patch that omits the
POSIXLY_CORRECT stuff.

This looks OK at quick glance, though you may want to document at the
top of getopt_long() the reordering business with the non-options?

I added a comment to this effect in v3. I also noticed that '-' wasn't
properly handled as a non-option, so I've tried to fix that as well.

(Honestly, the rearrangement code looks somewhat tricky to grasp..)

It doesn't work properly if '--' is involved.

For example, consider the following options (even though they don't
work for the command).

psql -t -T hoho -a hoge -- -1 hage hige huge

After the function returns -1, the argv array looks like this. The
"[..]" indicates the position of optind.

psql -t -T hoho -a -- [-1] hage hige huge hoge

This is clearly incorrect since "hoge" gets moved to the end. The
rearrangement logic needs to take into account the '--'. For your
information, the glibc version yields the following result for the
same options.

psql -t -T hoho -a -- [hoge] -1 hage hige huge

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#17Nathan Bossart
nathandbossart@gmail.com
In reply to: Kyotaro Horiguchi (#16)
1 attachment(s)
Re: add non-option reordering to in-tree getopt_long

On Fri, Jun 16, 2023 at 04:51:38PM +0900, Kyotaro Horiguchi wrote:

(Honestly, the rearrangement code looks somewhat tricky to grasp..)

Yeah, I think there's still some room for improvement here.

It doesn't work properly if '--' is involved.

For example, consider the following options (even though they don't
work for the command).

psql -t -T hoho -a hoge -- -1 hage hige huge

After the function returns -1, the argv array looks like this. The
"[..]" indicates the position of optind.

psql -t -T hoho -a -- [-1] hage hige huge hoge

This is clearly incorrect since "hoge" gets moved to the end. The
rearrangement logic needs to take into account the '--'. For your
information, the glibc version yields the following result for the
same options.

psql -t -T hoho -a -- [hoge] -1 hage hige huge

Ah, so it effectively retains the non-option ordering, even if there is a
'--'. I think this behavior is worth keeping. I attempted to fix this in
the attached patch.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v4-0001-Teach-in-tree-getopt_long-to-move-non-options-to-.patchtext/x-diff; charset=us-asciiDownload
From 51e1033374c464ed90484f641d31b0ab705672f2 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Fri, 9 Jun 2023 15:51:58 -0700
Subject: [PATCH v4 1/1] Teach in-tree getopt_long() to move non-options to the
 end of argv.

---
 src/bin/scripts/t/040_createuser.pl | 10 +++---
 src/port/getopt_long.c              | 50 +++++++++++++++++++++--------
 2 files changed, 42 insertions(+), 18 deletions(-)

diff --git a/src/bin/scripts/t/040_createuser.pl b/src/bin/scripts/t/040_createuser.pl
index 9ca282181d..ba40ab11a3 100644
--- a/src/bin/scripts/t/040_createuser.pl
+++ b/src/bin/scripts/t/040_createuser.pl
@@ -42,9 +42,9 @@ $node->issues_sql_like(
 	'add a role as a member with admin option of the newly created role');
 $node->issues_sql_like(
 	[
-		'createuser', '-m',
-		'regress_user3', '-m',
-		'regress user #4', 'REGRESS_USER5'
+		'createuser', 'REGRESS_USER5',
+		'-m', 'regress_user3',
+		'-m', 'regress user #4'
 	],
 	qr/statement: CREATE ROLE "REGRESS_USER5" NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS ROLE regress_user3,"regress user #4";/,
 	'add a role as a member of the newly created role');
@@ -73,11 +73,13 @@ $node->issues_sql_like(
 	qr/statement: CREATE ROLE regress_user11 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--role');
 $node->issues_sql_like(
-	[ 'createuser', '--member-of', 'regress_user1', 'regress_user12' ],
+	[ 'createuser', 'regress_user12', '--member-of', 'regress_user1' ],
 	qr/statement: CREATE ROLE regress_user12 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--member-of');
 
 $node->command_fails([ 'createuser', 'regress_user1' ],
 	'fails if role already exists');
+$node->command_fails([ 'createuser', 'regress_user1', '-m', 'regress_user2', 'regress_user3' ],
+	'fails for too many non-options');
 
 done_testing();
diff --git a/src/port/getopt_long.c b/src/port/getopt_long.c
index c989276988..a4bdc6c8f0 100644
--- a/src/port/getopt_long.c
+++ b/src/port/getopt_long.c
@@ -50,8 +50,9 @@
  * This implementation does not use optreset.  Instead, we guarantee that
  * it can be restarted on a new argv array after a previous call returned -1,
  * if the caller resets optind to 1 before the first call of the new series.
- * (Internally, this means we must be sure to reset "place" to EMSG before
- * returning -1.)
+ *
+ * Note that this routine reorders the pointers in argv (despite the const
+ * qualifier) so that all non-options will be at the end when -1 is returned.
  */
 int
 getopt_long(int argc, char *const argv[],
@@ -60,38 +61,59 @@ getopt_long(int argc, char *const argv[],
 {
 	static char *place = EMSG;	/* option letter processing */
 	char	   *oli;			/* option letter list index */
+	static int	nonopt_start = -1;
+	static bool force_nonopt = false;
 
 	if (!*place)
 	{							/* update scanning pointer */
+retry:
 		if (optind >= argc)
 		{
 			place = EMSG;
+			nonopt_start = -1;
+			force_nonopt = false;
 			return -1;
 		}
 
 		place = argv[optind];
 
-		if (place[0] != '-')
+		/* non-options, including '-' and anything after '--' */
+		if (place[0] != '-' || place[1] == '\0' || force_nonopt)
 		{
-			place = EMSG;
-			return -1;
-		}
+			char	  **args = (char **) argv;
 
-		place++;
+			/*
+			 * If only non-options remain, return -1.  Else, move the
+			 * non-option to the end and try again.
+			 */
+			if (optind == nonopt_start)
+			{
+				place = EMSG;
+				nonopt_start = -1;
+				force_nonopt = false;
+				return -1;
+			}
 
-		if (!*place)
-		{
-			/* treat "-" as not being an option */
-			place = EMSG;
-			return -1;
+			for (int i = optind; i < argc - 1; i++)
+				args[i] = args[i + 1];
+			args[argc - 1] = place;
+
+			if (nonopt_start == -1)
+				nonopt_start = argc - 1;
+			else
+				nonopt_start--;
+
+			goto retry;
 		}
 
+		place++;
+
 		if (place[0] == '-' && place[1] == '\0')
 		{
 			/* found "--", treat it as end of options */
 			++optind;
-			place = EMSG;
-			return -1;
+			force_nonopt = true;
+			goto retry;
 		}
 
 		if (place[0] == '-' && place[1])
-- 
2.25.1

#18Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Nathan Bossart (#17)
Re: add non-option reordering to in-tree getopt_long

At Fri, 16 Jun 2023 11:28:47 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in

On Fri, Jun 16, 2023 at 04:51:38PM +0900, Kyotaro Horiguchi wrote:

(Honestly, the rearrangement code looks somewhat tricky to grasp..)

Yeah, I think there's still some room for improvement here.

The argv elements get shuffled around many times with the
patch. However, I couldn't find a way to decrease the count without
resorting to a forward scan. So I've concluded the current approach
is them most effeicient, considering the complexity.

Ah, so it effectively retains the non-option ordering, even if there is a
'--'. I think this behavior is worth keeping. I attempted to fix this in
the attached patch.

I tried some patterns with the patch and it generates the same results
with the glibc version.

The TAP test looks fine and it catches the change.

Everything looks fine to me.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#19Nathan Bossart
nathandbossart@gmail.com
In reply to: Kyotaro Horiguchi (#18)
Re: add non-option reordering to in-tree getopt_long

On Tue, Jun 20, 2023 at 02:12:44PM +0900, Kyotaro Horiguchi wrote:

The argv elements get shuffled around many times with the
patch. However, I couldn't find a way to decrease the count without
resorting to a forward scan. So I've concluded the current approach
is them most effeicient, considering the complexity.

Yeah, I'm not sure it's worth doing anything more sophisticated.

I tried some patterns with the patch and it generates the same results
with the glibc version.

The TAP test looks fine and it catches the change.

Everything looks fine to me.

Thanks for reviewing.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#20Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#19)
1 attachment(s)
Re: add non-option reordering to in-tree getopt_long

I spent some time tidying up the patch and adding a more detailed commit
message.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v5-0001-Teach-in-tree-getopt_long-to-move-non-options-to-.patchtext/x-diff; charset=us-asciiDownload
From 2525e6aed066fe8eafdaab0d33c8c5df055c7e09 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Fri, 7 Jul 2023 20:00:47 -0700
Subject: [PATCH v5 1/1] Teach in-tree getopt_long() to move non-options to the
 end of argv.

Unlike the other implementations of getopt_long() I could find, the
in-tree implementation does not reorder non-options to the end of
argv.  Instead, it returns -1 as soon as the first non-option is
found, even if there are other options listed afterwards.  By
moving non-options to the end of argv, getopt_long() can parse all
specified options and return -1 when only non-options remain.
This quirk is periodically missed by hackers (e.g., 869aa40a27,
ffd398021c, and d9ddc50baf).

This commit introduces the aforementioned non-option reordering
behavior to the in-tree getopt_long() implementation.  The only
systems I'm aware of that use it are Windows and AIX, both of which
have been tested.

Special thanks to Noah Misch for his help validating behavior on
AIX.

Reviewed-by: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/20230609232257.GA121461%40nathanxps13
---
 src/bin/scripts/t/040_createuser.pl | 10 ++--
 src/port/getopt_long.c              | 79 ++++++++++++++++++++---------
 2 files changed, 60 insertions(+), 29 deletions(-)

diff --git a/src/bin/scripts/t/040_createuser.pl b/src/bin/scripts/t/040_createuser.pl
index 9ca282181d..3290e30c88 100644
--- a/src/bin/scripts/t/040_createuser.pl
+++ b/src/bin/scripts/t/040_createuser.pl
@@ -42,9 +42,8 @@ $node->issues_sql_like(
 	'add a role as a member with admin option of the newly created role');
 $node->issues_sql_like(
 	[
-		'createuser', '-m',
-		'regress_user3', '-m',
-		'regress user #4', 'REGRESS_USER5'
+		'createuser', 'REGRESS_USER5', '-m', 'regress_user3',
+		'-m', 'regress user #4'
 	],
 	qr/statement: CREATE ROLE "REGRESS_USER5" NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS ROLE regress_user3,"regress user #4";/,
 	'add a role as a member of the newly created role');
@@ -73,11 +72,14 @@ $node->issues_sql_like(
 	qr/statement: CREATE ROLE regress_user11 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--role');
 $node->issues_sql_like(
-	[ 'createuser', '--member-of', 'regress_user1', 'regress_user12' ],
+	[ 'createuser', 'regress_user12', '--member-of', 'regress_user1' ],
 	qr/statement: CREATE ROLE regress_user12 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--member-of');
 
 $node->command_fails([ 'createuser', 'regress_user1' ],
 	'fails if role already exists');
+$node->command_fails(
+	[ 'createuser', 'regress_user1', '-m', 'regress_user2', 'regress_user3' ],
+	'fails for too many non-options');
 
 done_testing();
diff --git a/src/port/getopt_long.c b/src/port/getopt_long.c
index c989276988..3fc2cf36e9 100644
--- a/src/port/getopt_long.c
+++ b/src/port/getopt_long.c
@@ -50,8 +50,9 @@
  * This implementation does not use optreset.  Instead, we guarantee that
  * it can be restarted on a new argv array after a previous call returned -1,
  * if the caller resets optind to 1 before the first call of the new series.
- * (Internally, this means we must be sure to reset "place" to EMSG before
- * returning -1.)
+ *
+ * Note that this routine reorders the pointers in argv (despite the const
+ * qualifier) so that all non-options will be at the end when -1 is returned.
  */
 int
 getopt_long(int argc, char *const argv[],
@@ -60,40 +61,68 @@ getopt_long(int argc, char *const argv[],
 {
 	static char *place = EMSG;	/* option letter processing */
 	char	   *oli;			/* option letter list index */
+	static int	nonopt_start = -1;
+	static bool force_nonopt = false;
 
 	if (!*place)
 	{							/* update scanning pointer */
-		if (optind >= argc)
+		for (;;)
 		{
-			place = EMSG;
-			return -1;
-		}
+			char	  **args = (char **) argv;
 
-		place = argv[optind];
+			if (optind >= argc)
+			{
+				place = EMSG;
+				nonopt_start = -1;
+				force_nonopt = false;
+				return -1;
+			}
 
-		if (place[0] != '-')
-		{
-			place = EMSG;
-			return -1;
-		}
+			place = argv[optind];
+
+			/*
+			 * Any string that starts with '-' but is not equivalent to '-' or
+			 * '--' is considered an option.  If we see an option, we can
+			 * immediately break out of the loop since no argument reordering
+			 * is required.  Note that we treat '--' as the end of options and
+			 * immediately force reordering for every subsequent argument.
+			 * This reinstates the original order of all non-options (which
+			 * includes everything after the '--') before we return.
+			 */
+			if (!force_nonopt && place[0] == '-' && place[1])
+			{
+				if (place[1] != '-' || place[2])
+					break;
 
-		place++;
+				optind++;
+				force_nonopt = true;
+				continue;
+			}
 
-		if (!*place)
-		{
-			/* treat "-" as not being an option */
-			place = EMSG;
-			return -1;
-		}
+			/*
+			 * If only non-options remain, return -1.  Else, move the
+			 * non-option to the end and try again.
+			 */
+			if (optind == nonopt_start)
+			{
+				place = EMSG;
+				nonopt_start = -1;
+				force_nonopt = false;
+				return -1;
+			}
 
-		if (place[0] == '-' && place[1] == '\0')
-		{
-			/* found "--", treat it as end of options */
-			++optind;
-			place = EMSG;
-			return -1;
+			for (int i = optind; i < argc - 1; i++)
+				args[i] = args[i + 1];
+			args[argc - 1] = place;
+
+			if (nonopt_start == -1)
+				nonopt_start = argc - 1;
+			else
+				nonopt_start--;
 		}
 
+		place++;
+
 		if (place[0] == '-' && place[1])
 		{
 			/* long option */
-- 
2.25.1

#21Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Nathan Bossart (#20)
Re: add non-option reordering to in-tree getopt_long

At Fri, 7 Jul 2023 20:52:24 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in

I spent some time tidying up the patch and adding a more detailed commit
message.

The commit message and the change to TAP script looks good.

Two conditions are to be reversed and one of them look somewhat
unintuitive to me.

+			if (!force_nonopt && place[0] == '-' && place[1])
+			{
+				if (place[1] != '-' || place[2])
+					break;
+
+				optind++;
+				force_nonopt = true;
+				continue;
+			}

The first if looks good to me, but the second if is a bit hard to get the meaning at a glance. "!(place[1] == '-' && place[2] == 0)" is easier to read *to me*. Or I'm fine with the following structure here.

if (!force_nonopt ... )
{
if (place[1] == '-' && place[2] == 0)
{
optind+;
force_nonopt = true;
continue;
}
break;
}

(To be honest, I see the goto looked clear than for(;;)..)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#22Nathan Bossart
nathandbossart@gmail.com
In reply to: Kyotaro Horiguchi (#21)
Re: add non-option reordering to in-tree getopt_long

On Mon, Jul 10, 2023 at 04:57:11PM +0900, Kyotaro Horiguchi wrote:

+			if (!force_nonopt && place[0] == '-' && place[1])
+			{
+				if (place[1] != '-' || place[2])
+					break;
+
+				optind++;
+				force_nonopt = true;
+				continue;
+			}

The first if looks good to me, but the second if is a bit hard to get the meaning at a glance. "!(place[1] == '-' && place[2] == 0)" is easier to read *to me*. Or I'm fine with the following structure here.

I'd readily admit that it's tough to read these lines. I briefly tried to
make use of strcmp() to improve readability, but I wasn't having much luck.
I'll give it another try.

if (!force_nonopt ... )
{
if (place[1] == '-' && place[2] == 0)
{
optind+;
force_nonopt = true;
continue;
}
break;
}

(To be honest, I see the goto looked clear than for(;;)..)

Okay. I don't mind adding it back if it improves readability.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#23Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#22)
1 attachment(s)
Re: add non-option reordering to in-tree getopt_long

Here's a new version of the patch with the latest feedback addressed.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v6-0001-Teach-in-tree-getopt_long-to-move-non-options-to-.patchtext/x-diff; charset=us-asciiDownload
From e662a1b6b73c92ff7862444e092406ed82b0c2c7 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Fri, 7 Jul 2023 20:00:47 -0700
Subject: [PATCH v6 1/1] Teach in-tree getopt_long() to move non-options to the
 end of argv.

Unlike the other implementations of getopt_long() I could find, the
in-tree implementation does not reorder non-options to the end of
argv.  Instead, it returns -1 as soon as the first non-option is
found, even if there are other options listed afterwards.  By
moving non-options to the end of argv, getopt_long() can parse all
specified options and return -1 when only non-options remain.
This quirk is periodically missed by hackers (e.g., 869aa40a27,
ffd398021c, and d9ddc50baf).

This commit introduces the aforementioned non-option reordering
behavior to the in-tree getopt_long() implementation.  The only
systems I'm aware of that use it are Windows and AIX, both of which
have been tested.

Special thanks to Noah Misch for his help validating behavior on
AIX.

Reviewed-by: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/20230609232257.GA121461%40nathanxps13
---
 src/bin/scripts/t/040_createuser.pl | 10 +++---
 src/port/getopt_long.c              | 54 ++++++++++++++++++++---------
 2 files changed, 43 insertions(+), 21 deletions(-)

diff --git a/src/bin/scripts/t/040_createuser.pl b/src/bin/scripts/t/040_createuser.pl
index 9ca282181d..3290e30c88 100644
--- a/src/bin/scripts/t/040_createuser.pl
+++ b/src/bin/scripts/t/040_createuser.pl
@@ -42,9 +42,8 @@ $node->issues_sql_like(
 	'add a role as a member with admin option of the newly created role');
 $node->issues_sql_like(
 	[
-		'createuser', '-m',
-		'regress_user3', '-m',
-		'regress user #4', 'REGRESS_USER5'
+		'createuser', 'REGRESS_USER5', '-m', 'regress_user3',
+		'-m', 'regress user #4'
 	],
 	qr/statement: CREATE ROLE "REGRESS_USER5" NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS ROLE regress_user3,"regress user #4";/,
 	'add a role as a member of the newly created role');
@@ -73,11 +72,14 @@ $node->issues_sql_like(
 	qr/statement: CREATE ROLE regress_user11 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--role');
 $node->issues_sql_like(
-	[ 'createuser', '--member-of', 'regress_user1', 'regress_user12' ],
+	[ 'createuser', 'regress_user12', '--member-of', 'regress_user1' ],
 	qr/statement: CREATE ROLE regress_user12 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--member-of');
 
 $node->command_fails([ 'createuser', 'regress_user1' ],
 	'fails if role already exists');
+$node->command_fails(
+	[ 'createuser', 'regress_user1', '-m', 'regress_user2', 'regress_user3' ],
+	'fails for too many non-options');
 
 done_testing();
diff --git a/src/port/getopt_long.c b/src/port/getopt_long.c
index c989276988..3ba6094d93 100644
--- a/src/port/getopt_long.c
+++ b/src/port/getopt_long.c
@@ -50,8 +50,9 @@
  * This implementation does not use optreset.  Instead, we guarantee that
  * it can be restarted on a new argv array after a previous call returned -1,
  * if the caller resets optind to 1 before the first call of the new series.
- * (Internally, this means we must be sure to reset "place" to EMSG before
- * returning -1.)
+ *
+ * Note that this routine reorders the pointers in argv (despite the const
+ * qualifier) so that all non-options will be at the end when -1 is returned.
  */
 int
 getopt_long(int argc, char *const argv[],
@@ -60,40 +61,59 @@ getopt_long(int argc, char *const argv[],
 {
 	static char *place = EMSG;	/* option letter processing */
 	char	   *oli;			/* option letter list index */
+	static int	nonopt_start = -1;
+	static bool force_nonopt = false;
 
 	if (!*place)
 	{							/* update scanning pointer */
-		if (optind >= argc)
+		char	  **args = (char **) argv;
+
+retry:
+
+		/*
+		 * If we are out of arguments or only non-options remain, return -1.
+		 */
+		if (optind >= argc || optind == nonopt_start)
 		{
 			place = EMSG;
+			nonopt_start = -1;
+			force_nonopt = false;
 			return -1;
 		}
 
 		place = argv[optind];
 
-		if (place[0] != '-')
+		/*
+		 * An argument is a non-option if it meets any of the following
+		 * criteria: it follows an argument that is equivalent to the string
+		 * "--", it is equivalent to the string "-", or it does not start with
+		 * '-'.  When we encounter a non-option, we move it to the end of argv
+		 * (after shifting all remaining arguments over to make room), and
+		 * then we try again with the next argument.
+		 */
+		if (force_nonopt || strcmp("-", place) == 0 || place[0] != '-')
 		{
-			place = EMSG;
-			return -1;
-		}
+			for (int i = optind; i < argc - 1; i++)
+				args[i] = args[i + 1];
+			args[argc - 1] = place;
 
-		place++;
+			if (nonopt_start == -1)
+				nonopt_start = argc - 1;
+			else
+				nonopt_start--;
 
-		if (!*place)
-		{
-			/* treat "-" as not being an option */
-			place = EMSG;
-			return -1;
+			goto retry;
 		}
-
-		if (place[0] == '-' && place[1] == '\0')
+		else if (strcmp("--", place) == 0)
 		{
 			/* found "--", treat it as end of options */
 			++optind;
-			place = EMSG;
-			return -1;
+			force_nonopt = true;
+			goto retry;
 		}
 
+		place++;
+
 		if (place[0] == '-' && place[1])
 		{
 			/* long option */
-- 
2.25.1

#24Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Nathan Bossart (#23)
Re: add non-option reordering to in-tree getopt_long

At Mon, 10 Jul 2023 13:06:58 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in

Here's a new version of the patch with the latest feedback addressed.

Thanks!

+		 * An argument is a non-option if it meets any of the following
+		 * criteria: it follows an argument that is equivalent to the string
+		 * "--", it is equivalent to the string "-", or it does not start with
+		 * '-'.  When we encounter a non-option, we move it to the end of argv
+		 * (after shifting all remaining arguments over to make room), and
+		 * then we try again with the next argument.
+		 */
+		if (force_nonopt || strcmp("-", place) == 0 || place[0] != '-')
...
+		else if (strcmp("--", place) == 0)

I like it. We don't need to overcomplicate things just for the sake of
speed here. Plus, this version looks the most simple to me. That being
said, it might be better if the last term is positioned in the second
place. This is mainly because a lone hyphen is less common than words
that starts with characters other than a pyphen.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#25Nathan Bossart
nathandbossart@gmail.com
In reply to: Kyotaro Horiguchi (#24)
1 attachment(s)
Re: add non-option reordering to in-tree getopt_long

On Tue, Jul 11, 2023 at 04:16:09PM +0900, Kyotaro Horiguchi wrote:

I like it. We don't need to overcomplicate things just for the sake of
speed here. Plus, this version looks the most simple to me. That being
said, it might be better if the last term is positioned in the second
place. This is mainly because a lone hyphen is less common than words
that starts with characters other than a pyphen.

Sure. І did it this way in v7.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v7-0001-Teach-in-tree-getopt_long-to-move-non-options-to-.patchtext/x-diff; charset=us-asciiDownload
From 7c978c11c8012cbfa67646bfc37849cb061f4e07 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Fri, 7 Jul 2023 20:00:47 -0700
Subject: [PATCH v7 1/1] Teach in-tree getopt_long() to move non-options to the
 end of argv.

Unlike the other implementations of getopt_long() I could find, the
in-tree implementation does not reorder non-options to the end of
argv.  Instead, it returns -1 as soon as the first non-option is
found, even if there are other options listed afterwards.  By
moving non-options to the end of argv, getopt_long() can parse all
specified options and return -1 when only non-options remain.
This quirk is periodically missed by hackers (e.g., 869aa40a27,
ffd398021c, and d9ddc50baf).

This commit introduces the aforementioned non-option reordering
behavior to the in-tree getopt_long() implementation.  The only
systems I'm aware of that use it are Windows and AIX, both of which
have been tested.

Special thanks to Noah Misch for his help validating behavior on
AIX.

Reviewed-by: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/20230609232257.GA121461%40nathanxps13
---
 src/bin/scripts/t/040_createuser.pl | 10 +++---
 src/port/getopt_long.c              | 54 ++++++++++++++++++++---------
 2 files changed, 43 insertions(+), 21 deletions(-)

diff --git a/src/bin/scripts/t/040_createuser.pl b/src/bin/scripts/t/040_createuser.pl
index 9ca282181d..3290e30c88 100644
--- a/src/bin/scripts/t/040_createuser.pl
+++ b/src/bin/scripts/t/040_createuser.pl
@@ -42,9 +42,8 @@ $node->issues_sql_like(
 	'add a role as a member with admin option of the newly created role');
 $node->issues_sql_like(
 	[
-		'createuser', '-m',
-		'regress_user3', '-m',
-		'regress user #4', 'REGRESS_USER5'
+		'createuser', 'REGRESS_USER5', '-m', 'regress_user3',
+		'-m', 'regress user #4'
 	],
 	qr/statement: CREATE ROLE "REGRESS_USER5" NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS ROLE regress_user3,"regress user #4";/,
 	'add a role as a member of the newly created role');
@@ -73,11 +72,14 @@ $node->issues_sql_like(
 	qr/statement: CREATE ROLE regress_user11 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--role');
 $node->issues_sql_like(
-	[ 'createuser', '--member-of', 'regress_user1', 'regress_user12' ],
+	[ 'createuser', 'regress_user12', '--member-of', 'regress_user1' ],
 	qr/statement: CREATE ROLE regress_user12 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--member-of');
 
 $node->command_fails([ 'createuser', 'regress_user1' ],
 	'fails if role already exists');
+$node->command_fails(
+	[ 'createuser', 'regress_user1', '-m', 'regress_user2', 'regress_user3' ],
+	'fails for too many non-options');
 
 done_testing();
diff --git a/src/port/getopt_long.c b/src/port/getopt_long.c
index c989276988..b290a2bab9 100644
--- a/src/port/getopt_long.c
+++ b/src/port/getopt_long.c
@@ -50,8 +50,9 @@
  * This implementation does not use optreset.  Instead, we guarantee that
  * it can be restarted on a new argv array after a previous call returned -1,
  * if the caller resets optind to 1 before the first call of the new series.
- * (Internally, this means we must be sure to reset "place" to EMSG before
- * returning -1.)
+ *
+ * Note that this routine reorders the pointers in argv (despite the const
+ * qualifier) so that all non-options will be at the end when -1 is returned.
  */
 int
 getopt_long(int argc, char *const argv[],
@@ -60,40 +61,59 @@ getopt_long(int argc, char *const argv[],
 {
 	static char *place = EMSG;	/* option letter processing */
 	char	   *oli;			/* option letter list index */
+	static int	nonopt_start = -1;
+	static bool force_nonopt = false;
 
 	if (!*place)
 	{							/* update scanning pointer */
-		if (optind >= argc)
+		char	  **args = (char **) argv;
+
+retry:
+
+		/*
+		 * If we are out of arguments or only non-options remain, return -1.
+		 */
+		if (optind >= argc || optind == nonopt_start)
 		{
 			place = EMSG;
+			nonopt_start = -1;
+			force_nonopt = false;
 			return -1;
 		}
 
 		place = argv[optind];
 
-		if (place[0] != '-')
+		/*
+		 * An argument is a non-option if it meets any of the following
+		 * criteria: it follows an argument that is equivalent to the string
+		 * "--", it does not start with '-', or it is equivalent to the string
+		 * "-".  When we encounter a non-option, we move it to the end of argv
+		 * (after shifting all remaining arguments over to make room), and
+		 * then we try again with the next argument.
+		 */
+		if (force_nonopt || place[0] != '-' || strcmp("-", place) == 0)
 		{
-			place = EMSG;
-			return -1;
-		}
+			for (int i = optind; i < argc - 1; i++)
+				args[i] = args[i + 1];
+			args[argc - 1] = place;
 
-		place++;
+			if (nonopt_start == -1)
+				nonopt_start = argc - 1;
+			else
+				nonopt_start--;
 
-		if (!*place)
-		{
-			/* treat "-" as not being an option */
-			place = EMSG;
-			return -1;
+			goto retry;
 		}
-
-		if (place[0] == '-' && place[1] == '\0')
+		else if (strcmp("--", place) == 0)
 		{
 			/* found "--", treat it as end of options */
 			++optind;
-			place = EMSG;
-			return -1;
+			force_nonopt = true;
+			goto retry;
 		}
 
+		place++;
+
 		if (place[0] == '-' && place[1])
 		{
 			/* long option */
-- 
2.25.1

#26Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#25)
1 attachment(s)
Re: add non-option reordering to in-tree getopt_long

On Tue, Jul 11, 2023 at 09:32:32AM -0700, Nathan Bossart wrote:

Sure. І did it this way in v7.

After a couple more small edits, I've committed this. I looked through all
uses of getopt_long() in PostgreSQL earlier today, and of the programs that
accepted non-options, most accepted only one, some others accepted 2-3, and
ecpg and pg_regress accepted any number. Given this analysis, I'm not too
worried about the O(n^2) behavior that the patch introduces. You'd likely
need to provide an enormous number of non-options for it to be noticeable,
and I'm dubious such use-cases exist.

During my analysis, I noticed that pg_ctl contains a workaround for the
lack of argument reordering. I think this can now be removed. Patch
attached.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v8-0002-Rely-on-getopt_long-argument-reordering-in-pg_ctl.patchtext/x-diff; charset=us-asciiDownload
From 10931d7b83f3ef02f510385e1e595bc260b69a5c Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Wed, 12 Jul 2023 19:57:44 -0700
Subject: [PATCH v8 2/2] Rely on getopt_long() argument reordering in pg_ctl.

---
 src/bin/pg_ctl/pg_ctl.c | 268 +++++++++++++++++++---------------------
 1 file changed, 128 insertions(+), 140 deletions(-)

diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index 465948e707..807d7023a9 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -2260,163 +2260,151 @@ main(int argc, char **argv)
 	if (env_wait != NULL)
 		wait_seconds = atoi(env_wait);
 
-	/*
-	 * 'Action' can be before or after args so loop over both. Some
-	 * getopt_long() implementations will reorder argv[] to place all flags
-	 * first (GNU?), but we don't rely on it. Our /port version doesn't do
-	 * that.
-	 */
-	optind = 1;
-
 	/* process command-line options */
-	while (optind < argc)
+	while ((c = getopt_long(argc, argv, "cD:e:l:m:N:o:p:P:sS:t:U:wW",
+							long_options, &option_index)) != -1)
 	{
-		while ((c = getopt_long(argc, argv, "cD:e:l:m:N:o:p:P:sS:t:U:wW",
-								long_options, &option_index)) != -1)
+		switch (c)
 		{
-			switch (c)
-			{
-				case 'D':
-					{
-						char	   *pgdata_D;
-
-						pgdata_D = pg_strdup(optarg);
-						canonicalize_path(pgdata_D);
-						setenv("PGDATA", pgdata_D, 1);
-
-						/*
-						 * We could pass PGDATA just in an environment
-						 * variable but we do -D too for clearer postmaster
-						 * 'ps' display
-						 */
-						pgdata_opt = psprintf("-D \"%s\" ", pgdata_D);
-						free(pgdata_D);
-						break;
-					}
-				case 'e':
-					event_source = pg_strdup(optarg);
-					break;
-				case 'l':
-					log_file = pg_strdup(optarg);
-					break;
-				case 'm':
-					set_mode(optarg);
-					break;
-				case 'N':
-					register_servicename = pg_strdup(optarg);
-					break;
-				case 'o':
-					/* append option? */
-					if (!post_opts)
-						post_opts = pg_strdup(optarg);
-					else
-					{
-						char	   *old_post_opts = post_opts;
-
-						post_opts = psprintf("%s %s", old_post_opts, optarg);
-						free(old_post_opts);
-					}
-					break;
-				case 'p':
-					exec_path = pg_strdup(optarg);
-					break;
-				case 'P':
-					register_password = pg_strdup(optarg);
-					break;
-				case 's':
-					silent_mode = true;
+			case 'D':
+				{
+					char	   *pgdata_D;
+
+					pgdata_D = pg_strdup(optarg);
+					canonicalize_path(pgdata_D);
+					setenv("PGDATA", pgdata_D, 1);
+
+					/*
+					 * We could pass PGDATA just in an environment variable
+					 * but we do -D too for clearer postmaster 'ps' display
+					 */
+					pgdata_opt = psprintf("-D \"%s\" ", pgdata_D);
+					free(pgdata_D);
 					break;
-				case 'S':
+				}
+			case 'e':
+				event_source = pg_strdup(optarg);
+				break;
+			case 'l':
+				log_file = pg_strdup(optarg);
+				break;
+			case 'm':
+				set_mode(optarg);
+				break;
+			case 'N':
+				register_servicename = pg_strdup(optarg);
+				break;
+			case 'o':
+				/* append option? */
+				if (!post_opts)
+					post_opts = pg_strdup(optarg);
+				else
+				{
+					char	   *old_post_opts = post_opts;
+
+					post_opts = psprintf("%s %s", old_post_opts, optarg);
+					free(old_post_opts);
+				}
+				break;
+			case 'p':
+				exec_path = pg_strdup(optarg);
+				break;
+			case 'P':
+				register_password = pg_strdup(optarg);
+				break;
+			case 's':
+				silent_mode = true;
+				break;
+			case 'S':
 #ifdef WIN32
-					set_starttype(optarg);
+				set_starttype(optarg);
 #else
-					write_stderr(_("%s: -S option not supported on this platform\n"),
-								 progname);
-					exit(1);
+				write_stderr(_("%s: -S option not supported on this platform\n"),
+							 progname);
+				exit(1);
 #endif
-					break;
-				case 't':
-					wait_seconds = atoi(optarg);
-					wait_seconds_arg = true;
-					break;
-				case 'U':
-					if (strchr(optarg, '\\'))
-						register_username = pg_strdup(optarg);
-					else
-						/* Prepend .\ for local accounts */
-						register_username = psprintf(".\\%s", optarg);
-					break;
-				case 'w':
-					do_wait = true;
-					break;
-				case 'W':
-					do_wait = false;
-					break;
-				case 'c':
-					allow_core_files = true;
-					break;
-				default:
-					/* getopt_long already issued a suitable error message */
-					do_advice();
-					exit(1);
-			}
+				break;
+			case 't':
+				wait_seconds = atoi(optarg);
+				wait_seconds_arg = true;
+				break;
+			case 'U':
+				if (strchr(optarg, '\\'))
+					register_username = pg_strdup(optarg);
+				else
+					/* Prepend .\ for local accounts */
+					register_username = psprintf(".\\%s", optarg);
+				break;
+			case 'w':
+				do_wait = true;
+				break;
+			case 'W':
+				do_wait = false;
+				break;
+			case 'c':
+				allow_core_files = true;
+				break;
+			default:
+				/* getopt_long already issued a suitable error message */
+				do_advice();
+				exit(1);
 		}
+	}
 
-		/* Process an action */
-		if (optind < argc)
+	/* Process an action */
+	if (optind < argc)
+	{
+		if (strcmp(argv[optind], "init") == 0
+			|| strcmp(argv[optind], "initdb") == 0)
+			ctl_command = INIT_COMMAND;
+		else if (strcmp(argv[optind], "start") == 0)
+			ctl_command = START_COMMAND;
+		else if (strcmp(argv[optind], "stop") == 0)
+			ctl_command = STOP_COMMAND;
+		else if (strcmp(argv[optind], "restart") == 0)
+			ctl_command = RESTART_COMMAND;
+		else if (strcmp(argv[optind], "reload") == 0)
+			ctl_command = RELOAD_COMMAND;
+		else if (strcmp(argv[optind], "status") == 0)
+			ctl_command = STATUS_COMMAND;
+		else if (strcmp(argv[optind], "promote") == 0)
+			ctl_command = PROMOTE_COMMAND;
+		else if (strcmp(argv[optind], "logrotate") == 0)
+			ctl_command = LOGROTATE_COMMAND;
+		else if (strcmp(argv[optind], "kill") == 0)
 		{
-			if (ctl_command != NO_COMMAND)
+			if (argc - optind < 3)
 			{
-				write_stderr(_("%s: too many command-line arguments (first is \"%s\")\n"), progname, argv[optind]);
+				write_stderr(_("%s: missing arguments for kill mode\n"), progname);
 				do_advice();
 				exit(1);
 			}
-
-			if (strcmp(argv[optind], "init") == 0
-				|| strcmp(argv[optind], "initdb") == 0)
-				ctl_command = INIT_COMMAND;
-			else if (strcmp(argv[optind], "start") == 0)
-				ctl_command = START_COMMAND;
-			else if (strcmp(argv[optind], "stop") == 0)
-				ctl_command = STOP_COMMAND;
-			else if (strcmp(argv[optind], "restart") == 0)
-				ctl_command = RESTART_COMMAND;
-			else if (strcmp(argv[optind], "reload") == 0)
-				ctl_command = RELOAD_COMMAND;
-			else if (strcmp(argv[optind], "status") == 0)
-				ctl_command = STATUS_COMMAND;
-			else if (strcmp(argv[optind], "promote") == 0)
-				ctl_command = PROMOTE_COMMAND;
-			else if (strcmp(argv[optind], "logrotate") == 0)
-				ctl_command = LOGROTATE_COMMAND;
-			else if (strcmp(argv[optind], "kill") == 0)
-			{
-				if (argc - optind < 3)
-				{
-					write_stderr(_("%s: missing arguments for kill mode\n"), progname);
-					do_advice();
-					exit(1);
-				}
-				ctl_command = KILL_COMMAND;
-				set_sig(argv[++optind]);
-				killproc = atol(argv[++optind]);
-			}
+			ctl_command = KILL_COMMAND;
+			set_sig(argv[++optind]);
+			killproc = atol(argv[++optind]);
+		}
 #ifdef WIN32
-			else if (strcmp(argv[optind], "register") == 0)
-				ctl_command = REGISTER_COMMAND;
-			else if (strcmp(argv[optind], "unregister") == 0)
-				ctl_command = UNREGISTER_COMMAND;
-			else if (strcmp(argv[optind], "runservice") == 0)
-				ctl_command = RUN_AS_SERVICE_COMMAND;
+		else if (strcmp(argv[optind], "register") == 0)
+			ctl_command = REGISTER_COMMAND;
+		else if (strcmp(argv[optind], "unregister") == 0)
+			ctl_command = UNREGISTER_COMMAND;
+		else if (strcmp(argv[optind], "runservice") == 0)
+			ctl_command = RUN_AS_SERVICE_COMMAND;
 #endif
-			else
-			{
-				write_stderr(_("%s: unrecognized operation mode \"%s\"\n"), progname, argv[optind]);
-				do_advice();
-				exit(1);
-			}
-			optind++;
+		else
+		{
+			write_stderr(_("%s: unrecognized operation mode \"%s\"\n"), progname, argv[optind]);
+			do_advice();
+			exit(1);
 		}
+		optind++;
+	}
+
+	if (optind < argc)
+	{
+		write_stderr(_("%s: too many command-line arguments (first is \"%s\")\n"), progname, argv[optind]);
+		do_advice();
+		exit(1);
 	}
 
 	if (ctl_command == NO_COMMAND)
-- 
2.25.1

#27Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#26)
Re: add non-option reordering to in-tree getopt_long

On Wed, Jul 12, 2023 at 08:49:03PM -0700, Nathan Bossart wrote:

After a couple more small edits, I've committed this. I looked through all
uses of getopt_long() in PostgreSQL earlier today, and of the programs that
accepted non-options, most accepted only one, some others accepted 2-3, and
ecpg and pg_regress accepted any number. Given this analysis, I'm not too
worried about the O(n^2) behavior that the patch introduces. You'd likely
need to provide an enormous number of non-options for it to be noticeable,
and I'm dubious such use-cases exist.

During my analysis, I noticed that pg_ctl contains a workaround for the
lack of argument reordering. I think this can now be removed. Patch
attached.

Interesting piece of history that you have found here, coming from
f3d6d94 back in 2004. The old pg_ctl.sh before that did not need any
of that. It looks sensible to do something about that.

Something does not seem to be right seen from here, a CI run with
Windows 2019 fails when using pg_ctl at the beginning of most of the
tests, like:
[04:56:07.404] ------------------------------------- 8< -------------------------------------
[04:56:07.404] stderr:
[04:56:07.404] pg_ctl: too many command-line arguments (first is "-D")
--
Michael

#28Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#27)
Re: add non-option reordering to in-tree getopt_long

At Thu, 13 Jul 2023 14:39:32 +0900, Michael Paquier <michael@paquier.xyz> wrote in

[04:56:07.404] pg_ctl: too many command-line arguments (first is "-D")

Mmm. It checks, for example, for "pg_ctl initdb -D $tempdir/data -o
-N". This version of getopt_long() returns -1 as soon as it meets the
first non-option "initdb". This is somewhat different from the last
time what I saw the patch and looks strange at a glance..

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#29Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#27)
Re: add non-option reordering to in-tree getopt_long

On Thu, Jul 13, 2023 at 02:39:32PM +0900, Michael Paquier wrote:

Something does not seem to be right seen from here, a CI run with
Windows 2019 fails when using pg_ctl at the beginning of most of the
tests, like:
[04:56:07.404] ------------------------------------- 8< -------------------------------------
[04:56:07.404] stderr:
[04:56:07.404] pg_ctl: too many command-line arguments (first is "-D")

Assuming you are referring to [0]https://github.com/michaelpq/postgres/commits/getopt_test, it looks like you are missing 411b720.

[0]: https://github.com/michaelpq/postgres/commits/getopt_test

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#30Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#29)
Re: add non-option reordering to in-tree getopt_long

On Thu, Jul 13, 2023 at 07:57:12AM -0700, Nathan Bossart wrote:

Assuming you are referring to [0], it looks like you are missing 411b720.

[0] https://github.com/michaelpq/postgres/commits/getopt_test

Indeed, it looks like I've fat-fingered a rebase here. I am able to
get a clean CI run when running this patch, sorry for the noise.

Anyway, this introduces a surprising behavior when specifying too many
subcommands. On HEAD:
$ pg_ctl stop -D $PGDATA kill -t 20 start
pg_ctl: too many command-line arguments (first is "stop")
Try "pg_ctl --help" for more information.
$ pg_ctl stop -D $PGDATA -t 20 start
pg_ctl: too many command-line arguments (first is "stop")
Try "pg_ctl --help" for more information.

With the patch:
$ pg_ctl stop -D $PGDATA -t 20 start
pg_ctl: too many command-line arguments (first is "start")
Try "pg_ctl --help" for more information.
$ pg_ctl stop -D $PGDATA kill -t 20 start
pg_ctl: too many command-line arguments (first is "kill")
Try "pg_ctl --help" for more information.

So the error message reported is incorrect now, referring to an
incorrect first subcommand.
--
Michael

#31Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#30)
Re: add non-option reordering to in-tree getopt_long

On Fri, Jul 14, 2023 at 01:27:26PM +0900, Michael Paquier wrote:

Indeed, it looks like I've fat-fingered a rebase here. I am able to
get a clean CI run when running this patch, sorry for the noise.

Anyway, this introduces a surprising behavior when specifying too many
subcommands. On HEAD:
$ pg_ctl stop -D $PGDATA kill -t 20 start
pg_ctl: too many command-line arguments (first is "stop")
Try "pg_ctl --help" for more information.
$ pg_ctl stop -D $PGDATA -t 20 start
pg_ctl: too many command-line arguments (first is "stop")
Try "pg_ctl --help" for more information.

With the patch:
$ pg_ctl stop -D $PGDATA -t 20 start
pg_ctl: too many command-line arguments (first is "start")
Try "pg_ctl --help" for more information.
$ pg_ctl stop -D $PGDATA kill -t 20 start
pg_ctl: too many command-line arguments (first is "kill")
Try "pg_ctl --help" for more information.

So the error message reported is incorrect now, referring to an
incorrect first subcommand.

I did notice this, but I had the opposite reaction. Take the following
examples of client programs that accept one non-option:

~$ pg_resetwal a b c
pg_resetwal: error: too many command-line arguments (first is "b")
pg_resetwal: hint: Try "pg_resetwal --help" for more information.

~$ createuser a b c
createuser: error: too many command-line arguments (first is "b")
createuser: hint: Try "createuser --help" for more information.

~$ pgbench a b c
pgbench: error: too many command-line arguments (first is "b")
pgbench: hint: Try "pgbench --help" for more information.

~$ pg_restore a b c
pg_restore: error: too many command-line arguments (first is "b")
pg_restore: hint: Try "pg_restore --help" for more information.

Yet pg_ctl gives:

~$ pg_ctl start a b c
pg_ctl: too many command-line arguments (first is "start")
Try "pg_ctl --help" for more information.

In this example, isn't "a" the first extra non-option that should be
reported?

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#32Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#31)
Re: add non-option reordering to in-tree getopt_long

On Thu, Jul 13, 2023 at 09:38:42PM -0700, Nathan Bossart wrote:

I did notice this, but I had the opposite reaction.

Ahah, well ;)

Take the following examples of client programs that accept one non-option:

~$ pg_resetwal a b c
pg_resetwal: error: too many command-line arguments (first is "b")
pg_resetwal: hint: Try "pg_resetwal --help" for more information.

Yet pg_ctl gives:

~$ pg_ctl start a b c
pg_ctl: too many command-line arguments (first is "start")
Try "pg_ctl --help" for more information.

In this example, isn't "a" the first extra non-option that should be
reported?

Good point. This is interpreting "first" as being the first option
that's invalid. Here my first impression was that pg_ctl got that
right, where "first" refers to the first subcommand that would be
valid. Objection withdrawn.
--
Michael

#33Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#32)
Re: add non-option reordering to in-tree getopt_long

On Fri, Jul 14, 2023 at 02:02:28PM +0900, Michael Paquier wrote:

Objection withdrawn.

Committed.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#32)
Re: add non-option reordering to in-tree getopt_long

Michael Paquier <michael@paquier.xyz> writes:

On Thu, Jul 13, 2023 at 09:38:42PM -0700, Nathan Bossart wrote:

Take the following examples of client programs that accept one non-option:

~$ pg_resetwal a b c
pg_resetwal: error: too many command-line arguments (first is "b")
pg_resetwal: hint: Try "pg_resetwal --help" for more information.

Yet pg_ctl gives:

~$ pg_ctl start a b c
pg_ctl: too many command-line arguments (first is "start")
Try "pg_ctl --help" for more information.

In this example, isn't "a" the first extra non-option that should be
reported?

Good point. This is interpreting "first" as being the first option
that's invalid. Here my first impression was that pg_ctl got that
right, where "first" refers to the first subcommand that would be
valid. Objection withdrawn.

We just had a user complaint that seems to trace to exactly this
bogus reporting in pg_ctl [1]/messages/by-id/CANzqJaDCagH5wNkPQ42=Fx3mJPR-YnB3PWFdCAYAVdb9=Q+t-A@mail.gmail.com. Although I was originally not
very pleased with changing our getopt_long to do switch reordering,
I'm now wondering if we should back-patch these changes as bug
fixes. It's probably not worth the risk, but ...

regards, tom lane

[1]: /messages/by-id/CANzqJaDCagH5wNkPQ42=Fx3mJPR-YnB3PWFdCAYAVdb9=Q+t-A@mail.gmail.com

#35Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#34)
Re: add non-option reordering to in-tree getopt_long

On Mon, Dec 18, 2023 at 02:41:22PM -0500, Tom Lane wrote:

We just had a user complaint that seems to trace to exactly this
bogus reporting in pg_ctl [1]. Although I was originally not
very pleased with changing our getopt_long to do switch reordering,
I'm now wondering if we should back-patch these changes as bug
fixes. It's probably not worth the risk, but ...

I'm not too concerned about the risks of back-patching these commits, but
if this 19-year-old bug was really first reported today, I'd agree that
fixing it in the stable branches is probably not worth it.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#36Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#35)
Re: add non-option reordering to in-tree getopt_long

Nathan Bossart <nathandbossart@gmail.com> writes:

On Mon, Dec 18, 2023 at 02:41:22PM -0500, Tom Lane wrote:

We just had a user complaint that seems to trace to exactly this
bogus reporting in pg_ctl [1]. Although I was originally not
very pleased with changing our getopt_long to do switch reordering,
I'm now wondering if we should back-patch these changes as bug
fixes. It's probably not worth the risk, but ...

I'm not too concerned about the risks of back-patching these commits, but
if this 19-year-old bug was really first reported today, I'd agree that
fixing it in the stable branches is probably not worth it.

Agreed, if it actually is 19 years old. I'm wondering a little bit
if there could be some moderately-recent glibc behavior change
involved. I'm not excited enough about it to go trawl their change
log, but we should keep our ears cocked for similar reports.

regards, tom lane

#37Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#36)
Re: add non-option reordering to in-tree getopt_long

On Mon, Dec 18, 2023 at 09:31:54PM -0500, Tom Lane wrote:

Agreed, if it actually is 19 years old. I'm wondering a little bit
if there could be some moderately-recent glibc behavior change
involved. I'm not excited enough about it to go trawl their change
log, but we should keep our ears cocked for similar reports.

From a brief glance, I believe this is long-standing behavior. Even though
we advance optind at the bottom of the loop, the next getopt_long() call
seems to reset it to the first non-option (which was saved in a previous
call).

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com