Incorrect usage of strtol, atoi for non-numeric junk inputs

Started by Bharath Rupireddyalmost 5 years ago26 messageshackers
Jump to latest
#1Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com

Hi,

While working on [1]/messages/by-id/CALj2ACVMO6wY5Pc4oe1OCgUOAtdjHuFsBDw8R5uoYR86eWFQDA@mail.gmail.com, I found that some parts of the code is using
strtol and atoi without checking for non-numeric junk input strings. I
found this strange. Most of the time users provide proper numeric
strings but there can be some scenarios where these strings are not
user-supplied but generated by some other code which may contain
non-numeric strings. Shouldn't the code use strtol or atoi
appropriately and error out in such cases? One way to fix this once
and for all is to have a common API something like int
pg_strtol/pg_str_convert_to_int(char *opt_name, char *opt_value) which
returns a generic message upon invalid strings ("invalid value \"%s\"
is provided for option \"%s\", opt_name, opt_value) and returns
integers on successful parsing.

Although this is not a critical issue, I would like to seek thoughts.

[1]: /messages/by-id/CALj2ACVMO6wY5Pc4oe1OCgUOAtdjHuFsBDw8R5uoYR86eWFQDA@mail.gmail.com
[2]: strtol: vacuumlo.c --> ./vacuumlo -l '2323adfd' postgres -p '5432ERE' libpq_pipeline.c --> ./libpq_pipeline -r '2232adf' tests
vacuumlo.c --> ./vacuumlo -l '2323adfd' postgres -p '5432ERE'
libpq_pipeline.c --> ./libpq_pipeline -r '2232adf' tests

atoi:
pg_amcheck.c --> ./pg_amcheck -j '1211efe'
pg_basebackup --> ./pg_basebackup -Z 'EFEF' -s 'a$##'
pg_receivewal.c --> ./pg_receivewal -p '54343GDFD' -s '54343GDFD'
pg_recvlogical.c --> ./pg_recvlogical -F '-$#$#' -p '5432$$$' -s '100$$$'
pg_checksums.c. --> ./pg_checksums -f '1212abc'
pg_ctl.c --> ./pg_ctl -t 'efc'
pg_dump.c --> ./pg_dump -j '454adc' -Z '4adc' --extra-float-digits '-14adc'
pg_upgrade/option.c
pgbench.c
reindexdb.c
vacuumdb.c
pg_regress.c

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#2Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bharath Rupireddy (#1)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

On 2021-May-19, Bharath Rupireddy wrote:

While working on [1], I found that some parts of the code is using
strtol and atoi without checking for non-numeric junk input strings. I
found this strange. Most of the time users provide proper numeric
strings but there can be some scenarios where these strings are not
user-supplied but generated by some other code which may contain
non-numeric strings. Shouldn't the code use strtol or atoi
appropriately and error out in such cases? One way to fix this once
and for all is to have a common API something like int
pg_strtol/pg_str_convert_to_int(char *opt_name, char *opt_value) which
returns a generic message upon invalid strings ("invalid value \"%s\"
is provided for option \"%s\", opt_name, opt_value) and returns
integers on successful parsing.

Hi, how is this related to
/messages/by-id/20191028012000.GA59064@begriffs.com ?

--
�lvaro Herrera Valdivia, Chile

#3Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Alvaro Herrera (#2)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

On Thu, May 27, 2021 at 3:05 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2021-May-19, Bharath Rupireddy wrote:

While working on [1], I found that some parts of the code is using
strtol and atoi without checking for non-numeric junk input strings. I
found this strange. Most of the time users provide proper numeric
strings but there can be some scenarios where these strings are not
user-supplied but generated by some other code which may contain
non-numeric strings. Shouldn't the code use strtol or atoi
appropriately and error out in such cases? One way to fix this once
and for all is to have a common API something like int
pg_strtol/pg_str_convert_to_int(char *opt_name, char *opt_value) which
returns a generic message upon invalid strings ("invalid value \"%s\"
is provided for option \"%s\", opt_name, opt_value) and returns
integers on successful parsing.

Hi, how is this related to
/messages/by-id/20191028012000.GA59064@begriffs.com ?

Thanks. The proposed approach there was to implement postgres's own
strtol i.e. string parsing, conversion to integers and use it in the
places where atoi is being used. I'm not sure how far that can go.
What I'm proposing here is to use strtol inplace of atoi to properly
detect errors in case of inputs like '1211efe', '-14adc' and so on as
atoi can't detect such errors. Thoughts?

With Regards,
Bharath Rupireddy.

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bharath Rupireddy (#3)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

On 2021-Jun-04, Bharath Rupireddy wrote:

On Thu, May 27, 2021 at 3:05 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

Hi, how is this related to
/messages/by-id/20191028012000.GA59064@begriffs.com ?

Thanks. The proposed approach there was to implement postgres's own
strtol i.e. string parsing, conversion to integers and use it in the
places where atoi is being used. I'm not sure how far that can go.
What I'm proposing here is to use strtol inplace of atoi to properly
detect errors in case of inputs like '1211efe', '-14adc' and so on as
atoi can't detect such errors. Thoughts?

Well, if you scroll back to Surafel's initial submission in that thread,
it looks very similar in spirit to what you have here.

Another thing I just noticed which I hadn't realized is that Joe
Nelson's patch depends on Fabien Coelho's patch in this other thread,
/messages/by-id/alpine.DEB.2.21.1904201223040.29102@lancre
which was closed as returned-with-feedback, I suppose mostly due to
exhaustion/frustration at the lack of progress/interest.

I would suggest that the best way forward in this area is to rebase both
there patches on current master.

--
�lvaro Herrera Valdivia, Chile
"La virtud es el justo medio entre dos defectos" (Arist�teles)

#5Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Alvaro Herrera (#4)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

On Fri, Jun 4, 2021 at 8:58 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2021-Jun-04, Bharath Rupireddy wrote:

On Thu, May 27, 2021 at 3:05 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

Hi, how is this related to
/messages/by-id/20191028012000.GA59064@begriffs.com ?

Thanks. The proposed approach there was to implement postgres's own
strtol i.e. string parsing, conversion to integers and use it in the
places where atoi is being used. I'm not sure how far that can go.
What I'm proposing here is to use strtol inplace of atoi to properly
detect errors in case of inputs like '1211efe', '-14adc' and so on as
atoi can't detect such errors. Thoughts?

Well, if you scroll back to Surafel's initial submission in that thread,
it looks very similar in spirit to what you have here.

Another thing I just noticed which I hadn't realized is that Joe
Nelson's patch depends on Fabien Coelho's patch in this other thread,
/messages/by-id/alpine.DEB.2.21.1904201223040.29102@lancre
which was closed as returned-with-feedback, I suppose mostly due to
exhaustion/frustration at the lack of progress/interest.

I would suggest that the best way forward in this area is to rebase both
there patches on current master.

Thanks. I will read both the threads [1]/messages/by-id/alpine.DEB.2.21.1904201223040.29102@lancre, [2]/messages/by-id/20191028012000.GA59064@begriffs.com and try to rebase the
patches. If at all I get to rebase them, do you prefer the patches to
be in this thread or in a new thread?

[1]: /messages/by-id/alpine.DEB.2.21.1904201223040.29102@lancre
[2]: /messages/by-id/20191028012000.GA59064@begriffs.com

With Regards,
Bharath Rupireddy.

#6Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bharath Rupireddy (#5)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

On 2021-Jun-04, Bharath Rupireddy wrote:

On Fri, Jun 4, 2021 at 8:58 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

I would suggest that the best way forward in this area is to rebase both
there patches on current master.

Thanks. I will read both the threads [1], [2] and try to rebase the
patches. If at all I get to rebase them, do you prefer the patches to
be in this thread or in a new thread?

Thanks, that would be helpful. This thread is a good place.

--
�lvaro Herrera Valdivia, Chile

#7Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Alvaro Herrera (#6)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

On Fri, Jun 4, 2021 at 10:23 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2021-Jun-04, Bharath Rupireddy wrote:

On Fri, Jun 4, 2021 at 8:58 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

I would suggest that the best way forward in this area is to rebase both
there patches on current master.

Thanks. I will read both the threads [1], [2] and try to rebase the
patches. If at all I get to rebase them, do you prefer the patches to
be in this thread or in a new thread?

Thanks, that would be helpful. This thread is a good place.

I'm unable to spend time on this work as promised. I'd be happy if
someone could take it forward, although it's not critical work(IMO)
that needs immediate focus. I will try to spend time maybe later this
year.

Regards,
Bharath Rupireddy.

#8Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Bharath Rupireddy (#7)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

At Wed, 7 Jul 2021 17:40:13 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in

On Fri, Jun 4, 2021 at 10:23 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2021-Jun-04, Bharath Rupireddy wrote:

On Fri, Jun 4, 2021 at 8:58 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

I would suggest that the best way forward in this area is to rebase both
there patches on current master.

Thanks. I will read both the threads [1], [2] and try to rebase the
patches. If at all I get to rebase them, do you prefer the patches to
be in this thread or in a new thread?

Thanks, that would be helpful. This thread is a good place.

I'm unable to spend time on this work as promised. I'd be happy if
someone could take it forward, although it's not critical work(IMO)
that needs immediate focus. I will try to spend time maybe later this
year.

Looked through the three threads.

[1]: /messages/by-id/alpine.DEB.2.21.1904201223040.29102@lancre
much point in doing that in conjunction with [2]/messages/by-id/20191028012000.GA59064@begriffs.com or this thread. Since
the integral parameter values of pg-commands are in int, which the
exising function strtoint() is sufficient to read. So even [2]/messages/by-id/20191028012000.GA59064@begriffs.com itself
doesn't need to utilize [1]/messages/by-id/alpine.DEB.2.21.1904201223040.29102@lancre.

So I agree to the Bharath's point.

So the attached is a standalone patch that:

- doesn't contain [1]/messages/by-id/alpine.DEB.2.21.1904201223040.29102@lancre, since that functions are not needed for this
purpose.

- basically does the same thing with [2]/messages/by-id/20191028012000.GA59064@begriffs.com, but uses
strtoint/strtol/strtod instead of pg_strtoint16/32.

- doesn't try to make error messages verbose. That results in a
somewhat strange message like this but I'm not sure we should be so
strict at that point.

reindexdb: error: number of parallel jobs must be at least 1: hoge

- is extended to cover all usages of atoi/l/f in command line
processing, which are not fully covered by [2]/messages/by-id/20191028012000.GA59064@begriffs.com. (Maybe)

- is extended to cover psql's meta command parameters.

- is extended to cover integral environment variables. (PGPORTOLD/NEW
of pg_upgrade and COLUMNS of psql). The commands emit a warning for
invalid values, but I'm not sure it's worthwhile. (The second attached.)

psql: warning: ignored invalid setting of environemt variable COLUMNS: 3x

- doesn't cover pgbench's meta command parameters (for speed).

[1]: /messages/by-id/alpine.DEB.2.21.1904201223040.29102@lancre
[2]: /messages/by-id/20191028012000.GA59064@begriffs.com

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

0001-Be-strict-in-numeric-parameters-on-command-line.patchtext/x-patch; charset=us-asciiDownload+219-89
0002-Make-complain-for-invalid-numeirc-values-in-environe.patchtext/x-patch; charset=us-asciiDownload+46-4
#9Michael Paquier
michael@paquier.xyz
In reply to: Kyotaro Horiguchi (#8)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

On Thu, Jul 08, 2021 at 05:30:23PM +0900, Kyotaro Horiguchi wrote:

Looked through the three threads.

Thanks!

[1] is trying to expose pg_strtoint16/32 to frontend, but I don't see
much point in doing that in conjunction with [2] or this thread. Since
the integral parameter values of pg-commands are in int, which the
exising function strtoint() is sufficient to read. So even [2] itself
doesn't need to utilize [1].

It sounds sensible from here to just use strtoint(), some strtol(),
son strtod() and call it a day as these are already available.

-                    wait_seconds = atoi(optarg);
+                    errno = 0;
+                    wait_seconds = strtoint(optarg, &endptr, 10);
+                    if (*endptr || errno == ERANGE || wait_seconds < 0)
+                    {
+                        pg_log_error("invalid timeout \"%s\"", optarg);
+                        exit(1);
+                    }
[ ... ]
-                killproc = atol(argv[++optind]);
+                errno = 0;
+                killproc = strtol(argv[++optind], &endptr, 10);
+                if (*endptr || errno == ERANGE || killproc < 0)
+                {
+                    pg_log_error("invalid process ID \"%s\"", argv[optind]);
+                    exit(1);
+                }

Er, wait. We've actually allowed negative values for pg_ctl
--timeout or the subcommand kill!?

case 'j':
-                user_opts.jobs = atoi(optarg);
+                errno = 0;
+                user_opts.jobs = strtoint(optarg, &endptr, 10);
+                /**/
+                if (*endptr || errno == ERANGE)
+                    pg_fatal("invalid number of jobs %s\n", optarg);
+                    
break;

This one in pg_upgrade is incomplete. Perhaps the missing comment
should tell that negative job values are checked later on?
--
Michael

#10Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#9)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

Thank you for the comments.

At Fri, 9 Jul 2021 10:29:07 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Thu, Jul 08, 2021 at 05:30:23PM +0900, Kyotaro Horiguchi wrote:

[1] is trying to expose pg_strtoint16/32 to frontend, but I don't see
much point in doing that in conjunction with [2] or this thread. Since
the integral parameter values of pg-commands are in int, which the
exising function strtoint() is sufficient to read. So even [2] itself
doesn't need to utilize [1].

It sounds sensible from here to just use strtoint(), some strtol(),
son strtod() and call it a day as these are already available.

Thanks.

-                    wait_seconds = atoi(optarg);
+                    errno = 0;
+                    wait_seconds = strtoint(optarg, &endptr, 10);
+                    if (*endptr || errno == ERANGE || wait_seconds < 0)
+                    {
+                        pg_log_error("invalid timeout \"%s\"", optarg);
+                        exit(1);
+                    }
[ ... ]
-                killproc = atol(argv[++optind]);
+                errno = 0;
+                killproc = strtol(argv[++optind], &endptr, 10);
+                if (*endptr || errno == ERANGE || killproc < 0)
+                {
+                    pg_log_error("invalid process ID \"%s\"", argv[optind]);
+                    exit(1);
+                }

Er, wait. We've actually allowed negative values for pg_ctl
--timeout or the subcommand kill!?

For killproc, leading minus sign suggests that it is an command line
option. Since pg_ctl doesn't have such an option, that values is
recognized as invalid *options*, even with the additional check. The
additional check is useless in that sense. My intension is just to
make the restriction explicit so I won't feel sad even if it should be
removed.

$ pg_ctl kill HUP -12345
cg_ctl: infalid option -- '1'

--timeout accepts values less than 1, which values cause the command
ends with the timeout error before checking for the ending state. Not
destructive but useless as a behavior. We have two choices here. One
is simply inhibiting zero or negative timeouts, and another is
allowing zero as timeout and giving it the same meaning to
--no-wait. The former is strictly right behavior but the latter is
casual and convenient. I took the former way in this version.

$ pg_ctl -w -t 0 start
pg_ctl: error: invalid timeout value "0", use --no-wait to finish without waiting

The same message is shown for non-decimal values but that would not matters.

case 'j':
-                user_opts.jobs = atoi(optarg);
+                errno = 0;
+                user_opts.jobs = strtoint(optarg, &endptr, 10);
+                /**/
+                if (*endptr || errno == ERANGE)
+                    pg_fatal("invalid number of jobs %s\n", optarg);
+                    
break;

This one in pg_upgrade is incomplete. Perhaps the missing comment
should tell that negative job values are checked later on?

Zero or negative job numbers mean non-parallel mode and don't lead to
an error. If we don't need to preserve that behavior (I personally
don't think it is ether useful and/or breaks someone's existing
usage.), it is sensible to do range-check here.

I noticed that I overlooked PGCTLTIMEOUT.

The attached is:

- disallowed less-than-one numbers as jobs in pg_upgrade.
- disallowed less-than-one timeout in pg_ctl
- Use strtoint for PGCTLTIMEOUT in pg_ctl is

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v2-0001-Be-strict-in-numeric-parameters-on-command-line.patchtext/x-patch; charset=us-asciiDownload+219-89
v2-0002-Make-complain-for-invalid-numeirc-values-in-envir.patchtext/x-patch; charset=us-asciiDownload+57-5
#11Michael Paquier
michael@paquier.xyz
In reply to: Kyotaro Horiguchi (#10)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

On Fri, Jul 09, 2021 at 04:50:28PM +0900, Kyotaro Horiguchi wrote:

At Fri, 9 Jul 2021 10:29:07 +0900, Michael Paquier <michael@paquier.xyz> wrote in

Er, wait. We've actually allowed negative values for pg_ctl
--timeout or the subcommand kill!?

--timeout accepts values less than 1, which values cause the command
ends with the timeout error before checking for the ending state. Not
destructive but useless as a behavior. We have two choices here. One
is simply inhibiting zero or negative timeouts, and another is
allowing zero as timeout and giving it the same meaning to
--no-wait. The former is strictly right behavior but the latter is
casual and convenient. I took the former way in this version.

Yeah, that's not useful.

This one in pg_upgrade is incomplete. Perhaps the missing comment
should tell that negative job values are checked later on?

Zero or negative job numbers mean non-parallel mode and don't lead to
an error. If we don't need to preserve that behavior (I personally
don't think it is ether useful and/or breaks someone's existing
usage.), it is sensible to do range-check here.

Hmm. It would be good to preserve some compatibility here, but I can
see more benefits in being consistent between all the tools, and make
people aware that such commands are not generated more carefully.

case 'j':
-                opts.jobs = atoi(optarg);
-                if (opts.jobs < 1)
+                errno = 0;
+                opts.jobs = strtoint(optarg, &endptr, 10);
+                if (*endptr || errno == ERANGE || opts.jobs < 1)
{
pg_log_error("number of parallel jobs must be at least 1");
exit(1);

specifying a value that triggers ERANGE could be confusing for values
higher than INT_MAX with pg_amcheck --jobs:
$ pg_amcheck --jobs=4000000000
pg_amcheck: error: number of parallel jobs must be at least 1
I think that this should be reworded as "invalid number of parallel
jobs \"$s\"", or "number of parallel jobs must be in range %d..%d".
Perhaps we could have a combination of both? Using the first message
is useful with junk, non-numeric values or trailing characters. The
second is useful to make users aware that the value is numeric, but
not quite right.

--- a/src/bin/pg_checksums/pg_checksums.c
case 'f':
-                if (atoi(optarg) == 0)
+                errno = 0;
+                if (strtoint(optarg, &endptr, 10) == 0
+                    || *endptr || errno == ERANGE)
{
pg_log_error("invalid filenode specification, must be numeric: %s", optarg);
exit(1);

The confusion is equal here with pg_checksums -f:
$ ./pg_checksums --f 4000000000
pg_checksums: error: invalid filenode specification, must be numeric: 400000000
We could say "invalid file specification: \"%s\"". Another idea to be
crystal-clear about the range requirements is to use that:
"file specification must be in range %d..%d"

@@ -587,8 +602,10 @@ main(int argc, char **argv)

case 8:
have_extra_float_digits = true;
-                extra_float_digits = atoi(optarg);
-                if (extra_float_digits < -15 || extra_float_digits > 3)
+                errno = 0;
+                extra_float_digits = strtoint(optarg, &endptr, 10);
+                if (*endptr || errno == ERANGE ||
+                    extra_float_digits < -15 || extra_float_digits > 3)
{
pg_log_error("extra_float_digits must be in range -15..3");
exit_nicely(1);

Should we take this occasion to reduce the burden of translators and
reword that as "%s must be in range %d..%d"? That could be a separate
patch.

case 'p':
-                if ((old_cluster.port = atoi(optarg)) <= 0)
-                    pg_fatal("invalid old port number\n");
+                errno = 0;
+                if ((old_cluster.port = strtoint(optarg, &endptr, 10)) <= 0 ||
+                    *endptr || errno == ERANGE)
+                    pg_fatal("invalid old port number %s\n", optarg);
break;

You may want to use columns here, or specify the port range:
"invalid old port number: %s" or "old port number must be in range
%d..%d".

case 'P':
-                if ((new_cluster.port = atoi(optarg)) <= 0)
-                    pg_fatal("invalid new port number\n");
+                errno = 0;
+                if ((new_cluster.port = strtoint(optarg, &endptr, 10)) <= 0 ||
+                    *endptr || errno == ERANGE)
+                    pg_fatal("invalid new port number %s\n", optarg);
break;

Ditto.

+                if (*endptr || errno == ERANGE || concurrentCons <= 0)
{
-                    pg_log_error("number of parallel jobs must be at least 1");
+                    pg_log_error("number of parallel jobs must be at least 1: %s", optarg);
exit(1);
}

This one is also confusing with optorg > INT_MAX.

+                concurrentCons = strtoint(optarg, &endptr, 10);
+                if (*endptr || errno == ERANGE || concurrentCons <= 0)
{
pg_log_error("number of parallel jobs must be at least 1");
exit(1);
}
break;

And ditto for all the ones of vacuumdb.
--
Michael

#12Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#11)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

Thanks for the discussion.

At Tue, 13 Jul 2021 09:28:30 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Fri, Jul 09, 2021 at 04:50:28PM +0900, Kyotaro Horiguchi wrote:

At Fri, 9 Jul 2021 10:29:07 +0900, Michael Paquier <michael@paquier.xyz> wrote in

Er, wait. We've actually allowed negative values for pg_ctl
--timeout or the subcommand kill!?

--timeout accepts values less than 1, which values cause the command
ends with the timeout error before checking for the ending state. Not
destructive but useless as a behavior. We have two choices here. One
is simply inhibiting zero or negative timeouts, and another is
allowing zero as timeout and giving it the same meaning to
--no-wait. The former is strictly right behavior but the latter is
casual and convenient. I took the former way in this version.

Yeah, that's not useful.

^^; Ok, I'm fine with taking the second way. Changed it.

"-t 0" means "-W" in the attached and that behavior is described in
the doc part. The environment variable PGCTLTIMEOUT accepts the same
range of values. I added a warning that notifies that -t 0 overrides
explicit -w .

$ pg_ctl -w -t 0 start
pg_ctl: WARNING: -w is ignored because timeout is set to 0
server starting

This one in pg_upgrade is incomplete. Perhaps the missing comment
should tell that negative job values are checked later on?

Zero or negative job numbers mean non-parallel mode and don't lead to
an error. If we don't need to preserve that behavior (I personally
don't think it is ether useful and/or breaks someone's existing
usage.), it is sensible to do range-check here.

Hmm. It would be good to preserve some compatibility here, but I can
see more benefits in being consistent between all the tools, and make
people aware that such commands are not generated more carefully.

Ageed.

case 'j':
-                opts.jobs = atoi(optarg);
-                if (opts.jobs < 1)
+                errno = 0;
+                opts.jobs = strtoint(optarg, &endptr, 10);
+                if (*endptr || errno == ERANGE || opts.jobs < 1)
{
pg_log_error("number of parallel jobs must be at least 1");
exit(1);

specifying a value that triggers ERANGE could be confusing for values
higher than INT_MAX with pg_amcheck --jobs:
$ pg_amcheck --jobs=4000000000
pg_amcheck: error: number of parallel jobs must be at least 1
I think that this should be reworded as "invalid number of parallel
jobs \"$s\"", or "number of parallel jobs must be in range %d..%d".
Perhaps we could have a combination of both? Using the first message
is useful with junk, non-numeric values or trailing characters. The
second is useful to make users aware that the value is numeric, but
not quite right.

Yeah, I noticed that but ignored as a kind of impossible, or
something-needless-to-say:p

"invalid number of parallel jobs \"$s\""
"number of parallel jobs must be in range %d..%d"

The resulting combined message looks like this:

"number of parallel jobs must be an integer in range 1..2147483647"

I don't think it's not great that the message explicitly suggests a
limit like INT_MAX, which is far above the practical limit. So, (even
though I avoided to do that but) in the attached, I changed my mind to
split most of the errors into two messages to avoid suggesting such
impractical limits.

$ pg_amcheck -j -1
$ pg_amcheck -j 1x
$ pg_amcheck -j 10000000000000x
pg_amcheck: error: number of parallel jobs must be an integer greater than zero: "...."
$ pg_amcheck -j 10000000000000
pg_amcheck: error: number of parallel jobs out of range: "10000000000000"

If you feel it's too-much or pointless to split that way, I'll happy
to change it the "%d..%d" form.

Still I used the "%d..%d" notation for some parameters that has a
finite range by design. They look like the following:
(%d..%d doesn't work well for real numbers.)

"sampling rate must be an real number between 0.0 and 1.0: \"%s\""

I'm not sure what to do for numWorkers of pg_dump/restore. The limit
for numWorkers are lowered on Windows to quite low value, which would
be worth showing, but otherwise the limit is INT_MAX. The attached
makes pg_dump respond to -j 100 with the following error message which
doesn't suggest the finite limit 64 on Windows.

$ pg_dump -j 100
pg_dump: error: number of parallel jobs out of range: "100"

@@ -587,8 +602,10 @@ main(int argc, char **argv)

case 8:
have_extra_float_digits = true;
-                extra_float_digits = atoi(optarg);
-                if (extra_float_digits < -15 || extra_float_digits > 3)
+                errno = 0;
+                extra_float_digits = strtoint(optarg, &endptr, 10);
+                if (*endptr || errno == ERANGE ||
+                    extra_float_digits < -15 || extra_float_digits > 3)
{
pg_log_error("extra_float_digits must be in range -15..3");
exit_nicely(1);

Should we take this occasion to reduce the burden of translators and
reword that as "%s must be in range %d..%d"? That could be a separate
patch.

The first %s is not always an invariable symbol name so it could
result in concatenating several phrases into one, like this.

pg_log..("%s must be in range %s..%s", _("compression level"), "0", "9"))

It is translatable at least into Japanese but I'm not sure about other
languages.

In the attached, most of the messages are not in this shape since I
avoided to suggest substantially infinite limits, thus this doesn't
fully work. I'll consider it if the current shape is found to be
unacceptable. In that case I'll use the option names in the messages
instead of the natural names.

case 'p':
-                if ((old_cluster.port = atoi(optarg)) <= 0)
-                    pg_fatal("invalid old port number\n");
+                errno = 0;
+                if ((old_cluster.port = strtoint(optarg, &endptr, 10)) <= 0 ||
+                    *endptr || errno == ERANGE)
+                    pg_fatal("invalid old port number %s\n", optarg);
break;

You may want to use columns here, or specify the port range:
"invalid old port number: %s" or "old port number must be in range
%d..%d".

I'm not sure whether the colons(?) are required, since pg_receivewal
currently complains as "invalid port number \"%s\"" but I agree that
it would be better if we had one.

By the way, in the attached version, the message is split into the
following combination. ("an integer" might be useless..)

pg_fatal("old port number must be an integer greater than zero: \"%s\"\n",
pg_fatal("old port number out of range: \"%s\"\n", optarg);

As the result.

+                if (*endptr || errno == ERANGE || concurrentCons <= 0)
{
-                    pg_log_error("number of parallel jobs must be at least 1");
+                    pg_log_error("number of parallel jobs must be at least 1: %s", optarg);
exit(1);
}

This one is also confusing with optorg > INT_MAX.

The attached version says just "out of range" in that case.

- Is it worth avoiding suggesting a virtually infinite upper limit by
splitting out "out of range" from an error message?

- If not, I'll use the single message "xxx must be in range
1..2147483647" or "xxx must be an integer in range 1..2147483647".

Do you think we need the parameter type "an integer" there?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v3-0001-Be-strict-in-numeric-parameters-on-command-line.patchtext/x-patch; charset=us-asciiDownload+484-128
v3-0002-Make-complain-for-invalid-numeirc-values-in-envir.patchtext/x-patch; charset=us-asciiDownload+57-5
v3-0003-Doc-change-for-pg_ctl.patchtext/x-patch; charset=us-asciiDownload+5-5
#13Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Kyotaro Horiguchi (#12)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

On 2021-Jul-14, Kyotaro Horiguchi wrote:

pg_log_error("extra_float_digits must be in range -15..3");
exit_nicely(1);

Should we take this occasion to reduce the burden of translators and
reword that as "%s must be in range %d..%d"? That could be a separate
patch.

Yes, please, let's do it here.

The first %s is not always an invariable symbol name so it could
result in concatenating several phrases into one, like this.

pg_log..("%s must be in range %s..%s", _("compression level"), "0", "9"))

It is translatable at least into Japanese but I'm not sure about other
languages.

No, this doesn't work. When the first word is something that is
not to be translated (a literal parameter name), let's use a %s (but of
course the values must be taken out of the phrase too). But when it is
a translatable phrase, it must be present a full phrase, no
substitution:

pg_log_error("%s must be in range %s..%s", "extra_float_digits", "-15", "3");
pg_log..("compression level must be in range %s..%s", "0", "9"))

I think the purity test is whether you want to use a _() around the
argument, then you have to include it into the original message.

Thanks

--
Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/
"After a quick R of TFM, all I can say is HOLY CR** THAT IS COOL! PostgreSQL was
amazing when I first started using it at 7.2, and I'm continually astounded by
learning new features and techniques made available by the continuing work of
the development team."
Berend Tober, http://archives.postgresql.org/pgsql-hackers/2007-08/msg01009.php

#14Michael Paquier
michael@paquier.xyz
In reply to: Alvaro Herrera (#13)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

On Wed, Jul 14, 2021 at 11:02:47AM -0400, Alvaro Herrera wrote:

On 2021-Jul-14, Kyotaro Horiguchi wrote:

Should we take this occasion to reduce the burden of translators and
reword that as "%s must be in range %d..%d"? That could be a separate
patch.

Yes, please, let's do it here.

Okay.

No, this doesn't work. When the first word is something that is
not to be translated (a literal parameter name), let's use a %s (but of
course the values must be taken out of the phrase too). But when it is
a translatable phrase, it must be present a full phrase, no
substitution:

pg_log_error("%s must be in range %s..%s", "extra_float_digits", "-15", "3");
pg_log..("compression level must be in range %s..%s", "0", "9"))

I think the purity test is whether you want to use a _() around the
argument, then you have to include it into the original message.

After thinking about that, it seems to me that we don't have that much
context to lose once we build those error messages to show the option
name of the command. And the patch, as proposed, finishes with the
same error message patterns all over the place, which would be a
recipe for more inconsistencies in the future. I think that we should
centralize all that, say with a small-ish routine in a new file called
src/fe_utils/option_parsing.c that uses strtoint() as follows:
bool option_parse_int(const char *optarg,
const char *optname,
int min_range,
int max_range,
int *result);

Then this routine may print two types of errors through
pg_log_error():
- Incorrect range, using min_range/max_range.
- Junk input.
The boolean status is here to let the caller do any custom exit()
actions he wishes if there one of those two failures. pg_dump has
some of that with exit_nicely(), for one.
--
Michael

#15Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#14)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

At Thu, 15 Jul 2021 16:19:07 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Wed, Jul 14, 2021 at 11:02:47AM -0400, Alvaro Herrera wrote:

No, this doesn't work. When the first word is something that is
not to be translated (a literal parameter name), let's use a %s (but of
course the values must be taken out of the phrase too). But when it is
a translatable phrase, it must be present a full phrase, no
substitution:

pg_log_error("%s must be in range %s..%s", "extra_float_digits", "-15", "3");
pg_log..("compression level must be in range %s..%s", "0", "9"))

I think the purity test is whether you want to use a _() around the
argument, then you have to include it into the original message.

After thinking about that, it seems to me that we don't have that much
context to lose once we build those error messages to show the option
name of the command. And the patch, as proposed, finishes with the

Agreed.

same error message patterns all over the place, which would be a
recipe for more inconsistencies in the future. I think that we should
centralize all that, say with a small-ish routine in a new file called
src/fe_utils/option_parsing.c that uses strtoint() as follows:
bool option_parse_int(const char *optarg,
const char *optname,
int min_range,
int max_range,
int *result);

Then this routine may print two types of errors through
pg_log_error():
- Incorrect range, using min_range/max_range.
- Junk input.
The boolean status is here to let the caller do any custom exit()
actions he wishes if there one of those two failures. pg_dump has
some of that with exit_nicely(), for one.

It is substantially the same suggestion with [1] in the original
thread. The original proposal in the old thread was

bool pg_strtoint64_range(arg, &result, min, max, &error_message)

The difference is your suggestion is making the function output the
message within. I guess that the reason for the original proposal is
different style of message is required in several places.

Actually there are several irregulars.

1. Some "bare" options (that is not preceded by a hyphen option) like
PID of pg_ctl kill doesn't fit the format. \pset parameters of
pg_ctl is the same.

2. pg_ctl, pg_upgrade use their own error reporting mechanisms.

3. parameters that take real numbers doesn't fit the scheme specifying
range borders. For example boundary values may or may not be included
in the range.

4. Most of the errors are PG_LOG_ERROR, but a few ones are
PG_LOG_FATAL.

That being said, most of the caller sites are satisfied by
fixed-formed messages.

So I changed the interface as the following to deal with the above issues:

+extern optparse_result option_parse_int(int loglevel,
+										const char *optarg, const char *optname,
+										int min_range, int max_range,
+										int *result);

loglevel specifies the loglevel to use to emit error messages. If it
is the newly added item PG_LOG_OMIT, the function never emits an error
message. Addition to that, the return type is changed to an enum which
indicates what trouble the given string has. The caller can print
arbitrary error messages consulting the value. (killproc in pg_ctl.c)

Other points:

I added two more similar functions option_parse_long/double. The
former is a clone of _int. The latter doesn't perform explicit range
checks for the reason described above.

Maybe we need to make pg_upgraded use the common-logging features
instead of its own, but it is not included in this patch.

pgbench's -L option accepts out-of-range values for internal
variable. As the added comment says, we can limit the value with the
large exact number but I limited it to 3600s since I can't imagine
people needs larger latency limit than that.

Similarly to the above, -R option can take for example 1E-300, which
leads to int64 overflow later. Similar to -L, I don't think people
don't need less than 1E-5 or so as this parameter.

The attached patch needs more polish but should be enough to tell what
I have in my mind.

regards.

1: /messages/by-id/CAKJS1f94kkuB_53LcEf0HF+uxMiTCk5FtLx9gSsXcCByUKMz1g@mail.gmail.com

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v3-0001-Be-strict-in-numeric-parameters-on-command-line.patchtext/x-patch; charset=us-asciiDownload+457-184
v3-0002-Make-complain-for-invalid-numeirc-values-in-envir.patchtext/x-patch; charset=us-asciiDownload+38-6
v3-0003-Doc-change-for-pg_ctl.patchtext/x-patch; charset=us-asciiDownload+5-5
#16Michael Paquier
michael@paquier.xyz
In reply to: Kyotaro Horiguchi (#15)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

On Wed, Jul 21, 2021 at 05:02:29PM +0900, Kyotaro Horiguchi wrote:

The difference is your suggestion is making the function output the
message within. I guess that the reason for the original proposal is
different style of message is required in several places.

That's one step toward having a maximum number of frontend tools to
use the central logging APIs of src/common/.

1. Some "bare" options (that is not preceded by a hyphen option) like
PID of pg_ctl kill doesn't fit the format. \pset parameters of
pg_ctl is the same.

Yep. I was reviewing this one, but I have finished by removing it.
The argument 2 just below also came into my mind.

2. pg_ctl, pg_upgrade use their own error reporting mechanisms.

Yeah, for this reason I don't think that it is a good idea to switch
those areas to use the parsing of option_utils.c. Perhaps we should
consider switching pg_upgrade to have a better logging infra, but
there are also reasons behind what we have now. pg_ctl is out of
scope as it needs to cover WIN32 event logging.

3. parameters that take real numbers doesn't fit the scheme specifying
range borders. For example boundary values may or may not be included
in the range.

This concerns only pgbench, which I'd be fine to let as-is.

4. Most of the errors are PG_LOG_ERROR, but a few ones are
PG_LOG_FATAL.

I would take it that pgbench is inconsistent with the rest. Note that
pg_dump uses fatal(), but that's just a wrapper to pg_log_error().

loglevel specifies the loglevel to use to emit error messages. If it
is the newly added item PG_LOG_OMIT, the function never emits an error
message. Addition to that, the return type is changed to an enum which
indicates what trouble the given string has. The caller can print
arbitrary error messages consulting the value. (killproc in pg_ctl.c)

I am not much a fan of that. If we do so, what's the point in having
a dependency to logging.c anyway in option_utils.c? This OMIT option
only exists to bypass the specific logging needs where this gets
added. That does not seem a design adapted to me in the long term,
neither am I a fan of specific error codes for a code path that's just
going to be used to parse command options.

I added two more similar functions option_parse_long/double. The
former is a clone of _int. The latter doesn't perform explicit range
checks for the reason described above.

These have a limited impact, so I would limit things to int32 for
now.

Maybe we need to make pg_upgrade use the common-logging features
instead of its own, but it is not included in this patch.

Maybe. That would be good in the long term, though its case is very
particular.

The attached patch needs more polish but should be enough to tell what
I have in my mind.

This breaks some of the TAP tests of pgbench and pg_dump, at short
sight.

The checks for the port value in pg_receivewal and pg_recvlogical is
strange to have. We don't care about that in any other tools.

The number of checks for --jobs and workers could be made more
consistent across the board, but I have let that out for now.

Hacking on that, I am finishing with the attached. It is less
ambitious, still very useful as it removes a dozen of custom error
messages in favor of the two ones introduced in option_utils.c. On
top of that this reduces a bit the code:
15 files changed, 156 insertions(+), 169 deletions(-)

What do you think?
--
Michael

Attachments:

v4-0001-Introduce-and-use-routine-for-parsing-of-int32-op.patchtext/x-diff; charset=us-asciiDownload+156-170
#17David Rowley
dgrowleyml@gmail.com
In reply to: Michael Paquier (#16)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

On Wed, 21 Jul 2021 at 23:50, Michael Paquier <michael@paquier.xyz> wrote:

Hacking on that, I am finishing with the attached. It is less
ambitious, still very useful as it removes a dozen of custom error
messages in favor of the two ones introduced in option_utils.c. On
top of that this reduces a bit the code:
15 files changed, 156 insertions(+), 169 deletions(-)

What do you think?

This is just a driveby review, but I think that it's good to see no
increase in the number of lines of code to make these improvements.
It's also good to see the added consistency introduced by this patch.

I didn't test the patch, so this is just from reading through.

I wondered about the TAP tests here:

command_fails_like(
[ 'pg_dump', '-j', '-1' ],
qr/\Qpg_dump: error: -j\/--jobs must be in range 0..2147483647\E/,
'pg_dump: invalid number of parallel jobs');

command_fails_like(
[ 'pg_restore', '-j', '-1', '-f -' ],
qr/\Qpg_restore: error: -j\/--jobs must be in range 0..2147483647\E/,
'pg_restore: invalid number of parallel jobs');

I see both of these are limited to 64 on windows. Won't those fail on Windows?

I also wondered if it would be worth doing #define MAX_JOBS somewhere
away from the option parsing code. This part is pretty ugly:

/*
* On Windows we can only have at most MAXIMUM_WAIT_OBJECTS
* (= 64 usually) parallel jobs because that's the maximum
* limit for the WaitForMultipleObjects() call.
*/
if (!option_parse_int(optarg, "-j/--jobs", 0,
#ifdef WIN32
MAXIMUM_WAIT_OBJECTS,
#else
INT_MAX,
#endif
&numWorkers))
exit(1);
break;

David

#18Michael Paquier
michael@paquier.xyz
In reply to: David Rowley (#17)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

On Thu, Jul 22, 2021 at 12:32:39AM +1200, David Rowley wrote:

I see both of these are limited to 64 on windows. Won't those fail on Windows?

Yes, thanks, they would. I would just cut the range numbers from the
expected output here. This does not matter in terms of coverage
either.

x> I also wondered if it would be worth doing #define MAX_JOBS somewhere

away from the option parsing code. This part is pretty ugly:

Agreed as well. pg_dump and pg_restore have their own idea of
parallelism in parallel.{c.h}. What about putting MAX_JOBS in
parallel.h then?
--
Michael

#19David Rowley
dgrowleyml@gmail.com
In reply to: Michael Paquier (#18)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

On Thu, 22 Jul 2021 at 00:44, Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Jul 22, 2021 at 12:32:39AM +1200, David Rowley wrote:

I see both of these are limited to 64 on windows. Won't those fail on Windows?

Yes, thanks, they would. I would just cut the range numbers from the
expected output here. This does not matter in terms of coverage
either.

Sounds good.

x> I also wondered if it would be worth doing #define MAX_JOBS somewhere

away from the option parsing code. This part is pretty ugly:

Agreed as well. pg_dump and pg_restore have their own idea of
parallelism in parallel.{c.h}. What about putting MAX_JOBS in
parallel.h then?

parallel.h looks ok to me.

David

#20Michael Paquier
michael@paquier.xyz
In reply to: David Rowley (#19)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

On Thu, Jul 22, 2021 at 01:19:41AM +1200, David Rowley wrote:

On Thu, 22 Jul 2021 at 00:44, Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Jul 22, 2021 at 12:32:39AM +1200, David Rowley wrote:

I see both of these are limited to 64 on windows. Won't those fail on Windows?

Yes, thanks, they would. I would just cut the range numbers from the
expected output here. This does not matter in terms of coverage
either.

Sounds good.

x> I also wondered if it would be worth doing #define MAX_JOBS somewhere

away from the option parsing code. This part is pretty ugly:

Agreed as well. pg_dump and pg_restore have their own idea of
parallelism in parallel.{c.h}. What about putting MAX_JOBS in
parallel.h then?

parallel.h looks ok to me.

Okay, done those parts as per the attached. While on it, I noticed an
extra one for pg_dump --rows-per-insert. I am counting 25 translated
strings cut in total.

Any objections to this first step?
--
Michael

Attachments:

v5-0001-Introduce-and-use-routine-for-parsing-of-int32-op.patchtext/x-diff; charset=us-asciiDownload+174-191
#21Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Michael Paquier (#16)
#22Michael Paquier
michael@paquier.xyz
In reply to: Alvaro Herrera (#21)
#23Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#20)
#24Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#23)
#25Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#24)
#26Michael Paquier
michael@paquier.xyz
In reply to: Kyotaro Horiguchi (#25)