Change atoi to strtol in same place

Started by Surafel Temesgenalmost 7 years ago42 messageshackers
Jump to latest
#1Surafel Temesgen
surafel3000@gmail.com

Hello,

we use atoi for user argument processing in same place which return zero
for both invalid input and input value zero. In most case its ok because we
error out with appropriate error message for input zero but in same place
where we accept zero as valued input it case a problem by preceding for
invalid input as input value zero. The attached patch change those place to
strtol which can handle invalid input

regards

Surafel

Attachments:

atoi-to-strtol-v1.patchtext/x-patch; charset=US-ASCII; name=atoi-to-strtol-v1.patchDownload+191-38
#2Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Surafel Temesgen (#1)
Re: Change atoi to strtol in same place

Hi Surafel,

On Mon, Jul 01, 2019 at 08:48:27PM +0300, Surafel Temesgen wrote:

Hello,

we use atoi for user argument processing in same place which return zero
for both invalid input and input value zero. In most case its ok because we
error out with appropriate error message for input zero but in same place
where we accept zero as valued input it case a problem by preceding for
invalid input as input value zero. The attached patch change those place to
strtol which can handle invalid input

regards

Surafel

This seems to have bit-rotted (due to minor changes to pg_basebackup).
Can you fix that and post an updated version?

In general, I think it's a good idea to fix those places, but I wonder
if we need to change the error messages too.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#3Joe Nelson
joe@begriffs.com
In reply to: Tomas Vondra (#2)
Re: Change atoi to strtol in same place

Surafel Temesgen wrote:

we use atoi for user argument processing in same place which return
zero for both invalid input and input value zero. [...] in same
place where we accept zero as valued input it case a problem by
preceding for invalid input as input value zero. strtol which can
handle invalid input

Not only that, but atoi causes Undefined Behavior on erroneous input.
The C99 standard says this:

7.20.1 Numeric conversion functions
The functions atof, atoi, atol, and atoll need not affect the value of the
integer expression errno on an error. If the value of the result cannot be
represented, the behavior is undefined.

Tomas Vondra wrote:

This seems to have bit-rotted (due to minor changes to pg_basebackup).
Can you fix that and post an updated version?

I adjusted the patch to apply cleanly on a0555ddab9.

In general, I think it's a good idea to fix those places, but I wonder
if we need to change the error messages too.

I'll leave that decision for the community to debate. I did, however,
remove the newlines for the new error messages being passed to
pg_log_error().

As discussed in message [0], the logging functions in common/logging.c
now contain an assertion that messages do not end in newline:

Assert(fmt[strlen(fmt) - 1] != '\n');

(in pg_log_error via pg_log_generic via pg_log_generic_v)

I also added limits.h to some places it was missing, so the patch would
build.

0: /messages/by-id/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com

Attachments:

atoi-to-strtol-v2.patchtext/x-patch; charset=utf-8Download+193-38
#4David Rowley
dgrowleyml@gmail.com
In reply to: Joe Nelson (#3)
Re: Change atoi to strtol in same place

On Wed, 24 Jul 2019 at 16:02, Joe Nelson <joe@begriffs.com> wrote:

In general, I think it's a good idea to fix those places, but I wonder
if we need to change the error messages too.

I'll leave that decision for the community to debate. I did, however,
remove the newlines for the new error messages being passed to
pg_log_error().

I'd like to put my vote not to add this complex code to each option
validation that requires an integer number. I'm not sure there
currently is a home for it, but if there was, wouldn't it be better
writing a function that takes a lower and upper bound and sets some
output param with the value and returns a bool to indicate if it's
within range or not?

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#5Andres Freund
andres@anarazel.de
In reply to: David Rowley (#4)
Re: Change atoi to strtol in same place

On 2019-07-24 16:57:42 +1200, David Rowley wrote:

On Wed, 24 Jul 2019 at 16:02, Joe Nelson <joe@begriffs.com> wrote:

In general, I think it's a good idea to fix those places, but I wonder
if we need to change the error messages too.

I'll leave that decision for the community to debate. I did, however,
remove the newlines for the new error messages being passed to
pg_log_error().

I'd like to put my vote not to add this complex code to each option
validation that requires an integer number. I'm not sure there
currently is a home for it, but if there was, wouldn't it be better
writing a function that takes a lower and upper bound and sets some
output param with the value and returns a bool to indicate if it's
within range or not?

+many

#6Michael Paquier
michael@paquier.xyz
In reply to: David Rowley (#4)
Re: Change atoi to strtol in same place

On Wed, Jul 24, 2019 at 04:57:42PM +1200, David Rowley wrote:

I'd like to put my vote not to add this complex code to each option
validation that requires an integer number. I'm not sure there
currently is a home for it, but if there was, wouldn't it be better
writing a function that takes a lower and upper bound and sets some
output param with the value and returns a bool to indicate if it's
within range or not?

Perhaps. When I see this patch calling strtol basically only for 10
as base, this reminds me of Fabien Coelho's patch refactor all the
strtoint routines we have in the code:
https://commitfest.postgresql.org/23/2099/

The conclusion that we are reaching on the thread is to remove more
dependencies on strtol that we have in the code, and replace it with
our own, more performant wrappers. This thread makes me wondering
that we had better wait before doing this move.
--
Michael

#7Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Michael Paquier (#6)
Re: Change atoi to strtol in same place

On 2019-Jul-24, Michael Paquier wrote:

The conclusion that we are reaching on the thread is to remove more
dependencies on strtol that we have in the code, and replace it with
our own, more performant wrappers. This thread makes me wondering
that we had better wait before doing this move.

Okay, so who is submitting a new version here? Surafel, Joe?

Waiting on Author.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#8Joe Nelson
joe@begriffs.com
In reply to: Alvaro Herrera (#7)
Re: Change atoi to strtol in same place

Alvaro Herrera from 2ndQuadrant wrote:

Okay, so who is submitting a new version here? Surafel, Joe?

Sure, I'll look into it over the weekend.

#9Joe Nelson
joe@begriffs.com
In reply to: Alvaro Herrera (#7)
Re: Change atoi to strtol in same place

Alvaro Herrera from 2ndQuadrant wrote:

Okay, so who is submitting a new version here? Surafel, Joe?

I've attached a revision that uses pg_strtoint64 from str2int-10.patch
(posted in <alpine.DEB.2.21.1909032007230.21690@lancre>). My patch is
based off that one and c5bc7050af.

It covers the same front-end utilities as atoi-to-strtol-v2.patch. Other
utilities in the pg codebase still have atoi inside, but I thought I'd
share my progress to see if people like the direction. If so, I can
update the rest of the utils.

I added a helper function to a new file in src/fe_utils, but had to
modify Makefiles in ways that might not be too clean. Maybe there's a
better place for the function.

--
Joe Nelson https://begriffs.com

Attachments:

atoi-to-strtol-v3.patchtext/x-patch; charset=utf-8Download+209-44
#10Michael Paquier
michael@paquier.xyz
In reply to: Joe Nelson (#9)
Re: Change atoi to strtol in same place

On Mon, Sep 09, 2019 at 11:02:49PM -0500, Joe Nelson wrote:

Alvaro Herrera from 2ndQuadrant wrote:

Okay, so who is submitting a new version here? Surafel, Joe?

I've attached a revision that uses pg_strtoint64 from str2int-10.patch
(posted in <alpine.DEB.2.21.1909032007230.21690@lancre>). My patch is
based off that one and c5bc7050af.

It covers the same front-end utilities as atoi-to-strtol-v2.patch. Other
utilities in the pg codebase still have atoi inside, but I thought I'd
share my progress to see if people like the direction. If so, I can
update the rest of the utils.

I added a helper function to a new file in src/fe_utils, but had to
modify Makefiles in ways that might not be too clean. Maybe there's a
better place for the function.

Using a wrapper in src/fe_utils makes sense. I would have used
options.c for the new file, or would that be too much generic?

The code indentation is weird, particularly the switch/case in
pg_strtoint64_range().

The error handling is awkward. I think that you should just call
pg_log_error in pg_strtoint64_range instead of returning an error
string as you do. You could do that by passing down the option name
to the routine, and generate a new set of error messages using that.

Why do you need to update common/Makefile?

The builds on Windows are broken. You are adding one file into
fe_utils, but forgot to update @pgfeutilsfiles in Mkvcbuild.pm.
--
Michael

#11Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#10)
Re: Change atoi to strtol in same place

On Tue, Sep 10, 2019 at 1:36 AM Michael Paquier <michael@paquier.xyz> wrote:

The error handling is awkward. I think that you should just call
pg_log_error in pg_strtoint64_range instead of returning an error
string as you do. You could do that by passing down the option name
to the routine, and generate a new set of error messages using that.

-1. I think it's very useful to have routines for this sort of thing
that return an error message rather than emitting an error report
directly. That gives the caller a lot more control.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#12Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#11)
Re: Change atoi to strtol in same place

On Tue, Sep 10, 2019 at 08:03:32AM -0400, Robert Haas wrote:

-1. I think it's very useful to have routines for this sort of thing
that return an error message rather than emitting an error report
directly. That gives the caller a lot more control.

Please let me counter-argue here.  There are a couple of reasons to
not do as the patch proposes:
1) Consistency with the error messages makes less work for translators,
who already have a lot to deal with.  The patch is awkward in this
sense, to give some examples:
+   if (s != PG_STRTOINT_OK)
    {
-       pg_log_error("invalid status interval \"%s\"", optarg);
+       pg_log_error("invalid status interval: %s", parse_error);
}
[...]
-       pg_log_error("invalid compression level \"%s\"", optarg);
+       pg_log_error("invalid compression level: %s", parse_error);

2) A centralized error message can provide the same level of details.
Here are suggestions for each error status:
pg_log_error("could not parse value for option %s", optname);
pg_log_error("invalid value for option %s", optname);
optname should be defined by the caller with strings like
"-t/--timeout" or such. Then, if ranges are specified and the error
is on a range, I think that we should just add a second error message
to provide a hint to the user, if wanted by the caller of
pg_strtoint64_range() so an extra argument could do handle that:
pg_log_error("value must be in range %d..%d")

3) I think that we should not expose directly the status values of
pg_strtoint_status in pg_strtoint64_range(), that's less for module
authors to worry about, and that would be the same approach as we are
using for the wrappers of pg_strto[u]intXX() in the patch of the other
thread (see pg_strto[u]intXX_check for example in [1]/messages/by-id/20190910030525.GA22934@paquier.xyz -- Michael).

[1]: /messages/by-id/20190910030525.GA22934@paquier.xyz -- Michael
--
Michael

#13Joe Nelson
joe@begriffs.com
In reply to: Michael Paquier (#10)
Re: Change atoi to strtol in same place

Michael Paquier wrote:

Using a wrapper in src/fe_utils makes sense. I would have used
options.c for the new file, or would that be too much generic?

Sure, options.c sounds fine -- it doesn't seem any more generic than
"arg_utils" and is a little simpler. The only other question I have is
if the function inside -- with some adjustment in its interface -- might
be useful in other contexts beyond front-end args checking.

The code indentation is weird, particularly the switch/case in
pg_strtoint64_range().

I did run pgindent... Do I need to tell it about the existence of the
new file?

The error handling is awkward.

Let's continue this point in your follow-up
<20190911035356.GE1953@paquier.xyz>.

Why do you need to update common/Makefile?

Thought I needed to do this so that other parts would link properly, but
just removed the changes there and stuff still links OK, so I'll remove
that change.

The builds on Windows are broken. You are adding one file into
fe_utils, but forgot to update @pgfeutilsfiles in Mkvcbuild.pm. --

Thanks for the tip, I'm still learning about the build process. Is there
a service I can use to test my patches across multiple platforms? I'd
rather not bother reviewers with build problems that I can catch in a
more automated way.

--
Joe Nelson https://begriffs.com

#14Joe Nelson
joe@begriffs.com
In reply to: Michael Paquier (#12)
Re: Change atoi to strtol in same place

Robert Haas wrote:

-1. I think it's very useful to have routines for this sort of thing
that return an error message rather than emitting an error report
directly. That gives the caller a lot more control.

Michael Paquier wrote:

1) Consistency with the error messages makes less work for translators,
who already have a lot to deal with.

Agreed that the messages can be slightly inconsistent. I tried to make
the new messages match the styles of other messages in their respective
utilities. Maybe the bigger issue here is inconsistent output styles
across the utilities in general:

pg_standby.c includes flag names
%s: -r maxretries %s
pg_basebackup.c writes the settings out in words
invalid compression level: %s

Note that the final %s in those examples will expand to a more detailed
message. For example passing "-Z 10" to pg_dump in the current patch will
output:

pg_dump: error: invalid compression level: 10 is outside range 0..9

2) A centralized error message can provide the same level of details.

Even assuming we standardize the message format, different callers have
different means to handle the messages. The front-end utilities affected in my
patch use calls as varied as fprintf, pg_log_error, write_stderr and pg_fatal.
Thus pg_strtoint64_range needs more flexibility than calling pg_log_error
internally.

3) I think that we should not expose directly the status values of
pg_strtoint_status in pg_strtoint64_range(), that's less for module
authors to worry about, and that would be the same approach as we are
using for the wrappers of pg_strto[u]intXX() in the patch of the other
thread (see pg_strto[u]intXX_check for example in [1]).

The pg_strto[u]intXX_check functions can return the integer directly only
because they handle errors with ereport(ERROR, ...). However, as I mentioned
earlier, this is not always what the front-end utilities need to do.

--
Joe Nelson https://begriffs.com

#15Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#12)
Re: Change atoi to strtol in same place

On Tue, Sep 10, 2019 at 11:54 PM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Sep 10, 2019 at 08:03:32AM -0400, Robert Haas wrote:

-1. I think it's very useful to have routines for this sort of thing
that return an error message rather than emitting an error report
directly. That gives the caller a lot more control.

Please let me counter-argue here.

OK, but on the other hand, Joe's example of a custom message "invalid
compression level: 10 is outside range 0..9" is a world better than
"invalid compression level: %s". We might even be able to do better
"argument to -Z must be a compression level between 0 and 9". In
backend error-reporting, it's often important to show the misguided
value, because it may be coming from a complex query where it's hard
to find the source of the problematic value. But if the user types
-Z42 or -Zborked, I'm not sure it's important to tell them that the
problem is with "42" or "borked". It's more important to explain the
concept, or such would be my judgement.

Also, consider an option where the value must be an integer between 1
and 100 or one of several fixed strings (e.g. think of
recovery_target_timeline). The user clearly can't use the generic
error message in that case. Perhaps the answer is to say that such
users shouldn't use the provided range-checking function but rather
implement the logic from scratch. But that seems a bit limiting.

Also, suppose the user doesn't want to pg_log_error(). Maybe it's a
warning. Maybe it doesn't even need to be logged.

What this boils down to in the end is that putting more of the policy
decisions into the function helps ensure consistency and save code
when the function is used, but it also results in the function being
used less often. Reasonable people can differ on the merits of
different approaches, but for me the down side of returning the error
message appears minor at most, and the up sides seem significant.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#16Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#15)
Re: Change atoi to strtol in same place

... can we have a new patch? Not only because there seems to have been
some discussion points that have gone unaddressed (?), but also because
Appveyor complains really badly about this one.
https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.58672

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#17Joe Nelson
joe@begriffs.com
In reply to: Alvaro Herrera (#16)
Re: Change atoi to strtol in same place

Alvaro Herrera wrote:

... can we have a new patch? Not only because there seems to have
been some discussion points that have gone unaddressed (?)

Yes, I'll work on it over the weekend.

but also because Appveyor complains really badly about this one.
https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.58672

Note that it requires functions from str2int-10.patch, and will not
compile when applied to master by itself. I didn't want to duplicate
functionality from that other uncommitted patch in mine.

#18Michael Paquier
michael@paquier.xyz
In reply to: Joe Nelson (#17)
Re: Change atoi to strtol in same place

On Fri, Sep 27, 2019 at 09:35:53PM -0500, Joe Nelson wrote:

Note that it requires functions from str2int-10.patch, and will not
compile when applied to master by itself. I didn't want to duplicate
functionality from that other uncommitted patch in mine.

If we have a dependency between both threads, perhaps more people
could comment there? Here is the most relevant update:
/messages/by-id/20190917022913.GB1660@paquier.xyz
--
Michael

#19Joe Nelson
joe@begriffs.com
In reply to: Alvaro Herrera (#16)
Re: Change atoi to strtol in same place

Alvaro Herrera wrote:

... can we have a new patch?

OK, I've attached v4. It works cleanly on 55282fa20f with
str2int-16.patch applied. My patch won't compile without the other one
applied too.

Changed:
[x] revert my changes in common/Makefile
[x] rename arg_utils.[ch] to option.[ch]
[x] update @pgfeutilsfiles in Mkvcbuild.pm
[x] pgindent everything
[x] get rid of atoi() in more utilities

One question about how the utilities parse port numbers. I currently
have it check that the value can be parsed as an integer, and that its
range is within 1 .. (1<<16)-1. I wonder if the former restriction is
(un)desirable, because ultimately getaddrinfo() takes a "service name
description" for the port, which can be a name such as found in
'/etc/services' as well as the string representation of a number. If
desired, I *could* treat only range errors as a failure for ports, and
allow integer parse errors.

--
Joe Nelson https://begriffs.com

Attachments:

atoi-to-strtol-v4.patchtext/x-patch; charset=utf-8Download+307-92
#20Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Joe Nelson (#19)
Re: Change atoi to strtol in same place

Hello.

At Sun, 29 Sep 2019 23:51:23 -0500, Joe Nelson <joe@begriffs.com> wrote in <20190930045123.GC68117@begriffs.com>

Alvaro Herrera wrote:

... can we have a new patch?

OK, I've attached v4. It works cleanly on 55282fa20f with
str2int-16.patch applied. My patch won't compile without the other one
applied too.

Changed:
[x] revert my changes in common/Makefile
[x] rename arg_utils.[ch] to option.[ch]
[x] update @pgfeutilsfiles in Mkvcbuild.pm
[x] pgindent everything
[x] get rid of atoi() in more utilities

Compiler complained as "INT_MAX undeclared" (gcc 7.3 / CentOS7.6).

One question about how the utilities parse port numbers. I currently
have it check that the value can be parsed as an integer, and that its
range is within 1 .. (1<<16)-1. I wonder if the former restriction is
(un)desirable, because ultimately getaddrinfo() takes a "service name
description" for the port, which can be a name such as found in
'/etc/services' as well as the string representation of a number. If
desired, I *could* treat only range errors as a failure for ports, and
allow integer parse errors.

We could do that, but perhaps no use for our usage. We are not
likely to use named ports other than 'postgres', if any.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#21Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#20)
#22Joe Nelson
joe@begriffs.com
In reply to: Kyotaro Horiguchi (#21)
#23Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Joe Nelson (#22)
#24Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Alvaro Herrera (#23)
#25Joe Nelson
joe@begriffs.com
In reply to: Ashutosh Sharma (#24)
#26David Rowley
dgrowleyml@gmail.com
In reply to: Joe Nelson (#25)
#27Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: David Rowley (#26)
#28David Rowley
dgrowleyml@gmail.com
In reply to: Ashutosh Sharma (#27)
#29Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: David Rowley (#28)
#30Joe Nelson
joe@begriffs.com
In reply to: David Rowley (#26)
#31David Rowley
dgrowleyml@gmail.com
In reply to: Joe Nelson (#30)
#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#31)
#33Joe Nelson
joe@begriffs.com
In reply to: David Rowley (#26)
#34Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Joe Nelson (#25)
#35Joe Nelson
joe@begriffs.com
In reply to: Kyotaro Horiguchi (#34)
#36Joe Nelson
joe@begriffs.com
In reply to: Joe Nelson (#35)
#37Peter Eisentraut
peter_e@gmx.net
In reply to: Joe Nelson (#36)
#38Joe Nelson
joe@begriffs.com
In reply to: Peter Eisentraut (#37)
#39Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Joe Nelson (#36)
#40Daniel Gustafsson
daniel@yesql.se
In reply to: Alvaro Herrera (#39)
#41Joe Nelson
joe@begriffs.com
In reply to: Daniel Gustafsson (#40)
#42Daniel Gustafsson
daniel@yesql.se
In reply to: Joe Nelson (#41)