Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

Started by Michael Paquierover 5 years ago16 messageshackers
Jump to latest
#1Michael Paquier
michael@paquier.xyz

Hi all,

As $subject says, pg_test_fsync and pg_test_timing don't really check
the range of option values specified. It is possible for example to
make pg_test_fsync run an infinite amount of time, and pg_test_timing
does not handle overflows with --duration at all.

These are far from being critical issues, but let's fix them at least
on HEAD. So, please see the attached, where I have also added some
basic TAP tests for both tools.

Thanks,
--
Michael

Attachments:

pgtest-fix-range.patchtext/x-diff; charset=us-asciiDownload+78-16
#2Peter Eisentraut
peter_e@gmx.net
In reply to: Michael Paquier (#1)
Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

On 2020-08-06 08:27, Michael Paquier wrote:

As $subject says, pg_test_fsync and pg_test_timing don't really check
the range of option values specified. It is possible for example to
make pg_test_fsync run an infinite amount of time, and pg_test_timing
does not handle overflows with --duration at all.

These are far from being critical issues, but let's fix them at least
on HEAD. So, please see the attached, where I have also added some
basic TAP tests for both tools.

According to the POSIX standard, atoi() is not required to do any error
checking, and if you want error checking, you should use strtol().

And if you do that, you might as well change the variables to unsigned
and use strtoul(), and then drop the checks for <=0. I would allow 0.
It's not very useful, but it's not harmful and could be applicable in
testing.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#3Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#2)
Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

On Fri, Sep 04, 2020 at 11:24:39PM +0200, Peter Eisentraut wrote:

According to the POSIX standard, atoi() is not required to do any error
checking, and if you want error checking, you should use strtol().

And if you do that, you might as well change the variables to unsigned and
use strtoul(), and then drop the checks for <=0.

Switching to unsigned makes sense, indeed.

I would allow 0. It's not
very useful, but it's not harmful and could be applicable in testing.

Hmm, OK. For pg_test_fsync, 0 means infinity, and for pg_test_timing
that means stopping immediately (we currently don't allow that). How
does this apply to testing? For pg_test_fsync, using 0 would mean to
just remain stuck in the first fsync() pattern, while for
pg_test_fsync this means doing no test loops at all, generating a
useless log once done. Or do you mean to change the logic of
pg_test_fsync so as --secs-per-test=0 means doing one single write?
That's something I thought about for this thread, but I am not sure
that the extra regression test gain is worth more complexity in this
code.
--
Michael

Attachments:

pgtest-fix-range-v2.patchtext/x-diff; charset=us-asciiDownload+105-22
#4Peter Eisentraut
peter_e@gmx.net
In reply to: Michael Paquier (#3)
Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

On 2020-09-06 05:04, Michael Paquier wrote:

I would allow 0. It's not
very useful, but it's not harmful and could be applicable in testing.

Hmm, OK. For pg_test_fsync, 0 means infinity, and for pg_test_timing
that means stopping immediately (we currently don't allow that). How
does this apply to testing? For pg_test_fsync, using 0 would mean to
just remain stuck in the first fsync() pattern, while for
pg_test_fsync this means doing no test loops at all, generating a
useless log once done. Or do you mean to change the logic of
pg_test_fsync so as --secs-per-test=0 means doing one single write?
That's something I thought about for this thread, but I am not sure
that the extra regression test gain is worth more complexity in this
code.

I think in general doing something 0 times should be allowed if possible.

However, I see that in the case of pg_test_fsync you end up in alarm(0),
which does something different, so it's okay in that case to disallow it.

I notice that the error checking you introduce is different from the
checks for pgbench -t and -T (the latter having no errno checks). I'm
not sure which is correct, but it's perhaps worth making them the same.

(pgbench -t 0, which is also currently not allowed, is a good example of
why this could be useful, because that would allow checking whether the
script etc. can be loaded without running an actual test.)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#5Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#4)
Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

On Mon, Sep 07, 2020 at 10:06:57AM +0200, Peter Eisentraut wrote:

However, I see that in the case of pg_test_fsync you end up in alarm(0),
which does something different, so it's okay in that case to disallow it.

Yep.

I notice that the error checking you introduce is different from the checks
for pgbench -t and -T (the latter having no errno checks). I'm not sure
which is correct, but it's perhaps worth making them the same.

pgbench currently uses atoi() to parse the options of -t and -T. Are
you suggesting to switch that to strtoXX() as well or perhaps you are
referring to the parsing of the weight in parseScriptWeight()? FWIW,
the error handling introduced in this patch is similar to what we do
for example in pg_resetwal. This has its own problems as strtoul()
would not report ERANGE except for values higher than ULONG_MAX, but
the returned results are stored in 32 bits. We could switch to just
use uint64 where we could of course, but is that really worth it for
such tools? For example, pg_test_timing could overflow the
total_timing calculated if using a too high value, but nobody would
use such values anyway. So I'd rather just use uint32 and call it a
day, for simplicity's sake mainly..

(pgbench -t 0, which is also currently not allowed, is a good example of why
this could be useful, because that would allow checking whether the script
etc. can be loaded without running an actual test.)

Perhaps. That looks like a separate item to me though.
--
Michael

#6Peter Eisentraut
peter_e@gmx.net
In reply to: Michael Paquier (#5)
Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

On 2020-09-10 09:59, Michael Paquier wrote:

I notice that the error checking you introduce is different from the checks
for pgbench -t and -T (the latter having no errno checks). I'm not sure
which is correct, but it's perhaps worth making them the same.

pgbench currently uses atoi() to parse the options of -t and -T. Are
you suggesting to switch that to strtoXX() as well or perhaps you are
referring to the parsing of the weight in parseScriptWeight()? FWIW,
the error handling introduced in this patch is similar to what we do
for example in pg_resetwal. This has its own problems as strtoul()
would not report ERANGE except for values higher than ULONG_MAX, but
the returned results are stored in 32 bits. We could switch to just
use uint64 where we could of course, but is that really worth it for
such tools? For example, pg_test_timing could overflow the
total_timing calculated if using a too high value, but nobody would
use such values anyway. So I'd rather just use uint32 and call it a
day, for simplicity's sake mainly..

The first patch you proposed checks for errno == ERANGE, but pgbench
code doesn't do that. So one of them is not correct.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#7Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#6)
Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

On Thu, Sep 10, 2020 at 03:59:20PM +0200, Peter Eisentraut wrote:

The first patch you proposed checks for errno == ERANGE, but pgbench code
doesn't do that. So one of them is not correct.

Sorry for the confusion, I misunderstood what you were referring to.
Yes, the first patch is wrong to add the check on errno. FWIW, I
thought about your point to use strtol() but that does not seem worth
the complication for those tools. It is not like anybody is going to
use high values for these, and using uint64 to make sure that the
boundaries are checked just add more checks for bounds. There is
one example in pg_test_timing when compiling the total time.
--
Michael

#8Peter Eisentraut
peter_e@gmx.net
In reply to: Michael Paquier (#7)
Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

On 2020-09-11 09:08, Michael Paquier wrote:

On Thu, Sep 10, 2020 at 03:59:20PM +0200, Peter Eisentraut wrote:

The first patch you proposed checks for errno == ERANGE, but pgbench code
doesn't do that. So one of them is not correct.

Sorry for the confusion, I misunderstood what you were referring to.
Yes, the first patch is wrong to add the check on errno. FWIW, I
thought about your point to use strtol() but that does not seem worth
the complication for those tools. It is not like anybody is going to
use high values for these, and using uint64 to make sure that the
boundaries are checked just add more checks for bounds. There is
one example in pg_test_timing when compiling the total time.

I didn't mean use strtol() to be able to process larger values, but for
the error checking. atoi() cannot detect any errors other than ERANGE.
So if you are spending effort on making the option value parsing more
robust, relying on atoi() will result in an incomplete solution.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#9Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#8)
Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

On Tue, Sep 15, 2020 at 02:39:08PM +0200, Peter Eisentraut wrote:

I didn't mean use strtol() to be able to process larger values, but for the
error checking. atoi() cannot detect any errors other than ERANGE. So if
you are spending effort on making the option value parsing more robust,
relying on atoi() will result in an incomplete solution.

Okay, after looking at that, here is v3. This includes range checks
as well as errno checks based on strtol(). What do you think?
--
Michael

Attachments:

pgtest-fix-range-v3.patchtext/x-diff; charset=us-asciiDownload+115-20
#10Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#9)
Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

On Fri, Sep 18, 2020 at 05:22:15PM +0900, Michael Paquier wrote:

Okay, after looking at that, here is v3. This includes range checks
as well as errno checks based on strtol(). What do you think?

This fails in the CF bot on Linux because of pg_logging_init()
returning with errno=ENOTTY in the TAP tests, for which I began a new
thread:
/messages/by-id/20200918095713.GA20887@paquier.xyz

Not sure if this will lead anywhere, but we can also address the
failure by enforcing errno=0 for the new calls of strtol() introduced
in this patch. So here is an updated patch doing so.
--
Michael

Attachments:

pgtest-fix-range-v4.patchtext/x-diff; charset=us-asciiDownload+117-20
#11Peter Eisentraut
peter_e@gmx.net
In reply to: Michael Paquier (#10)
Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

On 2020-09-20 05:41, Michael Paquier wrote:

On Fri, Sep 18, 2020 at 05:22:15PM +0900, Michael Paquier wrote:

Okay, after looking at that, here is v3. This includes range checks
as well as errno checks based on strtol(). What do you think?

This fails in the CF bot on Linux because of pg_logging_init()
returning with errno=ENOTTY in the TAP tests, for which I began a new
thread:
/messages/by-id/20200918095713.GA20887@paquier.xyz

Not sure if this will lead anywhere, but we can also address the
failure by enforcing errno=0 for the new calls of strtol() introduced
in this patch. So here is an updated patch doing so.

I think the error checking is now structurally correct in this patch.

However, I still think the integer type use is a bit inconsistent. In
both cases, using strtoul() and dealing with unsigned integer types
between parsing and final use would be more consistent.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#12Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#11)
Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

On Tue, Sep 22, 2020 at 11:45:14PM +0200, Peter Eisentraut wrote:

However, I still think the integer type use is a bit inconsistent. In both
cases, using strtoul() and dealing with unsigned integer types between
parsing and final use would be more consistent.

No objections to that either, so changed this way. I kept those
variables signed because applying values of 2B~4B is not really going
to matter much here ;p
--
Michael

Attachments:

pgtest-fix-range-v5.patchtext/x-diff; charset=us-asciiDownload+121-24
#13Peter Eisentraut
peter_e@gmx.net
In reply to: Michael Paquier (#12)
Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

On 2020-09-23 03:50, Michael Paquier wrote:

On Tue, Sep 22, 2020 at 11:45:14PM +0200, Peter Eisentraut wrote:

However, I still think the integer type use is a bit inconsistent. In both
cases, using strtoul() and dealing with unsigned integer types between
parsing and final use would be more consistent.

No objections to that either, so changed this way. I kept those
variables signed because applying values of 2B~4B is not really going
to matter much here ;p

This patch mixes up unsigned int and uint32 in random ways. The
variable is uint32, but the format is %u and the max constant is UINT_MAX.

I think just use unsigned int as the variable type. There is no need to
use the bit-exact types. Note that the argument of alarm() is of type
unsigned int.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#14Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#13)
Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

On Wed, Sep 23, 2020 at 08:11:59AM +0200, Peter Eisentraut wrote:

This patch mixes up unsigned int and uint32 in random ways. The variable is
uint32, but the format is %u and the max constant is UINT_MAX.

I think just use unsigned int as the variable type. There is no need to use
the bit-exact types. Note that the argument of alarm() is of type unsigned
int.

Makes sense, thanks.
--
Michael

Attachments:

pgtest-fix-range-v6.patchtext/x-diff; charset=us-asciiDownload+121-24
#15Peter Eisentraut
peter_e@gmx.net
In reply to: Michael Paquier (#14)
Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

On 2020-09-24 09:12, Michael Paquier wrote:

On Wed, Sep 23, 2020 at 08:11:59AM +0200, Peter Eisentraut wrote:

This patch mixes up unsigned int and uint32 in random ways. The variable is
uint32, but the format is %u and the max constant is UINT_MAX.

I think just use unsigned int as the variable type. There is no need to use
the bit-exact types. Note that the argument of alarm() is of type unsigned
int.

Makes sense, thanks.

looks good to me

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#16Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#15)
Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

On Fri, Sep 25, 2020 at 07:52:10AM +0200, Peter Eisentraut wrote:

looks good to me

Thanks, applied.
--
Michael