Strange interval arithmetic

Started by Christopher Kings-Lynneover 20 years ago27 messageshackers
Jump to latest
#1Christopher Kings-Lynne
chriskl@familyhealth.com.au

What's going on here? Some sort of integer wraparound?

WORKS
=====

mysql=# select interval '2378 seconds';
interval
----------
00:39:38
(1 row)

mysql=#
mysql=# select 2378 * interval '1 second';
?column?
----------
00:39:38
(1 row)

DOESN'T WORK
============

test=# select interval '2378234234 seconds';
interval
--------------
596523:14:07
(1 row)

test=# select 2378234234 * interval '1 second';
?column?
--------------
660620:37:14
(1 row)

#2Michael Fuhr
mike@fuhr.org
In reply to: Christopher Kings-Lynne (#1)
Re: Strange interval arithmetic

On Sun, Nov 27, 2005 at 11:15:04PM +0800, Christopher Kings-Lynne wrote:

What's going on here? Some sort of integer wraparound?

[...]

test=# select interval '2378234234 seconds';
interval
--------------
596523:14:07
(1 row)

Looks like the value is stuck at 2^31 - 1 seconds:

test=> select interval '2147483646 seconds'; -- 2^31 - 2
interval
--------------
596523:14:06
(1 row)

test=> select interval '2147483647 seconds'; -- 2^31 - 1
interval
--------------
596523:14:07
(1 row)

test=> select interval '2147483648 seconds'; -- 2^31
interval
--------------
596523:14:07
(1 row)

--
Michael Fuhr

#3Michael Fuhr
mike@fuhr.org
In reply to: Michael Fuhr (#2)
Re: Strange interval arithmetic

On Sun, Nov 27, 2005 at 08:45:18AM -0700, Michael Fuhr wrote:

Looks like the value is stuck at 2^31 - 1 seconds:

I see this behavior back to at least 7.3. I'd guess it's because
strtol() indicates overflow by returning LONG_MAX and setting errno
to ERANGE, but the code doesn't check for that.

--
Michael Fuhr

#4Michael Fuhr
mike@fuhr.org
In reply to: Michael Fuhr (#3)
Re: Strange interval arithmetic

On Sun, Nov 27, 2005 at 11:27:54AM -0700, Michael Fuhr wrote:

On Sun, Nov 27, 2005 at 08:45:18AM -0700, Michael Fuhr wrote:

Looks like the value is stuck at 2^31 - 1 seconds:

I see this behavior back to at least 7.3. I'd guess it's because
strtol() indicates overflow by returning LONG_MAX and setting errno
to ERANGE, but the code doesn't check for that.

Is this worth looking at for the upcoming dot releases? It's
apparently a longstanding behavior that almost nobody encounters,
yet knowingly not addressing it seems a bit MySQLish ;-) Here's
the start of the thread for anybody who missed it:

http://archives.postgresql.org/pgsql-hackers/2005-11/msg01385.php

--
Michael Fuhr

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Fuhr (#4)
Re: Strange interval arithmetic

Michael Fuhr <mike@fuhr.org> writes:

I see this behavior back to at least 7.3. I'd guess it's because
strtol() indicates overflow by returning LONG_MAX and setting errno
to ERANGE, but the code doesn't check for that.

Is this worth looking at for the upcoming dot releases?

Sure, send a patch ...

regards, tom lane

#6Michael Fuhr
mike@fuhr.org
In reply to: Tom Lane (#5)
Re: Strange interval arithmetic

On Wed, Nov 30, 2005 at 12:37:40PM -0500, Tom Lane wrote:

Michael Fuhr <mike@fuhr.org> writes:

I see this behavior back to at least 7.3. I'd guess it's because
strtol() indicates overflow by returning LONG_MAX and setting errno
to ERANGE, but the code doesn't check for that.

Is this worth looking at for the upcoming dot releases?

Sure, send a patch ...

Any preferences on an approach? The simplest and easiest to verify
would be to raise an error for just this particular case; a TODO
item might be to change how the string is parsed to allow values
larger than LONG_MAX. I see several calls to strtol() that aren't
checked for overflow but that might not be relevant to this problem,
so I'm thinking this patch ought not touch them. Maybe that's another
TODO item.

--
Michael Fuhr

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Fuhr (#6)
Re: Strange interval arithmetic

Michael Fuhr <mike@fuhr.org> writes:

On Wed, Nov 30, 2005 at 12:37:40PM -0500, Tom Lane wrote:

Sure, send a patch ...

Any preferences on an approach? The simplest and easiest to verify
would be to raise an error for just this particular case; a TODO
item might be to change how the string is parsed to allow values
larger than LONG_MAX.

I think the latter would be a feature enhancement and therefore not
good material to back-patch. Just erroring out seems appropriate
for now.

I see several calls to strtol() that aren't checked for overflow but
that might not be relevant to this problem, so I'm thinking this patch
ought not touch them. Maybe that's another TODO item.

If it's possible for them to be given overflowing input, they probably
ought to be checked.

regards, tom lane

#8Michael Fuhr
mike@fuhr.org
In reply to: Tom Lane (#7)
Re: Strange interval arithmetic

On Wed, Nov 30, 2005 at 02:01:46PM -0500, Tom Lane wrote:

Michael Fuhr <mike@fuhr.org> writes:

Any preferences on an approach? The simplest and easiest to verify
would be to raise an error for just this particular case; a TODO
item might be to change how the string is parsed to allow values
larger than LONG_MAX.

I think the latter would be a feature enhancement and therefore not
good material to back-patch. Just erroring out seems appropriate
for now.

Agreed. I'm thinking about rewriting strtol() calls in datetime.c
to look like this:

errno = 0;
val = strtol(field[i], &cp, 10);
if (errno == ERANGE)
return DTERR_FIELD_OVERFLOW;

Does that look okay? Or would you rather raise an error with ereport()?

I see several calls to strtol() that aren't checked for overflow but
that might not be relevant to this problem, so I'm thinking this patch
ought not touch them. Maybe that's another TODO item.

If it's possible for them to be given overflowing input, they probably
ought to be checked.

I'm looking at all the strtol() calls in datetime.c right now; I
haven't looked anywhere else yet. Should I bother checking values
that will be range checked later anyway? Time zone displacements,
for example?

--
Michael Fuhr

#9Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Michael Fuhr (#8)
Re: Strange interval arithmetic

Michael Fuhr wrote:

Agreed. I'm thinking about rewriting strtol() calls in datetime.c
to look like this:

errno = 0;
val = strtol(field[i], &cp, 10);
if (errno == ERANGE)
return DTERR_FIELD_OVERFLOW;

Hmm, why not check both the return value _and_ errno:

val = strtol(field[i], &cp, 10);
if (val == LONG_MAX && errno == ERANGE)
return DTERR_FIELD_OVERFLOW;

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#10Michael Fuhr
mike@fuhr.org
In reply to: Alvaro Herrera (#9)
Re: Strange interval arithmetic

On Wed, Nov 30, 2005 at 07:06:42PM -0300, Alvaro Herrera wrote:

Hmm, why not check both the return value _and_ errno:

val = strtol(field[i], &cp, 10);
if (val == LONG_MAX && errno == ERANGE)
return DTERR_FIELD_OVERFLOW;

I usually check both in my own code but I noticed several places
where PostgreSQL doesn't, so I kept that style. I'll check both
if that's preferred.

--
Michael Fuhr

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Fuhr (#8)
Re: Strange interval arithmetic

Michael Fuhr <mike@fuhr.org> writes:

errno = 0;
val = strtol(field[i], &cp, 10);
if (errno == ERANGE)
return DTERR_FIELD_OVERFLOW;

Does that look okay? Or would you rather raise an error with ereport()?

Looks fine to me, at least in the routines that are for datetime stuff.

I'm looking at all the strtol() calls in datetime.c right now; I
haven't looked anywhere else yet. Should I bother checking values
that will be range checked later anyway? Time zone displacements,
for example?

Good question. Is strtol guaranteed to return INT_MAX or INT_MIN on
overflow, or might it return the overflowed value?

regards, tom lane

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Fuhr (#10)
Re: Strange interval arithmetic

Michael Fuhr <mike@fuhr.org> writes:

I usually check both in my own code but I noticed several places
where PostgreSQL doesn't, so I kept that style. I'll check both
if that's preferred.

I'd say not --- it's more code and it makes a possibly unwarranted
assumption about strtol's behavior.

regards, tom lane

#13Michael Fuhr
mike@fuhr.org
In reply to: Tom Lane (#11)
Re: Strange interval arithmetic

On Wed, Nov 30, 2005 at 05:20:54PM -0500, Tom Lane wrote:

Michael Fuhr <mike@fuhr.org> writes:

I'm looking at all the strtol() calls in datetime.c right now; I
haven't looked anywhere else yet. Should I bother checking values
that will be range checked later anyway? Time zone displacements,
for example?

Good question. Is strtol guaranteed to return INT_MAX or INT_MIN on
overflow, or might it return the overflowed value?

The Open Group Base Specifications say this:

Upon successful completion, these functions shall return the converted
value, if any. If no conversion could be performed, 0 shall be
returned and errno may be set to [EINVAL].

If the correct value is outside the range of representable values,
{LONG_MIN}, {LONG_MAX}, {LLONG_MIN}, or {LLONG_MAX} shall be returned
(according to the sign of the value), and errno set to [ERANGE].

http://www.opengroup.org/onlinepubs/009695399/functions/strtol.html

FreeBSD and Solaris both peg overflow at LONG_MAX, and that behavior
is what I noticed in the first place. I don't know if any systems
behave otherwise. Alvaro suggested checking for both LONG_MAX and
ERANGE; I suppose if we check for LONG_MAX then we should also check
for LONG_MIN. I don't know if any systems might set ERANGE in a
non-error situation.

--
Michael Fuhr

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Fuhr (#13)
Re: Strange interval arithmetic

Michael Fuhr <mike@fuhr.org> writes:

I suppose if we check for LONG_MAX then we should also check
for LONG_MIN.

s/should/must/, which makes the code even more complicated, in order to
buy what exactly?

I don't know if any systems might set ERANGE in a non-error situation.

The SUS saith
http://www.opengroup.org/onlinepubs/007908799/xsh/strtol.html

The strtol() function will not change the setting of errno if
successful.

Perhaps more to the point, we've been doing it that way (errno test
only) for many years without complaints. Adding a test on the return
value is venturing into less charted waters.

regards, tom lane

#15Michael Fuhr
mike@fuhr.org
In reply to: Tom Lane (#12)
Re: Strange interval arithmetic

On Wed, Nov 30, 2005 at 05:23:23PM -0500, Tom Lane wrote:

Michael Fuhr <mike@fuhr.org> writes:

I usually check both in my own code but I noticed several places
where PostgreSQL doesn't, so I kept that style. I'll check both
if that's preferred.

I'd say not --- it's more code and it makes a possibly unwarranted
assumption about strtol's behavior.

OTOH, it might be an unwarranted assumption that ERANGE alone
indicates error. It's possible that on some system errno's value
is meaningless unless strtol() returns one of the documented
indicators (LONG_MAX or LONG_MIN). I've seen system calls that
behave that way: errno might get set in a non-error situation due
to the underlying implementation (e.g., wrappers around socket
functions in userland thread implementations), but the programmer
has no business looking at errno unless the function returns -1.

--
Michael Fuhr

#16Michael Fuhr
mike@fuhr.org
In reply to: Tom Lane (#14)
Re: Strange interval arithmetic

On Wed, Nov 30, 2005 at 05:49:53PM -0500, Tom Lane wrote:

The SUS saith
http://www.opengroup.org/onlinepubs/007908799/xsh/strtol.html

The strtol() function will not change the setting of errno if
successful.

Perhaps more to the point, we've been doing it that way (errno test
only) for many years without complaints. Adding a test on the return
value is venturing into less charted waters.

Good, I'll stick with just the ERANGE check then.

--
Michael Fuhr

#17Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#14)
Re: Strange interval arithmetic

Tom Lane wrote:

Michael Fuhr <mike@fuhr.org> writes:

I suppose if we check for LONG_MAX then we should also check
for LONG_MIN.

s/should/must/, which makes the code even more complicated, in order to
buy what exactly?

I don't know if any systems might set ERANGE in a non-error situation.

The SUS saith
http://www.opengroup.org/onlinepubs/007908799/xsh/strtol.html

The strtol() function will not change the setting of errno if
successful.

Perhaps more to the point, we've been doing it that way (errno test
only) for many years without complaints. Adding a test on the return
value is venturing into less charted waters.

LONG_MIN/LONG_MAX might be the actual values provided, too, mightn't
they? checking for ERANGE seems like the only viable test.

cheers

andrew

#18Michael Fuhr
mike@fuhr.org
In reply to: Andrew Dunstan (#17)
Re: Strange interval arithmetic

On Wed, Nov 30, 2005 at 06:00:07PM -0500, Andrew Dunstan wrote:

LONG_MIN/LONG_MAX might be the actual values provided, too, mightn't
they? checking for ERANGE seems like the only viable test.

Errno needs to be checked in any case for just that reason; the
question was whether checking *only* errno is sufficient to detect
an error. According to the standard it is.

--
Michael Fuhr

#19Michael Fuhr
mike@fuhr.org
In reply to: Michael Fuhr (#16)
Re: Strange interval arithmetic

Hmmm...is this something else that needs fixing? The doc says dates
range from 4713 BC to 32767 AD.

test=> select '11754179-08-04'::date;
date
----------------
11754179-08-04
(1 row)

test=> select '11754179-08-05'::date;
date
---------------
4801-01-01 BC
(1 row)

--
Michael Fuhr

#20Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#12)
Re: Strange interval arithmetic

Tom Lane <tgl@sss.pgh.pa.us> writes:

Michael Fuhr <mike@fuhr.org> writes:

I usually check both in my own code but I noticed several places
where PostgreSQL doesn't, so I kept that style. I'll check both
if that's preferred.

I'd say not --- it's more code and it makes a possibly unwarranted
assumption about strtol's behavior.

Generally speaking looking at errno when you haven't received an error return
from a libc function is asking for trouble. It could be leftover from any
previous libc error.

That's how you get programs saying things like "strtol: No such file or
directory" ...

The strtol() function returns the result of the conversion, unless the value
would underflow or overflow. If an underflow occurs, strtol() returns
LONG_MIN. If an overflow occurs, strtol() returns LONG_MAX. In both cases,
errno is set to ERANGE. Precisely the same holds for strtoll() (with LLONG_MIN
and LLONG_MAX instead of LONG_MIN and LONG_MAX).

--
greg

#21Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#12)
#22Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#20)
#23Bruce Momjian
bruce@momjian.us
In reply to: Michael Fuhr (#8)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#22)
#25Michael Fuhr
mike@fuhr.org
In reply to: Bruce Momjian (#22)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#23)
#27Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#26)