Integer parsing bug?

Started by Steve Atkinsabout 22 years ago5 messagesbugs

steve@blighty.com

about 22 years ago

Section 8.1 of the manual gives the range of an integer
as -2147483648 to +2147483647.

template1=# select '-2147483648'::int;
int4
-------------
-2147483648
(1 row)

template1=# select -2147483648::int;
ERROR: integer out of range

Oops.

template1=# select version();
version
-------------------------------------------------------------
PostgreSQL 7.4.1 on i686-pc-linux-gnu, compiled by GCC 2.96
(1 row)

Completely vanilla build - no options other than --prefix to
configure. Clean installation, this is immediately after an initdb.

I see the same bug on Solaris, built with Forte C in 64 bit mode.

Cheers,
Steve

bruce@momjian.us

about 22 years ago

In reply to: Steve Atkins (#1)

Re: Integer parsing bug?

Steve Atkins wrote:

Section 8.1 of the manual gives the range of an integer
as -2147483648 to +2147483647.

template1=# select '-2147483648'::int;
int4
-------------
-2147483648
(1 row)

template1=# select -2147483648::int;
ERROR: integer out of range

Oops.

template1=# select version();
version
-------------------------------------------------------------
PostgreSQL 7.4.1 on i686-pc-linux-gnu, compiled by GCC 2.96
(1 row)

Completely vanilla build - no options other than --prefix to
configure. Clean installation, this is immediately after an initdb.

I see the same bug on Solaris, built with Forte C in 64 bit mode.

Yep, it definately looks weird:

test=> select '-2147483648'::int;
int4
-------------
-2147483648
(1 row)

test=> select -2147483648::int;
ERROR: integer out of range
test=> select -2147483647::int;
?column?
-------------
-2147483647
(1 row)

test=> select '-2147483649'::int;
ERROR: value "-2147483649" is out of range for type integer

The non-quoting works only for *47, and the quoting works for *48, but
both fail for *49.

I looked at libc's strtol(), and that works fine, as does our existing
parser checks. The error is coming from int84, a comparison function
called from the executor. Here is a test program:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
long long l = -2147483648;
int i = l;

if (i != l)
printf("not equal\n");
else
printf("equal\n");
return 0;
}

A compile generates the following warning:

tst1.c:6: warning: decimal constant is so large that it is unsigned

and reports "not equal".

I see in the freebsd machine/limits.h file:

* According to ANSI (section 2.2.4.2), the values below must be usable by
* #if preprocessing directives. Additionally, the expression must have the
* same type as would an expression that is an object of the corresponding
* type converted according to the integral promotions. The subtraction for
* INT_MIN, etc., is so the value is not unsigned; e.g., 0x80000000 is an
* unsigned int for 32-bit two's complement ANSI compilers (section 3.1.3.2).
* These numbers are for the default configuration of gcc. They work for
* some other compilers as well, but this should not be depended on.

#define INT_MAX 0x7fffffff /* max value for an int */
#define INT_MIN (-0x7fffffff - 1) /* min value for an int */

Basically, what is happening is that the special value -INT_MAX-1 is
being converted to an int value, and the compiler is casting it to an
unsigned. Seems this is a known C issue and I can't see a good fix for
it except perhaps check for INT_MIN int he int84 function, but I ran
some tests and that didn't work either.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

steve@blighty.com

about 22 years ago

In reply to: Bruce Momjian (#2)

Re: Integer parsing bug?

On Wed, Mar 03, 2004 at 12:31:47PM -0500, Bruce Momjian wrote:

Yep, it definately looks weird:

test=> select '-2147483648'::int;
int4
-------------
-2147483648
(1 row)

test=> select -2147483648::int;
ERROR: integer out of range
test=> select -2147483647::int;
?column?
-------------
-2147483647
(1 row)

test=> select '-2147483649'::int;
ERROR: value "-2147483649" is out of range for type integer

The non-quoting works only for *47, and the quoting works for *48, but
both fail for *49.

I looked at libc's strtol(), and that works fine, as does our existing
parser checks. The error is coming from int84, a comparison function
called from the executor. Here is a test program:

I traced through that far and managed to convince myself that the
problem was that it was considering a -...48 to be an int8, rather
than an int4, so was hitting int84() when it shouldn't have been - and
the input values for int84() looked very, very broken.

Specifically, a breakpoint on int84() fires on -..48 and -..49, but
not on -..47, suggesting that the problem is somewhere in the parsing
before it reaches int84().

I'm happy to take a look at it, but got very lost in the maze of twisty
parse routines, all alike, when I tried to track back further. Is there
any overview documentation on that end of the code?

I see in the freebsd machine/limits.h file:

* According to ANSI (section 2.2.4.2), the values below must be usable by
* #if preprocessing directives. Additionally, the expression must have the
* same type as would an expression that is an object of the corresponding
* type converted according to the integral promotions. The subtraction for
* INT_MIN, etc., is so the value is not unsigned; e.g., 0x80000000 is an
* unsigned int for 32-bit two's complement ANSI compilers (section 3.1.3.2).
* These numbers are for the default configuration of gcc. They work for
* some other compilers as well, but this should not be depended on.

#define INT_MAX 0x7fffffff /* max value for an int */
#define INT_MIN (-0x7fffffff - 1) /* min value for an int */

Basically, what is happening is that the special value -INT_MAX-1 is
being converted to an int value, and the compiler is casting it to an
unsigned. Seems this is a known C issue and I can't see a good fix for
it except perhaps check for INT_MIN int he int84 function, but I ran
some tests and that didn't work either.

I don't read it that way. INT_MIN is correctly read as a signed int,
but it can't be defined as -0x8000000 as that would be parsed as
-(0x80000000) and the constant 0x80000000 is unsigned.

Cheers,
Steve

tgl@sss.pgh.pa.us

about 22 years ago

In reply to: Steve Atkins (#3)

Re: Integer parsing bug?

Steve Atkins <steve@blighty.com> writes:

test=> select -2147483648::int;
ERROR: integer out of range

There is no bug here. You are mistakenly assuming that the above
represents
select (-2147483648)::int;
But actually the :: operator binds more tightly than unary minus,
so Postgres reads it as
select -(2147483648::int);
and quite rightly fails to convert the int8 literal to int.

If you write it with the correct parenthesization it works:

regression=# select -2147483648::int;
ERROR: integer out of range
regression=# select (-2147483648)::int;
int4
-------------
-2147483648
(1 row)

regards, tom lane

steve@blighty.com

about 22 years ago

In reply to: Tom Lane (#4)

Re: Integer parsing bug?

On Wed, Mar 03, 2004 at 06:27:07PM -0500, Tom Lane wrote:

Steve Atkins <steve@blighty.com> writes:

test=> select -2147483648::int;
ERROR: integer out of range

There is no bug here. You are mistakenly assuming that the above
represents
select (-2147483648)::int;
But actually the :: operator binds more tightly than unary minus,
so Postgres reads it as
select -(2147483648::int);
and quite rightly fails to convert the int8 literal to int.

If you write it with the correct parenthesization it works:

regression=# select -2147483648::int;
ERROR: integer out of range
regression=# select (-2147483648)::int;

OK... That makes sense if the parser has no support for negative
constants, but it doesn't seem like intuitive behaviour.

BTW, the original issue that led to this was:

db=>CREATE function t(integer) RETURNS integer AS '
BEGIN
return 0;
END;
' LANGUAGE 'plpgsql';

db=> select t(-2147483648);
ERROR: function t(bigint) does not exist

Which again makes sense considering the way the parser works, but
still seems to violate the principle of least surprise.

Cheers,
Steve