NUMERIC type conversions leave much to be desired

Started by Tom Laneover 26 years ago9 messages
#1Tom Lane
tgl@sss.pgh.pa.us

With fairly current sources:

regression=> create table dnum (f1 numeric(10,2));
CREATE
regression=> insert into dnum values ('12.34');
ERROR: overflow on numeric ABS(value) >= 10^1 for field with precision 31491 scale 52068
regression=> insert into dnum1 values ('12.34'::numeric);
ERROR: overflow on numeric ABS(value) >= 10^1 for field with precision 31491 scale 52132
regression=> insert into dnum1 values (12.34::numeric);
ERROR: parser_typecast: cannot cast this expression to type 'numeric'
regression=> insert into dnum1 values (12.34);
INSERT 950499 1

I've not put in Thomas' proposed change for handling out-of-range
constants; I don't think it'd change any of these cases anyway.

BTW, why is there no regression test for NUMERIC?

regards, tom lane

#2Thomas Lockhart
lockhart@alumni.caltech.edu
In reply to: Tom Lane (#1)
Re: [HACKERS] NUMERIC type conversions leave much to be desired

With fairly current sources:

<snip>

ERROR: overflow on numeric ABS(value) >= 10^1
for field with precision 31491 scale 52068

postgres=> create table dnum (f1 numeric(10,2));
postgres=> insert into dnum values ('12.34');
postgres=> insert into dnum values ('12.34'::numeric);
postgres=> insert into dnum values (12.34);
postgres=> insert into dnum values ('12.345');
postgres=> select * from dnum;
f1
-----
12.34
12.34
12.34
12.35
(4 rows)

fwiw, I've seen the same internal problems, and managed to fix them
with a full clean and reinstall. I'm probably a week old on my tree
vintage...

- Thomas

--
Thomas Lockhart lockhart@alumni.caltech.edu
South Pasadena, California

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Lockhart (#2)
Re: [HACKERS] NUMERIC type conversions leave much to be desired

Thomas Lockhart <lockhart@alumni.caltech.edu> writes:

With fairly current sources:

<snip>

ERROR: overflow on numeric ABS(value) >= 10^1
for field with precision 31491 scale 52068

fwiw, I've seen the same internal problems, and managed to fix them
with a full clean and reinstall. I'm probably a week old on my tree
vintage...

Good thought but no cigar ... I did a full fresh CVS checkout, configure,
make, install & initdb from scratch, and it still behaves the same.
Possibly it's been broken sometime in the last week? utils/adt/numeric.c
looks to have been last committed on 4 May.

regards, tom lane

#4Thomas Lockhart
lockhart@alumni.caltech.edu
In reply to: Tom Lane (#3)
Re: [HACKERS] NUMERIC type conversions leave much to be desired

fwiw, I've seen the same internal problems, and managed to fix them
with a full clean and reinstall. I'm probably a week old on my tree
vintage...

Good thought but no cigar ... I did a full fresh CVS checkout, configure,
make, install & initdb from scratch, and it still behaves the same.
Possibly it's been broken sometime in the last week? utils/adt/numeric.c
looks to have been last committed on 4 May.

No, those changes were actually by myself, and just modified one line
in the float8->numeric conversion code to use a direct sprintf("%f")
to convert float8 to a string in preparation for conversion to
numeric. The old code used the float8out() routine, which for larger
floats generated exponential notation that the numeric_in() routine
wasn't prepared to handle.

I've seen the same problems you are having, and reported them just as
you have ("something is fundamentally wrong with numeric..."). And
then the problems went away, but I'm not actually certain why.

Let me know if I can help track it down, since others might get bit
too...

- Tom

--
Thomas Lockhart lockhart@alumni.caltech.edu
South Pasadena, California

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Lockhart (#4)
Re: [HACKERS] NUMERIC type conversions leave much to be desired

Thomas Lockhart <lockhart@alumni.caltech.edu> writes:

I've seen the same problems you are having, and reported them just as
you have ("something is fundamentally wrong with numeric..."). And
then the problems went away, but I'm not actually certain why.

Ah-hah, I've sussed it. numeric_in() was declared in pg_proc as
taking one parameter, when in fact it takes three. Therefore, when
calling it via fmgr (as the parser does), the second and third
parameters were passed as random garbage. Apparently, the code
generated for your machine produced some fairly harmless garbage...
but not on mine.

I've committed a pg_proc.h update to fix this; it will take a full
rebuild and initdb to propagate the fix, of course.

I'm still seeing

regression=> insert into dnum values (12.34::numeric);
ERROR: parser_typecast: cannot cast this expression to type 'numeric'

which is arising from parser_typecast's inability to cope with
a T_Float input node. I suspect it could be readily done along
the same lines as T_Integer is handled, but I'll leave that to you.

regards, tom lane

PS: can anyone think of a reasonable way of mechanically checking
pg_proc entries against the actual C definitions of the functions?
I grovelled through all the typinput functions by hand to verify
that numeric_in was the only one with this bug ... but I ain't
about to do that for all 989 pg_proc entries for built-in functions.
Much less to do it again for future releases.

#6Thomas Lockhart
lockhart@alumni.caltech.edu
In reply to: Tom Lane (#5)
1 attachment(s)
Re: [HACKERS] NUMERIC type conversions leave much to be desired

Ah-hah, I've sussed it. numeric_in() was declared in pg_proc as
taking one parameter, when in fact it takes three. Therefore, when
calling it via fmgr (as the parser does), the second and third
parameters were passed as random garbage. Apparently, the code
generated for your machine produced some fairly harmless garbage...
but not on mine.
I've committed a pg_proc.h update to fix this...

Great.

I'm still seeing
regression=> insert into dnum values (12.34::numeric);
ERROR: parser_typecast: cannot cast this expression to type 'numeric'
which is arising from parser_typecast's inability to cope with
a T_Float input node. I suspect it could be readily done along
the same lines as T_Integer is handled, but I'll leave that to you.

postgres=> select 12.34::numeric;
--------
12.34
(1 row)

OK, and while I was looking at it I noticed that the T_Integer code
didn't bother using the int4out() routine to generate a string. imho
it should be using the official output routine unless there is some
compelling reason not to. It seems to still behave with this fix in
the T_Integer support code; should I commit both?

- Thomas

--
Thomas Lockhart lockhart@alumni.caltech.edu
South Pasadena, California

Attachments:

parse_expr.c.patchtext/plain; charset=us-ascii; name=parse_expr.c.patchDownload
*** parse_expr.c.orig	Tue Apr 27 14:14:13 1999
--- parse_expr.c	Sun May  9 03:47:34 1999
***************
*** 642,650 ****
  			const_string = DatumGetPointer(expr->val.str);
  			break;
  		case T_Integer:
- 			const_string = (char *) palloc(256);
  			string_palloced = true;
! 			sprintf(const_string, "%ld", expr->val.ival);
  			break;
  		default:
  			elog(ERROR,
--- 642,653 ----
  			const_string = DatumGetPointer(expr->val.str);
  			break;
  		case T_Integer:
  			string_palloced = true;
! 			const_string = int4out(expr->val.ival);
! 			break;
! 		case T_Float:
! 			string_palloced = true;
! 			const_string = float8out(&expr->val.dval);
  			break;
  		default:
  			elog(ERROR,
#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Lockhart (#6)
Re: [HACKERS] NUMERIC type conversions leave much to be desired

Thomas Lockhart <lockhart@alumni.caltech.edu> writes:

OK, and while I was looking at it I noticed that the T_Integer code
didn't bother using the int4out() routine to generate a string. imho
it should be using the official output routine unless there is some
compelling reason not to. It seems to still behave with this fix in
the T_Integer support code; should I commit both?

One potential problem is that if the value is large/small enough to make
float8out use 'E' notation, conversion to numeric will still fail ---
this is the same problem you hacked around in float8_numeric() earlier.

I still like the idea of hanging on to the original string form of the
constant long enough so that parser_typecast can feed that directly to
the target type's xxx_in() routine, and not have to worry about
conversion errors.

regards, tom lane

#8Thomas Lockhart
lockhart@alumni.caltech.edu
In reply to: Tom Lane (#7)
Re: [HACKERS] NUMERIC type conversions leave much to be desired

One potential problem is that if the value is large/small enough to make
float8out use 'E' notation, conversion to numeric will still fail ---
this is the same problem you hacked around in float8_numeric() earlier.

Yup. But in the long run, that is a problem for numeric(), not float8.
If nothing else, numeric() should be willing to flag cases where it
has trouble, and it doesn't seem to do that. That should probably be
considered a "must fix" for v6.5.

I still like the idea of hanging on to the original string form of the
constant long enough so that parser_typecast can feed that directly to
the target type's xxx_in() routine, and not have to worry about
conversion errors.

I agree. I'm just worried about losing the typing hints provided by
scan.l if we went to a "string only" solution. Also, there might be a
performance hit if we ended up having to do the string conversion too
many times.

At this late date, I'm (so far) happy doing the kinds of fixes we've
done, but we should revisit the issue for v6.5.x...

- Thomas

--
Thomas Lockhart lockhart@alumni.caltech.edu
South Pasadena, California

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Lockhart (#8)
Re: [HACKERS] NUMERIC type conversions leave much to be desired

Thomas Lockhart <lockhart@alumni.caltech.edu> writes:

I still like the idea of hanging on to the original string form of the
constant long enough so that parser_typecast can feed that directly to
the target type's xxx_in() routine, and not have to worry about
conversion errors.

I agree. I'm just worried about losing the typing hints provided by
scan.l if we went to a "string only" solution.

No no, I didn't say that you can't keep T_Integer and T_Float nodes
separate. I was just suggesting that the *value* of one of these nodes
might be kept as a string (or, perhaps, both as a string and the numeric
format). That way, if you need to convert to some other type, you start
from the original string and don't have to risk a "lossy compression"
into floating point.

regards, tom lane