Automatic type conversion

Started by Thomas G. Lockhartover 27 years ago10 messages

Thomas G. Lockhart

lockhart@alumni.caltech.edu

over 27 years ago

I've committed changes to allow more automatic type conversion. Lots of
files were touched, mostly in backend/parser/.

The new code tries to do the right thing for conversions, and does
handle cases which were problematic before:

-- there isn't a floating point factorial operator...
tgl=> select (4.3 !);
?column?
--------
24
(1 row)

-- there isn't an integer exponentiation operator...
tgl=> select 2 ^ 3;
?column?
--------
8
(1 row)

-- concatenation on unspecified types didn't used to work...
tgl=> select '123' || '456';
?column?
--------
123456
(1 row)

-- didn't used to correctly truncate strings into tables...
tgl=> create table cc (c char(4));
CREATE
tgl=> insert into cc select '123' || '456';
INSERT 268073 1
tgl=> select * from cc;
c
----
1234
(1 row)

So, it should fix longstanding issues. However, the main goal should be
that it doesn't do the WRONG thing at any time. So, test away and post
any problems or issues that come up; we have lots of time to fix things
before v6.4.

One change in behavior is that I defined (for builtin types) the concept
of a "preferred type" in each category/class of types (e.g. float8 is
the preferred type for numerics, datetime is the preferred type for
date/times, etc.). And, unspecified types are preferentially resolved to
use this preferred type. So, the following behavior has changed:

-- this is now done as a float8 calculation, used to be float4...
tgl=> select '123.456'::float4 * '1.99999999999';
?column?
----------------
246.912002562242
(1 row)

Before, unknown types, such as the second string above, were resolved to
be the same type as the other type, if available. So the calculation
would have been truncated at ~7 decimal places.

The good thing about this is that the behavior of the above is now the
same as if the second string was specified without the quotes:

tgl=> select '123.456'::float4 * 1.99999999999;
?column?
----------------
246.912002562242
(1 row)

where before it was evaluated differently in the two cases.

Anyway, try things out, and I'll be writing this up for the docs. Will
post the topics on hackers along the way...

I haven't yet changed the regression tests to reflect the new behavior,
just in case it needs to be different. Also, all regression tests pass
with the only differences as mentioned above. btw, the code still has
lots of cleanup needed, moving subroutines around and taking out defunct
code. But will do that later.

Have fun.

- Tom

Thomas G. Lockhart

lockhart@alumni.caltech.edu

over 27 years ago

In reply to: Thomas G. Lockhart (#1)

Re: [HACKERS] Automatic type conversion

I've committed changes to allow more automatic type conversion.

btw, this requires a dump/reload...

- Tom

Dave Chapeskie

dchapes@ddm.on.ca

over 27 years ago

In reply to: Thomas G. Lockhart (#1)

Re: [HACKERS] Automatic type conversion

On Sun, May 10, 1998 at 12:14:11AM +0000, Thomas G. Lockhart wrote:

-- there isn't a floating point factorial operator...
tgl=> select (4.3 !);
?column?
--------
24
(1 row)

Am I the only one that thinks the above is wrong? 4.3 factorial is
mathematically undefined and does NOT equal 24.

I don't think the automatic type conversion should automatically
truncate values without at least a warning. Preferably I'd like to be
forced to do the conversion myself for cases like the above.

--
Dave Chapeskie <dchapes@ddm.on.ca>, DDM Consulting

Eugene Selkov Jr.

selkovjr@mcs.anl.gov

over 27 years ago

In reply to: Thomas G. Lockhart (#1)

Re: [HACKERS] Automatic type conversion

Dave Chapeskie wrote:

On Sun, May 10, 1998 at 12:14:11AM +0000, Thomas G. Lockhart wrote:

-- there isn't a floating point factorial operator...
tgl=> select (4.3 !);
?column?
--------
24
(1 row)

Am I the only one that thinks the above is wrong? 4.3 factorial is
mathematically undefined and does NOT equal 24.

Just put the gamma function in there and assume the argument is always a
float. A decent gamma function algorithm should make a special case for
integers.

--Gene

Thomas G. Lockhart

lockhart@alumni.caltech.edu

over 27 years ago

In reply to: Thomas G. Lockhart (#1)

Re: [HACKERS] Automatic type conversion

-- there isn't a floating point factorial operator...
tgl=> select (4.3 !);
?column?
--------
24
(1 row)

Am I the only one that thinks the above is wrong? 4.3 factorial is
mathematically undefined and does NOT equal 24.

I don't think the automatic type conversion should automatically
truncate values without at least a warning. Preferably I'd like to be
forced to do the conversion myself for cases like the above.

Yes, I included this one to provoke discussion :)

Postgres has type extensibility, so the algorithms for matching up types
and functions need to be very general. In this case, there is only one
function defined for factorial, and it takes an integer argument. But of
course Postgres now says "ah! I know how to make an int from a float!"
and goes ahead and does it. If there were more than one function defined
for factorial, and if none of the arguments matched a float, then
Postgres would conclude that there are too many functions to choose from
and throw an error.

One way to address this is to never allow Postgres to "demote" a type;
i.e. Postgres would be allowed to promote arguments to a "higher" type
(e.g. int->float) but never allowed to demote arguments (e.g.
float->int). But this would severely restrict type matching. I wanted to
try the more flexible case first to see whether it really does the
"wrong thing"; in the case of factorial, the only recourse for someone
wanting to calculate a factorial from a float is to convert to an int
first anyway.

Or, again for this factorial case, we can implement a floating point
factorial with either the gamma function (whatever that is :) or with an
explicit routine which checks for non-integral values.

Could also print a notice when arguments are being converted, but that
might get annoying for most cases which are probably trivial ones.

- Tom

Maurice Gittens

mgittens@gits.nl

over 27 years ago

In reply to: Thomas G. Lockhart (#5)

Re: [HACKERS] Automatic type conversion

Postgres has type extensibility, so the algorithms for matching up types
and functions need to be very general. In this case, there is only one
function defined for factorial, and it takes an integer argument. But of
course Postgres now says "ah! I know how to make an int from a float!"
and goes ahead and does it. If there were more than one function defined
for factorial, and if none of the arguments matched a float, then
Postgres would conclude that there are too many functions to choose from
and throw an error.

Making an int from a float is only defined for "small" values of the float.
So for the general case such a conversion would simply overflow the int,
giving it an undefined value. Does this make sense to you?

One way to address this is to never allow Postgres to "demote" a type;
i.e. Postgres would be allowed to promote arguments to a "higher" type
(e.g. int->float) but never allowed to demote arguments (e.g.
float->int). But this would severely restrict type matching. I wanted to
try the more flexible case first to see whether it really does the
"wrong thing"; in the case of factorial, the only recourse for someone
wanting to calculate a factorial from a float is to convert to an int
first anyway.

Please bear with me since I haven't looked at the code. Are conversions
between types defined in a way that is also extensible? I'm trying to say
that if I add a new type to the system, can I also specify which conversions
are automatically allowed? (Something similar to the C++ "explicite"
keyword?).

Or, again for this factorial case, we can implement a floating point
factorial with either the gamma function (whatever that is :) or with an
explicit routine which checks for non-integral values.

And properly handles overflows.

Could also print a notice when arguments are being converted, but that
might get annoying for most cases which are probably trivial ones.

- Tom

Regards,
Maurice.

Import Notes

Resolved by subject fallback

Thomas G. Lockhart

lockhart@alumni.caltech.edu

over 27 years ago

In reply to: Maurice Gittens (#6)

Re: [HACKERS] Automatic type conversion

Making an int from a float is only defined for "small" values of the
float. So for the general case such a conversion would simply overflow
the int, giving it an undefined value. Does this make sense to you?

Yes, it does. Look, I'm not saying everyone _should_ call factorial with
a float, only that if someone does, Postgres will try to accomplish it.
Doesn't it make sense to you?

Are conversions between types defined in a way that is also
extensible? I'm trying to say that if I add a new type to the system,
can I also specify which conversions are automatically allowed?
(Something similar to the C++ "explicite" keyword?).

Yes, they are extensible in the sense that all conversions (except for a
few string type hacks at the moment) are done by looking for a function
named with the same name as the target type, taking as a single argument
one with the specified source type. If you define one, then Postgres can
use it for conversions.

At the moment the primary mechanism uses the pg_proc table to look for
possible conversion functions, along with a hardcoded notion of what
"preferred types" and "type categories" are for the builtin types. For
user-defined types, explicit type conversion functions must be provided
_and_ there must be a single path from source to possible targets for
the conversions. Otherwise there will result multiple possible
conversions and Postgres will ask you to use a cast, much as it does in
v6.3.x and before.

Or, again for this factorial case, we can implement a floating point
factorial with either the gamma function (whatever that is :) or with
an explicit routine which checks for non-integral values.

And properly handles overflows.

Hey, it doesn't do any worse than before...

- Tom

Maurice Gittens

mgittens@gits.nl

over 27 years ago

In reply to: Thomas G. Lockhart (#7)

Re: [HACKERS] Automatic type conversion

-----Original Message-----
From: Thomas G. Lockhart <lockhart@alumni.caltech.edu>
To: Maurice Gittens <mgittens@gits.nl>
Cc: Dave Chapeskie <dchapes@ddm.on.ca>; Postgres Hackers List
<hackers@postgresql.org>
Date: maandag 11 mei 1998 12:24
Subject: Re: [HACKERS] Automatic type conversion

Making an int from a float is only defined for "small" values of the
float. So for the general case such a conversion would simply overflow
the int, giving it an undefined value. Does this make sense to you?

Yes, it does. Look, I'm not saying everyone _should_ call factorial with
a float, only that if someone does, Postgres will try to accomplish it.
Doesn't it make sense to you?

IMO the issue is not related to the factorial function. I think we
are/should
be discussing the general issue how to handle conversions from a type A to
a type B while the conversion function F from A to B is not defined
for all values of A.

Are conversions between types defined in a way that is also
extensible? I'm trying to say that if I add a new type to the system,
can I also specify which conversions are automatically allowed?
(Something similar to the C++ "explicite" keyword?).

Yes, they are extensible in the sense that all conversions (except for a
few string type hacks at the moment) are done by looking for a function
named with the same name as the target type, taking as a single argument
one with the specified source type. If you define one, then Postgres can
use it for conversions.

At the moment the primary mechanism uses the pg_proc table to look for
possible conversion functions, along with a hardcoded notion of what
"preferred types" and "type categories" are for the builtin types. For
user-defined types, explicit type conversion functions must be provided
_and_ there must be a single path from source to possible targets for
the conversions. Otherwise there will result multiple possible
conversions and Postgres will ask you to use a cast, much as it does in
v6.3.x and before.

Thanks for the explanation.

Or, again for this factorial case, we can implement a floating point
factorial with either the gamma function (whatever that is :) or with
an explicit routine which checks for non-integral values.

And properly handles overflows.

Hey, it doesn't do any worse than before...

I don't know what the system used to do. I do however hope
that if a conversion is not defined that the system won't simply ignore
the error.

Don't worry I'll shut up now.

With regards from Maurice.

Import Notes

Resolved by subject fallback

Thomas G. Lockhart

lockhart@alumni.caltech.edu

over 27 years ago

In reply to: Maurice Gittens (#8)

Re: [HACKERS] Automatic type conversion

Don't worry I'll shut up now.

Well, try things out if you have time and see if things work in general.
It took me quite some time to work through the issues and we'll need to
try many cases and have some back-and-forth discussion before everything
is clear and we can determine what adjustments should be made.

fyi, the original SQL developers for Postgres seemed to be divided on
the subject, but one camp concluded that almost no automatic type
conversion was desirable. I think in practice that this is too extreme,
since many cases have an obvious best conversion strategy...

- Tom

#10

Noname

dg@illustra.com

over 27 years ago

In reply to: Thomas G. Lockhart (#5)

Re: [HACKERS] Automatic type conversion

Tom:

One way to address this is to never allow Postgres to "demote" a type;
i.e. Postgres would be allowed to promote arguments to a "higher" type
(e.g. int->float) but never allowed to demote arguments (e.g.
float->int). But this would severely restrict type matching. I wanted to
try the more flexible case first to see whether it really does the
"wrong thing"; in the case of factorial, the only recourse for someone
wanting to calculate a factorial from a float is to convert to an int
first anyway.

I think that never demoting is the best way to procede here. If the type
resolution search is too "thorough" it can be very confusing (see c++ for
example). As it is the interaction of SQL and the type system can create
surprises. Promoting both "up" and "down" is likely to make it very hard
to figure out what any given query will do.

-dg

David Gould dg@illustra.com 510.628.3783 or 510.305.9468
Informix Software (No, really) 300 Lakeside Drive Oakland, CA 94612
"Of course, someone who knows more about this will correct me if I'm wrong,
and someone who knows less will correct me if I'm right."
--David Palmer (palmer@tybalt.caltech.edu)