Implicit coercions need to be reined in

Started by Tom Laneover 24 years ago14 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

Currently, if PG has a single-argument function named after its result
type, the function is assumed to represent a valid implicit type
coercion. For example, I can do

regression=# select now() || now();
?column?
------------------------------------------------------------
2001-11-21 15:12:13.226482-052001-11-21 15:12:13.226482-05
(1 row)

because there is a function text(timestamp) returning text, and
this function gets invoked implicitly.

It strikes me that this is a bad idea, and will get a lot worse as we
add more conversion functions. With enough implicit coercions one will
never be entirely sure how the system will interpret a mixed-datatype
expression. Nonetheless, having more conversion functions seems like a
good idea --- I think there should be a numeric-to-text conversion
function, for example, and someone was just complaining in pggeneral
about the lack of a boolean-to-text coercion. The problem is that
there's no way to create a conversion function without inducing an
implicit coercion path (unless you give the function a nonobvious name).

What I'd like to suggest (for 7.3 or later) is adding a boolean column
to pg_proc that indicates "can be implicit coercion". A function for
which this is not set can be used as an explicitly requested coercion,
but not an implicit one. Thus it'd be possible to define text(boolean)
and allow it to be called explicitly, without creating an implicit
coercion path and thereby losing a lot of type safety.

I have not gone over the existing implicit coercions to see which ones
I like and don't like, but I think a good first cut at maintaining
sanity would be to disable any cross-type-category implicit coercions.
Thus, for example, int4 to float8 seems like an okay implicit coercion,
but not int4 to text.

Note that this would cause the system to reject some things it accepts
now, for example:

regression=# select 45::int4 || 66::int4;
?column?
----------
4566
(1 row)

This sort of thing would need explicit coercions to text under my proposal.

Comments?

regards, tom lane

#2Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Tom Lane (#1)
Re: Implicit coercions need to be reined in

On Wed, 21 Nov 2001, Tom Lane wrote:

It strikes me that this is a bad idea, and will get a lot worse as we
add more conversion functions. With enough implicit coercions one will
never be entirely sure how the system will interpret a mixed-datatype
expression. Nonetheless, having more conversion functions seems like a
good idea --- I think there should be a numeric-to-text conversion
function, for example, and someone was just complaining in pggeneral
about the lack of a boolean-to-text coercion. The problem is that
there's no way to create a conversion function without inducing an
implicit coercion path (unless you give the function a nonobvious name).

What I'd like to suggest (for 7.3 or later) is adding a boolean column
to pg_proc that indicates "can be implicit coercion". A function for
which this is not set can be used as an explicitly requested coercion,
but not an implicit one. Thus it'd be possible to define text(boolean)
and allow it to be called explicitly, without creating an implicit
coercion path and thereby losing a lot of type safety.

I think something of this sort is a good thing. :) It's a bit of a pain in
one way, but it makes understanding what's going on simpler, especially
given things like the person who was getting text cut off at 31 or
whatever characters due to the fact that there was an implicit coercion
through name.

#3Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Stephan Szabo (#2)
Re: Implicit coercions need to be reined in

Thus, for example, int4 to float8 seems like an okay implicit

coercion,

but not int4 to text.

int4 to text is imho one of the most important implicit coercions of
all.
Still, a field to mark an implicit coercion function is probably a good
thing.

Imho a numeric to text implicit coercion would also be great :-)

I come from a db where all coercions are possible implicitly,
this has not been a problem as long as there is a way to overrule.

Andreas

#4Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#1)
Re: Implicit coercions need to be reined in

Tom Lane writes:

What I'd like to suggest (for 7.3 or later) is adding a boolean column
to pg_proc that indicates "can be implicit coercion".

Great! I've always wished for this sort of thing.

I think from this there are only a few more steps to the full SQL99
specification of user-defined casts. In particular it'd involve two
boolean flags: one that means "this is a cast function" (the function is
used internally for a CAST() invocation), and one for allowing it to be
used for automatic conversion. We currently don't have the first one on
the radar because we use the function name and argument types as the
indicator.

One more thing is that instead of looking up a cast function from type A
to type B as "returns A, is named A, takes arg B" it could be looked up as
"returns A, takes arg B, is a cast function", which would generalize this
whole thing a bit more.

References: SQL99 Part 2 11.49, 11.52 (I haven't read the whole thing,
but I think it's the idea.)

--
Peter Eisentraut peter_e@gmx.net

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zeugswetter Andreas SB SD (#3)
Re: Implicit coercions need to be reined in

"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:

I come from a db where all coercions are possible implicitly,
this has not been a problem as long as there is a way to overrule.

Yeah, but how rich was its type structure compared to Postgres'?

It might indeed be safe/reasonable to allow implicit coercions to
text from all other types. I'm not sure. I am sure that if any
datatype coercion one could possibly want is available implicitly,
it's going to be very difficult to predict the system's behavior.
In fact, this would probably make the default behavior appear to
have *fewer* automatic coercions not more: anytime there wasn't
an exact type match, the parser would have too many alternatives
and would be unable to select a unique function or operator candidate
from among those it could reach by means of implicit coercions.
We've seen some reports of such problems already, and it'll get worse
as we add implicit coercions.

Of course, you could always turn on the "can be implicit coercion"
flag for whichever pg_proc entries you really wanted ...

regards, tom lane

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#4)
Re: Implicit coercions need to be reined in

Peter Eisentraut <peter_e@gmx.net> writes:

One more thing is that instead of looking up a cast function from type A
to type B as "returns A, is named A, takes arg B" it could be looked up as
"returns A, takes arg B, is a cast function", which would generalize this
whole thing a bit more.

I thought about that, but it would require keeping an extra index on
pg_proc (there's no efficient way to search by result type now). Not
clear that it's worth it, especially considering that there's really
only one reasonable name for a cast function anyway. What else would
you want to call it than the name of the destination type?

Also, this convention assures that there's at most *one* cast function
for any type coercion A to B. If the name could be anything then we'd
need some auxiliary mechanism to prevent conflicting cast functions
from being declared. Seems like a lot of mechanism for a dubious goal.

regards, tom lane

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#1)
Re: Implicit coercions need to be reined in

Awhile back I suggested adding a boolean column to pg_proc to control
which type coercion functions could be invoked implicitly, and which
would need an explicit cast:
http://archives.postgresql.org/pgsql-hackers/2001-11/msg00803.php
There is a relevant bug report #484 showing the dangers of too many
implicit coercion paths:
http://archives.postgresql.org/pgsql-bugs/2001-10/msg00108.php

I have added such a column as part of the pg_proc changes I'm currently
doing to migrate aggregates into pg_proc. So it's now time to debate
the nitty-gritty: exactly which coercion functions should not be
implicitly invokable anymore?

My first-cut attempt at this is shown by the two printouts below.
The first cut does not allow any implicit coercions to text from types
that are not in the text category, which seems a necessary rule to me
--- the above-cited bug report shows why free coercions to text are
dangerous.  However, it turns out that several of the regression
tests fail with this rule; see the regression diffs below.

Should I consider these regression tests wrong, and correct them?
If not, how can we limit implicit coercions to text enough to avoid
the problems illustrated by bug #484?

Another interesting point is that I allowed implicit coercions from
float8 to numeric; this is necessary to avoid breaking cases like
insert into foo(numeric_col) values(12.34);
since the constant will be initially typed as float8. However, because
I didn't allow the reverse coercion implicitly, this makes numeric
"more preferred" than float8. Thus, for example,
select '12.34'::numeric + 12.34;
which draws a can't-resolve-operator error in 7.2, is resolved as
numeric addition with these changes. Is this a good thing, or not?
We could preserve the can't-resolve behavior by marking numeric->float8
as an allowed implicit coercion, but that seems ugly. I'm not sure we
can do a whole lot better without some more wide-ranging revisions of
the way we handle untyped numeric literals (as in past proposals to
invent an UNKNOWNNUMERIC pseudo-type).

Also, does anyone have any other nits to pick with this classification
of which coercions are implicitly okay? I've started with a fairly
tough approach of disallowing most implicit coercions, but perhaps this
goes too far.

regards, tom lane

Coercions allowed implicitly:

oid | result | input | prosrc
------+-------------+-------------+-----------------------
860 | bpchar | char | char_bpchar
408 | bpchar | name | name_bpchar
861 | char | bpchar | bpchar_char
944 | char | text | text_char
312 | float4 | float8 | dtof
236 | float4 | int2 | i2tof
318 | float4 | int4 | i4tof
311 | float8 | float4 | ftod
235 | float8 | int2 | i2tod
316 | float8 | int4 | i4tod
482 | float8 | int8 | i8tod
314 | int2 | int4 | i4toi2
714 | int2 | int8 | int82
313 | int4 | int2 | i2toi4
480 | int4 | int8 | int84
754 | int8 | int2 | int28
481 | int8 | int4 | int48
1177 | interval | reltime | reltime_interval
1370 | interval | time | time_interval
409 | name | bpchar | bpchar_name
407 | name | text | text_name
1400 | name | varchar | text_name
1742 | numeric | float4 | float4_numeric
1743 | numeric | float8 | float8_numeric
1782 | numeric | int2 | int2_numeric
1740 | numeric | int4 | int4_numeric
1781 | numeric | int8 | int8_numeric
946 | text | char | char_text
406 | text | name | name_text
2046 | time | timetz | timetz_time
2023 | timestamp | abstime | abstime_timestamp
2024 | timestamp | date | date_timestamp
2027 | timestamp | timestamptz | timestamptz_timestamp
1173 | timestamptz | abstime | abstime_timestamptz
1174 | timestamptz | date | date_timestamptz
2028 | timestamptz | timestamp | timestamp_timestamptz
2047 | timetz | time | time_timetz
1401 | varchar | name | name_text
(38 rows)

Coercions that will require explicit CAST, ::type, or typename(x) syntax
(NB: in 7.2 all of these would have been allowed implicitly):

oid | result | input | prosrc
------+-------------+-------------+------------------------------------------
2030 | abstime | timestamp | timestamp_abstime
1180 | abstime | timestamptz | timestamptz_abstime
1480 | box | circle | circle_box
1446 | box | polygon | poly_box
1714 | cidr | text | text_cidr
1479 | circle | box | box_circle
1474 | circle | polygon | poly_circle
1179 | date | abstime | abstime_date
748 | date | text | text_date
2029 | date | timestamp | timestamp_date
1178 | date | timestamptz | timestamptz_date
1745 | float4 | numeric | numeric_float4
839 | float4 | text | text_float4
1746 | float8 | numeric | numeric_float8
838 | float8 | text | text_float8
1713 | inet | text | text_inet
238 | int2 | float4 | ftoi2
237 | int2 | float8 | dtoi2
1783 | int2 | numeric | numeric_int2
818 | int2 | text | text_int2
319 | int4 | float4 | ftoi4
317 | int4 | float8 | dtoi4
1744 | int4 | numeric | numeric_int4
819 | int4 | text | text_int4
483 | int8 | float8 | dtoi8
1779 | int8 | numeric | numeric_int8
1289 | int8 | text | text_int8
1263 | interval | text | text_interval
1541 | lseg | box | box_diagonal
767 | macaddr | text | text_macaddr
817 | oid | text | text_oid
1447 | path | polygon | poly_path
1534 | point | box | box_center
1416 | point | circle | circle_center
1532 | point | lseg | lseg_center
1533 | point | path | path_center
1540 | point | polygon | poly_center
1448 | polygon | box | box_poly
1544 | polygon | circle | select polygon(12, $1)
1449 | polygon | path | path_poly
1200 | reltime | int4 | int4reltime
1194 | reltime | interval | interval_reltime
749 | text | date | date_text
841 | text | float4 | float4_text
840 | text | float8 | float8_text
730 | text | inet | network_show
113 | text | int2 | int2_text
112 | text | int4 | int4_text
1288 | text | int8 | int8_text
1193 | text | interval | interval_text
752 | text | macaddr | macaddr_text
114 | text | oid | oid_text
948 | text | time | time_text
2034 | text | timestamp | timestamp_text
1192 | text | timestamptz | timestamptz_text
939 | text | timetz | timetz_text
1364 | time | abstime | select time(cast($1 as timestamp without time zone))
1419 | time | interval | interval_time
837 | time | text | text_time
1316 | time | timestamp | timestamp_time
2022 | timestamp | text | text_timestamp
1191 | timestamptz | text | text_timestamptz
938 | timetz | text | text_timetz
1388 | timetz | timestamptz | timestamptz_timetz
1619 | varchar | int4 | int4_text
1623 | varchar | int8 | int8_text
(66 rows)

Regression failures with this set of choices (I've edited the output to
remove diffs that are merely consequences of the actual failures):

*** ./expected/char.out	Mon May 21 12:54:46 2001
--- ./results/char.out	Wed Apr 10 11:48:16 2002
***************
*** 18,23 ****
--- 18,25 ----
  -- any of the following three input formats are acceptable 
  INSERT INTO CHAR_TBL (f1) VALUES ('1');
  INSERT INTO CHAR_TBL (f1) VALUES (2);
+ ERROR:  column "f1" is of type 'character' but expression is of type 'integer'
+ 	You will need to rewrite or cast the expression
  INSERT INTO CHAR_TBL (f1) VALUES ('3');
  -- zero-length char 
  INSERT INTO CHAR_TBL (f1) VALUES ('');
*** ./expected/varchar.out	Mon May 21 12:54:46 2001
--- ./results/varchar.out	Wed Apr 10 11:48:17 2002
***************
*** 7,12 ****
--- 7,14 ----
  -- any of the following three input formats are acceptable 
  INSERT INTO VARCHAR_TBL (f1) VALUES ('1');
  INSERT INTO VARCHAR_TBL (f1) VALUES (2);
+ ERROR:  column "f1" is of type 'character varying' but expression is of type 'integer'
+ 	You will need to rewrite or cast the expression
  INSERT INTO VARCHAR_TBL (f1) VALUES ('3');
  -- zero-length char 
  INSERT INTO VARCHAR_TBL (f1) VALUES ('');
*** ./expected/strings.out	Fri Jun  1 13:49:17 2001
--- ./results/strings.out	Wed Apr 10 11:49:29 2002
***************
*** 137,147 ****
  (1 row)
  SELECT POSITION(5 IN '1234567890') = '5' AS "5";
!  5 
! ---
!  t
! (1 row)
! 
  --
  -- test LIKE
  -- Be sure to form every test as a LIKE/NOT LIKE pair.
--- 137,145 ----
  (1 row)

SELECT POSITION(5 IN '1234567890') = '5' AS "5";
! ERROR: Function 'pg_catalog.position(unknown, int4)' does not exist
! Unable to identify a function that satisfies the given argument types
! You may need to add explicit typecasts
--
-- test LIKE
-- Be sure to form every test as a LIKE/NOT LIKE pair.

*** ./expected/alter_table.out	Fri Apr  5 12:03:45 2002
--- ./results/alter_table.out	Wed Apr 10 11:51:06 2002
***************
*** 363,374 ****
  CREATE TEMP TABLE FKTABLE (ftest1 varchar);
  ALTER TABLE FKTABLE ADD FOREIGN KEY(ftest1) references pktable;
  NOTICE:  ALTER TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
  -- As should this
  ALTER TABLE FKTABLE ADD FOREIGN KEY(ftest1) references pktable(ptest1);
  NOTICE:  ALTER TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
  DROP TABLE pktable;
- NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table "fktable"
- NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table "fktable"
  DROP TABLE fktable;
  CREATE TEMP TABLE PKTABLE (ptest1 int, ptest2 text,
                             PRIMARY KEY(ptest1, ptest2));
--- 363,376 ----
  CREATE TEMP TABLE FKTABLE (ftest1 varchar);
  ALTER TABLE FKTABLE ADD FOREIGN KEY(ftest1) references pktable;
  NOTICE:  ALTER TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
+ ERROR:  Unable to identify an operator '=' for types 'character varying' and 'integer'
+ 	You will have to retype this query using an explicit cast
  -- As should this
  ALTER TABLE FKTABLE ADD FOREIGN KEY(ftest1) references pktable(ptest1);
  NOTICE:  ALTER TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
+ ERROR:  Unable to identify an operator '=' for types 'character varying' and 'integer'
+ 	You will have to retype this query using an explicit cast
  DROP TABLE pktable;
  DROP TABLE fktable;
  CREATE TEMP TABLE PKTABLE (ptest1 int, ptest2 text,
                             PRIMARY KEY(ptest1, ptest2));
*** ./expected/rules.out	Thu Mar 21 10:24:35 2002
--- ./results/rules.out	Wed Apr 10 11:51:11 2002
***************
*** 1026,1037 ****
                                          'Al Bundy',
                                          'epoch'::text
                                      );
  UPDATE shoelace_data SET sl_avail = 6 WHERE  sl_name = 'sl7';
  SELECT * FROM shoelace_log;
    sl_name   | sl_avail | log_who  |         log_when         
! ------------+----------+----------+--------------------------
!  sl7        |        6 | Al Bundy | Thu Jan 01 00:00:00 1970
! (1 row)
      CREATE RULE shoelace_ins AS ON INSERT TO shoelace
          DO INSTEAD
--- 1026,1038 ----
                                          'Al Bundy',
                                          'epoch'::text
                                      );
+ ERROR:  column "log_when" is of type 'timestamp without time zone' but expression is of type 'text'
+ 	You will need to rewrite or cast the expression
  UPDATE shoelace_data SET sl_avail = 6 WHERE  sl_name = 'sl7';
  SELECT * FROM shoelace_log;
   sl_name | sl_avail | log_who | log_when 
! ---------+----------+---------+----------
! (0 rows)

CREATE RULE shoelace_ins AS ON INSERT TO shoelace
DO INSTEAD

*** ./expected/foreign_key.out	Wed Mar  6 01:10:56 2002
--- ./results/foreign_key.out	Wed Apr 10 11:51:17 2002
***************
*** 733,747 ****
  -- because varchar=int does exist
  CREATE TABLE FKTABLE (ftest1 varchar REFERENCES pktable);
  NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
  DROP TABLE FKTABLE;
! NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table "pktable"
! NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table "pktable"
  -- As should this
  CREATE TABLE FKTABLE (ftest1 varchar REFERENCES pktable(ptest1));
  NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
  DROP TABLE FKTABLE;
! NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table "pktable"
! NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table "pktable"
  DROP TABLE PKTABLE;
  -- Two columns, two tables
  CREATE TABLE PKTABLE (ptest1 int, ptest2 text, PRIMARY KEY(ptest1, ptest2));
--- 733,749 ----
  -- because varchar=int does exist
  CREATE TABLE FKTABLE (ftest1 varchar REFERENCES pktable);
  NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
+ ERROR:  Unable to identify an operator '=' for types 'character varying' and 'integer'
+ 	You will have to retype this query using an explicit cast
  DROP TABLE FKTABLE;
! ERROR:  table "fktable" does not exist
  -- As should this
  CREATE TABLE FKTABLE (ftest1 varchar REFERENCES pktable(ptest1));
  NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
+ ERROR:  Unable to identify an operator '=' for types 'character varying' and 'integer'
+ 	You will have to retype this query using an explicit cast
  DROP TABLE FKTABLE;
! ERROR:  table "fktable" does not exist
  DROP TABLE PKTABLE;
  -- Two columns, two tables
  CREATE TABLE PKTABLE (ptest1 int, ptest2 text, PRIMARY KEY(ptest1, ptest2));
*** ./expected/domain.out	Wed Mar 20 13:34:37 2002
--- ./results/domain.out	Wed Apr 10 11:51:23 2002
***************
*** 111,116 ****
--- 111,118 ----
  create domain ddef2 oid DEFAULT '12';
  -- Type mixing, function returns int8
  create domain ddef3 text DEFAULT 5;
+ ERROR:  Column "ddef3" is of type text but default expression is of type integer
+ 	You will need to rewrite or cast the expression
  create sequence ddef4_seq;
  create domain ddef4 int4 DEFAULT nextval(cast('ddef4_seq' as text));
  create domain ddef5 numeric(8,2) NOT NULL DEFAULT '12.12';
#8Barry Lind
barry@xythos.com
In reply to: Tom Lane (#1)
Re: Implicit coercions need to be reined in

Tom,

My feeling is that this change as currently scoped will break a lot of
existing apps. Especially the case where people are using where clauses
of the form: bigintcolumn = '999' to get a query to use the index on
a column of type bigint.

thanks,
--Barry

Tom Lane wrote:

Show quoted text

Awhile back I suggested adding a boolean column to pg_proc to control
which type coercion functions could be invoked implicitly, and which
would need an explicit cast:
http://archives.postgresql.org/pgsql-hackers/2001-11/msg00803.php
There is a relevant bug report #484 showing the dangers of too many
implicit coercion paths:
http://archives.postgresql.org/pgsql-bugs/2001-10/msg00108.php

I have added such a column as part of the pg_proc changes I'm currently
doing to migrate aggregates into pg_proc. So it's now time to debate
the nitty-gritty: exactly which coercion functions should not be
implicitly invokable anymore?

My first-cut attempt at this is shown by the two printouts below.
The first cut does not allow any implicit coercions to text from types
that are not in the text category, which seems a necessary rule to me
--- the above-cited bug report shows why free coercions to text are
dangerous.  However, it turns out that several of the regression
tests fail with this rule; see the regression diffs below.

Should I consider these regression tests wrong, and correct them?
If not, how can we limit implicit coercions to text enough to avoid
the problems illustrated by bug #484?

Another interesting point is that I allowed implicit coercions from
float8 to numeric; this is necessary to avoid breaking cases like
insert into foo(numeric_col) values(12.34);
since the constant will be initially typed as float8. However, because
I didn't allow the reverse coercion implicitly, this makes numeric
"more preferred" than float8. Thus, for example,
select '12.34'::numeric + 12.34;
which draws a can't-resolve-operator error in 7.2, is resolved as
numeric addition with these changes. Is this a good thing, or not?
We could preserve the can't-resolve behavior by marking numeric->float8
as an allowed implicit coercion, but that seems ugly. I'm not sure we
can do a whole lot better without some more wide-ranging revisions of
the way we handle untyped numeric literals (as in past proposals to
invent an UNKNOWNNUMERIC pseudo-type).

Also, does anyone have any other nits to pick with this classification
of which coercions are implicitly okay? I've started with a fairly
tough approach of disallowing most implicit coercions, but perhaps this
goes too far.

regards, tom lane

Coercions allowed implicitly:

oid | result | input | prosrc
------+-------------+-------------+-----------------------
860 | bpchar | char | char_bpchar
408 | bpchar | name | name_bpchar
861 | char | bpchar | bpchar_char
944 | char | text | text_char
312 | float4 | float8 | dtof
236 | float4 | int2 | i2tof
318 | float4 | int4 | i4tof
311 | float8 | float4 | ftod
235 | float8 | int2 | i2tod
316 | float8 | int4 | i4tod
482 | float8 | int8 | i8tod
314 | int2 | int4 | i4toi2
714 | int2 | int8 | int82
313 | int4 | int2 | i2toi4
480 | int4 | int8 | int84
754 | int8 | int2 | int28
481 | int8 | int4 | int48
1177 | interval | reltime | reltime_interval
1370 | interval | time | time_interval
409 | name | bpchar | bpchar_name
407 | name | text | text_name
1400 | name | varchar | text_name
1742 | numeric | float4 | float4_numeric
1743 | numeric | float8 | float8_numeric
1782 | numeric | int2 | int2_numeric
1740 | numeric | int4 | int4_numeric
1781 | numeric | int8 | int8_numeric
946 | text | char | char_text
406 | text | name | name_text
2046 | time | timetz | timetz_time
2023 | timestamp | abstime | abstime_timestamp
2024 | timestamp | date | date_timestamp
2027 | timestamp | timestamptz | timestamptz_timestamp
1173 | timestamptz | abstime | abstime_timestamptz
1174 | timestamptz | date | date_timestamptz
2028 | timestamptz | timestamp | timestamp_timestamptz
2047 | timetz | time | time_timetz
1401 | varchar | name | name_text
(38 rows)

Coercions that will require explicit CAST, ::type, or typename(x) syntax
(NB: in 7.2 all of these would have been allowed implicitly):

oid | result | input | prosrc
------+-------------+-------------+------------------------------------------
2030 | abstime | timestamp | timestamp_abstime
1180 | abstime | timestamptz | timestamptz_abstime
1480 | box | circle | circle_box
1446 | box | polygon | poly_box
1714 | cidr | text | text_cidr
1479 | circle | box | box_circle
1474 | circle | polygon | poly_circle
1179 | date | abstime | abstime_date
748 | date | text | text_date
2029 | date | timestamp | timestamp_date
1178 | date | timestamptz | timestamptz_date
1745 | float4 | numeric | numeric_float4
839 | float4 | text | text_float4
1746 | float8 | numeric | numeric_float8
838 | float8 | text | text_float8
1713 | inet | text | text_inet
238 | int2 | float4 | ftoi2
237 | int2 | float8 | dtoi2
1783 | int2 | numeric | numeric_int2
818 | int2 | text | text_int2
319 | int4 | float4 | ftoi4
317 | int4 | float8 | dtoi4
1744 | int4 | numeric | numeric_int4
819 | int4 | text | text_int4
483 | int8 | float8 | dtoi8
1779 | int8 | numeric | numeric_int8
1289 | int8 | text | text_int8
1263 | interval | text | text_interval
1541 | lseg | box | box_diagonal
767 | macaddr | text | text_macaddr
817 | oid | text | text_oid
1447 | path | polygon | poly_path
1534 | point | box | box_center
1416 | point | circle | circle_center
1532 | point | lseg | lseg_center
1533 | point | path | path_center
1540 | point | polygon | poly_center
1448 | polygon | box | box_poly
1544 | polygon | circle | select polygon(12, $1)
1449 | polygon | path | path_poly
1200 | reltime | int4 | int4reltime
1194 | reltime | interval | interval_reltime
749 | text | date | date_text
841 | text | float4 | float4_text
840 | text | float8 | float8_text
730 | text | inet | network_show
113 | text | int2 | int2_text
112 | text | int4 | int4_text
1288 | text | int8 | int8_text
1193 | text | interval | interval_text
752 | text | macaddr | macaddr_text
114 | text | oid | oid_text
948 | text | time | time_text
2034 | text | timestamp | timestamp_text
1192 | text | timestamptz | timestamptz_text
939 | text | timetz | timetz_text
1364 | time | abstime | select time(cast($1 as timestamp without time zone))
1419 | time | interval | interval_time
837 | time | text | text_time
1316 | time | timestamp | timestamp_time
2022 | timestamp | text | text_timestamp
1191 | timestamptz | text | text_timestamptz
938 | timetz | text | text_timetz
1388 | timetz | timestamptz | timestamptz_timetz
1619 | varchar | int4 | int4_text
1623 | varchar | int8 | int8_text
(66 rows)

Regression failures with this set of choices (I've edited the output to
remove diffs that are merely consequences of the actual failures):

*** ./expected/char.out	Mon May 21 12:54:46 2001
--- ./results/char.out	Wed Apr 10 11:48:16 2002
***************
*** 18,23 ****
--- 18,25 ----
-- any of the following three input formats are acceptable 
INSERT INTO CHAR_TBL (f1) VALUES ('1');
INSERT INTO CHAR_TBL (f1) VALUES (2);
+ ERROR:  column "f1" is of type 'character' but expression is of type 'integer'
+ 	You will need to rewrite or cast the expression
INSERT INTO CHAR_TBL (f1) VALUES ('3');
-- zero-length char 
INSERT INTO CHAR_TBL (f1) VALUES ('');
*** ./expected/varchar.out	Mon May 21 12:54:46 2001
--- ./results/varchar.out	Wed Apr 10 11:48:17 2002
***************
*** 7,12 ****
--- 7,14 ----
-- any of the following three input formats are acceptable 
INSERT INTO VARCHAR_TBL (f1) VALUES ('1');
INSERT INTO VARCHAR_TBL (f1) VALUES (2);
+ ERROR:  column "f1" is of type 'character varying' but expression is of type 'integer'
+ 	You will need to rewrite or cast the expression
INSERT INTO VARCHAR_TBL (f1) VALUES ('3');
-- zero-length char 
INSERT INTO VARCHAR_TBL (f1) VALUES ('');
*** ./expected/strings.out	Fri Jun  1 13:49:17 2001
--- ./results/strings.out	Wed Apr 10 11:49:29 2002
***************
*** 137,147 ****
(1 row)
SELECT POSITION(5 IN '1234567890') = '5' AS "5";
!  5 
! ---
!  t
! (1 row)
! 
--
-- test LIKE
-- Be sure to form every test as a LIKE/NOT LIKE pair.
--- 137,145 ----
(1 row)

SELECT POSITION(5 IN '1234567890') = '5' AS "5";
! ERROR: Function 'pg_catalog.position(unknown, int4)' does not exist
! Unable to identify a function that satisfies the given argument types
! You may need to add explicit typecasts
--
-- test LIKE
-- Be sure to form every test as a LIKE/NOT LIKE pair.

*** ./expected/alter_table.out	Fri Apr  5 12:03:45 2002
--- ./results/alter_table.out	Wed Apr 10 11:51:06 2002
***************
*** 363,374 ****
CREATE TEMP TABLE FKTABLE (ftest1 varchar);
ALTER TABLE FKTABLE ADD FOREIGN KEY(ftest1) references pktable;
NOTICE:  ALTER TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
-- As should this
ALTER TABLE FKTABLE ADD FOREIGN KEY(ftest1) references pktable(ptest1);
NOTICE:  ALTER TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
DROP TABLE pktable;
- NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table "fktable"
- NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table "fktable"
DROP TABLE fktable;
CREATE TEMP TABLE PKTABLE (ptest1 int, ptest2 text,
PRIMARY KEY(ptest1, ptest2));
--- 363,376 ----
CREATE TEMP TABLE FKTABLE (ftest1 varchar);
ALTER TABLE FKTABLE ADD FOREIGN KEY(ftest1) references pktable;
NOTICE:  ALTER TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
+ ERROR:  Unable to identify an operator '=' for types 'character varying' and 'integer'
+ 	You will have to retype this query using an explicit cast
-- As should this
ALTER TABLE FKTABLE ADD FOREIGN KEY(ftest1) references pktable(ptest1);
NOTICE:  ALTER TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
+ ERROR:  Unable to identify an operator '=' for types 'character varying' and 'integer'
+ 	You will have to retype this query using an explicit cast
DROP TABLE pktable;
DROP TABLE fktable;
CREATE TEMP TABLE PKTABLE (ptest1 int, ptest2 text,
PRIMARY KEY(ptest1, ptest2));
*** ./expected/rules.out	Thu Mar 21 10:24:35 2002
--- ./results/rules.out	Wed Apr 10 11:51:11 2002
***************
*** 1026,1037 ****
'Al Bundy',
'epoch'::text
);
UPDATE shoelace_data SET sl_avail = 6 WHERE  sl_name = 'sl7';
SELECT * FROM shoelace_log;
sl_name   | sl_avail | log_who  |         log_when         
! ------------+----------+----------+--------------------------
!  sl7        |        6 | Al Bundy | Thu Jan 01 00:00:00 1970
! (1 row)
CREATE RULE shoelace_ins AS ON INSERT TO shoelace
DO INSTEAD
--- 1026,1038 ----
'Al Bundy',
'epoch'::text
);
+ ERROR:  column "log_when" is of type 'timestamp without time zone' but expression is of type 'text'
+ 	You will need to rewrite or cast the expression
UPDATE shoelace_data SET sl_avail = 6 WHERE  sl_name = 'sl7';
SELECT * FROM shoelace_log;
sl_name | sl_avail | log_who | log_when 
! ---------+----------+---------+----------
! (0 rows)

CREATE RULE shoelace_ins AS ON INSERT TO shoelace
DO INSTEAD

*** ./expected/foreign_key.out	Wed Mar  6 01:10:56 2002
--- ./results/foreign_key.out	Wed Apr 10 11:51:17 2002
***************
*** 733,747 ****
-- because varchar=int does exist
CREATE TABLE FKTABLE (ftest1 varchar REFERENCES pktable);
NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
DROP TABLE FKTABLE;
! NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table "pktable"
! NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table "pktable"
-- As should this
CREATE TABLE FKTABLE (ftest1 varchar REFERENCES pktable(ptest1));
NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
DROP TABLE FKTABLE;
! NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table "pktable"
! NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table "pktable"
DROP TABLE PKTABLE;
-- Two columns, two tables
CREATE TABLE PKTABLE (ptest1 int, ptest2 text, PRIMARY KEY(ptest1, ptest2));
--- 733,749 ----
-- because varchar=int does exist
CREATE TABLE FKTABLE (ftest1 varchar REFERENCES pktable);
NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
+ ERROR:  Unable to identify an operator '=' for types 'character varying' and 'integer'
+ 	You will have to retype this query using an explicit cast
DROP TABLE FKTABLE;
! ERROR:  table "fktable" does not exist
-- As should this
CREATE TABLE FKTABLE (ftest1 varchar REFERENCES pktable(ptest1));
NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
+ ERROR:  Unable to identify an operator '=' for types 'character varying' and 'integer'
+ 	You will have to retype this query using an explicit cast
DROP TABLE FKTABLE;
! ERROR:  table "fktable" does not exist
DROP TABLE PKTABLE;
-- Two columns, two tables
CREATE TABLE PKTABLE (ptest1 int, ptest2 text, PRIMARY KEY(ptest1, ptest2));
*** ./expected/domain.out	Wed Mar 20 13:34:37 2002
--- ./results/domain.out	Wed Apr 10 11:51:23 2002
***************
*** 111,116 ****
--- 111,118 ----
create domain ddef2 oid DEFAULT '12';
-- Type mixing, function returns int8
create domain ddef3 text DEFAULT 5;
+ ERROR:  Column "ddef3" is of type text but default expression is of type integer
+ 	You will need to rewrite or cast the expression
create sequence ddef4_seq;
create domain ddef4 int4 DEFAULT nextval(cast('ddef4_seq' as text));
create domain ddef5 numeric(8,2) NOT NULL DEFAULT '12.12';

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Barry Lind (#8)
Re: Implicit coercions need to be reined in

Barry Lind <barry@xythos.com> writes:

My feeling is that this change as currently scoped will break a lot of
existing apps. Especially the case where people are using where clauses
of the form: bigintcolumn = '999' to get a query to use the index on
a column of type bigint.

Eh? That case will not change behavior in the slightest, because
there's no type conversion --- the literal is interpreted as the target
type to start with.

regards, tom lane

#10Barry Lind
barry@xythos.com
In reply to: Tom Lane (#1)
Re: Implicit coercions need to be reined in

OK. My mistake. In looking at the regression failures in your post, I
thought I saw errors being reported of this type. My bad.

--Barry

Tom Lane wrote:

Show quoted text

Barry Lind <barry@xythos.com> writes:

My feeling is that this change as currently scoped will break a lot of
existing apps. Especially the case where people are using where clauses
of the form: bigintcolumn = '999' to get a query to use the index on
a column of type bigint.

Eh? That case will not change behavior in the slightest, because
there's no type conversion --- the literal is interpreted as the target
type to start with.

regards, tom lane

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Barry Lind (#10)
Re: Implicit coercions need to be reined in

Barry Lind <barry@xythos.com> writes:

OK. My mistake. In looking at the regression failures in your post, I
thought I saw errors being reported of this type. My bad.

Well, although that particular case isn't a problem, I am sure that this
change will break some existing applications --- the question is how
many, and do we feel that they're all poorly coded?

I suspect that the main thing that will cause issues is removal of
implicit coercions to text. For example, in 7.2 and before you can do

test72=# select 'At the tone, the time will be ' || now();
?column?
-------------------------------------------------------------
At the tone, the time will be 2002-04-11 11:49:27.309181-04
(1 row)

since there is an implicit timestamp->text coercion; or in a less
plausible example,

test72=# select 123 || (33.0/7.0);
?column?
---------------------
1234.71428571428571
(1 row)

With my proposed changes, both of these examples will require explicit
casts. The latter case might not bother people but I'm sure that
someone out there is using code much like the former case.

Since I didn't see an immediate batch of squawks, I think I will go
ahead and commit what I have; we can always revisit the implicit-allowed
flag settings later.

regards, tom lane

#12Thomas Lockhart
lockhart@fourpalms.org
In reply to: Tom Lane (#1)
Re: Implicit coercions need to be reined in

...

Since I didn't see an immediate batch of squawks, I think I will go
ahead and commit what I have; we can always revisit the implicit-allowed
flag settings later.

Squawk. But I haven't had time to look at the full ramifications of your
proposed change, so it is inappropriate to comment, right?

We have never been in complete agreement on the optimal behavior for
type coersion, but it seems that most users are blissfully ignorant of
the potential downsides of the current behavior. Another way to phrase
that would be to say that it actually does the right thing in the vast
majority of cases out in the field.

We'll probably both agree that it would be nice to avoid *hard coded*
rules of any kind for this, but do you share my concern that moving this
to a database table-driven set of rules will affect performance too
much?

- Thomas

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Lockhart (#12)
Re: Implicit coercions need to be reined in

Thomas Lockhart <lockhart@fourpalms.org> writes:

We have never been in complete agreement on the optimal behavior for
type coersion, but it seems that most users are blissfully ignorant of
the potential downsides of the current behavior. Another way to phrase
that would be to say that it actually does the right thing in the vast
majority of cases out in the field.

Could be; we probably see more complaints about the lack of any coercion
path for particular cases than about inappropriate implicit coercions.
But we do see a fair number of the latter. (And in the cases where I've
resisted adding more coercions, it was precisely because I thought it'd
be dangerous to allow them implicitly --- that concern goes away once
we can mark a coercion function as not implicitly invokable.)

We'll probably both agree that it would be nice to avoid *hard coded*
rules of any kind for this, but do you share my concern that moving this
to a database table-driven set of rules will affect performance too
much?

AFAICT the performance cost is negligible: find_coercion_function has to
look at the pg_proc row anyway. The relevant change looks like

  						  PointerGetDatum(oid_array),
  						  ObjectIdGetDatum(typnamespace));
! 	if (!HeapTupleIsValid(ftup))
! 	{
! 		ReleaseSysCache(targetType);
! 		return InvalidOid;
! 	}
! 	/* Make sure the function's result type is as expected, too */
! 	pform = (Form_pg_proc) GETSTRUCT(ftup);
! 	if (pform->prorettype != targetTypeId)
  	{
  		ReleaseSysCache(ftup);
- 		ReleaseSysCache(targetType);
- 		return InvalidOid;
  	}
! 	funcid = ftup->t_data->t_oid;
! 	ReleaseSysCache(ftup);
  	ReleaseSysCache(targetType);
  	return funcid;
  }
--- 711,734 ----
  						  Int16GetDatum(nargs),
  						  PointerGetDatum(oid_array),
  						  ObjectIdGetDatum(typnamespace));
! 	if (HeapTupleIsValid(ftup))
  	{
+ 		Form_pg_proc pform = (Form_pg_proc) GETSTRUCT(ftup);
+ 
+ 		/* Make sure the function's result type is as expected */
+ 		if (pform->prorettype == targetTypeId && !pform->proretset &&
+ 			!pform->proisagg)
+ 		{
+ 			/* If needed, make sure it can be invoked implicitly */
+ 			if (isExplicit || pform->proimplicit)
+ 			{
+ 				/* Okay to use it */
+ 				funcid = ftup->t_data->t_oid;
+ 			}
+ 		}
  		ReleaseSysCache(ftup);
  	}
! 
  	ReleaseSysCache(targetType);
  	return funcid;
  }

I do not see any reason not to install the mechanism; we can fine-tune
the actual pg_class.proimplicit settings as we get experience with them.

regards, tom lane

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#13)
Re: Implicit coercions need to be reined in

Since it seems we still want to debate this a little, I've modified the
initial set of implicit-coercion-allowed flags to allow silent coercions
from the standard datatypes to text. This un-breaks most of the
regression tests that were failing before. I still want to debate the
wisdom of allowing this, but there's no point in changing the regress
tests until we're agreed.

An interesting breakage that remained was that the foreign_key tests
were assuming a "text = integer" comparison would fail, while a
"varchar = integer" comparison would succeed ... which is not only
pretty bogus in itself, but becomes even more so when you notice that
there isn't a varchar = integer operator. Apparently, because we had
implicit coercions in *both* directions between text and integer,
the system couldn't figure out how to resolve text = integer; but
since there was an int->varchar and no varchar->int coercion, it
would resolve varchar = integer as varchar = integer::varchar.

With the attached settings, both cases are accepted as doing text =
int::text. I'm not convinced that this is a step forward; I'd prefer
to see explicit coercion needed to cross type categories. But that's
the matter for debate.

The lines marked XXX are the ones that I enabled since yesterday, and
would like to disable again:

implicit | result | input | prosrc
----------+-------------+-------------+--------------------------------------
no | abstime | timestamp | timestamp_abstime
no | abstime | timestamptz | timestamptz_abstime
no | box | circle | circle_box
no | box | polygon | poly_box
yes | bpchar | char | char_bpchar
yes | bpchar | name | name_bpchar
yes | char | text | text_char
no | cidr | text | text_cidr
no | circle | box | box_circle
no | circle | polygon | poly_circle
no | date | abstime | abstime_date
no | date | text | text_date
no | date | timestamp | timestamp_date
no | date | timestamptz | timestamptz_date
yes | float4 | float8 | dtof
yes | float4 | int2 | i2tof
yes | float4 | int4 | i4tof
no | float4 | numeric | numeric_float4
no | float4 | text | text_float4
yes | float8 | float4 | ftod
yes | float8 | int2 | i2tod
yes | float8 | int4 | i4tod
yes | float8 | int8 | i8tod
no | float8 | numeric | numeric_float8
no | float8 | text | text_float8
no | inet | text | text_inet
no | int2 | float4 | ftoi2
no | int2 | float8 | dtoi2
yes | int2 | int4 | i4toi2
yes | int2 | int8 | int82
no | int2 | numeric | numeric_int2
no | int2 | text | text_int2
no | int4 | float4 | ftoi4
no | int4 | float8 | dtoi4
yes | int4 | int2 | i2toi4
yes | int4 | int8 | int84
no | int4 | numeric | numeric_int4
no | int4 | text | text_int4
no | int8 | float8 | dtoi8
yes | int8 | int2 | int28
yes | int8 | int4 | int48
no | int8 | numeric | numeric_int8
no | int8 | text | text_int8
yes | interval | reltime | reltime_interval
no | interval | text | text_interval
yes | interval | time | time_interval
no | lseg | box | box_diagonal
no | macaddr | text | text_macaddr
yes | name | bpchar | bpchar_name
yes | name | text | text_name
yes | name | varchar | text_name
yes | numeric | float4 | float4_numeric
yes | numeric | float8 | float8_numeric
yes | numeric | int2 | int2_numeric
yes | numeric | int4 | int4_numeric
yes | numeric | int8 | int8_numeric
no | oid | text | text_oid
no | path | polygon | poly_path
no | point | box | box_center
no | point | circle | circle_center
no | point | lseg | lseg_center
no | point | path | path_center
no | point | polygon | poly_center
no | polygon | box | box_poly
no | polygon | circle | select polygon(12, $1)
no | polygon | path | path_poly
no | reltime | int4 | int4reltime
no | reltime | interval | interval_reltime
yes | text | char | char_text
XXX | text | date | date_text
XXX | text | float4 | float4_text
XXX | text | float8 | float8_text
no | text | inet | network_show
XXX | text | int2 | int2_text
XXX | text | int4 | int4_text
XXX | text | int8 | int8_text
XXX | text | interval | interval_text
no | text | macaddr | macaddr_text
yes | text | name | name_text
no | text | oid | oid_text
XXX | text | time | time_text
XXX | text | timestamp | timestamp_text
XXX | text | timestamptz | timestamptz_text
XXX | text | timetz | timetz_text
no | time | abstime | select time(cast($1 as timestamp without time zone))
no | time | interval | interval_time
no | time | text | text_time
no | time | timestamp | timestamp_time
yes | time | timetz | timetz_time
yes | timestamp | abstime | abstime_timestamp
yes | timestamp | date | date_timestamp
no | timestamp | text | text_timestamp
yes | timestamp | timestamptz | timestamptz_timestamp
yes | timestamptz | abstime | abstime_timestamptz
yes | timestamptz | date | date_timestamptz
no | timestamptz | text | text_timestamptz
yes | timestamptz | timestamp | timestamp_timestamptz
no | timetz | text | text_timetz
yes | timetz | time | time_timetz
no | timetz | timestamptz | timestamptz_timetz
no | varchar | int4 | int4_text
no | varchar | int8 | int8_text
yes | varchar | name | name_text
(103 rows)

regards, tom lane