string cast/compare broken?
Mac OS X:
postgres% psql --version
psql (PostgreSQL) 7.2.1
contains support for: multibyte
LEDEV=# create table test1 (foo varchar(5));
CREATE
LEDEV=# create table test2 (foo char(5));
CREATE
LEDEV=# insert into test2 (foo) values ('S');
INSERT 3724249 1
LEDEV=# insert into test1 (foo) values ('S');
INSERT 3724250 1
LEDEV=# select a.foo, b.foo from test1 a, test2 b where a.foo =
b.foo::text;
foo | foo
-----+-----
(0 rows)
LEDEV=# select a.foo = 'S', b.foo = 'S' from test1 a, test2 b;
?column? | ?column?
----------+----------
t | t
(1 row)
LEDEV=# select a.foo, b.foo from test1 a, test2 b where CAST(a.foo as
CHAR) = b.foo;
foo | foo
-----+-------
S | S
(1 row)
Scott Royston <scroyston@mac.com> writes:
[ various examples of comparing char and varchar ]
I see no bug here. For the CHAR datatype, trailing spaces are defined
to be insignificant. For VARCHAR and TEXT, trailing spaces are
significant. If you want to compare a CHAR value to a VARCHAR or TEXT
value, your best bet is a locution like
rtrim(charval) = varcharval
regards, tom lane
On Fri, 2002-07-12 at 08:50, Tom Lane wrote:
Scott Royston <scroyston@mac.com> writes:
[ various examples of comparing char and varchar ]
I see no bug here. For the CHAR datatype, trailing spaces are defined
to be insignificant. For VARCHAR and TEXT, trailing spaces are
significant. If you want to compare a CHAR value to a VARCHAR or TEXT
value, your best bet is a locution like
rtrim(charval) = varcharval
I guess the strangest part was that both a.foo = 'S' and b.foo = 'S' but
not a.foo=b.foo; (a.foo is varchar(5) , b.foo is char(5) )
I guess that tha 'S' that b.foo gets compared to is converted to 'S '
before comparison but when comparing varchar(5) and char(5) they are
both compared by converting them to varchar which keeps the trailing
spaces from char(5). If the conversion where varchar(5) --> char(5) then
they would compare equal.
I vaguely remember something in the standard about cases when comparing
char() types should discard extra spaces.
-------------
Hannu
I guess the strangest part was that both a.foo = 'S' and b.foo = 'S' but
not a.foo=b.foo; (a.foo is varchar(5) , b.foo is char(5) )I guess that tha 'S' that b.foo gets compared to is converted to 'S '
before comparison but when comparing varchar(5) and char(5) they are
both compared by converting them to varchar which keeps the trailing
spaces from char(5).
Yes, I think this is inconvenient/unintuitive. If it is doable according to
standards, this should imho be fixed.
If the conversion where varchar(5) --> char(5) then
they would compare equal.
I am not sure, since, if the varchar stored 'S ' then the comparison to a char 'S'
should probably still fail, since those spaces in the varchar are significant.
Informix compares them equal, so I guess argumentation can be made in that direction
too (that currently evades my understanding of intuitive reasoning :-).
Andreas
Import Notes
Resolved by subject fallback
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
If the conversion where varchar(5) --> char(5) then
they would compare equal.
I am not sure, since, if the varchar stored 'S ' then the comparison
to a char 'S' should probably still fail,
There is no comparison of varchar to char:
regression=# select 'z'::char = 'z'::varchar;
ERROR: Unable to identify an operator '=' for types 'character' and 'character varying'
You will have to retype this query using an explicit cast
regression=#
I consider this a feature, not a bug, since it's quite unclear which
semantics ought to be used.
The cases Scott originally posted all involved various forms of
coercion to force both sides to be the same type; I'm not sure
that he quite understood why he had to do that, but perhaps it's now
becoming clear.
I wonder whether it would be a good idea to stop considering char
as binary-compatible to varchar and text. Instead we could set
things up so that there is a coercion function involved, namely
rtrim(). But that would probably make us diverge even further
from the spec.
Has anyone studied how other DBMSs handle CHAR vs VARCHAR? Judging
from the number of questions we get on this point, I have to wonder
if we are not out of step with the way other systems do it.
regards, tom lane
Has anyone studied how other DBMSs handle CHAR vs VARCHAR? Judging
from the number of questions we get on this point, I have to wonder
if we are not out of step with the way other systems do it.
Well, I already gave the Informix example, that compares them as equal.
(they obviously coerce varchar to char)
In nearly all cases I have seen so far the different handling of trailing
blanks is not wanted. In most of these varchar is simply used instead of char to
save disk space.
In Informix ESQL/C there is a host variable type CSTRINGTYPE that automatically
rtrims columns of char type upon select.
Imho the advantages of an automatic coercion would outweigh the few corner cases
where the behavior would not be intuitive to everybody.
Andreas
Import Notes
Resolved by subject fallback
On Fri, 12 Jul 2002, Tom Lane wrote:
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
If the conversion where varchar(5) --> char(5) then
they would compare equal.I am not sure, since, if the varchar stored 'S ' then the comparison
to a char 'S' should probably still fail,There is no comparison of varchar to char:
regression=# select 'z'::char = 'z'::varchar;
ERROR: Unable to identify an operator '=' for types 'character' and 'character varying'
You will have to retype this query using an explicit cast
regression=#I consider this a feature, not a bug, since it's quite unclear which
semantics ought to be used.The cases Scott originally posted all involved various forms of
coercion to force both sides to be the same type; I'm not sure
that he quite understood why he had to do that, but perhaps it's now
becoming clear.I wonder whether it would be a good idea to stop considering char
as binary-compatible to varchar and text. Instead we could set
things up so that there is a coercion function involved, namely
rtrim(). But that would probably make us diverge even further
from the spec.Has anyone studied how other DBMSs handle CHAR vs VARCHAR? Judging
from the number of questions we get on this point, I have to wonder
if we are not out of step with the way other systems do it.
I don't think it's just a CHAR vs VARCHAR issue. AFAICT the spec defines
all of this in terms of the collations used and there are (imho arcane)
rules about converting between them for comparisons and operations.
Technically I think varcharcol=charcol *is* illegal if we are
saying that char has a collation with PAD SPACE and varchar
has a collation with NO PAD, because they're different collations
and character value expressions from column reference are implicit
and that doesn't allow comparison between two different collations.
Of course I could also be misreading it.
There is no comparison of varchar to char in Oracle too.
Scott provided cast cases are some unique features in psql,
each database MAY handle those casting differently.
In good design/application, char should be replaced by
varchar type unless you know the exact bytes. It would be
not bad idea to get rid of char gradually in the future
to avoid such inconsistency
between databases, that's just my view.
johnl
Show quoted text
-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Zeugswetter
Andreas SB SD
Sent: Friday, July 12, 2002 8:49 AM
To: Tom Lane
Cc: Hannu Krosing; Scott Royston; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] string cast/compare broken?Has anyone studied how other DBMSs handle CHAR vs VARCHAR? Judging
from the number of questions we get on this point, I have to wonder
if we are not out of step with the way other systems do it.Well, I already gave the Informix example, that compares them as equal.
(they obviously coerce varchar to char)In nearly all cases I have seen so far the different handling of trailing
blanks is not wanted. In most of these varchar is simply used
instead of char to
save disk space.In Informix ESQL/C there is a host variable type CSTRINGTYPE that
automatically
rtrims columns of char type upon select.Imho the advantages of an automatic coercion would outweigh the
few corner cases
where the behavior would not be intuitive to everybody.Andreas
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
On Fri, Jul 12, 2002 at 03:48:59PM +0200, Zeugswetter Andreas SB SD wrote:
Imho the advantages of an automatic coercion would outweigh the few
corner cases where the behavior would not be intuitive to
everybody.
How then would one get the correct behaviour from char()?
A
--
----
Andrew Sullivan 87 Mowat Avenue
Liberty RMS Toronto, Ontario Canada
<andrew@libertyrms.info> M6K 3E3
+1 416 646 3304 x110