String Comparision Weirdness
We had major problems after migrating the DB to a more powerful server; we
managed to locate the problem to a type conversion bug in our software.
Never the less, this thing puzzles us a lot:
NBTEST2=# select '-1'>'0';
?column?
----------
t
(1 row)
We've tried this query on several servers with different versions of
postgresql and different versions of glibc - some returns true, others
returns false - and it seems neither to be related to the postgresql version
nor the glibc version. At all servers we tested, strcmp("-1","0") returned
negative - at some -3 and at others -1, and not related to postgresql.
The correct result above should be false, since ascii('-')=45 while
ascii('0')=48.
Can the character set in use be significant?
--
Notice of Confidentiality: This email is sent unencrypted over the network,
and may be stored on several email servers; it can be read by third parties
as easy as a postcard. Do not rely on email for confidential information.
[Tobias Brox - Mon at 12:17:33PM +0200]
At all servers we tested, strcmp("-1","0") returned
negative - at some -3 and at others -1, and not related to postgresql.
Ehr, uncorrelated to the result of the evaluation of '-1'>'0' on postgresql,
I mean.
--
Notice of Confidentiality: This email is sent unencrypted over the network,
and may be stored on several email servers; it can be read by third parties
as easy as a postcard. Do not rely on email for confidential information.
On Mon, 26 Sep 2005, Tobias Brox wrote:
We had major problems after migrating the DB to a more powerful server; we
managed to locate the problem to a type conversion bug in our software.
Never the less, this thing puzzles us a lot:NBTEST2=# select '-1'>'0';
?column?
----------
t
(1 row)We've tried this query on several servers with different versions of
postgresql and different versions of glibc - some returns true, others
returns false - and it seems neither to be related to the postgresql version
nor the glibc version. At all servers we tested, strcmp("-1","0") returned
negative - at some -3 and at others -1, and not related to postgresql.The correct result above should be false, since ascii('-')=45 while
ascii('0')=48.Can the character set in use be significant?
It's more likely to be the locale in use. For example, on my machine,
given a file with -1 and 0.
LANG="C" sort file
-1
0
LANG="en_US" sort file
0
-1
Many locales do a more complicated comparison than ascii values (like
strcmp). For example, symbols and spaces may only be used as tiebreakers
after effectively comparing the strings without them.
[Stephan Szabo - Mon at 06:07:48AM -0700]
It's more likely to be the locale in use. For example, on my machine,
given a file with -1 and 0.
(...)
LANG="en_US" sort file
0
-1
Hah, those Americans don't even know how to count ;-)
I find this weird, but it's clearly not a problem with postgresql at least.
We should obviously check up the locale and stick to "C" on the servers.
Thanks for the effort.
--
Notice of Confidentiality: This email is sent unencrypted over the network,
and may be stored on several email servers; it can be read by third parties
as easy as a postcard. Do not rely on email for confidential information.