BUG #2070: Encoding dependent error in comparison operators

Started by Jan Jockuschover 20 years ago2 messagesbugs
Jump to latest
#1Jan Jockusch
jan@jockusch.de

The following bug has been logged online:

Bug reference: 2070
Logged by: Jan Jockusch
Email address: jan@jockusch.de
PostgreSQL version: 8.1.0
Operating system: Linux
Description: Encoding dependent error in comparison operators
Details:

With terminal encoding Latin-1, client encoding Latin-1
and database encoding LATIN1, I do:

\l
Name | Owner | Encoding
---------------+----------+-----------
encoding_test | postgres | LATIN1
...
encoding_test=# select 'ä' = 'ö';
?column?
----------
t
(1 row)

And although the two values are quite clearly
different, the operator finds them equal.

I hope you see the different umlauts in the query
(also latin-1 encoded).

The comparison operator works OK for 7-bit ASCII values
and finds characters below 128 different from those
above 128. It finds all characters above 128 equal, though.

The bug also applies for ascii strings which are the
same except for a different umlaut at the same
position, e.g. 'Größe' = 'Grüße'. This comparison
also renders true in latin-1 scenarios.

The bug does not apply for clean UTF-8 scenarios.

I think this is a serious bug which produces surprising
and very hard to find problems. If I can be of any
assistance in diagnosing or fixing, please contact me.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jan Jockusch (#1)
Re: BUG #2070: Encoding dependent error in comparison operators

"Jan Jockusch" <jan@jockusch.de> writes:

With terminal encoding Latin-1, client encoding Latin-1
and database encoding LATIN1, I do:

... and what database locale? This sort of misbehavior is a common
symptom of having a database encoding that's not what the locale
expects.

regards, tom lane