BUG: ILIKE with single-byte encoding
Hello,
With PostgreSQL 8.3.0 the following bug has been introduced with the ILIKE or
~~* operator:
In a database with single-byte encoding as LATIN1 the expression
SELECT 'aü' ILIKE '%ü';
returns false.
This error is true for every pattern, where a % is followed by a char with a
decimal value between 128 and 255.
I was able to track down the error to the file
src/backend/utils/adt/like_match.c
For the single-byte case there are some places where a (signed) char value is
compared to the return value auf tolower() which is an int. The 'ü' in Latin1
is -4 as signed char and 252 as int as returned by tolower() which is
obviously not equal.
It could be fixed, with the appended patch.
cu
Rolf Jentsch
Entwicklung Mitglieder-Systeme Dezentral
ElectronicPartner GmbH
Mündelheimer Weg 40
40472 Düsseldorf
phone: +49-(0)211-4156-0
fax: +49-(0)211-4156-6865
eMail: rjentsch@electronicpartner.de
Sitz der Gesellschaft Düsseldorf
Amtsgericht - Registergericht Düsseldorf - HRB 4078
Geschäftsführer: Oliver Haubrich,
Dr. Sven-Olaf Krauß, Karl Trautman
--- src/backend/utils/adt/like_match.c 2008-02-28 18:19:30.000000000
+0100
+++ src/backend/utils/adt/like_match.c 2008-02-28 18:19:43.000000000
+0100
@@ -71,7 +71,7 @@
*/
#ifdef MATCH_LOWER
-#define TCHAR(t) tolower((t))
+#define TCHAR(t) ((char)tolower((t)))
#else
#define TCHAR(t) (t)
#endif
Rolf Jentsch <RJentsch@electronicpartner.de> writes:
With PostgreSQL 8.3.0 the following bug has been introduced with the ILIKE or
~~* operator:
In a database with single-byte encoding as LATIN1 the expression
SELECT 'a�' ILIKE '%�';
returns false.
For the single-byte case there are some places where a (signed) char
value is compared to the return value auf tolower() which is an int.
Patch applied, thanks! It turns out there was a second bug on the very
same line: some machines have problems if the argument of tolower()
isn't explicitly cast to unsigned char ...
regards, tom lane