BUG #2554: ILIKE operator works incorrectly

Started by Jaros�aw Bojarover 19 years ago3 messagesbugs
Jump to latest
#1Jaros�aw Bojar
jarek.bojar@gmail.com

The following bug has been logged online:

Bug reference: 2554
Logged by: Jarosław Bojar
Email address: jarek.bojar@gmail.com
PostgreSQL version: 8.1.4
Operating system: i386-redhat-linux-gnu, compiled by GCC
i386-redhat-linux-gcc (GCC) 4.1.0
Description: ILIKE operator works incorrectly
Details:

ILIKE operator works incorrectly with UTF8 encoding and Polish characters.
Consider following SQL statements:

CREATE DATABASE test ENCODING='UTF8';
\c test
CREATE TABLE the_table (val VARCHAR(50));
INSERT INTO the_table (val) VALUES ('Świat');
INSERT INTO the_table (val) VALUES ('Łąka');
INSERT INTO the_table (val) VALUES ('Ćma');
INSERT INTO the_table (val) VALUES ('abc');
INSERT INTO the_table (val) VALUES ('ABC');

Without Polish characters ILIKE works correctly:
SELECT * FROM the_table WHERE val ilike 'abc';
val
-----
abc
ABC
(2 rows)

But with Polish characters it does not work correctly. Following queries
should give single row results, but they do not return any rows:

SELECT * FROM the_table WHERE val ilike 'świat';
val
-----
(0 rows)

SELECT * FROM the_table WHERE val ilike 'łąka';
val
-----
(0 rows)

SELECT * FROM the_table WHERE val ilike 'ćma';
val
-----
(0 rows)

On the contrary functions like UPPER work correctly with Polish characters
and following queries produce correct results:

SELECT * FROM the_table WHERE UPPER(val) like UPPER('świat');
val
-------
Świat
(1 row)

SELECT * FROM the_table WHERE UPPER(val) like UPPER('ćma');
val
-----
Ćma
(1 row)

SELECT * FROM the_table WHERE UPPER(val) like UPPER('łąka');
val
------
Łąka
(1 row)

The bug is also present in PostgreSQL 8.1.0 on Windows XP.

#2Peter Eisentraut
peter_e@gmx.net
In reply to: Jaros�aw Bojar (#1)
Re: BUG #2554: ILIKE operator works incorrectly

Jaros�aw Bojar wrote:

ILIKE operator works incorrectly with UTF8 encoding and Polish
characters.

What does SHOW lc_collate say?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#2)
Re: BUG #2554: ILIKE operator works incorrectly

Peter Eisentraut <peter_e@gmx.net> writes:

Jaros�aw Bojar wrote:

ILIKE operator works incorrectly with UTF8 encoding and Polish
characters.

What does SHOW lc_collate say?

Even if the locale is set right, iwchareq() is completely broken
for multibyte charsets: it's trying to apply the system's tolower()
to a pg_wchar, which is highly unlikely to work. This has been a
known problem for awhile, but no one's stepped up to fix it.

regards, tom lane