BUG #2261: ILIKE seems to be buggy on koi8 input
The following bug has been logged online:
Bug reference: 2261
Logged by: Evgeny Gridasov
Email address: eugrid@fpm.kubsu.ru
PostgreSQL version: 8.1.2
Operating system: Debian Linux
Description: ILIKE seems to be buggy on koi8 input
Details:
my terminal is RU_ru.KOI8-R,
template1's encoding is UTF8.
ILIKE seems to be buggy when comparing russian strings,
while UPPER/LOWER works OK.
template1=# \encoding koi8;
try to get uppercase of some russian letters:
template1=# select upper('фыва');
upper
-------
ФЫВА
(1 row)
result is OK!
next, try to compare uppercase and lowercase using
ILIKE:
template1=# select true where 'фыва' ilike 'ФЫВА';
bool
------
(0 rows)
OOPS! Nothing happened. But why?
try the same but with latin charset letters:
template1=# select true where 'asdf' ilike 'ASDF';
bool
------
t
(1 row)
Try to compare lowercase with lowercase (russian):
template1=# select true where 'фыва' ilike 'фыва';
bool
------
t
(1 row)
it works.
"Evgeny Gridasov" <eugrid@fpm.kubsu.ru> writes:
my terminal is RU_ru.KOI8-R,
template1's encoding is UTF8.
ILIKE seems to be buggy when comparing russian strings,
while UPPER/LOWER works OK.
I'll bet that the database's locale setting is expecting some encoding
other than UTF8 :-(. You need to have compatible locale and encoding
settings inside the database. You didn't say exactly what the database
LC_COLLATE value is, but if it's RU_ru.KOI8-R, that definitely does not
match UTF8.
regards, tom lane
postgresql server starts with environment:
LC_COLLATE=en_US.UTF-8
LC_ALL=en_US.UTF-8
LANG=en_US.UTF-8
I've tried to set different LC_COLLATE/LC_ALL/LANG settings
but it did not help.
I've tried to change my psql input to unicode russian, but it did not help, too.
'show all' says I've got lc_collate and other lc_* set to en_US.UTF-8.
initdb was run with this locale.
It cannot be modified setting it in postgresql.conf (creation db constant?)
Should I reinit database to get this working or what?
If I should reinit db, what locale should I choose?
BTW, ~* syntax does not also work with upper/lower case russian letters,
while upper()/lower() still work ok.
On Wed, 15 Feb 2006 12:44:18 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Evgeny Gridasov" <eugrid@fpm.kubsu.ru> writes:
my terminal is RU_ru.KOI8-R,
template1's encoding is UTF8.
ILIKE seems to be buggy when comparing russian strings,
while UPPER/LOWER works OK.I'll bet that the database's locale setting is expecting some encoding
other than UTF8 :-(. You need to have compatible locale and encoding
settings inside the database. You didn't say exactly what the database
LC_COLLATE value is, but if it's RU_ru.KOI8-R, that definitely does not
match UTF8.regards, tom lane
--
Evgeny Gridasov
Software Engineer
I-Free, Russia
Evgeny Gridasov <eugrid@fpm.kubsu.ru> writes:
postgresql server starts with environment:
LC_COLLATE=en_US.UTF-8
LC_ALL=en_US.UTF-8
LANG=en_US.UTF-8
Well, that setting shouldn't translate much except A-Z/a-z. If you want
cyrillic upper/lower case conversions you need database's LC_CTYPE to be
ru_RU.something.
regards, tom lane
Evgeny Gridasov wrote:
It cannot be modified setting it in postgresql.conf (creation db
constant?) Should I reinit database to get this working or what?
Yes.
If I should reinit db, what locale should I choose?
Something like ru_RU.utf8.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/