Different result when using '=' and 'like' in unicode mode
I used PostgreSQL 7.2.1 on redhat 7.3, installed using the rpm from redhat,.
When I createdb -E UNICODE, and import data using \copy command, after
converted all the imported text files to UTF-8 encoding. It looks well, I
can get
the data from several platforms in different locale. But, a strange problem,
when i using '=' in where clause, to match the varchar value, the result is
not correct,
for example, the table structure is
Table "tbl_test"
Column | Type | Modifiers
-------------+-----------------------+-----------
id | integer |
name | character varying(40) |
the sql is
select * from tbl_test where name = '石田';
but the result contains other data, such as 石町, 柳町, 柳田
and I changed the sql to
select * from tbl_test where name like '石田';
select * from tbl_test where name ilike '石田';
the results both correct.
I don't know the reason of this problem, and in my application it not
allowed
to use 'like' replaced '=', the data amount is too large.
Is this a bug of PostgreSQL? or only a problem in redhat environment? and
how to fix it?
Thx,
Jiang Sheng
Jiang Sheng writes:
the sql is
select * from tbl_test where name = 'ç³ç°';but the result contains other data, such as ç³çº, æ³çº, æ³ç°
and I changed the sql to
select * from tbl_test where name like 'ç³ç°';
select * from tbl_test where name ilike 'ç³ç°';the results both correct.
The LIKE operator uses a character-by-character comparison (which could
quite possibly behave oddly if your strings contain combining characters,
but that is a different issue). The = operator uses the host operating
system's locale facilities to do a locale-aware comparison. So in general
LIKE and = are not interchangeable.
If the result you get for = is wrong even under some locale, then you
probably don't have the right locale set on your server. Recent glibc
systems require you to set your locale to xx_YY.utf8 (rather than just
xx_YY) if you want to use Unicode.
--
Peter Eisentraut peter_e@gmx.net
Thanks a lot!
In your opinion, should I re-compile the PostgreSQL after set the $LANG
to ja_JP.utf8? or I need to set other variables' value..
I changed the $LANG value, but the result still not correct, and not found
anything about it in PostgrSQL's document.
Thx
Jiang Sheng
----- Original Message -----
送信者 : "Peter Eisentraut" <peter_e@gmx.net>
宛先 : "Jiang Sheng" <jiangsheng@sis.sh.cn>
Cc: <pgsql-bugs@postgresql.org>
送信日時 : 2002年11月18日 6:21
件名 : Re: [BUGS] Different result when using '=' and 'like' in unicode mode
Show quoted text
Jiang Sheng writes:
the sql is
select * from tbl_test where name = 'ç³ç°';but the result contains other data, such as ç³çº, æ³çº, æ³ç°
and I changed the sql to
select * from tbl_test where name like 'ç³ç°';
select * from tbl_test where name ilike 'ç³ç°';the results both correct.
The LIKE operator uses a character-by-character comparison (which could
quite possibly behave oddly if your strings contain combining characters,
but that is a different issue). The = operator uses the host operating
system's locale facilities to do a locale-aware comparison. So in general
LIKE and = are not interchangeable.If the result you get for = is wrong even under some locale, then you
probably don't have the right locale set on your server. Recent glibc
systems require you to set your locale to xx_YY.utf8 (rather than just
xx_YY) if you want to use Unicode.--
Peter Eisentraut peter_e@gmx.net
Jiang Sheng writes:
In your opinion, should I re-compile the PostgreSQL after set the $LANG
to ja_JP.utf8? or I need to set other variables' value..
Recompiling has nothing to do with that. Please see the Administrator's
Guide chapter on localization for all the information.
--
Peter Eisentraut peter_e@gmx.net