unicode and =
= is not working on a char(30) coloumn for me.
I want to find rows with equal name.
I have my database set to unicode.
SQL1
SELECT h1.key,h1.name,h2.key,h2.name
FROM table1 as h1, table1 as h2
WHERE h1.name=h2.name
and h1.OID = 730716
produces result rows where name doe not match
name is multibyte UTF-8 values.
SQL1
SELECT h1.key,h1.name,h2.key,h2.name
FROM table1 as h1, table1 as h2
WHERE h1.key=h2.key
and h1.OID = 730716
produces correct results.
key is single byte UTF-8 values only (digits only)
I have a hash index on name, I dropped it and got a different but still wrong result.
key is part of a multicolumn primary kay
version 8.0.3 - gcc 3.4.3 fedora 3
Any suggestion on how to match multibyte characters? Do I need to use a differnt comparison operator?
Thanks,
Grant
"Grant Morgan" <grant@ryuuguu.com> writes:
= is not working on a char(30) coloumn for me.
I want to find rows with equal name.
I have my database set to unicode.
I'll bet you are running the postmaster in a locale that isn't expecting
utf-8 encoding. The locale and encoding have to match or you're going
to get very strange behavior.
regards, tom lane
I am not sure what locale I was running as I had not set it when doing initdb.
I created a new DB with --locale=en_US.utf8 -E UNICODE
and imported my data from original source (not copied from old DB) and still have the smae problem that UNICODE strings with double byte characters that are not equal get selected as equal.
to test things further
md5(h1.name)=md5(h2.name)
works and only matches equal values.
h1.name=h2.name
match un equal values.
Anyone have any other ideas? or is en_US.utf8 not a proper utf8 locale ( I got the name by doing locale -a )
I am not so concerned about sorting on this project just equality, but general solution would be apreciated.
Thanks,
Grant
On Mon, 20 Jun 2005 10:13:39 +0900, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
"Grant Morgan" <grant@ryuuguu.com> writes:
= is not working on a char(30) coloumn for me.
I want to find rows with equal name.
I have my database set to unicode.I'll bet you are running the postmaster in a locale that isn't expecting
utf-8 encoding. The locale and encoding have to match or you're going
to get very strange behavior.regards, tom lane