Multibyte (Japanese Character) Sorting
Hi there,
Im having a problem in sorting multibyte characters.
I am using EUC-JP for my database encoding becuase we need to support
japanese (hiragana, katakana, kanji) text, since our clients are japanese.
I have a table named "user_info" with the following fields:
first_name character(60) NOT NULL
last_name character(60) NOT NULL
We've forced doublebyte character our entries so that all data stored in
the table are doublebyte. The problem is, the sorting procedure. when
you user ORDER BY last_name ASC, the list is not sorted properly. Please
help me fix this problem. Thank you in advanced.
--
==================================================
Morgan Gonzales - 1st BU (MSI) - Tsukiden Software
There are two kinds of people in this world.
One says to God, thy will be done,
and the other to whom God says, thy will be done.
Hi there,
Im having a problem in sorting multibyte characters.
I am using EUC-JP for my database encoding becuase we need to support
japanese (hiragana, katakana, kanji) text, since our clients are japanese.I have a table named "user_info" with the following fields:
first_name character(60) NOT NULL
last_name character(60) NOT NULLWe've forced doublebyte character our entries so that all data stored in
the table are doublebyte. The problem is, the sorting procedure. when
you user ORDER BY last_name ASC, the list is not sorted properly. Please
help me fix this problem. Thank you in advanced.
I'm not sure why you think "not sorted properly", but my wild guess is
your OS's locale data is broken. Use C locale.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
I have taken a look at the screen shot. Yes, the sort order seems
pretty ridiculous. I tested similar data on my Linux box and the
result was nothing strange. Do you have an index on the field? What is
the platform PostgreSQL is running on? Do you see the same problem
using psql? Can you give me the pg_dump data if possible?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
Show quoted text
Thank you for your reply. But I believe our LOCALE was already set to C
(since this is the default setting).I've attached the result of my query using "ORDER BY <field> ASC". This
field contains double byte character for both english and japanese text.
I think the problem with this sorting is, it sorts by length then by
ascii code value.Tatsuo Ishii wrote:
Hi there,
Im having a problem in sorting multibyte characters.
I am using EUC-JP for my database encoding becuase we need to support
japanese (hiragana, katakana, kanji) text, since our clients are japanese.I have a table named "user_info" with the following fields:
first_name character(60) NOT NULL
last_name character(60) NOT NULLWe've forced doublebyte character our entries so that all data stored in
the table are doublebyte. The problem is, the sorting procedure. when
you user ORDER BY last_name ASC, the list is not sorted properly. Please
help me fix this problem. Thank you in advanced.I'm not sure why you think "not sorted properly", but my wild guess is
your OS's locale data is broken. Use C locale.
--
Tatsuo Ishii
SRA OSS, Inc. Japan--
==================================================
Morgan Gonzales - 1st BU (MSI) - Tsukiden SoftwareThere are two kinds of people in this world.
One says to God, thy will be done,
and the other to whom God says, thy will be done.
Import Notes
Reply to msg id not found: 4817FC29.2040805@tspi.com.ph