Re: A question multibye

Started by Tatsuo Ishiialmost 25 years ago1 messagesgeneral

t-ishii@sra.co.jp

almost 25 years ago

From: "Siamack Jabbarzadeh" <safamack@hotmail.com>
Subject: A question multibye
Date: Fri, 06 Jul 2001 18:56:53
Message-ID: <F81FKpRcTCi90gP8yJr00010c91@hotmail.com>

Dear Sir/Madam:
I have some questions on multibye languages and I hope you can help
me? First I was wondering if there is a table (like ASCII table) for
multibyte languages?

I am not sure what you want, but PostgreSQL allows default encoding
per database, not per table.

Second, Assuming we have an input made up of some Japanese letters mixed
with special character like & and % (which have ASCII values). Now I would
like to write a parser that takes & and % out and leaves only Japanese
letters. Knowing the fact that & and % are ASCII and the letters are
mulitbyte, I can not do the parsing by comparing byte by byte ( as we do in
normal ASCII). How can I do that? Do % and & have multibye values in
multibye systems? if yes, how can I get those values? Could you kindly ( if
you have some solutions to the problem), give me some hints on that?

Japanese has several encodings. I recomend you to use
EUC-JP. (Extended Unix Code for Japanese). With EUC-JP, it's very easy
to distinguish Japanese from ASCII even paring byte by byte. If a
byte is greater than 7f, then it should be a Japanese, otherwise
ASCII.

Anyway, I recommend you to study about Japanese encodings first.

See:
ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf
--
Tatsuo Ishii

Import Notes

Reply to msg id not found: F81FKpRcTCi90gP8yJr00010c91@hotmail.comReference msg id not found: F81FKpRcTCi90gP8yJr00010c91@hotmail.com