Big5 contains '\'

Started by Limin Liuover 24 years ago3 messages
#1Limin Liu
limin@pumpkinnet.com
1 attachment(s)

Hi,

My project requires to use Big5/EUC_TW (two bytes per
chinese-character).

Unfortunately, Big5 code contains escape '\'.
For instance,
1: create table "¦¨¥\ªº¤@¥b" (n int, m text);
2: create table n (n int, m text);
3: insert into n values (19,'¦nªº¶}©l¬O¦¨¥\ªº¤@¥b'); -- 10 chinese
characters
4: select * from n;
n | m
----+----------------------
15 | ¦nªº¶}©l¬O¦¨¥ªº¤@¥b

Table name in line 1 is ¦¨¥\ªº¤@¥b. With ", there is no problem. Line
3 tries to insert a value with escape '\'. I realized this '\' is
gone. Of course, what we see are totally nonsense after the 6th
character.

This can be fixed by creating a MULTIBYTE block in scansup.c (see
attachment). Can you put this in the next release if there is no
objection (or bugs). :-)

I am thinking about to write an introduction book for PostgreSQL in
chinese (big5). With this problem, it will be hard to convince them to
use PostgreSQL.

--
Regards,
Limin Liu

Attachments:

scansup.capplication/octet-stream; name=scansup.cDownload
#2Limin Liu
limin@pumpkinnet.com
In reply to: Limin Liu (#1)
1 attachment(s)
Re: Big5 contains '\'

Unfortunately, Big5 code contains escape '\'.
For instance,
1: create table "¦¨¥\ªº¤@¥b" (n int, m text);
2: create table n (n int, m text);
3: insert into n values (19,'¦nªº¶}©l¬O¦¨¥\ªº¤@¥b'); -- 10 chinese
characters
4: select * from n;
n | m
----+----------------------
15 | ¦nªº¶}©l¬O¦¨¥ªº¤@¥b

My previous approach was not able to provide scape-escape. I.e. "\\t" will
remain as "\\t". Here is the new update. I tried the following... It looks
fine on all cases.

insert into n values (21,'¦nªº¶}©l¬O¦¨¥\ªº¤@¥b \t \\t \\\try \\\\test2');
select * from n;
n | m
----+----------------------
21 | ¦nªº¶}©l¬O¦¨¥\ªº¤@¥b \t \ ry \\test2

--
Regards,
Limin Liu

Attachments:

scansup.capplication/octet-stream; name=scansup.cDownload
#3Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Limin Liu (#1)
Re: Big5 contains '\'

My project requires to use Big5/EUC_TW (two bytes per
chinese-character).

Unfortunately, Big5 code contains escape '\'.

PostgreSQL can handle Big5 since 6.5. Create a database with the encoding
EUC_TW, and set the client side encoding to BIG5. For example, set
PGCLIENTENCODING envrionment variable to BIG5 before starting psql or
any libpq based clients. For 7.0 or greater version of PostgreSQL, you
could also use \encoding command in psql.
--
Tatsuo Ishii