Re: How to read/write multibyte to database
I have a form that contains data with Chinese characters. When it's
submited through http request, how can I write it to the database?
What kind of encoding are you using? There are several encodings for
Chinese.
--
Tatsuo Ishii
Import Notes
Reply to msg id not found: 989nue$l5a$1@news.tht.netReference msg id not found: 989nue$l5a$1@news.tht.net
I'm using BIG5
Then you lose. Because BIG5 containts byte patterns conflicting with
ASCII special charcters (like '\'), I guess your code:
for(i=0; *queryString; i++)
{
splitword(items.Item, queryString, '&');
unescape_url(items.Item);
splitword(items.name, items.Item, '=');
if(!strcmp(items.name, "Name"))
{
strcpy(name, items.Item);
}
else if(!strcmp(items.name, "Address"))
{
strcpy(address, items.Item);
}
won't work. Change your program to treat BIG5 carefully. Or you
probably better to use EUC_TW or UTF-8 to write your web contents.
--
Tatsuo Ishii
Import Notes
Reply to msg id not found: NDBBIHPECLIGKCCLMACAGECPCLAA.jklcom@mindspring.com
Ok, if I change to use EUC_TW or UTF-8 encoding, do I need to make any
changes to my code?
No.
What do I need to do differently on the web content?
Nothing, I guess.
Show quoted text
Thanks
-----Original Message-----
From: Tatsuo Ishii [mailto:t-ishii@sra.co.jp]
Sent: Saturday, March 10, 2001 8:39 AM
To: jklcom@mindspring.com
Cc: pgsql-general@postgresql.org
Subject: RE: [GENERAL] How to read/write multibyte to databaseI'm using BIG5
Then you lose. Because BIG5 containts byte patterns conflicting with
ASCII special charcters (like '\'), I guess your code:for(i=0; *queryString; i++)
{
splitword(items.Item, queryString, '&');
unescape_url(items.Item);
splitword(items.name, items.Item, '=');if(!strcmp(items.name, "Name"))
{
strcpy(name, items.Item);
}
else if(!strcmp(items.name, "Address"))
{
strcpy(address, items.Item);
}won't work. Change your program to treat BIG5 carefully. Or you
probably better to use EUC_TW or UTF-8 to write your web contents.
--
Tatsuo Ishii
Import Notes
Reply to msg id not found: NDBBIHPECLIGKCCLMACAKEDGCLAA.jklcom@mindspring.com
Can you please tell me what I need to do in my program to treat BIG5 such
that it will not conflict with ASCII escape sequence?
Probably splitword and unescape_url need to rework, I'm not sure
actualy what you are doing inside them though. For example, if you
need to find '&', you would do like this:
unsigned char *p = your_string_to_parse;
int len = strlen(p);
while (len > 0)
{
if (*p == '&')
break;
if (p > 0x7f) /* first byte of BIG5 ? */
{
p++; /* skip second byte */
len--;
}
p++;
len--;
}
Show quoted text
Thanks
-----Original Message-----
From: Tatsuo Ishii [mailto:t-ishii@sra.co.jp]
Sent: Saturday, March 10, 2001 8:39 AM
To: jklcom@mindspring.com
Cc: pgsql-general@postgresql.org
Subject: RE: [GENERAL] How to read/write multibyte to databaseI'm using BIG5
Then you lose. Because BIG5 containts byte patterns conflicting with
ASCII special charcters (like '\'), I guess your code:for(i=0; *queryString; i++)
{
splitword(items.Item, queryString, '&');
unescape_url(items.Item);
splitword(items.name, items.Item, '=');if(!strcmp(items.name, "Name"))
{
strcpy(name, items.Item);
}
else if(!strcmp(items.name, "Address"))
{
strcpy(address, items.Item);
}won't work. Change your program to treat BIG5 carefully. Or you
probably better to use EUC_TW or UTF-8 to write your web contents.
--
Tatsuo Ishii
Import Notes
Reply to msg id not found: NDBBIHPECLIGKCCLMACAEEHHCLAA.jklcom@mindspring.com
From: "Jeff Lu" <jklcom@mindspring.com>
Subject: RE: [GENERAL] How to read/write multibyte to database
Date: Thu, 15 Mar 2001 11:18:29 -0500
Message-ID: <NDBBIHPECLIGKCCLMACAMEJACLAA.jklcom@mindspring.com>
Another question:
What if the input data stream is mixture of English & Chinese such as
How are you? = �A �n �� �H
How should I handle this?
You mean ascii by "English"? Then it's ok since ascii chars always
lower than 0x80.
Show quoted text
-----Original Message-----
From: Tatsuo Ishii [mailto:t-ishii@sra.co.jp]
Sent: Thursday, March 15, 2001 7:16 AM
To: jklcom@mindspring.com
Cc: pgsql-general@postgresql.org
Subject: RE: [GENERAL] How to read/write multibyte to databaseCan you please tell me what I need to do in my program to treat BIG5 such
that it will not conflict with ASCII escape sequence?Probably splitword and unescape_url need to rework, I'm not sure
actualy what you are doing inside them though. For example, if you
need to find '&', you would do like this:unsigned char *p = your_string_to_parse;
int len = strlen(p);
while (len > 0)
{
if (*p == '&')
break;
if (p > 0x7f) /* first byte of BIG5 ? */
{
p++; /* skip second byte */
len--;
}
p++;
len--;
}Thanks
-----Original Message-----
From: Tatsuo Ishii [mailto:t-ishii@sra.co.jp]
Sent: Saturday, March 10, 2001 8:39 AM
To: jklcom@mindspring.com
Cc: pgsql-general@postgresql.org
Subject: RE: [GENERAL] How to read/write multibyte to databaseI'm using BIG5
Then you lose. Because BIG5 containts byte patterns conflicting with
ASCII special charcters (like '\'), I guess your code:for(i=0; *queryString; i++)
{
splitword(items.Item, queryString, '&');
unescape_url(items.Item);
splitword(items.name, items.Item, '=');if(!strcmp(items.name, "Name"))
{
strcpy(name, items.Item);
}
else if(!strcmp(items.name, "Address"))
{
strcpy(address, items.Item);
}won't work. Change your program to treat BIG5 carefully. Or you
probably better to use EUC_TW or UTF-8 to write your web contents.
--
Tatsuo Ishii
Import Notes
Reply to msg id not found: NDBBIHPECLIGKCCLMACAMEJACLAA.jklcom@mindspring.com
From: "Jeff Lu" <jklcom@mindspring.com>
Subject: RE: [GENERAL] How to read/write multibyte to database
Date: Thu, 15 Mar 2001 10:56:13 -0500
Message-ID: <NDBBIHPECLIGKCCLMACAAEJACLAA.jklcom@mindspring.com>
Here're functions:
What else do I need to watch out for besides '\'?
Thanks for your help.
You should always watch out the second byte of Big5. What would
happen if "stop" appears in the second byte of Big5 string provided by
"in" in splitword?
--
Tatsuo Ishii
Show quoted text
void splitword(uchar *out, uchar *in, uchar stop)
{
int i, j;while(*in == ' ') in++; /* skip past any spaces */
for(i = 0; in[i] && (in[i] != stop); i++)
out[i] = in[i];
out[i] = '\0'; /* terminate it */
if(in[i]) ++i; /* position past the stop */
while(in[i] == ' ') i++; /* skip past any spaces */
for(j = 0; in[j]; ) /* shift the rest of the in */
in[j++] = in[i++];
}uchar x2c(uchar *x)
{
register uchar c;/* note: (x & 0xdf) makes x upper case */
c = (x[0] >= 'A' ? ((x[0] & 0xdf) - 'A') + 10 : (x[0] - '0'));
c *= 16;
c += (x[1] >= 'A' ? ((x[1] & 0xdf) - 'A') + 10 : (x[1] - '0'));
return(c);
}void unescape_url(uchar *url)
{
register int i, j;for(i = 0, j = 0; url[j]; ++i, ++j)
{
if((url[i] = url[j]) == '%')
{
url[i] = x2c(&url[j + 1]);
j += 2;
}
else if (url[i] == '+')
url[i] = ' ';
}
url[i] = '\0'; /* terminate it at the new length */
}-----Original Message-----
From: Tatsuo Ishii [mailto:t-ishii@sra.co.jp]
Sent: Thursday, March 15, 2001 7:16 AM
To: jklcom@mindspring.com
Cc: pgsql-general@postgresql.org
Subject: RE: [GENERAL] How to read/write multibyte to databaseCan you please tell me what I need to do in my program to treat BIG5 such
that it will not conflict with ASCII escape sequence?Probably splitword and unescape_url need to rework, I'm not sure
actualy what you are doing inside them though. For example, if you
need to find '&', you would do like this:unsigned char *p = your_string_to_parse;
int len = strlen(p);
while (len > 0)
{
if (*p == '&')
break;
if (p > 0x7f) /* first byte of BIG5 ? */
{
p++; /* skip second byte */
len--;
}
p++;
len--;
}Thanks
-----Original Message-----
From: Tatsuo Ishii [mailto:t-ishii@sra.co.jp]
Sent: Saturday, March 10, 2001 8:39 AM
To: jklcom@mindspring.com
Cc: pgsql-general@postgresql.org
Subject: RE: [GENERAL] How to read/write multibyte to databaseI'm using BIG5
Then you lose. Because BIG5 containts byte patterns conflicting with
ASCII special charcters (like '\'), I guess your code:for(i=0; *queryString; i++)
{
splitword(items.Item, queryString, '&');
unescape_url(items.Item);
splitword(items.name, items.Item, '=');if(!strcmp(items.name, "Name"))
{
strcpy(name, items.Item);
}
else if(!strcmp(items.name, "Address"))
{
strcpy(address, items.Item);
}won't work. Change your program to treat BIG5 carefully. Or you
probably better to use EUC_TW or UTF-8 to write your web contents.
--
Tatsuo Ishii
Import Notes
Reply to msg id not found: NDBBIHPECLIGKCCLMACAAEJACLAA.jklcom@mindspring.com