Re: Multi-byte character bug (resend for clarify)

Started by Richard Soover 23 years ago1 messagesbugs
Jump to latest
#1Richard So
richso@i-cable.com

Two bugs has been found in the SQL parser and Multibyte char support:

What is the encoding for "chinese char"? You need to give us more info.

By Chinese here, I mean BIG5 encoding character which is a widely used
encoding in HK and Taiwan. My setup:
Db encoding: EUC_TW
Client (JDBC / ODBC) Encoding: BIG5
JDBC: I supplied the parameter 'charSet=Big5' to the
connection string
ODBC: my locale (Chinese Win2000 machine) is Chinese
Taiwan
Client application: Tomcat4 jsp page (see the attached)
App / Db Server: Redhat 7.3 Linux + postgresql (set) 7.2.1-2PGDG
(download binary rpm) + Tomcat4
App / DB Server locale: zh_TW.Big5
JDBC driver: pgjdbc2.jar
Client Machine: Win2000 Chinese (Taiwan) Version with SP2 + I.E.
(jsp) + Delphi SQL Explorer (ODBC)
Client Machine locale: Chinese (Taiwan)

1. 'Problem connecting to database: java.sql.SQLException:

ERROR:

Invalid EUC_TW character sequence found (0xb27a)' was reported in
using JDBC driver to insert record, similar error reported when using

ODBC driver and psql, since auto-conversion from client to server
should convert the charcter to a valid EUC_TW char, therefore this is
a bug

How did you set the auto-conversion settings for psql? I suspect you
did something wrong with it.

I've done a new check on it, I found JDBC and ODBC driver still report
the error message but psql do not (may be as you said, I've done a wrong
procedure). However, the problem still there: why JDBC and ODBC still
report the error ? I just tried some Chinese words, but there may be
some of other character will also cause the problem.
I know Tomcat4 default will return the request parameters in ISO-8859
and therefore I've added code
<%@ page contentType="text/html; charset=Big5"%>
<%
request.setCharacterEncoding("BIG5");
%>
to the JSP page and dump the actual SQL posted to postgresql server to
make sure the SQL is correct and its attached (pls see attached file:
offence1.zip).

2. inserting record with xx chinese char, the SQL parser
report something like 'Problem connecting to database:
java.sql.SQLException: ERROR: parser: parse error at or near
"4567891"' (similar in jdbc and odbc), and the error 'unterminated
string' has been reported when using psql.

The character code is 0xc05c, in which the second byte is actually a "\"
(back-slash) (pls see the attached file: offence2.zip)

Show quoted text

I've found the problem exists since 7.1.x till 7.2.*.

Attachments:

offence1.zipapplication/x-zip-compressed; name=offence1.zipDownload
offence2.zipapplication/x-zip-compressed; name=offence2.zipDownload