Inserting Unicode into Postgre

Started by Firestaralmost 25 years ago4 messagesgeneral
Jump to latest
#1Firestar
theebh@yahoo.com

Hi,

I'm currently using PostgreSQL 7.0 on Solaris. My Java program receives
strings in Big5
encoding and will store them in PostgreSQL (via JDBC). However, the inserted
strings become
multiple '?' (question marks) instead everytime i do a insert command. And
when i retrieve them,
via JDBC, the string becomes those question marks.

Is the problem due to the Unicode encoding that Java String uses, or must i
enable multibyte-support
in my postgre installation? If i enable multibyte support, should i create
my table with Unicode support,
or Big5?

Thanks in advance.

Firestar

#2Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Firestar (#1)
Re: Inserting Unicode into Postgre

I'm currently using PostgreSQL 7.0 on Solaris. My Java program receives
strings in Big5
encoding and will store them in PostgreSQL (via JDBC). However, the inserted
strings become
multiple '?' (question marks) instead everytime i do a insert command. And
when i retrieve them,
via JDBC, the string becomes those question marks.

Is the problem due to the Unicode encoding that Java String uses, or must i
enable multibyte-support
in my postgre installation? If i enable multibyte support, should i create
my table with Unicode support,
or Big5?

First of all, you cannot store Big5 data into PostgreSQL. You need to
convert Big5 to either EUC_TW or UTF-8 before storing them into
PostgreSQL database. There are several ways to accompish this.

The easiest way would be upgrade to 7.1 with multibyte support enabled
and create a database with UNICODE (actially UTF-8) or EUC_TW
encoding. In this environment, 7.1's JDBC driver would recognize the
database encoding correctly, and do an automatic conversion between
database encodings and UTF-8, that is Java's internal encoding.

Ask Java expers on this list for more details.
--
Tatsuo Ishii

#3Firestar
theebh@yahoo.com
In reply to: Tatsuo Ishii (#2)
Re: Inserting Unicode into Postgre

Hi Tatsuo, thanks for your fast reply.

My string (which contains big5 characters) is originally read from an
inputstream, and created by:
insertStmt = new String(bytes, "big5")

Since all strings in java is in unicode, so if i enable unicode support with
postgre7.1, JDBC should now
be able to insert the string correctly into the database?

Btw, i dun seem to be able to find the JDBC driver for postgre 7.1 on the
website. I guess i have to build
it myself during the installation (as suggested by the readme file)?

Thanks in advance,
Firestar

"Tatsuo Ishii" <t-ishii@sra.co.jp> wrote in message
news:20010417161538B.t-ishii@sra.co.jp...

I'm currently using PostgreSQL 7.0 on Solaris. My Java program receives
strings in Big5
encoding and will store them in PostgreSQL (via JDBC). However, the

inserted

strings become
multiple '?' (question marks) instead everytime i do a insert command.

And

when i retrieve them,
via JDBC, the string becomes those question marks.

Is the problem due to the Unicode encoding that Java String uses, or

must i

enable multibyte-support
in my postgre installation? If i enable multibyte support, should i

create

Show quoted text

my table with Unicode support,
or Big5?

First of all, you cannot store Big5 data into PostgreSQL. You need to
convert Big5 to either EUC_TW or UTF-8 before storing them into
PostgreSQL database. There are several ways to accompish this.

The easiest way would be upgrade to 7.1 with multibyte support enabled
and create a database with UNICODE (actially UTF-8) or EUC_TW
encoding. In this environment, 7.1's JDBC driver would recognize the
database encoding correctly, and do an automatic conversion between
database encodings and UTF-8, that is Java's internal encoding.

Ask Java expers on this list for more details.
--
Tatsuo Ishii

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

#4Weiping He
laser@zhengmai.com.cn
In reply to: Firestar (#1)
Re: Inserting Unicode into Postgre

Firestar wrote:

Hi,

I'm currently using PostgreSQL 7.0 on Solaris. My Java program receives
strings in Big5
encoding and will store them in PostgreSQL (via JDBC). However, the inserted
strings become
multiple '?' (question marks) instead everytime i do a insert command. And
when i retrieve them,
via JDBC, the string becomes those question marks.

Is the problem due to the Unicode encoding that Java String uses, or must i
enable multibyte-support
in my postgre installation? If i enable multibyte support, should i create
my table with Unicode support,
or Big5?

Upgrade to just released 7.1,
now postgres can do unicode conversion to you.
(thanks to Mr. Tatsuo Ishii)
I think you should enable both enable-multibyte & enable-unicode-conversion
switch.
when building postgresql.

regards

Laser