Unicode encoding

Started by William Sweetalmost 22 years ago2 messagesgeneral
Jump to latest
#1William Sweet
wsweet@register.com

Hi Guys,

I've been searching the web lately for discussions on PostgreSQL multibyte support and UTF-8 support. The result was similar to the situation of asking 10 doctors the same question and getting 10 different answers. So, I solicit your help on solving this great mystery.

As I understand it, multibyte support has been enabled since v7.3. I am running a RH9 installation that had v7.3 as an included pkg, so I installed it as part of the RH9 installation. So I assume multibyte support is enabled. Now, I'd like to only store Unicode chars in my PostgreSQL dbs. I hear there are 3 ways to accomplish this:

1) during PostgreSQL configure/build (installation level)
2) during initdb (cluster level)
3) CREATE DATABASE (db level)

...but there are some "not-so-happy" stories on the net. For instance, "it's not 'true' Unicode support when implemented at the db level", or "sorting and regex do not work properly with a cluster level implementation", etc. I've read the v7.3 Admin Guide section 7.2 Multibyte support... sounds reasonable. So my question is, what is the official way to enable "true" Unicode storage and retrieval, so that LIKE, sorting, and regex in perl::DBI work properly? I am a tad concerned also that I don't see PostgreSQL mentioned on the Unicode products page; http://www.unicode.org/onlinedat/products.html

Any advice would be greatly appreciated.

Thanks, Will

William Sweet Jr, Web Engineer
Register.com, Inc.
Tel 410.953.7941
Fax 410.953.0122
<mailto:wsweet@register.com> wsweet@register.com
AIM: WillCSweet
Columbia Corporate Park
8830 Stanford Blvd Suite 402
Columbia MD 21045

<http://216.21.229.207/images/sig_line.gif&gt;
<http://216.21.229.207/images/sig_txt.gif&gt; <http://216.21.229.207/images/spacer.gif&gt;

<http://216.21.229.207/images/sig_logo.gif&gt;
<http://216.21.229.207/images/sig_line.gif&gt;

#2Peter Eisentraut
peter_e@gmx.net
In reply to: William Sweet (#1)
Re: Unicode encoding

William Sweet wrote:

support is enabled. Now, I'd like to only store Unicode chars in my
PostgreSQL dbs. I hear there are 3 ways to accomplish this:

1) during PostgreSQL configure/build (installation level)
2) during initdb (cluster level)
3) CREATE DATABASE (db level)

Each one of these only sets the default for the one below it.

...but there are some "not-so-happy" stories on the net. For
instance, "it's not 'true' Unicode support when implemented at the db
level",

That is bogus.

or "sorting and regex do not work properly with a cluster
level implementation",

That is true.

etc. I've read the v7.3 Admin Guide section

7.2 Multibyte support... sounds reasonable. So my question is, what
is the official way to enable "true" Unicode storage and retrieval,
so that LIKE, sorting, and regex in perl::DBI work properly?

Sorting will not work correctly with Unicode.

I am a
tad concerned also that I don't see PostgreSQL mentioned on the
Unicode products page; http://www.unicode.org/onlinedat/products.html

Well, we're also not listed on the ISO 8859 products page, but I don't
think that matters. :-)