Encoding and multibye support

Started by Iainabout 22 years ago2 messagesdocs
Jump to latest
#1Iain
iain@mst.co.jp

Hi All,

I recently had a sight problem with a development database because I used
the default encoding of SQL_ASCII. When I tried to load the database into a
EUC_JP database of course there were some problems with invlaid EUC_JP
characters. Fortunately they were easy to find and fix.

Anyway, my search on "encoding" or "multibyte" showed up nothing in the 7.4
documentation. Eventually I found a page written by Tatsuo Ishii in the 7.2
documentation.

I think that it's an important area, and is a potential trap for new players
so I'd like to see the documentation updated.

The following came out of a discussion with Tom Lane. I submitted it as
comment in the interactive documentation. I think it would be a good idea to
check the details and update the doc:
------
The default encoding SQL_ASCII effectively disables any encoding conversion.
This means that your db will accept any kind of data. It's a potential
problem as you may end up wth different kinds of encoding being used in both
your data and metadata.

It would seem that unless you specifically require to store data in various
encodings then you should select a specific encoding when creating a new
database. Use initdb -E to set the default for all new DBs. This can be
overridden when using creating a new DB
------

Also, the documentation for installation (chapter 14), creating database
clusters (16.2) and creating databases (18.2) doesn't mention encoding at
all. Maybe they should. Also 16.2 should link to the documention for initdb
(Server Applications, section III). I think that wuld be a good idea.

regards
Iain

#2Iain
iain@mst.co.jp
In reply to: Iain (#1)
Re: Encoding and multibye support

Actually I should say that I eventually found the section in chapter 20
(localization) of the 7.4 docs, but I'd like to see this page being linked
to from the areas I mentioned, and maybe making it easier to find by
searching on words like "encode" "encoding" etc.

Regards
Iain