Changing encoding of a database

Started by Marco Bizzarrialmost 20 years ago3 messagesgeneral
Jump to latest
#1Marco Bizzarri
marco.bizzarri@gmail.com

I all.

We've PostgreSQL database, with SQL_ASCII or LATIN1 encoding. We would
like to migrate them to UNICODE. Is there some contributed/available
script, or this is something we should do at hand?

Regards
Marco
--
Marco Bizzarri
http://notenotturne.blogspot.com/

#2Marco Bizzarri
marco.bizzarri@gmail.com
In reply to: Marco Bizzarri (#1)
Re: Changing encoding of a database

Hi Tomi.

Thanks for your answer, I was not aware of such a tool.

The next question at this point is (of course): what is the problem if
I have blob? Should I recode them as well?

Regards
Marco

On 6/20/06, Tomi NA <hefest@gmail.com> wrote:

On 6/19/06, Marco Bizzarri <marco.bizzarri@gmail.com> wrote:

I all.

We've PostgreSQL database, with SQL_ASCII or LATIN1 encoding. We would
like to migrate them to UNICODE. Is there some contributed/available
script, or this is something we should do at hand?

Regards
Marco

If you don't have blobs in your database, dump it to insert
statements, use the recode tool to recode your data, create a new
database based on UTF8 and load the data.

t.n.a.

--
Marco Bizzarri
http://notenotturne.blogspot.com/

#3TJ O'Donnell
tjo@acm.org
In reply to: Marco Bizzarri (#2)
Re: Changing encoding of a database

We've PostgreSQL database, with SQL_ASCII or LATIN1 encoding. We would
like to migrate them to UNICODE. Is there some contributed/available
script, or this is something we should do at hand?

I had a similar problem migrating from 7.4 to 8.1 and wanting to
go from sql_ascii to utf8. I did the following:

pg_dump -p 5433 --encoding ISO_8859_7 -t cas tj |psql tj

where the dump connected to 7.4 (port 5433) and interpreted the
cas data using ISO_8859_7. psql connected to 8.1
I had to experiment to find that ISO_8859_7 was the "proper"
encoding - i had some greek (math and chemistry) letters which
were accomodated by sql_ascii, but not quite "properly".
The output from pg_dump above properly converts to utf8
which 8.1 (i set the default enccoding utf8) accepts without complaint.

See http://www.postgresql.org/docs/8.1/static/multibyte.html
for all the other encodings.

I don't think the above will convert a table in place, but could be
used to create a copy with changed encoding.
Hope this helps.

TJ