upgrade to 8.4 with new character set

Started by Kevin Duffyover 16 years ago4 messagesgeneral
Jump to latest
#1Kevin Duffy
KD@wrinvestments.com

________________________________

From: Jonathan Schindler
Sent: Wednesday, November 11, 2009 9:58 AM
To: Kevin Duffy
Subject: upgrade to 8.4 with new character set

To those that know,

We originally built our database with a character set of SQL_ASCII. We
now know the error of our foolish ways. We are trying to upgrade to 8.4
and the windows binaries would like us to use UTF8. I would like to use
a backup to restore onto the new installation, however I am getting
warnings and errors about how I am trying to restore a SQL_ASCII
character set to a UTF8 Database. I need to have confidence that my
data is intact before I can continue with the upgrade. I was wondering
what would be the best course of action. I currently have the 8.4
database installed on a separate machine for testing.

Thank you for all your help

Jonathan Schindler

#2Martijn van Oosterhout
kleptog@svana.org
In reply to: Kevin Duffy (#1)
Re: upgrade to 8.4 with new character set

On Wed, Nov 11, 2009 at 10:45:55AM -0500, Kevin Duffy wrote:

We originally built our database with a character set of SQL_ASCII. We
now know the error of our foolish ways. We are trying to upgrade to 8.4
and the windows binaries would like us to use UTF8. I would like to use
a backup to restore onto the new installation, however I am getting
warnings and errors about how I am trying to restore a SQL_ASCII
character set to a UTF8 Database. I need to have confidence that my
data is intact before I can continue with the upgrade. I was wondering
what would be the best course of action. I currently have the 8.4
database installed on a separate machine for testing.

Do you know what character you were using before. If, for example, you
know that all the data is actually in latin1 or win1252, then you can
just change the client_encoding in the dump file and reload it into
your utf-8 database.

Hope this helps,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Please line up in a tree and maintain the heap invariant while
boarding. Thank you for flying nlogn airlines.

#3Howard Cole
howardnews@selestial.com
In reply to: Kevin Duffy (#1)
Re: upgrade to 8.4 with new character set

Kevin Duffy wrote:

We originally built our database with a character set of SQL_ASCII.
We now know the error of our foolish ways. We are trying to upgrade
to 8.4 and the windows binaries would like us to use UTF8. I would
like to use a backup to restore onto the new installation, however I
am getting warnings and errors about how I am trying to restore a
SQL_ASCII character set to a UTF8 Database. I need to have confidence
that my data is intact before I can continue with the upgrade. I was
wondering what would be the best course of action. I currently have
the 8.4 database installed on a separate machine for testing.

Thank you for all your help

Jonathan Schindler

This is what I did:

Download & install iconv for windows from gnuwin32.sourceforge.net

Backup the 8.3 database using the plain format
e.g. pg_dump -F p -f backup.ascii -U postgres mydb

Run Iconv on the backup to convert to your new dataset (you may need to
use a different source encoding - but ISO-8859-1 works for me!)
e.g. c:\program files\gnuwin32\bin\iconv.exe -f ISO-8859-1 -t UTF-8
backup.ascii > backup.utf-8

Create a new db in UTF-8 encoding (newdb)

Now restore to your 8.4 database
e.g. psql -U postgres -f backup.utf-8 newdb

The only problem I've had so far is that some of my data was a streamed
xml file into a text variable, which stupidly had some byte order marks
at the beginning which became nonsense after converting to UTF-8. These
were removed with a SQL query but hopefully this shouldn't happen to
most people.

Hope this helps.

Howard Cole
www.selestial.com

#4Howard Cole
howardnews@selestial.com
In reply to: Kevin Duffy (#1)
Re: upgrade to 8.4 with new character set

Kevin Duffy wrote:

We are trying to upgrade to 8.4 and the windows binaries would like us
to use UTF8.

Jonathan Schindler

I would advise caution upgrading to 8.4 on windows as it seems to cause
more of the random winsock 10061 error. Which in my experience happens
hundreds of times a day.

Howard Cole
www.selestial.com