Covert database from ASCII to UTF-8

Started by Jeff Amielabout 15 years ago6 messagesgeneral
Jump to latest
#1Jeff Amiel
becauseimjeff@yahoo.com

It's come time to bite the bullet and convert a half-terraybyte database from ASCII to UTF8. Have gone through a bit of effort to track down the unclean ascii text and repair it but would like to avoid the outage of a many-many hour dump-restore.

Using Postgres 8.4.X.

Are there any other magic options open to me?
Any way to do an in-place conversion?
I assume slony replication is an option.....
What about some sort of wal log shipping replication?

Any thoughts would be appreciated.

#2Vibhor Kumar
vibhor.kumar@enterprisedb.com
In reply to: Jeff Amiel (#1)
Re: Covert database from ASCII to UTF-8

On Feb 22, 2011, at 10:23 PM, Jeff Amiel wrote:

It's come time to bite the bullet and convert a half-terraybyte database from ASCII to UTF8. Have gone through a bit of effort to track down the unclean ascii text and repair it but would like to avoid the outage of a many-many hour dump-restore.

Using Postgres 8.4.X.

I assume slony replication is an option.....

Right! Replication would help. You can also try Bucardo.

What about some sort of wal log shipping replication?

WAL Log shipping won't help.

Thanks & Regards,
Vibhor Kumar

#3Vick Khera
vivek@khera.org
In reply to: Jeff Amiel (#1)
Re: Covert database from ASCII to UTF-8

On Tue, Feb 22, 2011 at 11:53 AM, Jeff Amiel <becauseimjeff@yahoo.com> wrote:

I assume slony replication is an option.....

this is my plan, once i finish cleaning up the code and the DB data.
you have to ensure that whatever the original DB emits (in the form of
COPY and individual updates later on) will import correctly into the
new DB without error, else you're hosed.

also, I'd do it one table at a time rather than all at once, to
minimize failure cases if there is a problem with one table.

What about some sort of wal log shipping replication?

I don't think you can do that. This is a binary replication that
copies disk pages.

#4Geoffrey Myers
geof@serioustechnology.com
In reply to: Vibhor Kumar (#2)
Re: Covert database from ASCII to UTF-8

Vibhor Kumar wrote:

On Feb 22, 2011, at 10:23 PM, Jeff Amiel wrote:

It's come time to bite the bullet and convert a half-terraybyte database from ASCII to UTF8. Have gone through a bit of effort to track down the unclean ascii text and repair it but would like to avoid the outage of a many-many hour dump-restore.

Using Postgres 8.4.X.

I assume slony replication is an option.....

Right! Replication would help. You can also try Bucardo.

I'm assuming you're saying you can replicate from an ASCII database to
UTF8? What happens to the data that is not UTF8 'friendly?'

What about some sort of wal log shipping replication?

WAL Log shipping won't help.

Thanks & Regards,
Vibhor Kumar

--
Geoffrey Myers
Myers Consulting Inc.
770.592.1651

#5Vick Khera
vivek@khera.org
In reply to: Geoffrey Myers (#4)
Re: Covert database from ASCII to UTF-8

On Wed, Feb 23, 2011 at 9:06 AM, Geoffrey Myers
<geof@serioustechnology.com> wrote:

I'm assuming you're saying you can replicate from an ASCII database to UTF8?
 What happens to the data that is not UTF8 'friendly?'

The assumption up-thread was that the data was already made UTF8
friendly in the US_ASCII database.

#6Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Jeff Amiel (#1)
Re: Covert database from ASCII to UTF-8

Jeff Amiel <becauseimjeff@yahoo.com> writes:

It's come time to bite the bullet and convert a half-terraybyte database
from ASCII to UTF8. Have gone through a bit of effort to track down the
unclean ascii text and repair it but would like to avoid the outage of a
many-many hour dump-restore.

Those blog articles of mine might be of interest to you:

http://tapoueh.org/articles/blog/_Getting_out_of_SQL_ASCII,_part_1.html
http://tapoueh.org/articles/blog/_Getting_out_of_SQL_ASCII,_part_2.html

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support