Change server encoding after the fact
I would like to change my server_encoding which is currently SQL_ASCII to UTF8.
I have existing data that I would like to keep.
From my understanding of the steps I need to:
1) alter the template1 database encoding via
UPDATE pg_database SET encoding = 6 where datname IN ('template0', 'template1');
2) Dump my current database
pg_dump -Fc foo > foo.db
3) Drop my current database
drop database foo;
4) recreate it with the proper encoding
create database foo with template = template1 encoding = 'UTF-8';
5) restore from backup
pg_restore -d foo foo.db
Are these the correct steps to perform or is there an easier / in-place way?
Also, when I dump my old DB and restore it, will it be converted appropriately (e.g. it came from am SQL_ASCII encoding and its going into a UTF-8 database)?
Thank you
/Cody
On Fri, Sep 30, 2011 at 11:25 AM, Cody Caughlan <toolbag@gmail.com> wrote:
I would like to change my server_encoding which is currently SQL_ASCII to UTF8.
I have existing data that I would like to keep.
From my understanding of the steps I need to:
1) alter the template1 database encoding via
UPDATE pg_database SET encoding = 6 where datname IN ('template0', 'template1');
Just create database using template0 as template and you can skip this step ^^
Are these the correct steps to perform or is there an easier / in-place way?
Also, when I dump my old DB and restore it, will it be converted appropriately (e.g. it came from am SQL_ASCII encoding and its going into a UTF-8 database)?
You might need to set client encoding when restoring. Or use iconv to
convert from one encoding to another, which is what I usually do.
Note that it's VERY likely you'll have data in a SQL_ASCII db that
won't go into a UTF8 database without some lossiness.
Thanks Scott. See below:
On Fri, Sep 30, 2011 at 11:05 AM, Scott Marlowe <scott.marlowe@gmail.com>wrote:
On Fri, Sep 30, 2011 at 11:25 AM, Cody Caughlan <toolbag@gmail.com> wrote:
I would like to change my server_encoding which is currently SQL_ASCII to
UTF8.
I have existing data that I would like to keep.
From my understanding of the steps I need to:
1) alter the template1 database encoding via
UPDATE pg_database SET encoding = 6 where datname IN ('template0',
'template1');
Just create database using template0 as template and you can skip this step
^^
Wouldn't this only work if my template0 was UTF8 itself?
=> select datname, pg_encoding_to_char(encoding) from pg_database;
datname | pg_encoding_to_char
----------------------+---------------------
template1 | SQL_ASCII
template0 | SQL_ASCII
postgres | SQL_ASCII
So it appears both template0 & template1 are SQL_ASCII, so how would
creating from a new DB from template0 be any different than template1?
Are these the correct steps to perform or is there an easier / in-place
way?
Also, when I dump my old DB and restore it, will it be converted
appropriately (e.g. it came from am SQL_ASCII encoding and its going into a
UTF-8 database)?You might need to set client encoding when restoring. Or use iconv to
convert from one encoding to another, which is what I usually do.
Note that it's VERY likely you'll have data in a SQL_ASCII db that
won't go into a UTF8 database without some lossiness.
Yes, I see this might be the case. From my playing around with iconv I
cannot even properly do the conversion:
$ pg_dump -Fp foo > foo.sql
$ file -i foo.sql
foo.sql: text/plain; charset=us-ascii
$ iconv -f utf-8 foo.sql > utf8.sql
iconv: illegal input sequence at position 2512661
Uh oh... I cannot event convert it?
Whats my next step at this point if I cannot even convert my data? I'd be OK
with some lossiness.
Thanks again
/Cody
On Fri, Sep 30, 2011 at 12:26 PM, Cody Caughlan <toolbag@gmail.com> wrote:
Thanks Scott. See below:
On Fri, Sep 30, 2011 at 11:05 AM, Scott Marlowe <scott.marlowe@gmail.com>
wrote:On Fri, Sep 30, 2011 at 11:25 AM, Cody Caughlan <toolbag@gmail.com> wrote:
I would like to change my server_encoding which is currently SQL_ASCII
to UTF8.I have existing data that I would like to keep.
From my understanding of the steps I need to:
1) alter the template1 database encoding via
UPDATE pg_database SET encoding = 6 where datname IN ('template0',
'template1');Just create database using template0 as template and you can skip this
step ^^Wouldn't this only work if my template0 was UTF8 itself?
=> select datname, pg_encoding_to_char(encoding) from pg_database;
datname | pg_encoding_to_char
----------------------+---------------------
template1 | SQL_ASCII
template0 | SQL_ASCII
postgres | SQL_ASCIISo it appears both template0 & template1 are SQL_ASCII, so how would
creating from a new DB from template0 be any different than template1?
Well, let's try, shall we? From a freshly created cluster on my
laptop, running 8.4:
smarlowe=# select datname, pg_encoding_to_char(encoding) from pg_database;
datname | pg_encoding_to_char
-----------+---------------------
template1 | SQL_ASCII
template0 | SQL_ASCII
postgres | SQL_ASCII
smarlowe | SQL_ASCII
(4 rows)
smarlowe=# create database j template template0 encoding 'UTF8';
CREATE DATABASE
Seems to work.
P.s. I'm not sure why it works, I just know that it does. :)
Are these the correct steps to perform or is there an easier / in-place
way?Also, when I dump my old DB and restore it, will it be converted
appropriately (e.g. it came from am SQL_ASCII encoding and its going into a
UTF-8 database)?You might need to set client encoding when restoring. Or use iconv to
convert from one encoding to another, which is what I usually do.
Note that it's VERY likely you'll have data in a SQL_ASCII db that
won't go into a UTF8 database without some lossiness.Yes, I see this might be the case. From my playing around with iconv I
cannot even properly do the conversion:
$ pg_dump -Fp foo > foo.sql
$ file -i foo.sql
foo.sql: text/plain; charset=us-ascii
$ iconv -f utf-8 foo.sql > utf8.sql
iconv: illegal input sequence at position 2512661
I think you got it backwards, the -f should be somthing other than
utf-8 right? That's what the -t should be right? Try iconv without a
-f switch and a -t of utf-8 and see what happens...
Scott Marlowe <scott.marlowe@gmail.com> writes:
On Fri, Sep 30, 2011 at 12:26 PM, Cody Caughlan <toolbag@gmail.com> wrote:
So it appears both template0 & template1 are SQL_ASCII, so how would
creating from a new DB from template0 be any different than template1?
P.s. I'm not sure why it works, I just know that it does. :)
CREATE DATABASE assumes that template0 cannot contain any non-ASCII
data, so it's okay to clone it and then pretend that the result has some
other encoding. The same assumption cannot be made for template1, since
that's user-modifiable.
regards, tom lane
Please see below.
On Fri, Sep 30, 2011 at 11:38 AM, Scott Marlowe <scott.marlowe@gmail.com>wrote:
On Fri, Sep 30, 2011 at 12:26 PM, Cody Caughlan <toolbag@gmail.com> wrote:
Thanks Scott. See below:
On Fri, Sep 30, 2011 at 11:05 AM, Scott Marlowe <scott.marlowe@gmail.com
wrote:
On Fri, Sep 30, 2011 at 11:25 AM, Cody Caughlan <toolbag@gmail.com>
wrote:
I would like to change my server_encoding which is currently SQL_ASCII
to UTF8.I have existing data that I would like to keep.
From my understanding of the steps I need to:
1) alter the template1 database encoding via
UPDATE pg_database SET encoding = 6 where datname IN ('template0',
'template1');Just create database using template0 as template and you can skip this
step ^^Wouldn't this only work if my template0 was UTF8 itself?
=> select datname, pg_encoding_to_char(encoding) from pg_database;
datname | pg_encoding_to_char
----------------------+---------------------
template1 | SQL_ASCII
template0 | SQL_ASCII
postgres | SQL_ASCIISo it appears both template0 & template1 are SQL_ASCII, so how would
creating from a new DB from template0 be any different than template1?Well, let's try, shall we? From a freshly created cluster on my
laptop, running 8.4:smarlowe=# select datname, pg_encoding_to_char(encoding) from pg_database;
datname | pg_encoding_to_char
-----------+---------------------
template1 | SQL_ASCII
template0 | SQL_ASCII
postgres | SQL_ASCII
smarlowe | SQL_ASCII
(4 rows)smarlowe=# create database j template template0 encoding 'UTF8';
CREATE DATABASESeems to work.
P.s. I'm not sure why it works, I just know that it does. :)
Ok, I see what you mean. This would create a new DB with the proper
encoding. Which is "fine", and probably what I will do. I guess I see an
ideal scenario being one where we permanently convert the template encoding
to UTF8 so going forward I dont have to worry about forgetting to adding the
encoding= 'UTF8' for every new DB I create.
Are these the correct steps to perform or is there an easier /
in-place
way?
Also, when I dump my old DB and restore it, will it be converted
appropriately (e.g. it came from am SQL_ASCII encoding and its goinginto a
UTF-8 database)?
You might need to set client encoding when restoring. Or use iconv to
convert from one encoding to another, which is what I usually do.
Note that it's VERY likely you'll have data in a SQL_ASCII db that
won't go into a UTF8 database without some lossiness.Yes, I see this might be the case. From my playing around with iconv I
cannot even properly do the conversion:
$ pg_dump -Fp foo > foo.sql
$ file -i foo.sql
foo.sql: text/plain; charset=us-ascii
$ iconv -f utf-8 foo.sql > utf8.sql
iconv: illegal input sequence at position 2512661I think you got it backwards, the -f should be somthing other than
utf-8 right? That's what the -t should be right? Try iconv without a
-f switch and a -t of utf-8 and see what happens...
You're right, I had -f when I needed -t. I tried it again with the same
error:
$ iconv -t utf-8 foo.sql > utf.sql
iconv: illegal input sequence at position 2512661
On Fri, Sep 30, 2011 at 12:46 PM, Cody Caughlan <toolbag@gmail.com> wrote:
Please see below.
On Fri, Sep 30, 2011 at 11:38 AM, Scott Marlowe <scott.marlowe@gmail.com>
wrote:On Fri, Sep 30, 2011 at 12:26 PM, Cody Caughlan <toolbag@gmail.com> wrote:
Thanks Scott. See below:
On Fri, Sep 30, 2011 at 11:05 AM, Scott Marlowe
<scott.marlowe@gmail.com>
wrote:On Fri, Sep 30, 2011 at 11:25 AM, Cody Caughlan <toolbag@gmail.com>
wrote:I would like to change my server_encoding which is currently
SQL_ASCII
to UTF8.I have existing data that I would like to keep.
From my understanding of the steps I need to:
1) alter the template1 database encoding via
UPDATE pg_database SET encoding = 6 where datname IN ('template0',
'template1');Just create database using template0 as template and you can skip this
step ^^Wouldn't this only work if my template0 was UTF8 itself?
=> select datname, pg_encoding_to_char(encoding) from pg_database;
datname | pg_encoding_to_char
----------------------+---------------------
template1 | SQL_ASCII
template0 | SQL_ASCII
postgres | SQL_ASCIISo it appears both template0 & template1 are SQL_ASCII, so how would
creating from a new DB from template0 be any different than template1?Well, let's try, shall we? From a freshly created cluster on my
laptop, running 8.4:smarlowe=# select datname, pg_encoding_to_char(encoding) from pg_database;
datname | pg_encoding_to_char
-----------+---------------------
template1 | SQL_ASCII
template0 | SQL_ASCII
postgres | SQL_ASCII
smarlowe | SQL_ASCII
(4 rows)smarlowe=# create database j template template0 encoding 'UTF8';
CREATE DATABASESeems to work.
P.s. I'm not sure why it works, I just know that it does. :)
Ok, I see what you mean. This would create a new DB with the proper
encoding. Which is "fine", and probably what I will do. I guess I see an
ideal scenario being one where we permanently convert the template encoding
to UTF8 so going forward I dont have to worry about forgetting to adding the
encoding= 'UTF8' for every new DB I create.
Ah ok. The way I fix that is this:
update pg_database set datistemplate = false where datname='template1';
drop database template1;
create database template1 template template0 encoding 'UTF8';
But your way would likely work too.
I think you got it backwards, the -f should be somthing other than
utf-8 right? That's what the -t should be right? Try iconv without a
-f switch and a -t of utf-8 and see what happens...You're right, I had -f when I needed -t. I tried it again with the same
error:
$ iconv -t utf-8 foo.sql > utf.sql
iconv: illegal input sequence at position 2512661
Any idea waht the actual encoding of your source database is?
SQL_ASCII is basically not really ascii, more like anything goes.
Cody Caughlan <toolbag@gmail.com> writes:
Ok, I see what you mean. This would create a new DB with the proper
encoding. Which is "fine", and probably what I will do. I guess I see an
ideal scenario being one where we permanently convert the template encoding
to UTF8 so going forward I dont have to worry about forgetting to adding the
encoding= 'UTF8' for every new DB I create.
Well, if you're feeling brave you can mark template0 as having utf8
encoding via a manual update to pg_database. In theory that should
be safe enough. If you know template1 doesn't, and never has, contained
any non-ASCII data, you could do the same to it ... but it would be a
lot safer to drop it and recreate from template0. See
http://wiki.postgresql.org/wiki/Adventures_in_PostgreSQL,_Episode_1
for some context.
regards, tom lane
On Fri, Sep 30, 2011 at 12:31 PM, Scott Marlowe <scott.marlowe@gmail.com>wrote:
On Fri, Sep 30, 2011 at 12:46 PM, Cody Caughlan <toolbag@gmail.com> wrote:
Please see below.
On Fri, Sep 30, 2011 at 11:38 AM, Scott Marlowe <scott.marlowe@gmail.com
wrote:
On Fri, Sep 30, 2011 at 12:26 PM, Cody Caughlan <toolbag@gmail.com>
wrote:
Thanks Scott. See below:
On Fri, Sep 30, 2011 at 11:05 AM, Scott Marlowe
<scott.marlowe@gmail.com>
wrote:On Fri, Sep 30, 2011 at 11:25 AM, Cody Caughlan <toolbag@gmail.com>
wrote:I would like to change my server_encoding which is currently
SQL_ASCII
to UTF8.I have existing data that I would like to keep.
From my understanding of the steps I need to:
1) alter the template1 database encoding via
UPDATE pg_database SET encoding = 6 where datname IN ('template0',
'template1');Just create database using template0 as template and you can skip
this
step ^^
Wouldn't this only work if my template0 was UTF8 itself?
=> select datname, pg_encoding_to_char(encoding) from pg_database;
datname | pg_encoding_to_char
----------------------+---------------------
template1 | SQL_ASCII
template0 | SQL_ASCII
postgres | SQL_ASCIISo it appears both template0 & template1 are SQL_ASCII, so how would
creating from a new DB from template0 be any different than template1?Well, let's try, shall we? From a freshly created cluster on my
laptop, running 8.4:smarlowe=# select datname, pg_encoding_to_char(encoding) from
pg_database;
datname | pg_encoding_to_char
-----------+---------------------
template1 | SQL_ASCII
template0 | SQL_ASCII
postgres | SQL_ASCII
smarlowe | SQL_ASCII
(4 rows)smarlowe=# create database j template template0 encoding 'UTF8';
CREATE DATABASESeems to work.
P.s. I'm not sure why it works, I just know that it does. :)
Ok, I see what you mean. This would create a new DB with the proper
encoding. Which is "fine", and probably what I will do. I guess I see an
ideal scenario being one where we permanently convert the templateencoding
to UTF8 so going forward I dont have to worry about forgetting to adding
the
encoding= 'UTF8' for every new DB I create.
Ah ok. The way I fix that is this:
update pg_database set datistemplate = false where datname='template1';
drop database template1;
create database template1 template template0 encoding 'UTF8';But your way would likely work too.
I think you got it backwards, the -f should be somthing other than
utf-8 right? That's what the -t should be right? Try iconv without a
-f switch and a -t of utf-8 and see what happens...You're right, I had -f when I needed -t. I tried it again with the same
error:
$ iconv -t utf-8 foo.sql > utf.sql
iconv: illegal input sequence at position 2512661Any idea waht the actual encoding of your source database is?
SQL_ASCII is basically not really ascii, more like anything goes.
How would I find this? pg_database says my DB is SQL_ASCII.
"show all" says
client_encoding = SQL_ASCII
server_encoding = SQL_ASCII
On Fri, Sep 30, 2011 at 1:35 PM, Cody Caughlan <toolbag@gmail.com> wrote:
On Fri, Sep 30, 2011 at 12:31 PM, Scott Marlowe <scott.marlowe@gmail.com>
wrote:On Fri, Sep 30, 2011 at 12:46 PM, Cody Caughlan <toolbag@gmail.com> wrote:
Please see below.
On Fri, Sep 30, 2011 at 11:38 AM, Scott Marlowe
<scott.marlowe@gmail.com>
wrote:On Fri, Sep 30, 2011 at 12:26 PM, Cody Caughlan <toolbag@gmail.com>
wrote:Thanks Scott. See below:
On Fri, Sep 30, 2011 at 11:05 AM, Scott Marlowe
<scott.marlowe@gmail.com>
wrote:On Fri, Sep 30, 2011 at 11:25 AM, Cody Caughlan <toolbag@gmail.com>
wrote:I would like to change my server_encoding which is currently
SQL_ASCII
to UTF8.I have existing data that I would like to keep.
From my understanding of the steps I need to:
1) alter the template1 database encoding via
UPDATE pg_database SET encoding = 6 where datname IN ('template0',
'template1');Just create database using template0 as template and you can skip
this
step ^^Wouldn't this only work if my template0 was UTF8 itself?
=> select datname, pg_encoding_to_char(encoding) from pg_database;
datname | pg_encoding_to_char
----------------------+---------------------
template1 | SQL_ASCII
template0 | SQL_ASCII
postgres | SQL_ASCIISo it appears both template0 & template1 are SQL_ASCII, so how would
creating from a new DB from template0 be any different than
template1?Well, let's try, shall we? From a freshly created cluster on my
laptop, running 8.4:smarlowe=# select datname, pg_encoding_to_char(encoding) from
pg_database;
datname | pg_encoding_to_char
-----------+---------------------
template1 | SQL_ASCII
template0 | SQL_ASCII
postgres | SQL_ASCII
smarlowe | SQL_ASCII
(4 rows)smarlowe=# create database j template template0 encoding 'UTF8';
CREATE DATABASESeems to work.
P.s. I'm not sure why it works, I just know that it does. :)
Ok, I see what you mean. This would create a new DB with the proper
encoding. Which is "fine", and probably what I will do. I guess I see an
ideal scenario being one where we permanently convert the template
encoding
to UTF8 so going forward I dont have to worry about forgetting to adding
the
encoding= 'UTF8' for every new DB I create.Ah ok. The way I fix that is this:
update pg_database set datistemplate = false where datname='template1';
drop database template1;
create database template1 template template0 encoding 'UTF8';But your way would likely work too.
I think you got it backwards, the -f should be somthing other than
utf-8 right? That's what the -t should be right? Try iconv without a
-f switch and a -t of utf-8 and see what happens...You're right, I had -f when I needed -t. I tried it again with the same
error:
$ iconv -t utf-8 foo.sql > utf.sql
iconv: illegal input sequence at position 2512661Any idea waht the actual encoding of your source database is?
SQL_ASCII is basically not really ascii, more like anything goes.How would I find this? pg_database says my DB is SQL_ASCII.
"show all" says
client_encoding = SQL_ASCII
server_encoding = SQL_ASCII
It would have been set by the application accessing postgresql and
inserting the data. I.e. was it a windows app using a typical windows
encoding? etc.
Its a Rails app and I do have:
encoding: utf8
Set in my DB configuration.
On Fri, Sep 30, 2011 at 12:38 PM, Scott Marlowe <scott.marlowe@gmail.com>wrote:
Show quoted text
On Fri, Sep 30, 2011 at 1:35 PM, Cody Caughlan <toolbag@gmail.com> wrote:
On Fri, Sep 30, 2011 at 12:31 PM, Scott Marlowe <scott.marlowe@gmail.com
wrote:
On Fri, Sep 30, 2011 at 12:46 PM, Cody Caughlan <toolbag@gmail.com>
wrote:
Please see below.
On Fri, Sep 30, 2011 at 11:38 AM, Scott Marlowe
<scott.marlowe@gmail.com>
wrote:On Fri, Sep 30, 2011 at 12:26 PM, Cody Caughlan <toolbag@gmail.com>
wrote:Thanks Scott. See below:
On Fri, Sep 30, 2011 at 11:05 AM, Scott Marlowe
<scott.marlowe@gmail.com>
wrote:On Fri, Sep 30, 2011 at 11:25 AM, Cody Caughlan <
toolbag@gmail.com>
wrote:
I would like to change my server_encoding which is currently
SQL_ASCII
to UTF8.I have existing data that I would like to keep.
From my understanding of the steps I need to:
1) alter the template1 database encoding via
UPDATE pg_database SET encoding = 6 where datname IN
('template0',
'template1');
Just create database using template0 as template and you can skip
this
step ^^Wouldn't this only work if my template0 was UTF8 itself?
=> select datname, pg_encoding_to_char(encoding) from pg_database;
datname | pg_encoding_to_char
----------------------+---------------------
template1 | SQL_ASCII
template0 | SQL_ASCII
postgres | SQL_ASCIISo it appears both template0 & template1 are SQL_ASCII, so how
would
creating from a new DB from template0 be any different than
template1?Well, let's try, shall we? From a freshly created cluster on my
laptop, running 8.4:smarlowe=# select datname, pg_encoding_to_char(encoding) from
pg_database;
datname | pg_encoding_to_char
-----------+---------------------
template1 | SQL_ASCII
template0 | SQL_ASCII
postgres | SQL_ASCII
smarlowe | SQL_ASCII
(4 rows)smarlowe=# create database j template template0 encoding 'UTF8';
CREATE DATABASESeems to work.
P.s. I'm not sure why it works, I just know that it does. :)
Ok, I see what you mean. This would create a new DB with the proper
encoding. Which is "fine", and probably what I will do. I guess I seean
ideal scenario being one where we permanently convert the template
encoding
to UTF8 so going forward I dont have to worry about forgetting toadding
the
encoding= 'UTF8' for every new DB I create.Ah ok. The way I fix that is this:
update pg_database set datistemplate = false where datname='template1';
drop database template1;
create database template1 template template0 encoding 'UTF8';But your way would likely work too.
I think you got it backwards, the -f should be somthing other than
utf-8 right? That's what the -t should be right? Try iconv withouta
-f switch and a -t of utf-8 and see what happens...
You're right, I had -f when I needed -t. I tried it again with the
same
error:
$ iconv -t utf-8 foo.sql > utf.sql
iconv: illegal input sequence at position 2512661Any idea waht the actual encoding of your source database is?
SQL_ASCII is basically not really ascii, more like anything goes.How would I find this? pg_database says my DB is SQL_ASCII.
"show all" says
client_encoding = SQL_ASCII
server_encoding = SQL_ASCIIIt would have been set by the application accessing postgresql and
inserting the data. I.e. was it a windows app using a typical windows
encoding? etc.
On Fri, Sep 30, 2011 at 1:39 PM, Cody Caughlan <toolbag@gmail.com> wrote:
Its a Rails app and I do have:
encoding: utf8
Hmmm, if you try this does it work (mostly)?
iconv -f utf-8 -t utf-8 -c < infile > outfile
On Fri, Sep 30, 2011 at 1:41 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Fri, Sep 30, 2011 at 1:39 PM, Cody Caughlan <toolbag@gmail.com> wrote:
Its a Rails app and I do have:
encoding: utf8Hmmm, if you try this does it work (mostly)?
iconv -f utf-8 -t utf-8 -c < infile > outfile
If that doesn't work try:
iconv -f utf-8 -t utf-8//IGNORE -c < infile > outfile
That worked, but "file" shows no difference:
$ iconv -f utf-8 -t utf-8 -c foo.sql > utf.sql
$ file -i foo.sql
foo.sql: text/plain; charset=us-ascii
$file -i utf.sql
utf.sql: text/plain; charset=us-ascii
So iconv didnt actually convert the file OR does is the "file" command just
ignorant?
On Fri, Sep 30, 2011 at 12:41 PM, Scott Marlowe <scott.marlowe@gmail.com>wrote:
Show quoted text
On Fri, Sep 30, 2011 at 1:39 PM, Cody Caughlan <toolbag@gmail.com> wrote:
Its a Rails app and I do have:
encoding: utf8Hmmm, if you try this does it work (mostly)?
iconv -f utf-8 -t utf-8 -c < infile > outfile
On Fri, Sep 30, 2011 at 1:45 PM, Cody Caughlan <toolbag@gmail.com> wrote:
That worked, but "file" shows no difference:
$ iconv -f utf-8 -t utf-8 -c foo.sql > utf.sql
$ file -i foo.sql
foo.sql: text/plain; charset=us-ascii
$file -i utf.sql
utf.sql: text/plain; charset=us-ascii
So iconv didnt actually convert the file OR does is the "file" command just
ignorant?
Not sure. try loading the dump into the UTF-8 DB in postgres and see
what happens I guess?
Please see below.
On Fri, Sep 30, 2011 at 1:12 PM, Scott Marlowe <scott.marlowe@gmail.com>wrote:
On Fri, Sep 30, 2011 at 1:45 PM, Cody Caughlan <toolbag@gmail.com> wrote:
That worked, but "file" shows no difference:
$ iconv -f utf-8 -t utf-8 -c foo.sql > utf.sql
$ file -i foo.sql
foo.sql: text/plain; charset=us-ascii
$file -i utf.sql
utf.sql: text/plain; charset=us-ascii
So iconv didnt actually convert the file OR does is the "file" commandjust
ignorant?
Not sure. try loading the dump into the UTF-8 DB in postgres and see
what happens I guess?
Uh oh.
On the remote machine:
$ pg_dump -Fc -E UTF8 foo > foo.sql
Then I've created a new local DB with UTF8 encoding and I try to restore
this dump into it:
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 2342; 0 17086 TABLE DATA
wine_books vinosmith
pg_restore: [archiver (db)] COPY failed for table "wine_books": ERROR:
invalid byte sequence for encoding "UTF8": 0xc309
CONTEXT: COPY wine_books, line 1147
WARNING: errors ignored on restore: 1
And sure enough the table "wine_books" is empty. Not good.
On Fri, Sep 30, 2011 at 2:57 PM, Cody Caughlan <toolbag@gmail.com> wrote:
Please see below.
On Fri, Sep 30, 2011 at 1:12 PM, Scott Marlowe <scott.marlowe@gmail.com>
wrote:On Fri, Sep 30, 2011 at 1:45 PM, Cody Caughlan <toolbag@gmail.com> wrote:
That worked, but "file" shows no difference:
$ iconv -f utf-8 -t utf-8 -c foo.sql > utf.sql
$ file -i foo.sql
foo.sql: text/plain; charset=us-ascii
$file -i utf.sql
utf.sql: text/plain; charset=us-ascii
So iconv didnt actually convert the file OR does is the "file" command
just
ignorant?Not sure. try loading the dump into the UTF-8 DB in postgres and see
what happens I guess?Uh oh.
On the remote machine:
$ pg_dump -Fc -E UTF8 foo > foo.sql
Then I've created a new local DB with UTF8 encoding and I try to restore
this dump into it:
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 2342; 0 17086 TABLE DATA
wine_books vinosmith
pg_restore: [archiver (db)] COPY failed for table "wine_books": ERROR:
invalid byte sequence for encoding "UTF8": 0xc309
CONTEXT: COPY wine_books, line 1147
WARNING: errors ignored on restore: 1
And sure enough the table "wine_books" is empty. Not good.
You may have to hunt down that one bad line (1147) and chop it out /
edit it so it works.
Thanks y'all for your help on this.
I took this opportunity to upgrade to 9.1.1 which is UTF8 by default and I
ended up manually cleaning up the borked data by hand (there wasn't that
much).
So all is well now.
Thanks again.
/Cody
On Fri, Sep 30, 2011 at 3:37 PM, Scott Marlowe <scott.marlowe@gmail.com>wrote:
Show quoted text
On Fri, Sep 30, 2011 at 2:57 PM, Cody Caughlan <toolbag@gmail.com> wrote:
Please see below.
On Fri, Sep 30, 2011 at 1:12 PM, Scott Marlowe <scott.marlowe@gmail.com>
wrote:On Fri, Sep 30, 2011 at 1:45 PM, Cody Caughlan <toolbag@gmail.com>
wrote:
That worked, but "file" shows no difference:
$ iconv -f utf-8 -t utf-8 -c foo.sql > utf.sql
$ file -i foo.sql
foo.sql: text/plain; charset=us-ascii
$file -i utf.sql
utf.sql: text/plain; charset=us-ascii
So iconv didnt actually convert the file OR does is the "file" command
just
ignorant?Not sure. try loading the dump into the UTF-8 DB in postgres and see
what happens I guess?Uh oh.
On the remote machine:
$ pg_dump -Fc -E UTF8 foo > foo.sql
Then I've created a new local DB with UTF8 encoding and I try to restore
this dump into it:
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 2342; 0 17086 TABLE DATA
wine_books vinosmith
pg_restore: [archiver (db)] COPY failed for table "wine_books": ERROR:
invalid byte sequence for encoding "UTF8": 0xc309
CONTEXT: COPY wine_books, line 1147
WARNING: errors ignored on restore: 1
And sure enough the table "wine_books" is empty. Not good.You may have to hunt down that one bad line (1147) and chop it out /
edit it so it works.