Proposed TODO: --encoding option for pg_dump
Folks,
There's no time to do this for 8.1, but I'd like to get it on the books for
8.2:
The Problem: Occassionally a DBA needs to dump a database to a new
encoding. In instances where the current encoding, (or lack of an
encoding, like SQL_ASCII) is poorly supported on the target database
server, it can be useful to dump into a particular encoding. But,
currently the only way to set the encoding of a pg_dump file is to change
client_encoding in postgresql.conf and restart postmaster. This is more
than a little awkward for production systems.
The TODO: add an --encoding=[encoding name] option to pg_dump. This would
set client_encoding for pg_dump's session(s).
--
--Josh
Josh Berkus
Aglio Database Solutions
San Francisco
Josh Berkus wrote:
currently the only way to set the encoding of a pg_dump file is to
change client_encoding in postgresql.conf and restart postmaster.
Another way is to set the environment variable PGCLIENTENCODING.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
On Tue, 28 Jun 2005, Josh Berkus wrote:
The TODO: add an --encoding=[encoding name] option to pg_dump. This would
set client_encoding for pg_dump's session(s).
What about just using the PGCLIENTENCODING environment variable?
Kris Jurka
There's no time to do this for 8.1, but I'd like to get it on
the books for
8.2:The Problem: Occassionally a DBA needs to dump a database to a new
encoding. In instances where the current encoding, (or lack of an
encoding, like SQL_ASCII) is poorly supported on the target
database server, it can be useful to dump into a particular
encoding. But, currently the only way to set the encoding of
a pg_dump file is to change
client_encoding in postgresql.conf and restart postmaster.
This is more
than a little awkward for production systems.The TODO: add an --encoding=[encoding name] option to
pg_dump. This would set client_encoding for pg_dump's session(s).
I *think* that's easy enough to do in time for 8.1. Trivial patch
attached. I hope it's enough :-) It passed my very quick testing...
(Yup, I read the mails aobut PGCLIENTENCODING, but an option to pg_dump
is certainly easier)
//Magnus
Attachments:
pg_dump.diffapplication/octet-stream; name=pg_dump.diffDownload
*** pg_dump.c.orig 2005-06-28 20:49:10.000000000 +0100
--- pg_dump.c 2005-06-28 21:16:21.000000000 +0100
***************
*** 183,188 ****
--- 183,189 ----
const char *pghost = NULL;
const char *pgport = NULL;
const char *username = NULL;
+ const char *dumpencoding = NULL;
bool oids = false;
TableInfo *tblinfo;
int numTables;
***************
*** 229,234 ****
--- 230,236 ----
{"no-privileges", no_argument, NULL, 'x'},
{"no-acl", no_argument, NULL, 'x'},
{"compress", required_argument, NULL, 'Z'},
+ {"encoding", required_argument, NULL, 'E'},
{"help", no_argument, NULL, '?'},
{"version", no_argument, NULL, 'V'},
***************
*** 277,283 ****
}
}
! while ((c = getopt_long(argc, argv, "abcCdDf:F:h:in:oOp:RsS:t:uU:vWxX:Z:",
long_options, &optindex)) != -1)
{
switch (c)
--- 279,285 ----
}
}
! while ((c = getopt_long(argc, argv, "abcCdDE:f:F:h:in:oOp:RsS:t:uU:vWxX:Z:",
long_options, &optindex)) != -1)
{
switch (c)
***************
*** 309,314 ****
--- 311,320 ----
attrNames = true;
break;
+ case 'E': /* Dump encoding */
+ dumpencoding = optarg;
+ break;
+
case 'f':
filename = optarg;
break;
***************
*** 533,538 ****
--- 539,553 ----
/* Set the datestyle to ISO to ensure the dump's portability */
do_sql_command(g_conn, "SET DATESTYLE = ISO");
+ /* Set the client encoding */
+ if (dumpencoding)
+ {
+ char *cmd = malloc(strlen(dumpencoding) + 32);
+ sprintf(cmd,"SET client_encoding='%s'", dumpencoding);
+ do_sql_command(g_conn, cmd);
+ free(cmd);
+ }
+
/*
* If supported, set extra_float_digits so that we can dump float data
* exactly (given correctly implemented float I/O code, anyway)
***************
*** 675,680 ****
--- 690,696 ----
printf(_(" -C, --create include commands to create database in dump\n"));
printf(_(" -d, --inserts dump data as INSERT, rather than COPY, commands\n"));
printf(_(" -D, --column-inserts dump data as INSERT commands with column names\n"));
+ printf(_(" -E, --encoding=ENCODING dump the data in encoding ENCODING\n"));
printf(_(" -n, --schema=SCHEMA dump the named schema only\n"));
printf(_(" -o, --oids include OIDs in dump\n"));
printf(_(" -O, --no-owner skip restoration of object ownership\n"
pg_dump.sgml.diffapplication/octet-stream; name=pg_dump.sgml.diffDownload
*** pg_dump.sgml.orig 2005-06-28 21:19:28.000000000 +0100
--- pg_dump.sgml 2005-06-28 21:21:27.000000000 +0100
***************
*** 206,211 ****
--- 206,222 ----
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><option>-E <replaceable class="parameter">encoding</replaceable></option></term>
+ <listitem>
+ <para>
+ Create the dump in the specified encoding. By default, the dump is
+ created in the database encoding.
+ </para>
+ </listitem>
+ </varlistentry>
+
+
<varlistentry>
<term><option>-f <replaceable class="parameter">file</replaceable></option></term>
<term><option>--file=<replaceable class="parameter">file</replaceable></option></term>
Import Notes
Resolved by subject fallback
On Tue, Jun 28, 2005 at 10:24:19PM +0200, Magnus Hagander wrote:
I *think* that's easy enough to do in time for 8.1. Trivial patch
attached. I hope it's enough :-) It passed my very quick testing...(Yup, I read the mails aobut PGCLIENTENCODING, but an option to pg_dump
is certainly easier)
You forgot to document the long option, I think.
--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"No necesitamos banderas
No reconocemos fronteras" (Jorge Gonz�lez)
I *think* that's easy enough to do in time for 8.1. Trivial patch
attached. I hope it's enough :-) It passed my very quick testing...(Yup, I read the mails aobut PGCLIENTENCODING, but an option to
pg_dump is certainly easier)You forgot to document the long option, I think.
Oops. Fixed. Thanks.
//Magnus
Attachments:
pg_dump.sgml.diffapplication/octet-stream; name=pg_dump.sgml.diffDownload
*** pg_dump.sgml.orig 2005-06-28 21:19:28.000000000 +0100
--- pg_dump.sgml 2005-06-28 21:41:40.000000000 +0100
***************
*** 206,211 ****
--- 206,223 ----
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><option>-E <replaceable class="parameter">encoding</replaceable></option></term>
+ <term><option>--encoding=<replaceable class="parameter">encoding</replaceable></option></term>
+ <listitem>
+ <para>
+ Create the dump in the specified encoding. By default, the dump is
+ created in the database encoding.
+ </para>
+ </listitem>
+ </varlistentry>
+
+
<varlistentry>
<term><option>-f <replaceable class="parameter">file</replaceable></option></term>
<term><option>--file=<replaceable class="parameter">file</replaceable></option></term>
Import Notes
Resolved by subject fallback
Alvaro Herrera wrote:
On Tue, Jun 28, 2005 at 10:24:19PM +0200, Magnus Hagander wrote:
I *think* that's easy enough to do in time for 8.1. Trivial patch
attached. I hope it's enough :-) It passed my very quick testing...(Yup, I read the mails aobut PGCLIENTENCODING, but an option to pg_dump
is certainly easier)You forgot to document the long option, I think.
Are the man pages generated from the sgml docs? Have never had a look at
that.
Best Regards,
Michael Paesold
I *think* that's easy enough to do in time for 8.1. Trivial patch
attached. I hope it's enough :-) It passed my very quick testing...(Yup, I read the mails aobut PGCLIENTENCODING, but an option to
pg_dump is certainly easier)You forgot to document the long option, I think.
Are the man pages generated from the sgml docs? Have never
had a look at that.
Yes - using docbook2man.
//Magnus
Import Notes
Resolved by subject fallback
I support to add the option, for I've been seeing too many of
our client got 'bad' dump just because they don't set PGCLIENTENCODING
correctly, (mostly because they use UTF8 as database encoding
but use some other encoding, like GBK as client encoding, but some
words break the autoconversion at present version and set
PGCLIENTENCODING to UTF8 would fix the problem).
Adding such a switch would remind DBAs there exists some encoding
conversion. In fact, I even think that we should use database encoding
to dump data regardless the PGCLIENTENCODING setting (unless
the user set the --encoding switch explicit), but I think
that might be break someone's application somewhere. :(
regards laser
Add pg_dump --encoding.
Patch applied. Thanks.
---------------------------------------------------------------------------
Magnus Hagander wrote:
I *think* that's easy enough to do in time for 8.1. Trivial patch
attached. I hope it's enough :-) It passed my very quick testing...(Yup, I read the mails aobut PGCLIENTENCODING, but an option to
pg_dump is certainly easier)You forgot to document the long option, I think.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073