reload-through-the-top-parent switch the partition table
/messages/by-id/CA+TgmoZFn7TJ7QBsFat
nuEE%3DGYGdZSNXqr9489n5JBsdy5rFfA%40mail.gmail.com
Above thread, it's been pointed out as important consideration
about adding reload-through-the-top-parent switch the partition
table. One small step toward making use of hash function was
adding a switch into pg_dump which reload through the top
parent for the partition table.
Attach patch does the implementation for the same:
- Added switch reload-through-root: (Open for suggestion for the switch
name)
- Fix dumpTableData to COPY to the Root table with the reload-through-root
switch.
- Fix dumpTableData_insert - to generate the proper insert statement with
the reload-through-root switch.
- Added switch reload-through-root into pg_dumpall
Testing:
- Performed test with multi level partition + different schema combination.
- Tested a case with parent and child attributes in different order.
Attaching the pg_dump output for below test with --reload-through-root for
COPY and INSERTS.
Testcase:
create schema a;
create schema b;
create schema c;
create table a.t1 ( a int, b int ) partition by list(a);
create table b.t1_p1 partition of a.t1 FOR VALUES in ( 1, 2, 3) partition
by list(b);
create table c.t1_p1_p1 partition of b.t1_p1 FOR VALUES in (10, 20 );
insert into a.t1 values ( 2 , 10 );
insert into a.t1 values ( 1 , 20 );
My colleague Robert and I had doubt about the order in of TABLE
and TABLE_DATA. We thought earlier that reload-thought-root might
might not solve the purpose which has been discussed in the above
mentioned thread. But later looking into code I realize the sort order
for DO_TABLE and DO_TABLE_DATA are different, so we don't need
to worry about that issue.
TODOs:
- Update documentation for pg_dump & pg_dumpall
Thanks,
Rushabh Lathia
www.EnterpriseDB.com
Attachments:
reload_through_root_v1_0001.patchtext/x-patch; charset=US-ASCII; name=reload_through_root_v1_0001.patchDownload
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 144068a..bfd49a8 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -157,6 +157,7 @@ typedef struct _dumpOptions
int outputNoTablespaces;
int use_setsessauth;
int enable_row_security;
+ int reload_through_root;
/* default, if no "inclusion" switches appear, is to dump everything */
bool include_everything;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index fb964d0..f0eacd2 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -269,6 +269,7 @@ static void appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
const char *prefix, Archive *fout);
static char *get_synchronized_snapshot(Archive *fout);
static void setupDumpWorker(Archive *AHX);
+static TableInfo *getRootTableInfo(TableInfo *tbinfo);
int
@@ -345,6 +346,7 @@ main(int argc, char **argv)
{"lock-wait-timeout", required_argument, NULL, 2},
{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
{"quote-all-identifiers", no_argument, "e_all_identifiers, 1},
+ {"reload-through-root", no_argument, &dopt.reload_through_root, 1},
{"role", required_argument, NULL, 3},
{"section", required_argument, NULL, 5},
{"serializable-deferrable", no_argument, &dopt.serializable_deferrable, 1},
@@ -959,6 +961,7 @@ help(const char *progname)
printf(_(" --no-tablespaces do not dump tablespace assignments\n"));
printf(_(" --no-unlogged-table-data do not dump unlogged table data\n"));
printf(_(" --quote-all-identifiers quote all identifiers, even if not key words\n"));
+ printf(_(" --reload-through-root reload partition table through the root relation\n"));
printf(_(" --section=SECTION dump named section (pre-data, data, or post-data)\n"));
printf(_(" --serializable-deferrable wait until the dump can run without anomalies\n"));
printf(_(" --snapshot=SNAPSHOT use given snapshot for the dump\n"));
@@ -1902,8 +1905,32 @@ dumpTableData_insert(Archive *fout, void *dcontext)
if (insertStmt == NULL)
{
insertStmt = createPQExpBuffer();
+
+ /*
+ * When reload-through-root is set, get the root relation name
+ * for the partition table, so that we can reload data through
+ * the root table.
+ */
+ if (dopt->reload_through_root && tbinfo->ispartition)
+ {
+ TableInfo *parentTbinfo;
+
+ parentTbinfo = getRootTableInfo(tbinfo);
+
+ /*
+ * When we loading data through the root, we will qualify
+ * the table name. This is needed because earlier
+ * search_path will be set for the partition table.
+ */
+ classname = (char *) fmtQualifiedId(fout->remoteVersion,
+ parentTbinfo->dobj.namespace->dobj.name,
+ parentTbinfo->dobj.name);
+ }
+ else
+ classname = fmtId(tbinfo->dobj.name);
+
appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
- fmtId(classname));
+ classname);
/* corner case for zero-column table */
if (nfields == 0)
@@ -2025,6 +2052,20 @@ dumpTableData_insert(Archive *fout, void *dcontext)
return 1;
}
+/*
+ * getRootTableInfo:
+ * get the root TableInfo for the given partition table.
+ */
+static TableInfo *
+getRootTableInfo(TableInfo *tbinfo)
+{
+ Assert(tbinfo->ispartition);
+ Assert(tbinfo->numParents == 1);
+ if (tbinfo->parents[0]->ispartition)
+ return getRootTableInfo(tbinfo->parents[0]);
+
+ return tbinfo->parents[0];
+}
/*
* dumpTableData -
@@ -2041,14 +2082,37 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
PQExpBuffer clistBuf = createPQExpBuffer();
DataDumperPtr dumpFn;
char *copyStmt;
+ const char *copyFrom;
if (!dopt->dump_inserts)
{
/* Dump/restore using COPY */
dumpFn = dumpTableData_copy;
- /* must use 2 steps here 'cause fmtId is nonreentrant */
+
+ /*
+ * When reload-through-root is set, get the root relation name for the
+ * partition table, so that we can reload data through the root table.
+ */
+ if (dopt->reload_through_root && tbinfo->ispartition)
+ {
+ TableInfo *parentTbinfo;
+
+ parentTbinfo = getRootTableInfo(tbinfo);
+
+ /*
+ * When we loading data through the root, we will qualify the
+ * table name. This is needed because earlier search_path will be
+ * set for the partition table.
+ */
+ copyFrom = fmtQualifiedId(fout->remoteVersion,
+ parentTbinfo->dobj.namespace->dobj.name,
+ parentTbinfo->dobj.name);
+ }
+ else
+ copyFrom = fmtId(tbinfo->dobj.name);
+
appendPQExpBuffer(copyBuf, "COPY %s ",
- fmtId(tbinfo->dobj.name));
+ copyFrom);
appendPQExpBuffer(copyBuf, "%s %sFROM stdin;\n",
fmtCopyColumnList(tbinfo, clistBuf),
(tdinfo->oids && tbinfo->hasoids) ? "WITH OIDS " : "");
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index b14bb8e..1849581 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -80,6 +80,7 @@ static int no_subscriptions = 0;
static int no_unlogged_table_data = 0;
static int no_role_passwords = 0;
static int server_version;
+static int reload_through_root = 0;
static char role_catalog[10];
#define PG_AUTHID "pg_authid"
@@ -128,6 +129,7 @@ main(int argc, char *argv[])
{"lock-wait-timeout", required_argument, NULL, 2},
{"no-tablespaces", no_argument, &no_tablespaces, 1},
{"quote-all-identifiers", no_argument, "e_all_identifiers, 1},
+ {"reload-through-root", no_argument, &reload_through_root, 1},
{"role", required_argument, NULL, 3},
{"use-set-session-authorization", no_argument, &use_setsessauth, 1},
{"no-publications", no_argument, &no_publications, 1},
@@ -385,6 +387,8 @@ main(int argc, char *argv[])
appendPQExpBufferStr(pgdumpopts, " --no-tablespaces");
if (quote_all_identifiers)
appendPQExpBufferStr(pgdumpopts, " --quote-all-identifiers");
+ if (reload_through_root)
+ appendPQExpBufferStr(pgdumpopts, " --reload-through-root");
if (use_setsessauth)
appendPQExpBufferStr(pgdumpopts, " --use-set-session-authorization");
if (no_publications)
@@ -606,6 +610,7 @@ help(void)
printf(_(" --no-tablespaces do not dump tablespace assignments\n"));
printf(_(" --no-unlogged-table-data do not dump unlogged table data\n"));
printf(_(" --quote-all-identifiers quote all identifiers, even if not key words\n"));
+ printf(_(" --reload-through-root reload partition table through the root relation\n"));
printf(_(" --use-set-session-authorization\n"
" use SET SESSION AUTHORIZATION commands instead of\n"
" ALTER OWNER commands to set ownership\n"));
On Tue, Aug 1, 2017 at 5:34 AM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
My colleague Robert and I had doubt about the order in of TABLE
and TABLE_DATA. We thought earlier that reload-thought-root might
might not solve the purpose which has been discussed in the above
mentioned thread. But later looking into code I realize the sort order
for DO_TABLE and DO_TABLE_DATA are different, so we don't need
to worry about that issue.
Hmm. Does that mean that table data restoration will *absolutely
always* follow all CREATE TABLE commands, or is that something that's
*normally* true but potentially false if dependency sorting switches
things around?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Aug 2, 2017 at 3:55 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Aug 1, 2017 at 5:34 AM, Rushabh Lathia <rushabh.lathia@gmail.com>
wrote:My colleague Robert and I had doubt about the order in of TABLE
and TABLE_DATA. We thought earlier that reload-thought-root might
might not solve the purpose which has been discussed in the above
mentioned thread. But later looking into code I realize the sort order
for DO_TABLE and DO_TABLE_DATA are different, so we don't need
to worry about that issue.Hmm. Does that mean that table data restoration will *absolutely
always* follow all CREATE TABLE commands, or is that something that's
*normally* true but potentially false if dependency sorting switches
things around?
Looking at the dbObjectTypePriority comments that seems like data
restoration
will *absolutely always* follow all CREATE TABLE commands.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Thanks,
Rushabh Lathia
www.EnterpriseDB.com
On Wed, Aug 2, 2017 at 1:01 AM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
Looking at the dbObjectTypePriority comments that seems like data
restoration
will *absolutely always* follow all CREATE TABLE commands.
Hmm. I wasn't very convinced by those comments, but Tom's commit
a1ef01fe163b304760088e3e30eb22036910a495 convinces me that it has to
work that way. So I think we are OK on that score.
The patch itself looks just fine on a quick glance, modulo the lack of
documentation, but I think we need to bikeshed the name of the flag.
--reload-through-root is clear as daylight to me, but I'm not sure
users will agree. The lack of the word "partition" is perhaps a
significant flaw, and pg_dump doesn't really reload anything; it just
dumps.
The best thing I can come up with after brief thought is
--partition-data-via-root, but maybe somebody else has a better idea?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
The patch itself looks just fine on a quick glance, modulo the lack of
documentation, but I think we need to bikeshed the name of the flag.
--reload-through-root is clear as daylight to me, but I'm not sure
users will agree. The lack of the word "partition" is perhaps a
significant flaw, and pg_dump doesn't really reload anything; it just
dumps.
The best thing I can come up with after brief thought is
--partition-data-via-root, but maybe somebody else has a better idea?
--restore-via-partition-root ?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Aug 2, 2017 at 1:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
The patch itself looks just fine on a quick glance, modulo the lack of
documentation, but I think we need to bikeshed the name of the flag.
--reload-through-root is clear as daylight to me, but I'm not sure
users will agree. The lack of the word "partition" is perhaps a
significant flaw, and pg_dump doesn't really reload anything; it just
dumps.The best thing I can come up with after brief thought is
--partition-data-via-root, but maybe somebody else has a better idea?--restore-via-partition-root ?
I worry someone will think that pg_dump is now restoring stuff, but it isn't.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Aug 2, 2017 at 1:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
--restore-via-partition-root ?
I worry someone will think that pg_dump is now restoring stuff, but it isn't.
Well, the point is that the commands it emits will cause the eventual
restore to go through the root. Anyway, I think trying to avoid using
a verb altogether is going to result in a very stilted option name.
I notice that the option list already includes some references to
"insert", so maybe "--insert-via-partition-root"? Although you could
argue that that's confusing when we're using COPY.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Aug 2, 2017 at 10:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Aug 2, 2017 at 1:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
--restore-via-partition-root ?
I worry someone will think that pg_dump is now restoring stuff, but it
isn't.
Well, the point is that the commands it emits will cause the eventual
restore to go through the root. Anyway, I think trying to avoid using
a verb altogether is going to result in a very stilted option name.I notice that the option list already includes some references to
"insert", so maybe "--insert-via-partition-root"? Although you could
argue that that's confusing when we're using COPY.
--use-partitioned-table [partition-name, ...] # if names are omitted it
defaults to all partitioned tables
I don't know that we need to use "root" in the argument name to communicate
"the top-most if partitioned tables are nested". We have the docs to
describe exactly what it does. "Partitioned Table" is what we are calling
the main routing table in the docs. "Use" seems adequate.
FWIW my first thought was "--insert-via-partition-root" and I didn't mind
that "COPY" commands would be affected implicitly.
David J.
On Wed, Aug 2, 2017 at 1:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Aug 2, 2017 at 1:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
--restore-via-partition-root ?
I worry someone will think that pg_dump is now restoring stuff, but it isn't.
Well, the point is that the commands it emits will cause the eventual
restore to go through the root. Anyway, I think trying to avoid using
a verb altogether is going to result in a very stilted option name.I notice that the option list already includes some references to
"insert", so maybe "--insert-via-partition-root"? Although you could
argue that that's confusing when we're using COPY.
Yeah, that's definitely confusing. I realize that my verbless version
is a little odd, but there are numerous precedents for it: --inserts,
--column-inserts, --if-exists, --strict-names, ...
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 8/2/17 13:58, Tom Lane wrote:
I notice that the option list already includes some references to
"insert", so maybe "--insert-via-partition-root"? Although you could
argue that that's confusing when we're using COPY.
"load" could be more general. But I'm also OK with "restore".
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Aug 2, 2017 at 11:47 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Wed, Aug 2, 2017 at 10:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Aug 2, 2017 at 1:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
--restore-via-partition-root ?
I worry someone will think that pg_dump is now restoring stuff, but it
isn't.Well, the point is that the commands it emits will cause the eventual
restore to go through the root. Anyway, I think trying to avoid using
a verb altogether is going to result in a very stilted option name.I notice that the option list already includes some references to
"insert", so maybe "--insert-via-partition-root"? Although you could
argue that that's confusing when we're using COPY.--use-partitioned-table [partition-name, ...] # if names are omitted it
defaults to all partitioned tables
I like this idea since it allows using this feature for selected
tables e.g. hash tables. Otherwise, users will be forced to use this
option even when there is only a single hash partitioned table and
many other list/range partitioned tables.
What we are trying to do here is dump the data in a partitioned table
as if it's not partitioned. Combine that with --data-only dumps, and
one could use it to load partitioned data into unpartitioned or
differently partitioned table. So, how about naming the option as
--unpartition-partitioned-table [partitioned-table-name, ....] # if
names are omitted it defaults to all the partitioned tables.
That really says what dump is really doing without focusing on how the
data will be used like restoring/inserting/copying etc.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Aug 3, 2017 at 3:39 PM, Ashutosh Bapat <
ashutosh.bapat@enterprisedb.com> wrote:
On Wed, Aug 2, 2017 at 11:47 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:On Wed, Aug 2, 2017 at 10:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Aug 2, 2017 at 1:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
--restore-via-partition-root ?
I worry someone will think that pg_dump is now restoring stuff, but it
isn't.Well, the point is that the commands it emits will cause the eventual
restore to go through the root. Anyway, I think trying to avoid using
a verb altogether is going to result in a very stilted option name.I notice that the option list already includes some references to
"insert", so maybe "--insert-via-partition-root"? Although you could
argue that that's confusing when we're using COPY.--use-partitioned-table [partition-name, ...] # if names are omitted it
defaults to all partitioned tablesI like this idea since it allows using this feature for selected
tables e.g. hash tables. Otherwise, users will be forced to use this
option even when there is only a single hash partitioned table and
many other list/range partitioned tables.
+1.
What we are trying to do here is dump the data in a partitioned table
as if it's not partitioned. Combine that with --data-only dumps, and
one could use it to load partitioned data into unpartitioned or
differently partitioned table. So, how about naming the option as--unpartition-partitioned-table [partitioned-table-name, ....] # if
names are omitted it defaults to all the partitioned tables.
--unpartition-partitioned-table is very confusing.
I liked the previous option which is --use-partitioned-table
[partition-name, ...].
The Only problem with --use-partitioned-table is, a user needs to specify
the
partition-name in the options list. Imagine if someone having 100's of
partitions then specifying those name is pg_dump option is a pain.
Rather than that:
--use-partitioned-table [partitioned_name, ...] # if
names are omitted it defaults to all the partitioned tables.
Here user need to specify the root relation name in the option - and any
partition table have that as a ROOT, will load the data through
top-parent-relation.
That really says what dump is really doing without focusing on how the
data will be used like restoring/inserting/copying etc.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
--
Rushabh Lathia
On Thursday, August 3, 2017, Rushabh Lathia <rushabh.lathia@gmail.com>
wrote:
--use-partitioned-table [partitioned_name, ...] # if
names are omitted it defaults to all the partitioned tables.Here user need to specify the root relation name in the option - and any
partition table have that as a ROOT, will load the data through
top-parent-relation.
This is indeed what I intended to convey. Only specify "root" names.
David J.
On Thu, Aug 3, 2017 at 9:01 AM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
--unpartition-partitioned-table is very confusing.
+1.
I liked the previous option which is --use-partitioned-table
[partition-name, ...].
The Only problem with --use-partitioned-table is, a user needs to specify
the
partition-name in the options list. Imagine if someone having 100's of
partitions then specifying those name is pg_dump option is a pain.
Yeah, that won't work.
Rather than that:
--use-partitioned-table [partitioned_name, ...] # if
names are omitted it defaults to all the partitioned tables.Here user need to specify the root relation name in the option - and any
partition table have that as a ROOT, will load the data through
top-parent-relation.
We could do that, but I'm not sure it's a good idea to use
getopt_long() with optional options. Sometimes that creates confusion
-- is pg_dump --use-partitioned-table salad an attempt to dump the
salad database with the --use-partitioned-table option, or an attempt
to apply --use-partitioned-table only to partitions whose parent is
the salad table? getopt_long() has an answer, but some people may
guess incorrectly about what it is.
I would be more inclined to make this a global option than something
that modifies the behavior for certain tables; the only per-table
flags we have right now are just to include/exclude individual tables.
You could make --inserts or --no-unlogged-table-data apply to some but
not all tables, but we didn't; why start here?
I don't like the specific name --use-partitioned-table much either.
Use it for what?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Aug 2, 2017 at 4:08 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
On 8/2/17 13:58, Tom Lane wrote:
I notice that the option list already includes some references to
"insert", so maybe "--insert-via-partition-root"? Although you could
argue that that's confusing when we're using COPY."load" could be more general. But I'm also OK with "restore".
"load" seems better than "restore" to me, both because it's shorter
and because it sounds less like pg_dump will be doing the job of
pg_restore.
So maybe --load-via-partition-root if nobody likes my previous
suggestion of --partition-data-via-root ?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
So maybe --load-via-partition-root if nobody likes my previous
suggestion of --partition-data-via-root ?
WFM.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Aug 3, 2017 at 8:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
So maybe --load-via-partition-root if nobody likes my previous
suggestion of --partition-data-via-root ?WFM.
+1
David J.
On 2017/08/04 1:08, David G. Johnston wrote:
On Thu, Aug 3, 2017 at 8:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
So maybe --load-via-partition-root if nobody likes my previous
suggestion of --partition-data-via-root ?WFM.
+1
+1.
Thanks,
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Here is an update patch, now renamed the switch to
--load-via-partition-root
and also added the documentation for the new switch into pg_dump as well
as pg_dumpall.
On Fri, Aug 4, 2017 at 7:13 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
wrote:
On 2017/08/04 1:08, David G. Johnston wrote:
On Thu, Aug 3, 2017 at 8:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
So maybe --load-via-partition-root if nobody likes my previous
suggestion of --partition-data-via-root ?WFM.
+1
+1.
Thanks,
Amit
Thanks,
Rushabh Lathia
www.EnterpriseDB.com
Attachments:
load_via_partition_root_v2.patchtext/x-patch; charset=US-ASCII; name=load_via_partition_root_v2.patchDownload
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index bafa031..65f7e98 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -889,6 +889,16 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
+ <term><option>--load-via-partition-root</></term>
+ <listitem>
+ <para>
+ Dump data via the top-most partitioned table (rather than partition
+ table) when dumping data for the partition table.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>--section=<replaceable class="parameter">sectionname</replaceable></option></term>
<listitem>
<para>
diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index aa944a2..54b8e7e 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -431,6 +431,16 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
+ <term><option>--load-via-partition-root</></term>
+ <listitem>
+ <para>
+ Dump data via the top-most partitioned table (rather than partition
+ table) when dumping data for the partition table.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>--use-set-session-authorization</></term>
<listitem>
<para>
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 144068a..ce3100f 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -157,6 +157,7 @@ typedef struct _dumpOptions
int outputNoTablespaces;
int use_setsessauth;
int enable_row_security;
+ int load_via_partition_root;
/* default, if no "inclusion" switches appear, is to dump everything */
bool include_everything;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 393b9e2..27e23ca 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -269,6 +269,7 @@ static void appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
const char *prefix, Archive *fout);
static char *get_synchronized_snapshot(Archive *fout);
static void setupDumpWorker(Archive *AHX);
+static TableInfo *getRootTableInfo(TableInfo *tbinfo);
int
@@ -345,6 +346,7 @@ main(int argc, char **argv)
{"lock-wait-timeout", required_argument, NULL, 2},
{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
{"quote-all-identifiers", no_argument, "e_all_identifiers, 1},
+ {"load-via-partition-root", no_argument, &dopt.load_via_partition_root, 1},
{"role", required_argument, NULL, 3},
{"section", required_argument, NULL, 5},
{"serializable-deferrable", no_argument, &dopt.serializable_deferrable, 1},
@@ -959,6 +961,7 @@ help(const char *progname)
printf(_(" --no-tablespaces do not dump tablespace assignments\n"));
printf(_(" --no-unlogged-table-data do not dump unlogged table data\n"));
printf(_(" --quote-all-identifiers quote all identifiers, even if not key words\n"));
+ printf(_(" --load-via-partition-root load partition table via the root relation\n"));
printf(_(" --section=SECTION dump named section (pre-data, data, or post-data)\n"));
printf(_(" --serializable-deferrable wait until the dump can run without anomalies\n"));
printf(_(" --snapshot=SNAPSHOT use given snapshot for the dump\n"));
@@ -1902,8 +1905,32 @@ dumpTableData_insert(Archive *fout, void *dcontext)
if (insertStmt == NULL)
{
insertStmt = createPQExpBuffer();
+
+ /*
+ * When load-via-partition-root is set, get the root relation name
+ * for the partition table, so that we can reload data through
+ * the root table.
+ */
+ if (dopt->load_via_partition_root && tbinfo->ispartition)
+ {
+ TableInfo *parentTbinfo;
+
+ parentTbinfo = getRootTableInfo(tbinfo);
+
+ /*
+ * When we loading data through the root, we will qualify
+ * the table name. This is needed because earlier
+ * search_path will be set for the partition table.
+ */
+ classname = (char *) fmtQualifiedId(fout->remoteVersion,
+ parentTbinfo->dobj.namespace->dobj.name,
+ parentTbinfo->dobj.name);
+ }
+ else
+ classname = fmtId(tbinfo->dobj.name);
+
appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
- fmtId(classname));
+ classname);
/* corner case for zero-column table */
if (nfields == 0)
@@ -2025,6 +2052,20 @@ dumpTableData_insert(Archive *fout, void *dcontext)
return 1;
}
+/*
+ * getRootTableInfo:
+ * get the root TableInfo for the given partition table.
+ */
+static TableInfo *
+getRootTableInfo(TableInfo *tbinfo)
+{
+ Assert(tbinfo->ispartition);
+ Assert(tbinfo->numParents == 1);
+ if (tbinfo->parents[0]->ispartition)
+ return getRootTableInfo(tbinfo->parents[0]);
+
+ return tbinfo->parents[0];
+}
/*
* dumpTableData -
@@ -2041,14 +2082,37 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
PQExpBuffer clistBuf = createPQExpBuffer();
DataDumperPtr dumpFn;
char *copyStmt;
+ const char *copyFrom;
if (!dopt->dump_inserts)
{
/* Dump/restore using COPY */
dumpFn = dumpTableData_copy;
- /* must use 2 steps here 'cause fmtId is nonreentrant */
+
+ /*
+ * When load-via-partition-root is set, get the root relation name for the
+ * partition table, so that we can reload data through the root table.
+ */
+ if (dopt->load_via_partition_root && tbinfo->ispartition)
+ {
+ TableInfo *parentTbinfo;
+
+ parentTbinfo = getRootTableInfo(tbinfo);
+
+ /*
+ * When we loading data through the root, we will qualify the
+ * table name. This is needed because earlier search_path will be
+ * set for the partition table.
+ */
+ copyFrom = fmtQualifiedId(fout->remoteVersion,
+ parentTbinfo->dobj.namespace->dobj.name,
+ parentTbinfo->dobj.name);
+ }
+ else
+ copyFrom = fmtId(tbinfo->dobj.name);
+
appendPQExpBuffer(copyBuf, "COPY %s ",
- fmtId(tbinfo->dobj.name));
+ copyFrom);
appendPQExpBuffer(copyBuf, "%s %sFROM stdin;\n",
fmtCopyColumnList(tbinfo, clistBuf),
(tdinfo->oids && tbinfo->hasoids) ? "WITH OIDS " : "");
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index b14bb8e..9035332 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -80,6 +80,7 @@ static int no_subscriptions = 0;
static int no_unlogged_table_data = 0;
static int no_role_passwords = 0;
static int server_version;
+static int load_via_partition_root = 0;
static char role_catalog[10];
#define PG_AUTHID "pg_authid"
@@ -128,6 +129,7 @@ main(int argc, char *argv[])
{"lock-wait-timeout", required_argument, NULL, 2},
{"no-tablespaces", no_argument, &no_tablespaces, 1},
{"quote-all-identifiers", no_argument, "e_all_identifiers, 1},
+ {"load-via-partition-root", no_argument, &load_via_partition_root, 1},
{"role", required_argument, NULL, 3},
{"use-set-session-authorization", no_argument, &use_setsessauth, 1},
{"no-publications", no_argument, &no_publications, 1},
@@ -385,6 +387,8 @@ main(int argc, char *argv[])
appendPQExpBufferStr(pgdumpopts, " --no-tablespaces");
if (quote_all_identifiers)
appendPQExpBufferStr(pgdumpopts, " --quote-all-identifiers");
+ if (load_via_partition_root)
+ appendPQExpBufferStr(pgdumpopts, " --load-via-partition-root");
if (use_setsessauth)
appendPQExpBufferStr(pgdumpopts, " --use-set-session-authorization");
if (no_publications)
@@ -606,6 +610,7 @@ help(void)
printf(_(" --no-tablespaces do not dump tablespace assignments\n"));
printf(_(" --no-unlogged-table-data do not dump unlogged table data\n"));
printf(_(" --quote-all-identifiers quote all identifiers, even if not key words\n"));
+ printf(_(" --load-via-partition-root load partition table via the root relation\n"));
printf(_(" --use-set-session-authorization\n"
" use SET SESSION AUTHORIZATION commands instead of\n"
" ALTER OWNER commands to set ownership\n"));
Hi Rushabh,
While testing latest v2 patch, I got a crash when using
--load-via-partition-root with --schema options. Below are steps to
reproduce.
--create below test data
create schema a;
create schema b;
create schema c;
create table t1 (a int,b text) partition by list(a);
create table a.t1_p1 partition of t1 FOR VALUES in (1,2,3,4) partition by
list(a);
create table b.t1_p1_p1 partition of a.t1_p1 FOR VALUES in (1,2);
create table c.t1_p1_p2 partition of a.t1_p1 FOR VALUES in (3,4);
create table b.t1_p2 partition of t1 FOR VALUES in (5,6,7,8) partition by
list(a);
create table a.t1_p2_p1 partition of b.t1_p2 FOR VALUES in (5,6);
create table t1_p2_p2 partition of b.t1_p2 FOR VALUES in (7,8);
insert into t1 values (8,'t1');
insert into a.t1_p1 values (2,'a.t1_p1');
insert into b.t1_p1_p1 values (1,'b.t1_p1_p1');
insert into c.t1_p1_p2 values (3,'c.t1_p1_p2');
insert into b.t1_p2 values (6,'b.t1_p2');
insert into a.t1_p2_p1 values (5,'a.t1_p2_p1');
insert into t1_p2_p2 values (7,'t1_p2_p1');
insert into t1 values (4 ,'t1');
--trying to take pg_dump
[edb@localhost bin]$ ./pg_dump -d postgres --schema=a -f d1.dump -Fp
[edb@localhost bin]$ ./pg_dump -d postgres --load-via-partition-root -f
d2.dump -Fp
[edb@localhost bin]$ ./pg_dump -d postgres --load-via-partition-root
--schema=a -f d3.dump -Fp
pg_dump: pg_dump.c:2063: getRootTableInfo: Assertion `tbinfo->numParents ==
1' failed.
Aborted (core dumped)
Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation
On Fri, Aug 4, 2017 at 3:01 PM, Rushabh Lathia <rushabh.lathia@gmail.com>
wrote:
Show quoted text
Here is an update patch, now renamed the switch to
--load-via-partition-root
and also added the documentation for the new switch into pg_dump as well
as pg_dumpall.On Fri, Aug 4, 2017 at 7:13 AM, Amit Langote <
Langote_Amit_f8@lab.ntt.co.jp> wrote:On 2017/08/04 1:08, David G. Johnston wrote:
On Thu, Aug 3, 2017 at 8:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
So maybe --load-via-partition-root if nobody likes my previous
suggestion of --partition-data-via-root ?WFM.
+1
+1.
Thanks,
AmitThanks,
Rushabh Lathia
www.EnterpriseDB.com--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Thanks Rajkumar for testing and reporting this.
It seems like with we set the numParents and parents only for the
dumpable objects (flagInhTables()). Current patch relies on the numParents
and parents to get the root partition TableInfo, but when --schema is been
specified - it doesn't load those information for the object which is not
dumpable.
Now one options is:
1) restrict the --load-via-partition-root to be used with
the --schema or may be some other options - where we restrict the
objects.
Consider this, partition root is in schema 'a' and the partition table is in
schema 'b', if someone specify the --schema b with
--load-via-partition-root,
I think we should not do "INSERT INTO a.tab" to load the data (because
user specified --schema b).
2) fix flagInhTables() to load the numParents and the parents information
for the partition table (can be checked using ispartition), irrespective of
whether object is dumpable is true or not.
May be something like:
@@ -322,7 +322,11 @@ flagInhTables(TableInfo *tblinfo, int numTables,
/* Don't bother computing anything for non-target tables, either */
if (!tblinfo[i].dobj.dump)
+ {
+ if (tblinfo[i].ispartition)
+ findParentsByOid(&tblinfo[i], inhinfo, numInherits);
continue;
+ }
I am still looking into this, meanwhile any inputs are welcome.
On Tue, Aug 8, 2017 at 4:10 PM, Rajkumar Raghuwanshi <
rajkumar.raghuwanshi@enterprisedb.com> wrote:
Hi Rushabh,
While testing latest v2 patch, I got a crash when using
--load-via-partition-root with --schema options. Below are steps to
reproduce.--create below test data
create schema a;
create schema b;
create schema c;create table t1 (a int,b text) partition by list(a);
create table a.t1_p1 partition of t1 FOR VALUES in (1,2,3,4) partition by
list(a);
create table b.t1_p1_p1 partition of a.t1_p1 FOR VALUES in (1,2);
create table c.t1_p1_p2 partition of a.t1_p1 FOR VALUES in (3,4);
create table b.t1_p2 partition of t1 FOR VALUES in (5,6,7,8) partition by
list(a);
create table a.t1_p2_p1 partition of b.t1_p2 FOR VALUES in (5,6);
create table t1_p2_p2 partition of b.t1_p2 FOR VALUES in (7,8);insert into t1 values (8,'t1');
insert into a.t1_p1 values (2,'a.t1_p1');
insert into b.t1_p1_p1 values (1,'b.t1_p1_p1');
insert into c.t1_p1_p2 values (3,'c.t1_p1_p2');
insert into b.t1_p2 values (6,'b.t1_p2');
insert into a.t1_p2_p1 values (5,'a.t1_p2_p1');
insert into t1_p2_p2 values (7,'t1_p2_p1');
insert into t1 values (4 ,'t1');--trying to take pg_dump
[edb@localhost bin]$ ./pg_dump -d postgres --schema=a -f d1.dump -Fp
[edb@localhost bin]$ ./pg_dump -d postgres --load-via-partition-root -f
d2.dump -Fp
[edb@localhost bin]$ ./pg_dump -d postgres --load-via-partition-root
--schema=a -f d3.dump -Fp
pg_dump: pg_dump.c:2063: getRootTableInfo: Assertion `tbinfo->numParents
== 1' failed.
Aborted (core dumped)Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB CorporationOn Fri, Aug 4, 2017 at 3:01 PM, Rushabh Lathia <rushabh.lathia@gmail.com>
wrote:Here is an update patch, now renamed the switch to
--load-via-partition-root
and also added the documentation for the new switch into pg_dump as well
as pg_dumpall.On Fri, Aug 4, 2017 at 7:13 AM, Amit Langote <
Langote_Amit_f8@lab.ntt.co.jp> wrote:On 2017/08/04 1:08, David G. Johnston wrote:
On Thu, Aug 3, 2017 at 8:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
So maybe --load-via-partition-root if nobody likes my previous
suggestion of --partition-data-via-root ?WFM.
+1
+1.
Thanks,
AmitThanks,
Rushabh Lathia
www.EnterpriseDB.com--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Rushabh Lathia
On Tue, Aug 8, 2017 at 6:18 PM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
Thanks Rajkumar for testing and reporting this.
It seems like with we set the numParents and parents only for the
dumpable objects (flagInhTables()). Current patch relies on the numParents
and parents to get the root partition TableInfo, but when --schema is been
specified - it doesn't load those information for the object which is not
dumpable.Now one options is:
1) restrict the --load-via-partition-root to be used with
the --schema or may be some other options - where we restrict the
objects.Consider this, partition root is in schema 'a' and the partition table is in
schema 'b', if someone specify the --schema b with
--load-via-partition-root,
I think we should not do "INSERT INTO a.tab" to load the data (because
user specified --schema b).
+1, this looks cleaner to me.
2) fix flagInhTables() to load the numParents and the parents information
for the partition table (can be checked using ispartition), irrespective of
whether object is dumpable is true or not.May be something like:
@@ -322,7 +322,11 @@ flagInhTables(TableInfo *tblinfo, int numTables,
/* Don't bother computing anything for non-target tables, either */ if (!tblinfo[i].dobj.dump) + { + if (tblinfo[i].ispartition) + findParentsByOid(&tblinfo[i], inhinfo, numInherits); continue; + }I am still looking into this, meanwhile any inputs are welcome.
See the note given in the pg_dump document[1]https://www.postgresql.org/docs/devel/static/app-pgdump.html is :
"When -n is specified, pg_dump makes no attempt to dump any other
database objects that the selected schema(s) might depend upon.
Therefore, there is no guarantee that the results of a specific-schema
dump can be successfully restored by themselves into a clean
database."
If we want to follow the same trend then we could simply dump all
partition(ed) belong to a specified schema. In the Rajkumar’s example,
we need to dump partitions belong to schema 'a' (i.e t1_p1 and
t1_p2_p1), and target root for the insert query will be the same table
or the parent belong to the same schema.
Regards,
Amul
[1]: https://www.postgresql.org/docs/devel/static/app-pgdump.html
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Aug 8, 2017 at 8:48 AM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
It seems like with we set the numParents and parents only for the
dumpable objects (flagInhTables()). Current patch relies on the numParents
and parents to get the root partition TableInfo, but when --schema is been
specified - it doesn't load those information for the object which is not
dumpable.Now one options is:
1) restrict the --load-via-partition-root to be used with
the --schema or may be some other options - where we restrict the
objects.Consider this, partition root is in schema 'a' and the partition table is in
schema 'b', if someone specify the --schema b with
--load-via-partition-root,
I think we should not do "INSERT INTO a.tab" to load the data (because
user specified --schema b).2) fix flagInhTables() to load the numParents and the parents information
for the partition table (can be checked using ispartition), irrespective of
whether object is dumpable is true or not.
(1) seems like a pretty arbitrary restriction, so I don't like that
option. (2) would hurt performance in some use cases. Do we have an
option (3)?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Aug 9, 2017 at 1:20 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Aug 8, 2017 at 8:48 AM, Rushabh Lathia <rushabh.lathia@gmail.com>
wrote:It seems like with we set the numParents and parents only for the
dumpable objects (flagInhTables()). Current patch relies on thenumParents
and parents to get the root partition TableInfo, but when --schema is
been
specified - it doesn't load those information for the object which is not
dumpable.Now one options is:
1) restrict the --load-via-partition-root to be used with
the --schema or may be some other options - where we restrict the
objects.Consider this, partition root is in schema 'a' and the partition table
is in
schema 'b', if someone specify the --schema b with
--load-via-partition-root,
I think we should not do "INSERT INTO a.tab" to load the data (because
user specified --schema b).2) fix flagInhTables() to load the numParents and the parents information
for the partition table (can be checked using ispartition), irrespectiveof
whether object is dumpable is true or not.
(1) seems like a pretty arbitrary restriction, so I don't like that
option. (2) would hurt performance in some use cases. Do we have an
option (3)?
How about protecting option 2) with the load-via-partition-root protection.
Means
load the parents information even dump is not set only when
load-via-partition-root
& ispartition.
This won't hurt performance for the normal cases.
Attaching the patch.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Rushabh Lathia
Attachments:
load_via_partition_root_v3.patchtext/x-patch; charset=US-ASCII; name=load_via_partition_root_v3.patchDownload
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index bafa031..65f7e98 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -889,6 +889,16 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
+ <term><option>--load-via-partition-root</></term>
+ <listitem>
+ <para>
+ Dump data via the top-most partitioned table (rather than partition
+ table) when dumping data for the partition table.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>--section=<replaceable class="parameter">sectionname</replaceable></option></term>
<listitem>
<para>
diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index aa944a2..54b8e7e 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -431,6 +431,16 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
+ <term><option>--load-via-partition-root</></term>
+ <listitem>
+ <para>
+ Dump data via the top-most partitioned table (rather than partition
+ table) when dumping data for the partition table.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>--use-set-session-authorization</></term>
<listitem>
<para>
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 47191be..65c2d7f 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -66,7 +66,7 @@ static int numExtensions;
static ExtensionMemberId *extmembers;
static int numextmembers;
-static void flagInhTables(TableInfo *tbinfo, int numTables,
+static void flagInhTables(Archive *fout, TableInfo *tbinfo, int numTables,
InhInfo *inhinfo, int numInherits);
static void flagInhAttrs(DumpOptions *dopt, TableInfo *tblinfo, int numTables);
static DumpableObject **buildIndexArray(void *objArray, int numObjs,
@@ -243,7 +243,7 @@ getSchemaData(Archive *fout, int *numTablesPtr)
/* Link tables to parents, mark parents of target tables interesting */
if (g_verbose)
write_msg(NULL, "finding inheritance relationships\n");
- flagInhTables(tblinfo, numTables, inhinfo, numInherits);
+ flagInhTables(fout, tblinfo, numTables, inhinfo, numInherits);
if (g_verbose)
write_msg(NULL, "reading column info for interesting tables\n");
@@ -304,9 +304,10 @@ getSchemaData(Archive *fout, int *numTablesPtr)
* modifies tblinfo
*/
static void
-flagInhTables(TableInfo *tblinfo, int numTables,
+flagInhTables(Archive *fout, TableInfo *tblinfo, int numTables,
InhInfo *inhinfo, int numInherits)
{
+ DumpOptions *dopt = fout->dopt;
int i,
j;
int numParents;
@@ -322,7 +323,17 @@ flagInhTables(TableInfo *tblinfo, int numTables,
/* Don't bother computing anything for non-target tables, either */
if (!tblinfo[i].dobj.dump)
+ {
+ /*
+ * Load the parents information for the partition table when
+ * the load-via-partition-root option is set. As we need the
+ * parents information to get the partition root.
+ */
+ if (dopt->load_via_partition_root &&
+ tblinfo[i].ispartition)
+ findParentsByOid(&tblinfo[i], inhinfo, numInherits);
continue;
+ }
/* Find all the immediate parent tables */
findParentsByOid(&tblinfo[i], inhinfo, numInherits);
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 144068a..ce3100f 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -157,6 +157,7 @@ typedef struct _dumpOptions
int outputNoTablespaces;
int use_setsessauth;
int enable_row_security;
+ int load_via_partition_root;
/* default, if no "inclusion" switches appear, is to dump everything */
bool include_everything;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 393b9e2..27e23ca 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -269,6 +269,7 @@ static void appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
const char *prefix, Archive *fout);
static char *get_synchronized_snapshot(Archive *fout);
static void setupDumpWorker(Archive *AHX);
+static TableInfo *getRootTableInfo(TableInfo *tbinfo);
int
@@ -345,6 +346,7 @@ main(int argc, char **argv)
{"lock-wait-timeout", required_argument, NULL, 2},
{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
{"quote-all-identifiers", no_argument, "e_all_identifiers, 1},
+ {"load-via-partition-root", no_argument, &dopt.load_via_partition_root, 1},
{"role", required_argument, NULL, 3},
{"section", required_argument, NULL, 5},
{"serializable-deferrable", no_argument, &dopt.serializable_deferrable, 1},
@@ -959,6 +961,7 @@ help(const char *progname)
printf(_(" --no-tablespaces do not dump tablespace assignments\n"));
printf(_(" --no-unlogged-table-data do not dump unlogged table data\n"));
printf(_(" --quote-all-identifiers quote all identifiers, even if not key words\n"));
+ printf(_(" --load-via-partition-root load partition table via the root relation\n"));
printf(_(" --section=SECTION dump named section (pre-data, data, or post-data)\n"));
printf(_(" --serializable-deferrable wait until the dump can run without anomalies\n"));
printf(_(" --snapshot=SNAPSHOT use given snapshot for the dump\n"));
@@ -1902,8 +1905,32 @@ dumpTableData_insert(Archive *fout, void *dcontext)
if (insertStmt == NULL)
{
insertStmt = createPQExpBuffer();
+
+ /*
+ * When load-via-partition-root is set, get the root relation name
+ * for the partition table, so that we can reload data through
+ * the root table.
+ */
+ if (dopt->load_via_partition_root && tbinfo->ispartition)
+ {
+ TableInfo *parentTbinfo;
+
+ parentTbinfo = getRootTableInfo(tbinfo);
+
+ /*
+ * When we loading data through the root, we will qualify
+ * the table name. This is needed because earlier
+ * search_path will be set for the partition table.
+ */
+ classname = (char *) fmtQualifiedId(fout->remoteVersion,
+ parentTbinfo->dobj.namespace->dobj.name,
+ parentTbinfo->dobj.name);
+ }
+ else
+ classname = fmtId(tbinfo->dobj.name);
+
appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
- fmtId(classname));
+ classname);
/* corner case for zero-column table */
if (nfields == 0)
@@ -2025,6 +2052,20 @@ dumpTableData_insert(Archive *fout, void *dcontext)
return 1;
}
+/*
+ * getRootTableInfo:
+ * get the root TableInfo for the given partition table.
+ */
+static TableInfo *
+getRootTableInfo(TableInfo *tbinfo)
+{
+ Assert(tbinfo->ispartition);
+ Assert(tbinfo->numParents == 1);
+ if (tbinfo->parents[0]->ispartition)
+ return getRootTableInfo(tbinfo->parents[0]);
+
+ return tbinfo->parents[0];
+}
/*
* dumpTableData -
@@ -2041,14 +2082,37 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
PQExpBuffer clistBuf = createPQExpBuffer();
DataDumperPtr dumpFn;
char *copyStmt;
+ const char *copyFrom;
if (!dopt->dump_inserts)
{
/* Dump/restore using COPY */
dumpFn = dumpTableData_copy;
- /* must use 2 steps here 'cause fmtId is nonreentrant */
+
+ /*
+ * When load-via-partition-root is set, get the root relation name for the
+ * partition table, so that we can reload data through the root table.
+ */
+ if (dopt->load_via_partition_root && tbinfo->ispartition)
+ {
+ TableInfo *parentTbinfo;
+
+ parentTbinfo = getRootTableInfo(tbinfo);
+
+ /*
+ * When we loading data through the root, we will qualify the
+ * table name. This is needed because earlier search_path will be
+ * set for the partition table.
+ */
+ copyFrom = fmtQualifiedId(fout->remoteVersion,
+ parentTbinfo->dobj.namespace->dobj.name,
+ parentTbinfo->dobj.name);
+ }
+ else
+ copyFrom = fmtId(tbinfo->dobj.name);
+
appendPQExpBuffer(copyBuf, "COPY %s ",
- fmtId(tbinfo->dobj.name));
+ copyFrom);
appendPQExpBuffer(copyBuf, "%s %sFROM stdin;\n",
fmtCopyColumnList(tbinfo, clistBuf),
(tdinfo->oids && tbinfo->hasoids) ? "WITH OIDS " : "");
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index b14bb8e..9035332 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -80,6 +80,7 @@ static int no_subscriptions = 0;
static int no_unlogged_table_data = 0;
static int no_role_passwords = 0;
static int server_version;
+static int load_via_partition_root = 0;
static char role_catalog[10];
#define PG_AUTHID "pg_authid"
@@ -128,6 +129,7 @@ main(int argc, char *argv[])
{"lock-wait-timeout", required_argument, NULL, 2},
{"no-tablespaces", no_argument, &no_tablespaces, 1},
{"quote-all-identifiers", no_argument, "e_all_identifiers, 1},
+ {"load-via-partition-root", no_argument, &load_via_partition_root, 1},
{"role", required_argument, NULL, 3},
{"use-set-session-authorization", no_argument, &use_setsessauth, 1},
{"no-publications", no_argument, &no_publications, 1},
@@ -385,6 +387,8 @@ main(int argc, char *argv[])
appendPQExpBufferStr(pgdumpopts, " --no-tablespaces");
if (quote_all_identifiers)
appendPQExpBufferStr(pgdumpopts, " --quote-all-identifiers");
+ if (load_via_partition_root)
+ appendPQExpBufferStr(pgdumpopts, " --load-via-partition-root");
if (use_setsessauth)
appendPQExpBufferStr(pgdumpopts, " --use-set-session-authorization");
if (no_publications)
@@ -606,6 +610,7 @@ help(void)
printf(_(" --no-tablespaces do not dump tablespace assignments\n"));
printf(_(" --no-unlogged-table-data do not dump unlogged table data\n"));
printf(_(" --quote-all-identifiers quote all identifiers, even if not key words\n"));
+ printf(_(" --load-via-partition-root load partition table via the root relation\n"));
printf(_(" --use-set-session-authorization\n"
" use SET SESSION AUTHORIZATION commands instead of\n"
" ALTER OWNER commands to set ownership\n"));
On Thu, Aug 10, 2017 at 3:47 AM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:
(1) seems like a pretty arbitrary restriction, so I don't like that
option. (2) would hurt performance in some use cases. Do we have an
option (3)?How about protecting option 2) with the load-via-partition-root protection.
Means
load the parents information even dump is not set only when
load-via-partition-root
& ispartition.This won't hurt performance for the normal cases.
Yes, that seems like the right approach.
+ Dump data via the top-most partitioned table (rather than partition
+ table) when dumping data for the partition table.
I think we should phrase this a bit more clearly, something like this:
When dumping a COPY or INSERT statement for a partitioned table,
target the root of the partitioning hierarchy which contains it rather
than the partition itself. This may be useful when reloading data on
a server where rows do not always fall into the same partitions as
they did on the original server. This could happen, for example, if
the partitioning column is of type text and the two system have
different definitions of the collation used to partition the data.
+ printf(_(" --load-via-partition-root load partition table via
the root relation\n"));
"relation" seems odd to me here. root table?
/* Don't bother computing anything for non-target tables, either */
if (!tblinfo[i].dobj.dump)
+ {
+ /*
+ * Load the parents information for the partition table when
+ * the load-via-partition-root option is set. As we need the
+ * parents information to get the partition root.
+ */
+ if (dopt->load_via_partition_root &&
+ tblinfo[i].ispartition)
+ findParentsByOid(&tblinfo[i], inhinfo, numInherits);
continue;
+ }
Duplicating the call to findParentsByOid seems less then ideal. How
about doing something like this:
if (tblinfo[i].dobj.dump)
{
find_parents = true;
mark_parents = true;
}
else if (dopt->load_via_partition_root && tblinfo[i].ispartition)
find_parents = true;
if (find_parents)
findParentsByOid(&tblinfo[i], inhinfo, numInherits);
etc.
The comments for this function also need some work - e.g. the function
header comment deserves some kind of update for these changes.
+static TableInfo *
+getRootTableInfo(TableInfo *tbinfo)
+{
+ Assert(tbinfo->ispartition);
+ Assert(tbinfo->numParents == 1);
+ if (tbinfo->parents[0]->ispartition)
+ return getRootTableInfo(tbinfo->parents[0]);
+
+ return tbinfo->parents[0];
+}
This code should iterate, not recurse, to avoid any risk of blowing
out the stack.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Aug 10, 2017 at 8:26 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Aug 10, 2017 at 3:47 AM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:(1) seems like a pretty arbitrary restriction, so I don't like that
option. (2) would hurt performance in some use cases. Do we have an
option (3)?How about protecting option 2) with the load-via-partition-root
protection.
Means
load the parents information even dump is not set only when
load-via-partition-root
& ispartition.This won't hurt performance for the normal cases.
Yes, that seems like the right approach.
+ Dump data via the top-most partitioned table (rather than partition + table) when dumping data for the partition table.I think we should phrase this a bit more clearly, something like this:
When dumping a COPY or INSERT statement for a partitioned table,
target the root of the partitioning hierarchy which contains it rather
than the partition itself. This may be useful when reloading data on
a server where rows do not always fall into the same partitions as
they did on the original server. This could happen, for example, if
the partitioning column is of type text and the two system have
different definitions of the collation used to partition the data.
Done.
+ printf(_(" --load-via-partition-root load partition table via
the root relation\n"));"relation" seems odd to me here. root table?
Done.
/* Don't bother computing anything for non-target tables, either */ if (!tblinfo[i].dobj.dump) + { + /* + * Load the parents information for the partition table when + * the load-via-partition-root option is set. As we need the + * parents information to get the partition root. + */ + if (dopt->load_via_partition_root && + tblinfo[i].ispartition) + findParentsByOid(&tblinfo[i], inhinfo, numInherits); continue; + }Duplicating the call to findParentsByOid seems less then ideal. How
about doing something like this:if (tblinfo[i].dobj.dump)
{
find_parents = true;
mark_parents = true;
}
else if (dopt->load_via_partition_root && tblinfo[i].ispartition)
find_parents = true;if (find_parents)
findParentsByOid(&tblinfo[i], inhinfo, numInherits);etc.
Done changes to avoid duplicate call to findParentsByOid().
The comments for this function also need some work - e.g. the function
header comment deserves some kind of update for these changes.
Done.
+static TableInfo * +getRootTableInfo(TableInfo *tbinfo) +{ + Assert(tbinfo->ispartition); + Assert(tbinfo->numParents == 1); + if (tbinfo->parents[0]->ispartition) + return getRootTableInfo(tbinfo->parents[0]); + + return tbinfo->parents[0]; +}This code should iterate, not recurse, to avoid any risk of blowing
out the stack.
Done.
Please find attach patch with the changes.
Thanks,
Rushabh Lathia
www.EnterpriseDB.com
Attachments:
load_via_partition_root_v4.patchtext/x-patch; charset=US-ASCII; name=load_via_partition_root_v4.patchDownload
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index bafa031..de8297a 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -889,6 +889,21 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
+ <term><option>--load-via-partition-root</></term>
+ <listitem>
+ <para>
+ When dumping a COPY or INSERT statement for a partitioned table,
+ target the root of the partitioning hierarchy which contains it rather
+ than the partition itself. This will be useful when reloading data on
+ a server where rows do not always fall into the same partitions as
+ they did on the original server. This could happen, for example, if
+ the partitioning column is of type text and the two system have
+ different definitions of the collation used to partition the data.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>--section=<replaceable class="parameter">sectionname</replaceable></option></term>
<listitem>
<para>
diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index aa944a2..dc1d3cc 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -431,6 +431,21 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
+ <term><option>--load-via-partition-root</></term>
+ <listitem>
+ <para>
+ When dumping a COPY or INSERT statement for a partitioned table,
+ target the root of the partitioning hierarchy which contains it rather
+ than the partition itself. This will be useful when reloading data on
+ a server where rows do not always fall into the same partitions as
+ they did on the original server. This could happen, for example, if
+ the partitioning column is of type text and the two system have
+ different definitions of the collation used to partition the data.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>--use-set-session-authorization</></term>
<listitem>
<para>
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 47191be..e7ee069 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -66,7 +66,7 @@ static int numExtensions;
static ExtensionMemberId *extmembers;
static int numextmembers;
-static void flagInhTables(TableInfo *tbinfo, int numTables,
+static void flagInhTables(Archive *fout, TableInfo *tbinfo, int numTables,
InhInfo *inhinfo, int numInherits);
static void flagInhAttrs(DumpOptions *dopt, TableInfo *tblinfo, int numTables);
static DumpableObject **buildIndexArray(void *objArray, int numObjs,
@@ -243,7 +243,7 @@ getSchemaData(Archive *fout, int *numTablesPtr)
/* Link tables to parents, mark parents of target tables interesting */
if (g_verbose)
write_msg(NULL, "finding inheritance relationships\n");
- flagInhTables(tblinfo, numTables, inhinfo, numInherits);
+ flagInhTables(fout, tblinfo, numTables, inhinfo, numInherits);
if (g_verbose)
write_msg(NULL, "reading column info for interesting tables\n");
@@ -301,12 +301,16 @@ getSchemaData(Archive *fout, int *numTablesPtr)
* This is sufficient; we don't much care whether they inherited their
* attributes or not.
*
+ * When load-via-partition-root is set, load the parents information for the
+ * partition table.
+ *
* modifies tblinfo
*/
static void
-flagInhTables(TableInfo *tblinfo, int numTables,
+flagInhTables(Archive *fout, TableInfo *tblinfo, int numTables,
InhInfo *inhinfo, int numInherits)
{
+ DumpOptions *dopt = fout->dopt;
int i,
j;
int numParents;
@@ -314,6 +318,8 @@ flagInhTables(TableInfo *tblinfo, int numTables,
for (i = 0; i < numTables; i++)
{
+ bool only_find_parents = false;
+
/* Some kinds never have parents */
if (tblinfo[i].relkind == RELKIND_SEQUENCE ||
tblinfo[i].relkind == RELKIND_VIEW ||
@@ -322,11 +328,26 @@ flagInhTables(TableInfo *tblinfo, int numTables,
/* Don't bother computing anything for non-target tables, either */
if (!tblinfo[i].dobj.dump)
- continue;
+ {
+ /*
+ * Load the parents information for the partition table when the
+ * load-via-partition-root option is set. As we need the parents
+ * information to get the partition root.
+ */
+ if (dopt->load_via_partition_root &&
+ tblinfo[i].ispartition)
+ only_find_parents = true;
+ else
+ continue;
+ }
/* Find all the immediate parent tables */
findParentsByOid(&tblinfo[i], inhinfo, numInherits);
+ /* Skip mark the parents */
+ if (only_find_parents)
+ continue;
+
/* Mark the parents as interesting for getTableAttrs */
numParents = tblinfo[i].numParents;
parents = tblinfo[i].parents;
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 144068a..ce3100f 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -157,6 +157,7 @@ typedef struct _dumpOptions
int outputNoTablespaces;
int use_setsessauth;
int enable_row_security;
+ int load_via_partition_root;
/* default, if no "inclusion" switches appear, is to dump everything */
bool include_everything;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 393b9e2..1703de8 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -269,6 +269,7 @@ static void appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
const char *prefix, Archive *fout);
static char *get_synchronized_snapshot(Archive *fout);
static void setupDumpWorker(Archive *AHX);
+static TableInfo *getRootTableInfo(TableInfo *tbinfo);
int
@@ -345,6 +346,7 @@ main(int argc, char **argv)
{"lock-wait-timeout", required_argument, NULL, 2},
{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
{"quote-all-identifiers", no_argument, "e_all_identifiers, 1},
+ {"load-via-partition-root", no_argument, &dopt.load_via_partition_root, 1},
{"role", required_argument, NULL, 3},
{"section", required_argument, NULL, 5},
{"serializable-deferrable", no_argument, &dopt.serializable_deferrable, 1},
@@ -959,6 +961,7 @@ help(const char *progname)
printf(_(" --no-tablespaces do not dump tablespace assignments\n"));
printf(_(" --no-unlogged-table-data do not dump unlogged table data\n"));
printf(_(" --quote-all-identifiers quote all identifiers, even if not key words\n"));
+ printf(_(" --load-via-partition-root load partition table via the root table\n"));
printf(_(" --section=SECTION dump named section (pre-data, data, or post-data)\n"));
printf(_(" --serializable-deferrable wait until the dump can run without anomalies\n"));
printf(_(" --snapshot=SNAPSHOT use given snapshot for the dump\n"));
@@ -1902,8 +1905,32 @@ dumpTableData_insert(Archive *fout, void *dcontext)
if (insertStmt == NULL)
{
insertStmt = createPQExpBuffer();
+
+ /*
+ * When load-via-partition-root is set, get the root table
+ * name for the partition table, so that we can reload data
+ * through the root table.
+ */
+ if (dopt->load_via_partition_root && tbinfo->ispartition)
+ {
+ TableInfo *parentTbinfo;
+
+ parentTbinfo = getRootTableInfo(tbinfo);
+
+ /*
+ * When we loading data through the root, we will qualify
+ * the table name. This is needed because earlier
+ * search_path will be set for the partition table.
+ */
+ classname = (char *) fmtQualifiedId(fout->remoteVersion,
+ parentTbinfo->dobj.namespace->dobj.name,
+ parentTbinfo->dobj.name);
+ }
+ else
+ classname = fmtId(tbinfo->dobj.name);
+
appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
- fmtId(classname));
+ classname);
/* corner case for zero-column table */
if (nfields == 0)
@@ -2025,6 +2052,27 @@ dumpTableData_insert(Archive *fout, void *dcontext)
return 1;
}
+/*
+ * getRootTableInfo:
+ * get the root TableInfo for the given partition table.
+ */
+static TableInfo *
+getRootTableInfo(TableInfo *tbinfo)
+{
+ TableInfo *parentTbinfo;
+
+ Assert(tbinfo->ispartition);
+ Assert(tbinfo->numParents == 1);
+
+ parentTbinfo = tbinfo->parents[0];
+ while (parentTbinfo->ispartition)
+ {
+ Assert(parentTbinfo->numParents == 1);
+ parentTbinfo = parentTbinfo->parents[0];
+ }
+
+ return parentTbinfo;
+}
/*
* dumpTableData -
@@ -2041,14 +2089,38 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
PQExpBuffer clistBuf = createPQExpBuffer();
DataDumperPtr dumpFn;
char *copyStmt;
+ const char *copyFrom;
if (!dopt->dump_inserts)
{
/* Dump/restore using COPY */
dumpFn = dumpTableData_copy;
- /* must use 2 steps here 'cause fmtId is nonreentrant */
+
+ /*
+ * When load-via-partition-root is set, get the root table name for
+ * the partition table, so that we can reload data through the root
+ * table.
+ */
+ if (dopt->load_via_partition_root && tbinfo->ispartition)
+ {
+ TableInfo *parentTbinfo;
+
+ parentTbinfo = getRootTableInfo(tbinfo);
+
+ /*
+ * When we loading data through the root, we will qualify the
+ * table name. This is needed because earlier search_path will be
+ * set for the partition table.
+ */
+ copyFrom = fmtQualifiedId(fout->remoteVersion,
+ parentTbinfo->dobj.namespace->dobj.name,
+ parentTbinfo->dobj.name);
+ }
+ else
+ copyFrom = fmtId(tbinfo->dobj.name);
+
appendPQExpBuffer(copyBuf, "COPY %s ",
- fmtId(tbinfo->dobj.name));
+ copyFrom);
appendPQExpBuffer(copyBuf, "%s %sFROM stdin;\n",
fmtCopyColumnList(tbinfo, clistBuf),
(tdinfo->oids && tbinfo->hasoids) ? "WITH OIDS " : "");
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index b14bb8e..c4741c2 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -80,6 +80,7 @@ static int no_subscriptions = 0;
static int no_unlogged_table_data = 0;
static int no_role_passwords = 0;
static int server_version;
+static int load_via_partition_root = 0;
static char role_catalog[10];
#define PG_AUTHID "pg_authid"
@@ -128,6 +129,7 @@ main(int argc, char *argv[])
{"lock-wait-timeout", required_argument, NULL, 2},
{"no-tablespaces", no_argument, &no_tablespaces, 1},
{"quote-all-identifiers", no_argument, "e_all_identifiers, 1},
+ {"load-via-partition-root", no_argument, &load_via_partition_root, 1},
{"role", required_argument, NULL, 3},
{"use-set-session-authorization", no_argument, &use_setsessauth, 1},
{"no-publications", no_argument, &no_publications, 1},
@@ -385,6 +387,8 @@ main(int argc, char *argv[])
appendPQExpBufferStr(pgdumpopts, " --no-tablespaces");
if (quote_all_identifiers)
appendPQExpBufferStr(pgdumpopts, " --quote-all-identifiers");
+ if (load_via_partition_root)
+ appendPQExpBufferStr(pgdumpopts, " --load-via-partition-root");
if (use_setsessauth)
appendPQExpBufferStr(pgdumpopts, " --use-set-session-authorization");
if (no_publications)
@@ -606,6 +610,7 @@ help(void)
printf(_(" --no-tablespaces do not dump tablespace assignments\n"));
printf(_(" --no-unlogged-table-data do not dump unlogged table data\n"));
printf(_(" --quote-all-identifiers quote all identifiers, even if not key words\n"));
+ printf(_(" --load-via-partition-root load partition table via the root table\n"));
printf(_(" --use-set-session-authorization\n"
" use SET SESSION AUTHORIZATION commands instead of\n"
" ALTER OWNER commands to set ownership\n"));
On Fri, Aug 11, 2017 at 5:36 AM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:
Please find attach patch with the changes.
I found the way that you had the logic structured in flagInhTables()
to be quite hard to follow, so I rewrote it in the attached version.
This version also has a bunch of minor cosmetic changes. Barring
objections, I'll commit it once the tree opens for v11 development.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
load-via-partition-root-rmh.patchapplication/octet-stream; name=load-via-partition-root-rmh.patchDownload
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index bafa031e1a..ad5b6fc703 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -889,6 +889,21 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
+ <term><option>--load-via-partition-root</></term>
+ <listitem>
+ <para>
+ When dumping a COPY or INSERT statement for a partitioned table,
+ target the root of the partitioning hierarchy which contains it rather
+ than the partition itself. This may be useful when reloading data on
+ a server where rows do not always fall into the same partitions as
+ they did on the original server. This could happen, for example, if
+ the partitioning column is of type text and the two system have
+ different definitions of the collation used to partition the data.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>--section=<replaceable class="parameter">sectionname</replaceable></option></term>
<listitem>
<para>
diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index aa944a2e92..f8a2521743 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -431,6 +431,21 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
+ <term><option>--load-via-partition-root</></term>
+ <listitem>
+ <para>
+ When dumping a COPY or INSERT statement for a partitioned table,
+ target the root of the partitioning hierarchy which contains it rather
+ than the partition itself. This may be useful when reloading data on
+ a server where rows do not always fall into the same partitions as
+ they did on the original server. This could happen, for example, if
+ the partitioning column is of type text and the two system have
+ different definitions of the collation used to partition the data.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>--use-set-session-authorization</></term>
<listitem>
<para>
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 47191be86a..4b47951de1 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -66,7 +66,7 @@ static int numExtensions;
static ExtensionMemberId *extmembers;
static int numextmembers;
-static void flagInhTables(TableInfo *tbinfo, int numTables,
+static void flagInhTables(Archive *fout, TableInfo *tbinfo, int numTables,
InhInfo *inhinfo, int numInherits);
static void flagInhAttrs(DumpOptions *dopt, TableInfo *tblinfo, int numTables);
static DumpableObject **buildIndexArray(void *objArray, int numObjs,
@@ -243,7 +243,7 @@ getSchemaData(Archive *fout, int *numTablesPtr)
/* Link tables to parents, mark parents of target tables interesting */
if (g_verbose)
write_msg(NULL, "finding inheritance relationships\n");
- flagInhTables(tblinfo, numTables, inhinfo, numInherits);
+ flagInhTables(fout, tblinfo, numTables, inhinfo, numInherits);
if (g_verbose)
write_msg(NULL, "reading column info for interesting tables\n");
@@ -294,8 +294,8 @@ getSchemaData(Archive *fout, int *numTablesPtr)
}
/* flagInhTables -
- * Fill in parent link fields of every target table, and mark
- * parents of target tables as interesting
+ * Fill in parent link fields of tables for which we need that information,
+ * and mark parents of target tables as interesting
*
* Note that only direct ancestors of targets are marked interesting.
* This is sufficient; we don't much care whether they inherited their
@@ -304,34 +304,53 @@ getSchemaData(Archive *fout, int *numTablesPtr)
* modifies tblinfo
*/
static void
-flagInhTables(TableInfo *tblinfo, int numTables,
+flagInhTables(Archive *fout, TableInfo *tblinfo, int numTables,
InhInfo *inhinfo, int numInherits)
{
+ DumpOptions *dopt = fout->dopt;
int i,
j;
- int numParents;
- TableInfo **parents;
for (i = 0; i < numTables; i++)
{
+ bool find_parents = true;
+ bool mark_parents = true;
+
/* Some kinds never have parents */
if (tblinfo[i].relkind == RELKIND_SEQUENCE ||
tblinfo[i].relkind == RELKIND_VIEW ||
tblinfo[i].relkind == RELKIND_MATVIEW)
continue;
- /* Don't bother computing anything for non-target tables, either */
+ /*
+ * Normally, we don't bother computing anything for non-target tables,
+ * but if load-via-partition-root is specified, we gather information
+ * on every partition in the system so that getRootTableInfo can trace
+ * from any given to leaf partition all the way up to the root. (We
+ * don't need to mark them as interesting for getTableAttrs, though.)
+ */
if (!tblinfo[i].dobj.dump)
- continue;
+ {
+ mark_parents = false;
- /* Find all the immediate parent tables */
- findParentsByOid(&tblinfo[i], inhinfo, numInherits);
+ if (!dopt->load_via_partition_root ||
+ !tblinfo[i].ispartition)
+ find_parents = false;
+ }
+
+ /* If needed, find all the immediate parent tables. */
+ if (find_parents)
+ findParentsByOid(&tblinfo[i], inhinfo, numInherits);
- /* Mark the parents as interesting for getTableAttrs */
- numParents = tblinfo[i].numParents;
- parents = tblinfo[i].parents;
- for (j = 0; j < numParents; j++)
- parents[j]->interesting = true;
+ /* If needed, mark the parents as interesting for getTableAttrs. */
+ if (mark_parents)
+ {
+ int numParents = tblinfo[i].numParents;
+ TableInfo **parents = tblinfo[i].parents;
+
+ for (j = 0; j < numParents; j++)
+ parents[j]->interesting = true;
+ }
}
}
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 144068ac49..ce3100f09d 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -157,6 +157,7 @@ typedef struct _dumpOptions
int outputNoTablespaces;
int use_setsessauth;
int enable_row_security;
+ int load_via_partition_root;
/* default, if no "inclusion" switches appear, is to dump everything */
bool include_everything;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 37cb7cd986..cb1ade954c 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -269,6 +269,7 @@ static void appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
const char *prefix, Archive *fout);
static char *get_synchronized_snapshot(Archive *fout);
static void setupDumpWorker(Archive *AHX);
+static TableInfo *getRootTableInfo(TableInfo *tbinfo);
int
@@ -345,6 +346,7 @@ main(int argc, char **argv)
{"lock-wait-timeout", required_argument, NULL, 2},
{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
{"quote-all-identifiers", no_argument, "e_all_identifiers, 1},
+ {"load-via-partition-root", no_argument, &dopt.load_via_partition_root, 1},
{"role", required_argument, NULL, 3},
{"section", required_argument, NULL, 5},
{"serializable-deferrable", no_argument, &dopt.serializable_deferrable, 1},
@@ -959,6 +961,7 @@ help(const char *progname)
printf(_(" --no-tablespaces do not dump tablespace assignments\n"));
printf(_(" --no-unlogged-table-data do not dump unlogged table data\n"));
printf(_(" --quote-all-identifiers quote all identifiers, even if not key words\n"));
+ printf(_(" --load-via-partition-root load partitions via the root table\n"));
printf(_(" --section=SECTION dump named section (pre-data, data, or post-data)\n"));
printf(_(" --serializable-deferrable wait until the dump can run without anomalies\n"));
printf(_(" --snapshot=SNAPSHOT use given snapshot for the dump\n"));
@@ -1902,8 +1905,32 @@ dumpTableData_insert(Archive *fout, void *dcontext)
if (insertStmt == NULL)
{
insertStmt = createPQExpBuffer();
+
+ /*
+ * When load-via-partition-root is set, get the root table
+ * name for the partition table, so that we can reload data
+ * through the root table.
+ */
+ if (dopt->load_via_partition_root && tbinfo->ispartition)
+ {
+ TableInfo *parentTbinfo;
+
+ parentTbinfo = getRootTableInfo(tbinfo);
+
+ /*
+ * When we loading data through the root, we will qualify
+ * the table name. This is needed because earlier
+ * search_path will be set for the partition table.
+ */
+ classname = (char *) fmtQualifiedId(fout->remoteVersion,
+ parentTbinfo->dobj.namespace->dobj.name,
+ parentTbinfo->dobj.name);
+ }
+ else
+ classname = fmtId(tbinfo->dobj.name);
+
appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
- fmtId(classname));
+ classname);
/* corner case for zero-column table */
if (nfields == 0)
@@ -2025,6 +2052,27 @@ dumpTableData_insert(Archive *fout, void *dcontext)
return 1;
}
+/*
+ * getRootTableInfo:
+ * get the root TableInfo for the given partition table.
+ */
+static TableInfo *
+getRootTableInfo(TableInfo *tbinfo)
+{
+ TableInfo *parentTbinfo;
+
+ Assert(tbinfo->ispartition);
+ Assert(tbinfo->numParents == 1);
+
+ parentTbinfo = tbinfo->parents[0];
+ while (parentTbinfo->ispartition)
+ {
+ Assert(parentTbinfo->numParents == 1);
+ parentTbinfo = parentTbinfo->parents[0];
+ }
+
+ return parentTbinfo;
+}
/*
* dumpTableData -
@@ -2041,14 +2089,38 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
PQExpBuffer clistBuf = createPQExpBuffer();
DataDumperPtr dumpFn;
char *copyStmt;
+ const char *copyFrom;
if (!dopt->dump_inserts)
{
/* Dump/restore using COPY */
dumpFn = dumpTableData_copy;
+
+ /*
+ * When load-via-partition-root is set, get the root table name for
+ * the partition table, so that we can reload data through the root
+ * table.
+ */
+ if (dopt->load_via_partition_root && tbinfo->ispartition)
+ {
+ TableInfo *parentTbinfo;
+
+ parentTbinfo = getRootTableInfo(tbinfo);
+
+ /*
+ * When we load data through the root, we will qualify the table
+ * name, because search_path is set for the partition.
+ */
+ copyFrom = fmtQualifiedId(fout->remoteVersion,
+ parentTbinfo->dobj.namespace->dobj.name,
+ parentTbinfo->dobj.name);
+ }
+ else
+ copyFrom = fmtId(tbinfo->dobj.name);
+
/* must use 2 steps here 'cause fmtId is nonreentrant */
appendPQExpBuffer(copyBuf, "COPY %s ",
- fmtId(tbinfo->dobj.name));
+ copyFrom);
appendPQExpBuffer(copyBuf, "%s %sFROM stdin;\n",
fmtCopyColumnList(tbinfo, clistBuf),
(tdinfo->oids && tbinfo->hasoids) ? "WITH OIDS " : "");
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index b14bb8e963..c0a0346cd9 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -80,6 +80,7 @@ static int no_subscriptions = 0;
static int no_unlogged_table_data = 0;
static int no_role_passwords = 0;
static int server_version;
+static int load_via_partition_root = 0;
static char role_catalog[10];
#define PG_AUTHID "pg_authid"
@@ -128,6 +129,7 @@ main(int argc, char *argv[])
{"lock-wait-timeout", required_argument, NULL, 2},
{"no-tablespaces", no_argument, &no_tablespaces, 1},
{"quote-all-identifiers", no_argument, "e_all_identifiers, 1},
+ {"load-via-partition-root", no_argument, &load_via_partition_root, 1},
{"role", required_argument, NULL, 3},
{"use-set-session-authorization", no_argument, &use_setsessauth, 1},
{"no-publications", no_argument, &no_publications, 1},
@@ -385,6 +387,8 @@ main(int argc, char *argv[])
appendPQExpBufferStr(pgdumpopts, " --no-tablespaces");
if (quote_all_identifiers)
appendPQExpBufferStr(pgdumpopts, " --quote-all-identifiers");
+ if (load_via_partition_root)
+ appendPQExpBufferStr(pgdumpopts, " --load-via-partition-root");
if (use_setsessauth)
appendPQExpBufferStr(pgdumpopts, " --use-set-session-authorization");
if (no_publications)
@@ -606,6 +610,7 @@ help(void)
printf(_(" --no-tablespaces do not dump tablespace assignments\n"));
printf(_(" --no-unlogged-table-data do not dump unlogged table data\n"));
printf(_(" --quote-all-identifiers quote all identifiers, even if not key words\n"));
+ printf(_(" --load-via-partition-root load partitions via the root table\n"));
printf(_(" --use-set-session-authorization\n"
" use SET SESSION AUTHORIZATION commands instead of\n"
" ALTER OWNER commands to set ownership\n"));
On Fri, Aug 11, 2017 at 10:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Aug 11, 2017 at 5:36 AM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:Please find attach patch with the changes.
I found the way that you had the logic structured in flagInhTables()
to be quite hard to follow, so I rewrote it in the attached version.
This version also has a bunch of minor cosmetic changes. Barring
objections, I'll commit it once the tree opens for v11 development.
Thanks Robert.
Patch changes looks good to me.
regards,
Rushabh Lathia
www.EnterpriseDB.com
On Mon, Aug 14, 2017 at 12:40 AM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:
On Fri, Aug 11, 2017 at 10:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Aug 11, 2017 at 5:36 AM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:Please find attach patch with the changes.
I found the way that you had the logic structured in flagInhTables()
to be quite hard to follow, so I rewrote it in the attached version.
This version also has a bunch of minor cosmetic changes. Barring
objections, I'll commit it once the tree opens for v11 development.Thanks Robert.
Patch changes looks good to me.
Thanks. Committed.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers