pg_dump multi VALUES INSERT

coelho@cri.ensmp.fr

about 7 years ago

In reply to: Surafel Temesgen (#1)

Re: pg_dump multi VALUES INSERT

The patch attached add additional option for multi values insert statement
with a default values of 100 row per statement so the row lose during error
is at most 100 rather than entire table.

Patch does not seem to apply anymore, could you rebase?

--
Fabien.

Michael Paquier

michael@paquier.xyz

about 7 years ago

In reply to: Stephen Frost (#3)

Re: pg_dump multi VALUES INSERT

On Wed, Oct 17, 2018 at 03:05:28PM -0400, Stephen Frost wrote:

The point of it is that it makes loading into other RDBMS faster. Yes,
it has many of the same issues as our COPY does, but we support it
because it's much faster. The same is true here, just for other
databases, so I'm +1 on the general idea.

Well, the patch author has mentioned that he cares about also being able
to detect errors when processing the dump, which multi inserts make that
easier to check for. However, even if you specify --data-only you still
need to worry about the first SET commands ahead, which requires manual
handling of the dump...

I am honestly not convinced that it is worth complicating pg_dump for
that, as there is no guarantee either that the DDLs generated by pg_dump
will be compatible with what other systems expect. This kind of
compatibility for fetch and reload can also be kind of tricky with
portability issues, so I'd rather let this stuff being handled correctly
by other tools like pgloader or others rather than expecting to get this
stuff half-baked within Postgres core tools.
--
Michael

Stephen Frost

sfrost@snowman.net

about 7 years ago

In reply to: Michael Paquier (#5)

Re: pg_dump multi VALUES INSERT

Greetings,

* Michael Paquier (michael@paquier.xyz) wrote:

On Wed, Oct 17, 2018 at 03:05:28PM -0400, Stephen Frost wrote:

The point of it is that it makes loading into other RDBMS faster. Yes,
it has many of the same issues as our COPY does, but we support it
because it's much faster. The same is true here, just for other
databases, so I'm +1 on the general idea.

Well, the patch author has mentioned that he cares about also being able
to detect errors when processing the dump, which multi inserts make that
easier to check for. However, even if you specify --data-only you still
need to worry about the first SET commands ahead, which requires manual
handling of the dump...

That's hardly a serious complication..

I am honestly not convinced that it is worth complicating pg_dump for
that, as there is no guarantee either that the DDLs generated by pg_dump
will be compatible with what other systems expect. This kind of
compatibility for fetch and reload can also be kind of tricky with
portability issues, so I'd rather let this stuff being handled correctly
by other tools like pgloader or others rather than expecting to get this
stuff half-baked within Postgres core tools.

I can see an argument for not wanting to complicate pg_dump, but we've
explicitly stated that the purpose of --inserts is to facilitate
restoring into other database systems and I don't agree that we should
just punt on that entirely. For better or worse, there's an awful lot
of weight put behind things which are in core and we should take care to
do what we can to make those things better, especially when someone is
proposing a patch to improve the situation.

Sure, the patch might need work or have other issues, but that's an
opportunity for us to provide feedback to the author and encourage them
to improve the patch.

As for the other things that make it difficult to use pg_dump to get a
result out that can be loaded into other database systems- let's try to
improve on that too. Having an option to skip the 'setup' bits, such as
the SET commands, certainly wouldn't be hard.

I certainly don't see us adding code to pg_dump to handle 'fetch and
reload' into some non-PG system, or, really, even into a PG system, and
that certainly isn't something the patch does, so I don't think it's a
particularly interesting argument. Users can handle that as needed
themselves.

In other words, none of the arguments put forth really seem to be a
reason to reject the general idea of this patch, so I'm still +1 on
that. Having just glanced over the patch quickly, I think I would have
done something like '--inserts=100' as the way to enable it instead of
adding a new option though. Not that I feel very strongly about it.

Thanks!

Stephen

surafel3000@gmail.com

about 7 years ago

In reply to: Fabien COELHO (#4)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

hi,

On Sun, Nov 4, 2018 at 1:18 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Patch does not seem to apply anymore, could you rebase?

The attached patch is a rebased version and work by ‘inserts=100’ as
Stephen suggest

regards
Surafel

Attachments:

multi_values_inserts_dum_v2.patchtext/x-patch; charset=US-ASCII; name=multi_values_inserts_dum_v2.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 790e81c32c..6cc15de2d8 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -789,6 +789,17 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--inserts=100</option></term>
+      <listitem>
+       <para>
+        Dump data as 100 values <command>INSERT</command> commands (rather than <command>COPY</command>).
+        This will make the dump file smaller than <option>--inserts</option> and faster to reload but lack
+        single row data lost on error while reloading rather entire 100 rows data lost.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--load-via-partition-root</option></term>
       <listitem>
diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index c51a130f43..c08cc80732 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -326,6 +326,17 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--inserts=100</option></term>
+      <listitem>
+       <para>
+        Dump data as 100 values <command>INSERT</command> commands (rather than <command>COPY</command>).
+        This will make the dump file smaller than <option>--inserts</option> and faster to reload but lack
+        single row data lost on error while reloading rather entire 100 rows data lost.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--load-via-partition-root</option></term>
       <listitem>
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index ba798213be..2fd48cf2f2 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -72,6 +72,7 @@ typedef struct _restoreOptions
 	int			dropSchema;
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_hundred;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;	/* Skip comments */
@@ -145,6 +146,7 @@ typedef struct _dumpOptions
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_hundred;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index c8d01ed4a4..6df1fc2409 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -359,7 +359,7 @@ main(int argc, char **argv)
 		{"enable-row-security", no_argument, &dopt.enable_row_security, 1},
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
-		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"inserts", optional_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -562,6 +562,21 @@ main(int argc, char **argv)
 				dosync = false;
 				break;
 
+			case 8:			/* inserts option */
+				if (optarg)
+				{
+					if (atoi(optarg) != 100)
+						{
+						write_msg(NULL, "insert values must be 100\n");
+						exit_nicely(1);
+						}
+					dopt.dump_inserts_hundred = 1;
+
+				}
+				else
+					dopt.dump_inserts = 1;
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -609,9 +624,9 @@ main(int argc, char **argv)
 		exit_nicely(1);
 	}
 
-	if (dopt.dump_inserts && dopt.oids)
+	if ((dopt.dump_inserts || dopt.dump_inserts_hundred) && dopt.oids)
 	{
-		write_msg(NULL, "options --inserts/--column-inserts and -o/--oids cannot be used together\n");
+		write_msg(NULL, "options --inserts --column-inserts --inserts=100 and -o/--oids cannot be used together\n");
 		write_msg(NULL, "(The INSERT command cannot set OIDs.)\n");
 		exit_nicely(1);
 	}
@@ -619,8 +634,9 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
-		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
+	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts ||
+		dopt.dump_inserts_hundred))
+		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --column-inserts or --inserts=100\n");
 
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -889,6 +905,7 @@ main(int argc, char **argv)
 	ropt->use_setsessauth = dopt.use_setsessauth;
 	ropt->disable_dollar_quoting = dopt.disable_dollar_quoting;
 	ropt->dump_inserts = dopt.dump_inserts;
+	ropt->dump_inserts_hundred = dopt.dump_inserts_hundred;
 	ropt->no_comments = dopt.no_comments;
 	ropt->no_publications = dopt.no_publications;
 	ropt->no_security_labels = dopt.no_security_labels;
@@ -980,6 +997,7 @@ help(const char *progname)
 	printf(_("  --exclude-table-data=TABLE   do NOT dump data for the named table(s)\n"));
 	printf(_("  --if-exists                  use IF EXISTS when dropping objects\n"));
 	printf(_("  --inserts                    dump data as INSERT commands, rather than COPY\n"));
+	printf(_("  --inserts=100                dump data as 100 values INSERT commands, rather than COPY\n"));
 	printf(_("  --load-via-partition-root    load partitions via the root table\n"));
 	printf(_("  --no-comments                do not dump comments\n"));
 	printf(_("  --no-publications            do not dump publications\n"));
@@ -2073,6 +2091,191 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	return 1;
 }
 
+/*
+ * Dump table data using hundred values INSERT commands.
+ */
+static int
+dumpTableData_insert_hundred(Archive *fout, void *dcontext)
+{
+	TableDataInfo *tdinfo = (TableDataInfo *) dcontext;
+	TableInfo  *tbinfo = tdinfo->tdtable;
+	DumpOptions *dopt = fout->dopt;
+	PQExpBuffer q = createPQExpBuffer();
+	PQExpBuffer insertStmt = NULL;
+	PGresult   *res;
+	int			tuple;
+	int			nfields;
+	int			field;
+	int			ntuple;
+	int			ltuple;
+
+	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
+					  "SELECT * FROM ONLY %s",
+					  fmtQualifiedDumpable(tbinfo));
+	if (tdinfo->filtercond)
+		appendPQExpBuffer(q, " %s", tdinfo->filtercond);
+
+	ExecuteSqlStatement(fout, q->data);
+
+	while (1)
+	{
+		res = ExecuteSqlQuery(fout, "FETCH 100 FROM _pg_dump_cursor",
+							  PGRES_TUPLES_OK);
+		nfields = PQnfields(res);
+		ntuple = PQntuples(res);
+		ltuple = ntuple-1;
+
+			if (insertStmt == NULL)
+			{
+				TableInfo  *targettab;
+
+				insertStmt = createPQExpBuffer();
+
+				/*
+				 * When load-via-partition-root is set, get the root table
+				 * name for the partition table, so that we can reload data
+				 * through the root table.
+				 */
+				if (dopt->load_via_partition_root && tbinfo->ispartition)
+					targettab = getRootTableInfo(tbinfo);
+				else
+					targettab = tbinfo;
+
+				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+								  fmtQualifiedDumpable(targettab));
+
+				/* corner case for zero-column table */
+				if (nfields == 0)
+				{
+					appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+				}
+				else
+				{
+					/* append the list of column names if required */
+					if (dopt->column_inserts)
+					{
+						appendPQExpBufferChar(insertStmt, '(');
+						for (field = 0; field < nfields; field++)
+						{
+							if (field > 0)
+								appendPQExpBufferStr(insertStmt, ", ");
+							appendPQExpBufferStr(insertStmt,
+												 fmtId(PQfname(res, field)));
+						}
+						appendPQExpBufferStr(insertStmt, ") ");
+					}
+
+					if (tbinfo->needs_override)
+						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+
+					appendPQExpBufferStr(insertStmt, "VALUES ");
+				}
+			}
+
+			archputs(insertStmt->data, fout);
+			for (tuple = 0; tuple < ntuple ; tuple++)
+			{
+
+			/* if it is zero-column table then we're done */
+			if (nfields == 0)
+				continue;
+			if (tuple == 0)
+				archputs("(", fout);
+			else
+				archputs(", (", fout);
+
+			for (field = 0; field < nfields; field++)
+			{
+				if (field > 0)
+					archputs(", ", fout);
+				if (PQgetisnull(res, tuple, field))
+				{
+					archputs("NULL", fout);
+					continue;
+				}
+
+				/* XXX This code is partially duplicated in ruleutils.c */
+				switch (PQftype(res, field))
+				{
+					case INT2OID:
+					case INT4OID:
+					case INT8OID:
+					case OIDOID:
+					case FLOAT4OID:
+					case FLOAT8OID:
+					case NUMERICOID:
+						{
+							/*
+							 * These types are printed without quotes unless
+							 * they contain values that aren't accepted by the
+							 * scanner unquoted (e.g., 'NaN').  Note that
+							 * strtod() and friends might accept NaN, so we
+							 * can't use that to test.
+							 *
+							 * In reality we only need to defend against
+							 * infinity and NaN, so we need not get too crazy
+							 * about pattern matching here.
+							 */
+							const char *s = PQgetvalue(res, tuple, field);
+
+							if (strspn(s, "0123456789 +-eE.") == strlen(s))
+								archputs(s, fout);
+							else
+								archprintf(fout, "'%s'", s);
+						}
+						break;
+
+					case BITOID:
+					case VARBITOID:
+						archprintf(fout, "B'%s'",
+								   PQgetvalue(res, tuple, field));
+						break;
+
+					case BOOLOID:
+						if (strcmp(PQgetvalue(res, tuple, field), "t") == 0)
+							archputs("true", fout);
+						else
+							archputs("false", fout);
+						break;
+
+					default:
+						/* All other types are printed as string literals. */
+						resetPQExpBuffer(q);
+						appendStringLiteralAH(q,
+											  PQgetvalue(res, tuple, field),
+											  fout);
+						archputs(q->data, fout);
+						break;
+				}
+			}
+			if (tuple < ltuple)
+				archputs(")\n", fout);
+
+		}
+		if (!dopt->do_nothing)
+			archputs(");\n", fout);
+		else
+			archputs(") ON CONFLICT DO NOTHING;\n", fout);
+
+		if (PQntuples(res) <= 0)
+		{
+			PQclear(res);
+			break;
+		}
+		PQclear(res);
+	}
+
+	archputs("\n\n", fout);
+
+	ExecuteSqlStatement(fout, "CLOSE _pg_dump_cursor");
+
+	destroyPQExpBuffer(q);
+	if (insertStmt != NULL)
+		destroyPQExpBuffer(insertStmt);
+
+	return 1;
+}
+
 /*
  * getRootTableInfo:
  *     get the root TableInfo for the given partition table.
@@ -2112,7 +2315,7 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
 	char	   *copyStmt;
 	const char *copyFrom;
 
-	if (!dopt->dump_inserts)
+	if (!dopt->dump_inserts && !dopt->dump_inserts_hundred)
 	{
 		/* Dump/restore using COPY */
 		dumpFn = dumpTableData_copy;
@@ -2140,6 +2343,12 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
 						  (tdinfo->oids && tbinfo->hasoids) ? "WITH OIDS " : "");
 		copyStmt = copyBuf->data;
 	}
+	else if (dopt->dump_inserts_hundred)
+	{
+		/* Restore using hundred values INSERT */
+		dumpFn = dumpTableData_insert_hundred;
+		copyStmt = NULL;
+	}
 	else
 	{
 		/* Restore using INSERT */
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index 5176626476..082d453da3 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -68,6 +68,7 @@ static int	disable_dollar_quoting = 0;
 static int	disable_triggers = 0;
 static int	if_exists = 0;
 static int	inserts = 0;
+static int	inserts_hundred = 0;
 static int	no_tablespaces = 0;
 static int	use_setsessauth = 0;
 static int	no_comments = 0;
@@ -125,6 +126,7 @@ main(int argc, char *argv[])
 		{"disable-triggers", no_argument, &disable_triggers, 1},
 		{"if-exists", no_argument, &if_exists, 1},
 		{"inserts", no_argument, &inserts, 1},
+		{"inserts-hundred", no_argument, &inserts_hundred, 1},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &no_tablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -390,6 +392,8 @@ main(int argc, char *argv[])
 		appendPQExpBufferStr(pgdumpopts, " --disable-triggers");
 	if (inserts)
 		appendPQExpBufferStr(pgdumpopts, " --inserts");
+	if (inserts_hundred)
+		appendPQExpBufferStr(pgdumpopts, " --inserts=100");
 	if (no_tablespaces)
 		appendPQExpBufferStr(pgdumpopts, " --no-tablespaces");
 	if (quote_all_identifiers)
@@ -616,6 +620,7 @@ help(void)
 	printf(_("  --disable-triggers           disable triggers during data-only restore\n"));
 	printf(_("  --if-exists                  use IF EXISTS when dropping objects\n"));
 	printf(_("  --inserts                    dump data as INSERT commands, rather than COPY\n"));
+	printf(_("  --inserts=100                dump data as 100 values INSERT commands, rather than COPY\n"));
 	printf(_("  --load-via-partition-root    load partitions via the root table\n"));
 	printf(_("  --no-comments                do not dump comments\n"));
 	printf(_("  --no-publications            do not dump publications\n"));
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index 17edf444b2..f409a1030a 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -73,8 +73,8 @@ command_fails_like(
 
 command_fails_like(
 	[ 'pg_dump', '--inserts', '-o' ],
-	qr/\Qpg_dump: options --inserts\/--column-inserts and -o\/--oids cannot be used together\E/,
-	'pg_dump: options --inserts/--column-inserts and -o/--oids cannot be used together'
+	qr/\Qpg_dump: options --inserts --column-inserts --inserts=100 and -o\/--oids cannot be used together\E/,
+	'pg_dump: options --inserts and -o/--oids cannot be used together'
 );
 
 command_fails_like(
@@ -124,8 +124,9 @@ command_fails_like(
 
 command_fails_like(
 	[ 'pg_dump', '--on-conflict-do-nothing' ],
-	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts\E/,
-	'pg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts');
+	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts , --column-inserts or --inserts=100\E/,
+	'pg_dump: option --on-conflict-do-nothing requires option --inserts , --column-inserts or --inserts=100');
+
 
 # pg_dumpall command-line argument checks
 command_fails_like(

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Surafel Temesgen (#7)

Re: pg_dump multi VALUES INSERT

On 2018-Nov-06, Surafel Temesgen wrote:

hi,

On Sun, Nov 4, 2018 at 1:18 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Patch does not seem to apply anymore, could you rebase?

The attached patch is a rebased version and work by ‘inserts=100’ as
Stephen suggest

I thought the suggestion was that the number could be any positive
integer, not hardcoded 100. It shouldn't take much more code to handle
it that way, which makes more sense to me.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Pavel Stehule

pavel.stehule@gmail.com

about 7 years ago

In reply to: Alvaro Herrera (#8)

Re: pg_dump multi VALUES INSERT

út 6. 11. 2018 v 18:18 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com>
napsal:

On 2018-Nov-06, Surafel Temesgen wrote:

hi,

On Sun, Nov 4, 2018 at 1:18 PM Fabien COELHO <coelho@cri.ensmp.fr>

wrote:

Patch does not seem to apply anymore, could you rebase?

The attached patch is a rebased version and work by ‘inserts=100’ as
Stephen suggest

I thought the suggestion was that the number could be any positive
integer, not hardcoded 100. It shouldn't take much more code to handle
it that way, which makes more sense to me.

100 looks really strange

Pavel

Show quoted text

Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#10

surafel3000@gmail.com

about 7 years ago

In reply to: Alvaro Herrera (#8)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

On Tue, Nov 6, 2018 at 8:18 PM Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:

On 2018-Nov-06, Surafel Temesgen wrote:

hi,

On Sun, Nov 4, 2018 at 1:18 PM Fabien COELHO <coelho@cri.ensmp.fr>

wrote:

Patch does not seem to apply anymore, could you rebase?

The attached patch is a rebased version and work by ‘inserts=100’ as
Stephen suggest

I thought the suggestion was that the number could be any positive
integer, not hardcoded 100.

It shouldn't take much more code to handle
it that way, which makes more sense to me

yes its not much line of code. Attach is a patch that optionally accept the
number of row in a single insert statement and if it is not specified one
row per statement used

regards

Surafel

Attachments:

multi_values_inserts_dum_v3.patchtext/x-patch; charset=US-ASCII; name=multi_values_inserts_dum_v3.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index b5fa4fb85c..70411cb6ac 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -780,7 +780,10 @@ PostgreSQL documentation
         non-<productname>PostgreSQL</productname> databases.
         However, since this option generates a separate command for each row,
         an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
+        than the entire table contents. The number of row per insert statement
+        can also be specified to make the dump file smaller and faster
+        to reload but lack single row data lost on error while reloading rather entire affected
+        insert statement data lost.
         Note that
         the restore might fail altogether if you have rearranged column order.
         The <option>--column-inserts</option> option is safe against column
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index ba798213be..c5108ff6d8 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -72,6 +72,7 @@ typedef struct _restoreOptions
 	int			dropSchema;
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;	/* Skip comments */
@@ -145,6 +146,7 @@ typedef struct _dumpOptions
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index c8d01ed4a4..a5fa4917a8 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -359,7 +359,7 @@ main(int argc, char **argv)
 		{"enable-row-security", no_argument, &dopt.enable_row_security, 1},
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
-		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"inserts", optional_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -562,6 +562,20 @@ main(int argc, char **argv)
 				dosync = false;
 				break;
 
+			case 8:			/* inserts values number */
+				if (optarg)
+				{
+					dopt.dump_inserts_multiple = atoi(optarg);
+					if (dopt.dump_inserts_multiple < 0)
+					{
+						write_msg(NULL, "insert values must be positive number\n");
+						exit_nicely(1);
+					}
+				}
+				else
+					dopt.dump_inserts = 1;
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -609,7 +623,7 @@ main(int argc, char **argv)
 		exit_nicely(1);
 	}
 
-	if (dopt.dump_inserts && dopt.oids)
+	if ((dopt.dump_inserts || dopt.dump_inserts_multiple) && dopt.oids)
 	{
 		write_msg(NULL, "options --inserts/--column-inserts and -o/--oids cannot be used together\n");
 		write_msg(NULL, "(The INSERT command cannot set OIDs.)\n");
@@ -619,7 +633,8 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
+	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts ||
+		dopt.dump_inserts_multiple))
 		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
 
 	/* Identify archive format to emit */
@@ -889,6 +904,7 @@ main(int argc, char **argv)
 	ropt->use_setsessauth = dopt.use_setsessauth;
 	ropt->disable_dollar_quoting = dopt.disable_dollar_quoting;
 	ropt->dump_inserts = dopt.dump_inserts;
+	ropt->dump_inserts_multiple = dopt.dump_inserts_multiple;
 	ropt->no_comments = dopt.no_comments;
 	ropt->no_publications = dopt.no_publications;
 	ropt->no_security_labels = dopt.no_security_labels;
@@ -2073,6 +2089,193 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	return 1;
 }
 
+/*
+ * Dump table data using multiple values INSERT commands.
+ */
+static int
+dumpTableData_insert_multiple(Archive *fout, void *dcontext)
+{
+	TableDataInfo *tdinfo = (TableDataInfo *) dcontext;
+	TableInfo  *tbinfo = tdinfo->tdtable;
+	DumpOptions *dopt = fout->dopt;
+	PQExpBuffer q = createPQExpBuffer();
+	PQExpBuffer i = createPQExpBuffer();
+	PQExpBuffer insertStmt = NULL;
+	PGresult   *res;
+	int			tuple;
+	int			nfields;
+	int			field;
+	int			ntuple;
+	int			ltuple;
+
+	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
+					  "SELECT * FROM ONLY %s",
+					  fmtQualifiedDumpable(tbinfo));
+	if (tdinfo->filtercond)
+		appendPQExpBuffer(q, " %s", tdinfo->filtercond);
+
+	ExecuteSqlStatement(fout, q->data);
+	appendPQExpBuffer(i, "FETCH %d FROM _pg_dump_cursor",
+					  dopt->dump_inserts_multiple);
+	while (1)
+	{
+		res = ExecuteSqlQuery(fout, i->data, PGRES_TUPLES_OK);
+		nfields = PQnfields(res);
+		ntuple = PQntuples(res);
+		ltuple = ntuple-1;
+			if (ntuple  > 0)
+			{
+				if (insertStmt == NULL)
+				{
+					TableInfo  *targettab;
+
+					insertStmt = createPQExpBuffer();
+
+					/*
+					 * When load-via-partition-root is set, get the root table
+					 * name for the partition table, so that we can reload data
+					 * through the root table.
+					 */
+					if (dopt->load_via_partition_root && tbinfo->ispartition)
+						targettab = getRootTableInfo(tbinfo);
+					else
+						targettab = tbinfo;
+
+					appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+									  fmtQualifiedDumpable(targettab));
+
+					/* corner case for zero-column table */
+					if (nfields == 0)
+					{
+						appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+					}
+					else
+					{
+						/* append the list of column names if required */
+					if (dopt->column_inserts)
+					{
+						appendPQExpBufferChar(insertStmt, '(');
+						for (field = 0; field < nfields; field++)
+						{
+							if (field > 0)
+								appendPQExpBufferStr(insertStmt, ", ");
+							appendPQExpBufferStr(insertStmt,
+												 fmtId(PQfname(res, field)));
+						}
+						appendPQExpBufferStr(insertStmt, ") ");
+					}
+
+					if (tbinfo->needs_override)
+						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+
+					appendPQExpBufferStr(insertStmt, "VALUES ");
+					}
+				}
+			archputs(insertStmt->data, fout);
+			for (tuple = 0; tuple < ntuple ; tuple++)
+			{
+
+			/* if it is zero-column table then we're done */
+			if (nfields == 0)
+				continue;
+			if (tuple == 0)
+				archputs("(", fout);
+			else
+				archputs(", (", fout);
+
+			for (field = 0; field < nfields; field++)
+			{
+				if (field > 0)
+					archputs(", ", fout);
+				if (PQgetisnull(res, tuple, field))
+				{
+					archputs("NULL", fout);
+					continue;
+				}
+
+				/* XXX This code is partially duplicated in ruleutils.c */
+				switch (PQftype(res, field))
+				{
+					case INT2OID:
+					case INT4OID:
+					case INT8OID:
+					case OIDOID:
+					case FLOAT4OID:
+					case FLOAT8OID:
+					case NUMERICOID:
+						{
+							/*
+							 * These types are printed without quotes unless
+							 * they contain values that aren't accepted by the
+							 * scanner unquoted (e.g., 'NaN').  Note that
+							 * strtod() and friends might accept NaN, so we
+							 * can't use that to test.
+							 *
+							 * In reality we only need to defend against
+							 * infinity and NaN, so we need not get too crazy
+							 * about pattern matching here.
+							 */
+							const char *s = PQgetvalue(res, tuple, field);
+
+							if (strspn(s, "0123456789 +-eE.") == strlen(s))
+								archputs(s, fout);
+							else
+								archprintf(fout, "'%s'", s);
+						}
+						break;
+
+					case BITOID:
+					case VARBITOID:
+						archprintf(fout, "B'%s'",
+								   PQgetvalue(res, tuple, field));
+						break;
+
+					case BOOLOID:
+						if (strcmp(PQgetvalue(res, tuple, field), "t") == 0)
+							archputs("true", fout);
+						else
+							archputs("false", fout);
+						break;
+
+					default:
+						/* All other types are printed as string literals. */
+						resetPQExpBuffer(q);
+						appendStringLiteralAH(q,
+											  PQgetvalue(res, tuple, field),
+											  fout);
+						archputs(q->data, fout);
+						break;
+				}
+					}
+			if (tuple < ltuple)
+				archputs(")\n", fout);
+
+			}
+			if (!dopt->do_nothing)
+				archputs(");\n", fout);
+			else
+				archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			}
+		if (PQntuples(res) <= 0)
+		{
+			PQclear(res);
+			break;
+		}
+		PQclear(res);
+	}
+
+	archputs("\n\n", fout);
+
+	ExecuteSqlStatement(fout, "CLOSE _pg_dump_cursor");
+
+	destroyPQExpBuffer(q);
+	destroyPQExpBuffer(i);
+	if (insertStmt != NULL)
+		destroyPQExpBuffer(insertStmt);
+
+	return 1;
+}
+
 /*
  * getRootTableInfo:
  *     get the root TableInfo for the given partition table.
@@ -2112,7 +2315,7 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
 	char	   *copyStmt;
 	const char *copyFrom;
 
-	if (!dopt->dump_inserts)
+	if (!dopt->dump_inserts && !dopt->dump_inserts_multiple)
 	{
 		/* Dump/restore using COPY */
 		dumpFn = dumpTableData_copy;
@@ -2140,6 +2343,12 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
 						  (tdinfo->oids && tbinfo->hasoids) ? "WITH OIDS " : "");
 		copyStmt = copyBuf->data;
 	}
+	else if (dopt->dump_inserts_multiple)
+	{
+		/* Restore using multiple values INSERT */
+		dumpFn = dumpTableData_insert_multiple;
+		copyStmt = NULL;
+	}
 	else
 	{
 		/* Restore using INSERT */

#11

Dmitry Dolgov

9erthalion6@gmail.com

about 7 years ago

In reply to: Surafel Temesgen (#10)

Re: pg_dump multi VALUES INSERT

On Thu, Nov 8, 2018 at 2:03 PM Surafel Temesgen <surafel3000@gmail.com> wrote:

yes its not much line of code. Attach is a patch that optionally accept the number of row in a single insert statement and if it is not specified one row per statement used

Hi,

Unfortunately, patch needs to be rebased, could you please post an updated
version?

#12

surafel3000@gmail.com

about 7 years ago

In reply to: Dmitry Dolgov (#11)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

On Fri, Nov 30, 2018 at 7:16 PM Dmitry Dolgov <9erthalion6@gmail.com> wrote:

Unfortunately, patch needs to be rebased, could you please post an updated
version?

Thank you for informing, Here is an updated patch against current master
Regards
Surafel

Attachments:

multi_values_inserts_dum_v4.patchtext/x-patch; charset=US-ASCII; name=multi_values_inserts_dum_v4.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2015410a42..d93d8fc0c2 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -766,7 +766,10 @@ PostgreSQL documentation
         non-<productname>PostgreSQL</productname> databases.
         However, since this option generates a separate command for each row,
         an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
+        than the entire table contents. The number of row per insert statement
+        can also be specified to make the dump file smaller and faster
+        to reload but lack single row data lost on error while reloading rather entire affected
+        insert statement data lost.
         Note that
         the restore might fail altogether if you have rearranged column order.
         The <option>--column-inserts</option> option is safe against column
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..73a243ecb0 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -72,6 +72,7 @@ typedef struct _restoreOptions
 	int			dropSchema;
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;	/* Skip comments */
@@ -144,6 +145,7 @@ typedef struct _dumpOptions
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 637c79af48..6b1440e173 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -358,7 +358,7 @@ main(int argc, char **argv)
 		{"enable-row-security", no_argument, &dopt.enable_row_security, 1},
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
-		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"inserts", optional_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -557,6 +557,20 @@ main(int argc, char **argv)
 				dosync = false;
 				break;
 
+			case 8:			/* inserts values number */
+				if (optarg)
+				{
+					dopt.dump_inserts_multiple = atoi(optarg);
+					if (dopt.dump_inserts_multiple < 0)
+					{
+						write_msg(NULL, "insert values must be positive number\n");
+						exit_nicely(1);
+					}
+				}
+				else
+					dopt.dump_inserts = 1;
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -607,7 +621,8 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
+	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts ||
+		dopt.dump_inserts_multiple))
 		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
 
 	/* Identify archive format to emit */
@@ -877,6 +892,7 @@ main(int argc, char **argv)
 	ropt->use_setsessauth = dopt.use_setsessauth;
 	ropt->disable_dollar_quoting = dopt.disable_dollar_quoting;
 	ropt->dump_inserts = dopt.dump_inserts;
+	ropt->dump_inserts_multiple = dopt.dump_inserts_multiple;
 	ropt->no_comments = dopt.no_comments;
 	ropt->no_publications = dopt.no_publications;
 	ropt->no_security_labels = dopt.no_security_labels;
@@ -2052,6 +2068,193 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	return 1;
 }
 
+/*
+ * Dump table data using multiple values INSERT commands.
+ */
+static int
+dumpTableData_insert_multiple(Archive *fout, void *dcontext)
+{
+	TableDataInfo *tdinfo = (TableDataInfo *) dcontext;
+	TableInfo  *tbinfo = tdinfo->tdtable;
+	DumpOptions *dopt = fout->dopt;
+	PQExpBuffer q = createPQExpBuffer();
+	PQExpBuffer i = createPQExpBuffer();
+	PQExpBuffer insertStmt = NULL;
+	PGresult   *res;
+	int			tuple;
+	int			nfields;
+	int			field;
+	int			ntuple;
+	int			ltuple;
+
+	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
+					  "SELECT * FROM ONLY %s",
+					  fmtQualifiedDumpable(tbinfo));
+	if (tdinfo->filtercond)
+		appendPQExpBuffer(q, " %s", tdinfo->filtercond);
+
+	ExecuteSqlStatement(fout, q->data);
+	appendPQExpBuffer(i, "FETCH %d FROM _pg_dump_cursor",
+					  dopt->dump_inserts_multiple);
+	while (1)
+	{
+		res = ExecuteSqlQuery(fout, i->data, PGRES_TUPLES_OK);
+		nfields = PQnfields(res);
+		ntuple = PQntuples(res);
+		ltuple = ntuple-1;
+			if (ntuple  > 0)
+			{
+				if (insertStmt == NULL)
+				{
+					TableInfo  *targettab;
+
+					insertStmt = createPQExpBuffer();
+
+					/*
+					 * When load-via-partition-root is set, get the root table
+					 * name for the partition table, so that we can reload data
+					 * through the root table.
+					 */
+					if (dopt->load_via_partition_root && tbinfo->ispartition)
+						targettab = getRootTableInfo(tbinfo);
+					else
+						targettab = tbinfo;
+
+					appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+									  fmtQualifiedDumpable(targettab));
+
+					/* corner case for zero-column table */
+					if (nfields == 0)
+					{
+						appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+					}
+					else
+					{
+						/* append the list of column names if required */
+					if (dopt->column_inserts)
+					{
+						appendPQExpBufferChar(insertStmt, '(');
+						for (field = 0; field < nfields; field++)
+						{
+							if (field > 0)
+								appendPQExpBufferStr(insertStmt, ", ");
+							appendPQExpBufferStr(insertStmt,
+												 fmtId(PQfname(res, field)));
+						}
+						appendPQExpBufferStr(insertStmt, ") ");
+					}
+
+					if (tbinfo->needs_override)
+						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+
+					appendPQExpBufferStr(insertStmt, "VALUES ");
+					}
+				}
+			archputs(insertStmt->data, fout);
+			for (tuple = 0; tuple < ntuple ; tuple++)
+			{
+
+			/* if it is zero-column table then we're done */
+			if (nfields == 0)
+				continue;
+			if (tuple == 0)
+				archputs("(", fout);
+			else
+				archputs(", (", fout);
+
+			for (field = 0; field < nfields; field++)
+			{
+				if (field > 0)
+					archputs(", ", fout);
+				if (PQgetisnull(res, tuple, field))
+				{
+					archputs("NULL", fout);
+					continue;
+				}
+
+				/* XXX This code is partially duplicated in ruleutils.c */
+				switch (PQftype(res, field))
+				{
+					case INT2OID:
+					case INT4OID:
+					case INT8OID:
+					case OIDOID:
+					case FLOAT4OID:
+					case FLOAT8OID:
+					case NUMERICOID:
+						{
+							/*
+							 * These types are printed without quotes unless
+							 * they contain values that aren't accepted by the
+							 * scanner unquoted (e.g., 'NaN').  Note that
+							 * strtod() and friends might accept NaN, so we
+							 * can't use that to test.
+							 *
+							 * In reality we only need to defend against
+							 * infinity and NaN, so we need not get too crazy
+							 * about pattern matching here.
+							 */
+							const char *s = PQgetvalue(res, tuple, field);
+
+							if (strspn(s, "0123456789 +-eE.") == strlen(s))
+								archputs(s, fout);
+							else
+								archprintf(fout, "'%s'", s);
+						}
+						break;
+
+					case BITOID:
+					case VARBITOID:
+						archprintf(fout, "B'%s'",
+								   PQgetvalue(res, tuple, field));
+						break;
+
+					case BOOLOID:
+						if (strcmp(PQgetvalue(res, tuple, field), "t") == 0)
+							archputs("true", fout);
+						else
+							archputs("false", fout);
+						break;
+
+					default:
+						/* All other types are printed as string literals. */
+						resetPQExpBuffer(q);
+						appendStringLiteralAH(q,
+											  PQgetvalue(res, tuple, field),
+											  fout);
+						archputs(q->data, fout);
+						break;
+				}
+					}
+			if (tuple < ltuple)
+				archputs(")\n", fout);
+
+			}
+			if (!dopt->do_nothing)
+				archputs(");\n", fout);
+			else
+				archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			}
+		if (PQntuples(res) <= 0)
+		{
+			PQclear(res);
+			break;
+		}
+		PQclear(res);
+	}
+
+	archputs("\n\n", fout);
+
+	ExecuteSqlStatement(fout, "CLOSE _pg_dump_cursor");
+
+	destroyPQExpBuffer(q);
+	destroyPQExpBuffer(i);
+	if (insertStmt != NULL)
+		destroyPQExpBuffer(insertStmt);
+
+	return 1;
+}
+
 /*
  * getRootTableInfo:
  *     get the root TableInfo for the given partition table.
@@ -2091,7 +2294,7 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
 	char	   *copyStmt;
 	const char *copyFrom;
 
-	if (!dopt->dump_inserts)
+	if (!dopt->dump_inserts && !dopt->dump_inserts_multiple)
 	{
 		/* Dump/restore using COPY */
 		dumpFn = dumpTableData_copy;
@@ -2118,6 +2321,12 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
 						  fmtCopyColumnList(tbinfo, clistBuf));
 		copyStmt = copyBuf->data;
 	}
+	else if (dopt->dump_inserts_multiple)
+	{
+		/* Restore using multiple values INSERT */
+		dumpFn = dumpTableData_insert_multiple;
+		copyStmt = NULL;
+	}
 	else
 	{
 		/* Restore using INSERT */

#13

coelho@cri.ensmp.fr

about 7 years ago

In reply to: Surafel Temesgen (#12)

Re: pg_dump multi VALUES INSERT

Hello Surafel,

Thank you for informing, Here is an updated patch against current master

Patch applies cleanly, compiles, "make check" is okay, but given that the
feature is not tested...

Feature should be tested somewhere.

ISTM that command-line switches with optional arguments should be avoided:
This feature is seldom used (hmmm... 2 existing instances), because it
interferes with argument processing if such switches are used as the last
one. It is only okay with commands which do not expect arguments. For
backward compatibility, this suggests to add another switch, eg
--insert-multi=100 or whatever, which would possibly default to 100. The
alternative is to break compatibility with adding a mandatory argument,
but I guess it would not be admissible to committers.

Function "atoi" parses "1zzz" as 1, which is debatable, so I'd suggest to
avoid it and use some stricter option and error out on malformed integers.

The --help output does not document the --inserts argument, nor the
documentation.

There is an indendation issue within the while loop.

Given that the implementation is largely a copy-paste of the preceding
function, I'd suggest to simply extend it so that it takes into account
the "multi insert" setting and default to the previous behavior if not
set.

--
Fabien.

#14

surafel3000@gmail.com

about 7 years ago

In reply to: Fabien COELHO (#13)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

On Tue, Dec 25, 2018 at 2:47 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Thank you for looking into it

Hello Surafel,

Thank you for informing, Here is an updated patch against current master

Patch applies cleanly, compiles, "make check" is okay, but given that the
feature is not tested...

Feature should be tested somewhere.

ISTM that command-line switches with optional arguments should be avoided:
This feature is seldom used (hmmm... 2 existing instances), because it
interferes with argument processing if such switches are used as the last
one. It is only okay with commands which do not expect arguments. For
backward compatibility, this suggests to add another switch, eg
--insert-multi=100 or whatever, which would possibly default to 100. The
alternative is to break compatibility with adding a mandatory argument,
but I guess it would not be admissible to committers.

Function "atoi" parses "1zzz" as 1, which is debatable, so I'd suggest to
avoid it and use some stricter option and error out on malformed integers.

The --help output does not document the --inserts argument, nor the
documentation.

done

There is an indendation issue within the while loop.

Given that the implementation is largely a copy-paste of the preceding
function, I'd suggest to simply extend it so that it takes into account
the "multi insert" setting and default to the previous behavior if not
set.

At first i also try to do it like that but it seems the function will
became long and more complex to me

Regards

Surafel

Attachments:

multi_values_inserts_dum_v5.patchtext/x-patch; charset=US-ASCII; name=multi_values_inserts_dum_v5.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2015410a42..ee94d1d293 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -775,6 +775,18 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--insert-multi</option></term>
+      <listitem>
+       <para>
+        Specify the number of values per <command>INSERT</command> command.
+        This will make the dump file smaller than <option>--inserts</option>
+        and it is faster to reload but lack per row data lost on error
+        instead entire affected insert statement data lost.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--load-via-partition-root</option></term>
       <listitem>
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..73a243ecb0 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -72,6 +72,7 @@ typedef struct _restoreOptions
 	int			dropSchema;
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;	/* Skip comments */
@@ -144,6 +145,7 @@ typedef struct _dumpOptions
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 637c79af48..d433c3882a 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -313,6 +313,7 @@ main(int argc, char **argv)
 	int			plainText = 0;
 	ArchiveFormat archiveFormat = archUnknown;
 	ArchiveMode archiveMode;
+	char       *p;
 
 	static DumpOptions dopt;
 
@@ -359,6 +360,7 @@ main(int argc, char **argv)
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
 		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"insert-multi", required_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -557,6 +559,27 @@ main(int argc, char **argv)
 				dosync = false;
 				break;
 
+			case 8:			/* inserts values number */
+				errno = 0;
+				dopt.dump_inserts_multiple = strtol(optarg, &p, 10);
+				if (p == optarg || *p != '\0')
+				{
+					write_msg(NULL, "argument of --insert-multi must be a number\n");
+					exit_nicely(1);
+				}
+				if (errno == ERANGE)
+				{
+					write_msg(NULL, "argument of --insert-multi exceeds integer range.\n");
+					exit_nicely(1);
+				}
+				if (dopt.dump_inserts_multiple < 0)
+				{
+					write_msg(NULL, "argument of --insert-multi must be positive number\n");
+					exit_nicely(1);
+				}
+
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -607,8 +630,9 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
-		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
+	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts ||
+		dopt.dump_inserts_multiple))
+		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --insert-multi or --column-inserts\n");
 
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -877,6 +901,7 @@ main(int argc, char **argv)
 	ropt->use_setsessauth = dopt.use_setsessauth;
 	ropt->disable_dollar_quoting = dopt.disable_dollar_quoting;
 	ropt->dump_inserts = dopt.dump_inserts;
+	ropt->dump_inserts_multiple = dopt.dump_inserts_multiple;
 	ropt->no_comments = dopt.no_comments;
 	ropt->no_publications = dopt.no_publications;
 	ropt->no_security_labels = dopt.no_security_labels;
@@ -967,6 +992,7 @@ help(const char *progname)
 	printf(_("  --exclude-table-data=TABLE   do NOT dump data for the named table(s)\n"));
 	printf(_("  --if-exists                  use IF EXISTS when dropping objects\n"));
 	printf(_("  --inserts                    dump data as INSERT commands, rather than COPY\n"));
+	printf(_("  --insert-multi               number of values per INSERT command\n"));
 	printf(_("  --load-via-partition-root    load partitions via the root table\n"));
 	printf(_("  --no-comments                do not dump comments\n"));
 	printf(_("  --no-publications            do not dump publications\n"));
@@ -2052,6 +2078,193 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	return 1;
 }
 
+/*
+ * Dump table data using multiple values INSERT commands.
+ */
+static int
+dumpTableData_insert_multiple(Archive *fout, void *dcontext)
+{
+	TableDataInfo *tdinfo = (TableDataInfo *) dcontext;
+	TableInfo  *tbinfo = tdinfo->tdtable;
+	DumpOptions *dopt = fout->dopt;
+	PQExpBuffer q = createPQExpBuffer();
+	PQExpBuffer i = createPQExpBuffer();
+	PQExpBuffer insertStmt = NULL;
+	PGresult   *res;
+	int			tuple;
+	int			nfields;
+	int			field;
+	int			ntuple;
+	int			ltuple;
+
+	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
+					  "SELECT * FROM ONLY %s",
+					  fmtQualifiedDumpable(tbinfo));
+	if (tdinfo->filtercond)
+		appendPQExpBuffer(q, " %s", tdinfo->filtercond);
+
+	ExecuteSqlStatement(fout, q->data);
+	appendPQExpBuffer(i, "FETCH %d FROM _pg_dump_cursor",
+					  dopt->dump_inserts_multiple);
+	while (1)
+	{
+		res = ExecuteSqlQuery(fout, i->data, PGRES_TUPLES_OK);
+		nfields = PQnfields(res);
+		ntuple = PQntuples(res);
+		ltuple = ntuple-1;
+		if (ntuple  > 0)
+		{
+			if (insertStmt == NULL)
+			{
+				TableInfo  *targettab;
+
+				insertStmt = createPQExpBuffer();
+
+				/*
+				 * When load-via-partition-root is set, get the root table
+				 * name for the partition table, so that we can reload data
+				 * through the root table.
+				 */
+				if (dopt->load_via_partition_root && tbinfo->ispartition)
+					targettab = getRootTableInfo(tbinfo);
+				else
+					targettab = tbinfo;
+
+				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+								  fmtQualifiedDumpable(targettab));
+
+				/* corner case for zero-column table */
+				if (nfields == 0)
+				{
+					appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+				}
+				else
+				{
+					/* append the list of column names if required */
+					if (dopt->column_inserts)
+					{
+						appendPQExpBufferChar(insertStmt, '(');
+						for (field = 0; field < nfields; field++)
+						{
+							if (field > 0)
+								appendPQExpBufferStr(insertStmt, ", ");
+							appendPQExpBufferStr(insertStmt,
+												 fmtId(PQfname(res, field)));
+						}
+						appendPQExpBufferStr(insertStmt, ") ");
+					}
+
+					if (tbinfo->needs_override)
+						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+
+					appendPQExpBufferStr(insertStmt, "VALUES ");
+				}
+			}
+			archputs(insertStmt->data, fout);
+			for (tuple = 0; tuple < ntuple ; tuple++)
+			{
+
+				/* if it is zero-column table then we're done */
+				if (nfields == 0)
+					continue;
+				if (tuple == 0)
+					archputs("(", fout);
+				else
+					archputs(", (", fout);
+
+				for (field = 0; field < nfields; field++)
+				{
+					if (field > 0)
+						archputs(", ", fout);
+					if (PQgetisnull(res, tuple, field))
+					{
+						archputs("NULL", fout);
+						continue;
+					}
+
+					/* XXX This code is partially duplicated in ruleutils.c */
+					switch (PQftype(res, field))
+					{
+						case INT2OID:
+						case INT4OID:
+						case INT8OID:
+						case OIDOID:
+						case FLOAT4OID:
+						case FLOAT8OID:
+						case NUMERICOID:
+							{
+								/*
+								 * These types are printed without quotes unless
+								 * they contain values that aren't accepted by the
+								 * scanner unquoted (e.g., 'NaN').  Note that
+								 * strtod() and friends might accept NaN, so we
+								 * can't use that to test.
+								 *
+								 * In reality we only need to defend against
+								 * infinity and NaN, so we need not get too crazy
+								 * about pattern matching here.
+								 */
+								const char *s = PQgetvalue(res, tuple, field);
+
+								if (strspn(s, "0123456789 +-eE.") == strlen(s))
+									archputs(s, fout);
+								else
+									archprintf(fout, "'%s'", s);
+							}
+							break;
+
+						case BITOID:
+						case VARBITOID:
+							archprintf(fout, "B'%s'",
+									   PQgetvalue(res, tuple, field));
+							break;
+
+						case BOOLOID:
+							if (strcmp(PQgetvalue(res, tuple, field), "t") == 0)
+								archputs("true", fout);
+							else
+								archputs("false", fout);
+							break;
+
+						default:
+							/* All other types are printed as string literals. */
+							resetPQExpBuffer(q);
+							appendStringLiteralAH(q,
+												  PQgetvalue(res, tuple, field),
+												  fout);
+							archputs(q->data, fout);
+							break;
+					}
+				}
+				if (tuple < ltuple)
+					archputs(")\n", fout);
+
+			}
+			if (!dopt->do_nothing)
+				archputs(");\n", fout);
+			else
+				archputs(") ON CONFLICT DO NOTHING;\n", fout);
+		}
+		if (PQntuples(res) <= 0)
+		{
+			PQclear(res);
+			break;
+		}
+		PQclear(res);
+	}
+
+	archputs("\n\n", fout);
+
+	ExecuteSqlStatement(fout, "CLOSE _pg_dump_cursor");
+
+	destroyPQExpBuffer(q);
+	destroyPQExpBuffer(i);
+	if (insertStmt != NULL)
+		destroyPQExpBuffer(insertStmt);
+
+	return 1;
+}
+
 /*
  * getRootTableInfo:
  *     get the root TableInfo for the given partition table.
@@ -2091,7 +2304,7 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
 	char	   *copyStmt;
 	const char *copyFrom;
 
-	if (!dopt->dump_inserts)
+	if (!dopt->dump_inserts && !dopt->dump_inserts_multiple)
 	{
 		/* Dump/restore using COPY */
 		dumpFn = dumpTableData_copy;
@@ -2118,6 +2331,12 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
 						  fmtCopyColumnList(tbinfo, clistBuf));
 		copyStmt = copyBuf->data;
 	}
+	else if (dopt->dump_inserts_multiple)
+	{
+		/* Restore using multiple values INSERT */
+		dumpFn = dumpTableData_insert_multiple;
+		copyStmt = NULL;
+	}
 	else
 	{
 		/* Restore using INSERT */
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a875d540b8..e1ca5416b7 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -118,8 +118,8 @@ command_fails_like(
 
 command_fails_like(
 	[ 'pg_dump', '--on-conflict-do-nothing' ],
-	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts\E/,
-	'pg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts');
+	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts , --insert-multi or --column-inserts\E/,
+	'pg_dump: option --on-conflict-do-nothing requires option --inserts , --insert-multi or --column-inserts');
 
 # pg_dumpall command-line argument checks
 command_fails_like(

#15

coelho@cri.ensmp.fr

about 7 years ago

In reply to: Surafel Temesgen (#14)

Re: pg_dump multi VALUES INSERT

At first i also try to do it like that but it seems the function will
became long and more complex to me

Probably. But calling it with size 100 should result in the same behavior,
so it is really just an extension of the preceeding one? Or am I missing
something?

--
Fabien.

#16

surafel3000@gmail.com

about 7 years ago

In reply to: Fabien COELHO (#15)

Re: pg_dump multi VALUES INSERT

On Fri, Dec 28, 2018 at 6:46 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:

At first i also try to do it like that but it seems the function will
became long and more complex to me

Probably. But calling it with size 100 should result in the same behavior,
so it is really just an extension of the preceeding one? Or am I missing
something?

Specifying table data using single value insert statement and user
specified values insert statement
have enough deference that demand to be separate function and they are not
the same thing that should implement
with the same function. Regarding code duplication i think the solution is
making those code separate function
and call at appropriate place.

Regards
Surafel

#17

david.rowley@2ndquadrant.com

about 7 years ago

In reply to: Surafel Temesgen (#14)

Re: pg_dump multi VALUES INSERT

Just looking at the v5 patch, it seems not to handle 0 column tables correctly.

For example:

# create table t();
# insert into t default values;
# insert into t default values;

$ pg_dump --table t --inserts --insert-multi=100 postgres > dump.sql

# \i dump.sql
[...]
INSERT 0 1
psql:dump.sql:35: ERROR: syntax error at or near ")"
LINE 1: );
^

I'm not aware of a valid way to insert multiple 0 column rows in a
single INSERT statement, so not sure how you're going to handle that
case.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#18

surafel3000@gmail.com

about 7 years ago

In reply to: David Rowley (#17)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

Hi,
Thank you for looking at it
On Mon, Dec 31, 2018 at 12:38 PM David Rowley <david.rowley@2ndquadrant.com>
wrote:

Just looking at the v5 patch, it seems not to handle 0 column tables
correctly.

For example:

# create table t();
# insert into t default values;
# insert into t default values;

$ pg_dump --table t --inserts --insert-multi=100 postgres > dump.sql

# \i dump.sql
[...]
INSERT 0 1
psql:dump.sql:35: ERROR: syntax error at or near ")"
LINE 1: );
^

The attach patch contain a fix for it
Regards
Surafel

Attachments:

multi_values_inserts_dum_v6.patchtext/x-patch; charset=US-ASCII; name=multi_values_inserts_dum_v6.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2015410a42..ee94d1d293 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -775,6 +775,18 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--insert-multi</option></term>
+      <listitem>
+       <para>
+        Specify the number of values per <command>INSERT</command> command.
+        This will make the dump file smaller than <option>--inserts</option>
+        and it is faster to reload but lack per row data lost on error
+        instead entire affected insert statement data lost.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--load-via-partition-root</option></term>
       <listitem>
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..73a243ecb0 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -72,6 +72,7 @@ typedef struct _restoreOptions
 	int			dropSchema;
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;	/* Skip comments */
@@ -144,6 +145,7 @@ typedef struct _dumpOptions
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 341b1a51f2..3176a71262 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -313,6 +313,7 @@ main(int argc, char **argv)
 	int			plainText = 0;
 	ArchiveFormat archiveFormat = archUnknown;
 	ArchiveMode archiveMode;
+	char       *p;
 
 	static DumpOptions dopt;
 
@@ -359,6 +360,7 @@ main(int argc, char **argv)
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
 		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"insert-multi", required_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -557,6 +559,27 @@ main(int argc, char **argv)
 				dosync = false;
 				break;
 
+			case 8:			/* inserts values number */
+				errno = 0;
+				dopt.dump_inserts_multiple = strtol(optarg, &p, 10);
+				if (p == optarg || *p != '\0')
+				{
+					write_msg(NULL, "argument of --insert-multi must be a number\n");
+					exit_nicely(1);
+				}
+				if (errno == ERANGE)
+				{
+					write_msg(NULL, "argument of --insert-multi exceeds integer range.\n");
+					exit_nicely(1);
+				}
+				if (dopt.dump_inserts_multiple < 0)
+				{
+					write_msg(NULL, "argument of --insert-multi must be positive number\n");
+					exit_nicely(1);
+				}
+
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -607,8 +630,9 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
-		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
+	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts ||
+		dopt.dump_inserts_multiple))
+		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --insert-multi or --column-inserts\n");
 
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -877,6 +901,7 @@ main(int argc, char **argv)
 	ropt->use_setsessauth = dopt.use_setsessauth;
 	ropt->disable_dollar_quoting = dopt.disable_dollar_quoting;
 	ropt->dump_inserts = dopt.dump_inserts;
+	ropt->dump_inserts_multiple = dopt.dump_inserts_multiple;
 	ropt->no_comments = dopt.no_comments;
 	ropt->no_publications = dopt.no_publications;
 	ropt->no_security_labels = dopt.no_security_labels;
@@ -967,6 +992,7 @@ help(const char *progname)
 	printf(_("  --exclude-table-data=TABLE   do NOT dump data for the named table(s)\n"));
 	printf(_("  --if-exists                  use IF EXISTS when dropping objects\n"));
 	printf(_("  --inserts                    dump data as INSERT commands, rather than COPY\n"));
+	printf(_("  --insert-multi               number of values per INSERT command\n"));
 	printf(_("  --load-via-partition-root    load partitions via the root table\n"));
 	printf(_("  --no-comments                do not dump comments\n"));
 	printf(_("  --no-publications            do not dump publications\n"));
@@ -2052,6 +2078,199 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	return 1;
 }
 
+/*
+ * Dump table data using multiple values INSERT commands.
+ */
+static int
+dumpTableData_insert_multiple(Archive *fout, void *dcontext)
+{
+	TableDataInfo *tdinfo = (TableDataInfo *) dcontext;
+	TableInfo  *tbinfo = tdinfo->tdtable;
+	DumpOptions *dopt = fout->dopt;
+	PQExpBuffer q = createPQExpBuffer();
+	PQExpBuffer i = createPQExpBuffer();
+	PQExpBuffer insertStmt = NULL;
+	PGresult   *res;
+	int			tuple;
+	int			nfields;
+	int			field;
+	int			ntuple;
+	int			ltuple;
+
+	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
+					  "SELECT * FROM ONLY %s",
+					  fmtQualifiedDumpable(tbinfo));
+	if (tdinfo->filtercond)
+		appendPQExpBuffer(q, " %s", tdinfo->filtercond);
+
+	ExecuteSqlStatement(fout, q->data);
+	appendPQExpBuffer(i, "FETCH %d FROM _pg_dump_cursor",
+					  dopt->dump_inserts_multiple);
+	while (1)
+	{
+		res = ExecuteSqlQuery(fout, i->data, PGRES_TUPLES_OK);
+		nfields = PQnfields(res);
+		ntuple = PQntuples(res);
+		ltuple = ntuple-1;
+		if (ntuple  > 0)
+		{
+			if (insertStmt == NULL)
+			{
+				TableInfo  *targettab;
+
+				insertStmt = createPQExpBuffer();
+
+				/*
+				 * When load-via-partition-root is set, get the root table
+				 * name for the partition table, so that we can reload data
+				 * through the root table.
+				 */
+				if (dopt->load_via_partition_root && tbinfo->ispartition)
+					targettab = getRootTableInfo(tbinfo);
+				else
+					targettab = tbinfo;
+
+				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+								  fmtQualifiedDumpable(targettab));
+
+				/* corner case for zero-column table */
+				if (nfields == 0)
+				{
+					appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+				}
+				else
+				{
+					/* append the list of column names if required */
+					if (dopt->column_inserts)
+					{
+						appendPQExpBufferChar(insertStmt, '(');
+						for (field = 0; field < nfields; field++)
+						{
+							if (field > 0)
+								appendPQExpBufferStr(insertStmt, ", ");
+							appendPQExpBufferStr(insertStmt,
+												 fmtId(PQfname(res, field)));
+						}
+						appendPQExpBufferStr(insertStmt, ") ");
+					}
+
+					if (tbinfo->needs_override)
+						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+
+					appendPQExpBufferStr(insertStmt, "VALUES ");
+				}
+			}
+
+			if (nfields != 0)
+				archputs(insertStmt->data, fout);
+
+			for (tuple = 0; tuple < ntuple ; tuple++)
+			{
+
+				/* if it is zero-column table then we're done */
+				if (nfields == 0)
+				{
+					archputs(insertStmt->data, fout);
+					continue;
+				}
+				if (tuple == 0)
+					archputs("(", fout);
+				else
+					archputs(", (", fout);
+
+				for (field = 0; field < nfields; field++)
+				{
+					if (field > 0)
+						archputs(", ", fout);
+					if (PQgetisnull(res, tuple, field))
+					{
+						archputs("NULL", fout);
+						continue;
+					}
+
+					/* XXX This code is partially duplicated in ruleutils.c */
+					switch (PQftype(res, field))
+					{
+						case INT2OID:
+						case INT4OID:
+						case INT8OID:
+						case OIDOID:
+						case FLOAT4OID:
+						case FLOAT8OID:
+						case NUMERICOID:
+							{
+								/*
+								 * These types are printed without quotes unless
+								 * they contain values that aren't accepted by the
+								 * scanner unquoted (e.g., 'NaN').  Note that
+								 * strtod() and friends might accept NaN, so we
+								 * can't use that to test.
+								 *
+								 * In reality we only need to defend against
+								 * infinity and NaN, so we need not get too crazy
+								 * about pattern matching here.
+								 */
+								const char *s = PQgetvalue(res, tuple, field);
+
+								if (strspn(s, "0123456789 +-eE.") == strlen(s))
+									archputs(s, fout);
+								else
+									archprintf(fout, "'%s'", s);
+							}
+							break;
+
+						case BITOID:
+						case VARBITOID:
+							archprintf(fout, "B'%s'",
+									   PQgetvalue(res, tuple, field));
+							break;
+
+						case BOOLOID:
+							if (strcmp(PQgetvalue(res, tuple, field), "t") == 0)
+								archputs("true", fout);
+							else
+								archputs("false", fout);
+							break;
+
+						default:
+							/* All other types are printed as string literals. */
+							resetPQExpBuffer(q);
+							appendStringLiteralAH(q,
+												  PQgetvalue(res, tuple, field),
+												  fout);
+							archputs(q->data, fout);
+							break;
+					}
+				}
+				if (tuple < ltuple)
+					archputs(")\n", fout);
+
+			}
+			if (!dopt->do_nothing && nfields != 0)
+				archputs(");\n", fout);
+			if (dopt->do_nothing && nfields != 0)
+				archputs(") ON CONFLICT DO NOTHING;\n", fout);
+		}
+		if (PQntuples(res) <= 0)
+		{
+			PQclear(res);
+			break;
+		}
+		PQclear(res);
+	}
+
+	archputs("\n\n", fout);
+
+	ExecuteSqlStatement(fout, "CLOSE _pg_dump_cursor");
+
+	destroyPQExpBuffer(q);
+	destroyPQExpBuffer(i);
+	if (insertStmt != NULL)
+		destroyPQExpBuffer(insertStmt);
+
+	return 1;
+}
+
 /*
  * getRootTableInfo:
  *     get the root TableInfo for the given partition table.
@@ -2091,7 +2310,7 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
 	char	   *copyStmt;
 	const char *copyFrom;
 
-	if (!dopt->dump_inserts)
+	if (!dopt->dump_inserts && !dopt->dump_inserts_multiple)
 	{
 		/* Dump/restore using COPY */
 		dumpFn = dumpTableData_copy;
@@ -2118,6 +2337,12 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
 						  fmtCopyColumnList(tbinfo, clistBuf));
 		copyStmt = copyBuf->data;
 	}
+	else if (dopt->dump_inserts_multiple)
+	{
+		/* Restore using multiple values INSERT */
+		dumpFn = dumpTableData_insert_multiple;
+		copyStmt = NULL;
+	}
 	else
 	{
 		/* Restore using INSERT */
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a875d540b8..e1ca5416b7 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -118,8 +118,8 @@ command_fails_like(
 
 command_fails_like(
 	[ 'pg_dump', '--on-conflict-do-nothing' ],
-	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts\E/,
-	'pg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts');
+	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts , --insert-multi or --column-inserts\E/,
+	'pg_dump: option --on-conflict-do-nothing requires option --inserts , --insert-multi or --column-inserts');
 
 # pg_dumpall command-line argument checks
 command_fails_like(

#19

david.rowley@2ndquadrant.com

about 7 years ago

In reply to: Surafel Temesgen (#18)

Re: pg_dump multi VALUES INSERT

On Thu, 3 Jan 2019 at 01:50, Surafel Temesgen <surafel3000@gmail.com> wrote:

On Mon, Dec 31, 2018 at 12:38 PM David Rowley <david.rowley@2ndquadrant.com> wrote:

Just looking at the v5 patch, it seems not to handle 0 column tables correctly.

The attach patch contain a fix for it

+ /* if it is zero-column table then we're done */
+ if (nfields == 0)
+ {
+ archputs(insertStmt->data, fout);
+ continue;
+ }

So looks like you're falling back on one INSERT per row for this case.
Given that this function is meant to be doing 'dump_inserts_multiple'
INSERTs per row, I think the comment should give some details of why
we can't do multi-inserts, and explain the reason for it. "we're done"
is just not enough detail.

I've not looked at the rest of the patch.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#20

surafel3000@gmail.com

about 7 years ago

In reply to: David Rowley (#19)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

On Thu, Jan 3, 2019 at 1:38 AM David Rowley <david.rowley@2ndquadrant.com>
wrote:

On Thu, 3 Jan 2019 at 01:50, Surafel Temesgen <surafel3000@gmail.com>
wrote:

On Mon, Dec 31, 2018 at 12:38 PM David Rowley <

david.rowley@2ndquadrant.com> wrote:

Just looking at the v5 patch, it seems not to handle 0 column tables

correctly.

The attach patch contain a fix for it
+ /* if it is zero-column table then we're done */
+ if (nfields == 0)
+ {
+ archputs(insertStmt->data, fout);
+ continue;
+ }
So looks like you're falling back on one INSERT per row for this case.
Given that this function is meant to be doing 'dump_inserts_multiple'
INSERTs per row, I think the comment should give some details of why
we can't do multi-inserts, and explain the reason for it. "we're done"
is just not enough detail.

right , attach patch add more detail comment

regards
Surafel

Attachments:

multi_values_inserts_dum_v6.patchtext/x-patch; charset=US-ASCII; name=multi_values_inserts_dum_v6.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2015410a42..ee94d1d293 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -775,6 +775,18 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--insert-multi</option></term>
+      <listitem>
+       <para>
+        Specify the number of values per <command>INSERT</command> command.
+        This will make the dump file smaller than <option>--inserts</option>
+        and it is faster to reload but lack per row data lost on error
+        instead entire affected insert statement data lost.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--load-via-partition-root</option></term>
       <listitem>
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..73a243ecb0 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -72,6 +72,7 @@ typedef struct _restoreOptions
 	int			dropSchema;
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;	/* Skip comments */
@@ -144,6 +145,7 @@ typedef struct _dumpOptions
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 341b1a51f2..050ef89650 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -313,6 +313,7 @@ main(int argc, char **argv)
 	int			plainText = 0;
 	ArchiveFormat archiveFormat = archUnknown;
 	ArchiveMode archiveMode;
+	char       *p;
 
 	static DumpOptions dopt;
 
@@ -359,6 +360,7 @@ main(int argc, char **argv)
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
 		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"insert-multi", required_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -557,6 +559,27 @@ main(int argc, char **argv)
 				dosync = false;
 				break;
 
+			case 8:			/* inserts values number */
+				errno = 0;
+				dopt.dump_inserts_multiple = strtol(optarg, &p, 10);
+				if (p == optarg || *p != '\0')
+				{
+					write_msg(NULL, "argument of --insert-multi must be a number\n");
+					exit_nicely(1);
+				}
+				if (errno == ERANGE)
+				{
+					write_msg(NULL, "argument of --insert-multi exceeds integer range.\n");
+					exit_nicely(1);
+				}
+				if (dopt.dump_inserts_multiple < 0)
+				{
+					write_msg(NULL, "argument of --insert-multi must be positive number\n");
+					exit_nicely(1);
+				}
+
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -607,8 +630,9 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
-		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
+	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts ||
+		dopt.dump_inserts_multiple))
+		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --insert-multi or --column-inserts\n");
 
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -877,6 +901,7 @@ main(int argc, char **argv)
 	ropt->use_setsessauth = dopt.use_setsessauth;
 	ropt->disable_dollar_quoting = dopt.disable_dollar_quoting;
 	ropt->dump_inserts = dopt.dump_inserts;
+	ropt->dump_inserts_multiple = dopt.dump_inserts_multiple;
 	ropt->no_comments = dopt.no_comments;
 	ropt->no_publications = dopt.no_publications;
 	ropt->no_security_labels = dopt.no_security_labels;
@@ -967,6 +992,7 @@ help(const char *progname)
 	printf(_("  --exclude-table-data=TABLE   do NOT dump data for the named table(s)\n"));
 	printf(_("  --if-exists                  use IF EXISTS when dropping objects\n"));
 	printf(_("  --inserts                    dump data as INSERT commands, rather than COPY\n"));
+	printf(_("  --insert-multi               number of values per INSERT command\n"));
 	printf(_("  --load-via-partition-root    load partitions via the root table\n"));
 	printf(_("  --no-comments                do not dump comments\n"));
 	printf(_("  --no-publications            do not dump publications\n"));
@@ -2052,6 +2078,203 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	return 1;
 }
 
+/*
+ * Dump table data using multiple values INSERT commands.
+ */
+static int
+dumpTableData_insert_multiple(Archive *fout, void *dcontext)
+{
+	TableDataInfo *tdinfo = (TableDataInfo *) dcontext;
+	TableInfo  *tbinfo = tdinfo->tdtable;
+	DumpOptions *dopt = fout->dopt;
+	PQExpBuffer q = createPQExpBuffer();
+	PQExpBuffer i = createPQExpBuffer();
+	PQExpBuffer insertStmt = NULL;
+	PGresult   *res;
+	int			tuple;
+	int			nfields;
+	int			field;
+	int			ntuple;
+	int			ltuple;
+
+	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
+					  "SELECT * FROM ONLY %s",
+					  fmtQualifiedDumpable(tbinfo));
+	if (tdinfo->filtercond)
+		appendPQExpBuffer(q, " %s", tdinfo->filtercond);
+
+	ExecuteSqlStatement(fout, q->data);
+	appendPQExpBuffer(i, "FETCH %d FROM _pg_dump_cursor",
+					  dopt->dump_inserts_multiple);
+	while (1)
+	{
+		res = ExecuteSqlQuery(fout, i->data, PGRES_TUPLES_OK);
+		nfields = PQnfields(res);
+		ntuple = PQntuples(res);
+		ltuple = ntuple-1;
+		if (ntuple  > 0)
+		{
+			if (insertStmt == NULL)
+			{
+				TableInfo  *targettab;
+
+				insertStmt = createPQExpBuffer();
+
+				/*
+				 * When load-via-partition-root is set, get the root table
+				 * name for the partition table, so that we can reload data
+				 * through the root table.
+				 */
+				if (dopt->load_via_partition_root && tbinfo->ispartition)
+					targettab = getRootTableInfo(tbinfo);
+				else
+					targettab = tbinfo;
+
+				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+								  fmtQualifiedDumpable(targettab));
+
+				/* corner case for zero-column table */
+				if (nfields == 0)
+				{
+					appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+				}
+				else
+				{
+					/* append the list of column names if required */
+					if (dopt->column_inserts)
+					{
+						appendPQExpBufferChar(insertStmt, '(');
+						for (field = 0; field < nfields; field++)
+						{
+							if (field > 0)
+								appendPQExpBufferStr(insertStmt, ", ");
+							appendPQExpBufferStr(insertStmt,
+												 fmtId(PQfname(res, field)));
+						}
+						appendPQExpBufferStr(insertStmt, ") ");
+					}
+
+					if (tbinfo->needs_override)
+						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+
+					appendPQExpBufferStr(insertStmt, "VALUES ");
+				}
+			}
+
+			if (nfields != 0)
+				archputs(insertStmt->data, fout);
+
+			for (tuple = 0; tuple < ntuple ; tuple++)
+			{
+
+				/*
+				 * if it is zero-column table there are no suitable multi values
+				 * insert statement for default values so we dump it using single
+				 * values insert statement.
+				 */
+				if (nfields == 0)
+				{
+					archputs(insertStmt->data, fout);
+					continue;
+				}
+				if (tuple == 0)
+					archputs("(", fout);
+				else
+					archputs(", (", fout);
+
+				for (field = 0; field < nfields; field++)
+				{
+					if (field > 0)
+						archputs(", ", fout);
+					if (PQgetisnull(res, tuple, field))
+					{
+						archputs("NULL", fout);
+						continue;
+					}
+
+					/* XXX This code is partially duplicated in ruleutils.c */
+					switch (PQftype(res, field))
+					{
+						case INT2OID:
+						case INT4OID:
+						case INT8OID:
+						case OIDOID:
+						case FLOAT4OID:
+						case FLOAT8OID:
+						case NUMERICOID:
+							{
+								/*
+								 * These types are printed without quotes unless
+								 * they contain values that aren't accepted by the
+								 * scanner unquoted (e.g., 'NaN').  Note that
+								 * strtod() and friends might accept NaN, so we
+								 * can't use that to test.
+								 *
+								 * In reality we only need to defend against
+								 * infinity and NaN, so we need not get too crazy
+								 * about pattern matching here.
+								 */
+								const char *s = PQgetvalue(res, tuple, field);
+
+								if (strspn(s, "0123456789 +-eE.") == strlen(s))
+									archputs(s, fout);
+								else
+									archprintf(fout, "'%s'", s);
+							}
+							break;
+
+						case BITOID:
+						case VARBITOID:
+							archprintf(fout, "B'%s'",
+									   PQgetvalue(res, tuple, field));
+							break;
+
+						case BOOLOID:
+							if (strcmp(PQgetvalue(res, tuple, field), "t") == 0)
+								archputs("true", fout);
+							else
+								archputs("false", fout);
+							break;
+
+						default:
+							/* All other types are printed as string literals. */
+							resetPQExpBuffer(q);
+							appendStringLiteralAH(q,
+												  PQgetvalue(res, tuple, field),
+												  fout);
+							archputs(q->data, fout);
+							break;
+					}
+				}
+				if (tuple < ltuple)
+					archputs(")\n", fout);
+
+			}
+			if (!dopt->do_nothing && nfields != 0)
+				archputs(");\n", fout);
+			if (dopt->do_nothing && nfields != 0)
+				archputs(") ON CONFLICT DO NOTHING;\n", fout);
+		}
+		if (PQntuples(res) <= 0)
+		{
+			PQclear(res);
+			break;
+		}
+		PQclear(res);
+	}
+
+	archputs("\n\n", fout);
+
+	ExecuteSqlStatement(fout, "CLOSE _pg_dump_cursor");
+
+	destroyPQExpBuffer(q);
+	destroyPQExpBuffer(i);
+	if (insertStmt != NULL)
+		destroyPQExpBuffer(insertStmt);
+
+	return 1;
+}
+
 /*
  * getRootTableInfo:
  *     get the root TableInfo for the given partition table.
@@ -2091,7 +2314,7 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
 	char	   *copyStmt;
 	const char *copyFrom;
 
-	if (!dopt->dump_inserts)
+	if (!dopt->dump_inserts && !dopt->dump_inserts_multiple)
 	{
 		/* Dump/restore using COPY */
 		dumpFn = dumpTableData_copy;
@@ -2118,6 +2341,12 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
 						  fmtCopyColumnList(tbinfo, clistBuf));
 		copyStmt = copyBuf->data;
 	}
+	else if (dopt->dump_inserts_multiple)
+	{
+		/* Restore using multiple values INSERT */
+		dumpFn = dumpTableData_insert_multiple;
+		copyStmt = NULL;
+	}
 	else
 	{
 		/* Restore using INSERT */
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a875d540b8..e1ca5416b7 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -118,8 +118,8 @@ command_fails_like(
 
 command_fails_like(
 	[ 'pg_dump', '--on-conflict-do-nothing' ],
-	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts\E/,
-	'pg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts');
+	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts , --insert-multi or --column-inserts\E/,
+	'pg_dump: option --on-conflict-do-nothing requires option --inserts , --insert-multi or --column-inserts');
 
 # pg_dumpall command-line argument checks
 command_fails_like(

#21

david.rowley@2ndquadrant.com

about 7 years ago

In reply to: Surafel Temesgen (#16)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

On Mon, 31 Dec 2018 at 18:58, Surafel Temesgen <surafel3000@gmail.com> wrote:

On Fri, Dec 28, 2018 at 6:46 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:

At first i also try to do it like that but it seems the function will
became long and more complex to me

Probably. But calling it with size 100 should result in the same behavior,
so it is really just an extension of the preceeding one? Or am I missing
something?

Specifying table data using single value insert statement and user specified values insert statement
have enough deference that demand to be separate function and they are not the same thing that should implement
with the same function. Regarding code duplication i think the solution is making those code separate function
and call at appropriate place.

I don't really buy this. I've just hacked up a version of
dumpTableData_insert() which supports a variable number rows per
statement. It seems fairly clean and easy to me. Likely the fact that
this is very possible greatly increases the chances of this getting in
since it gets rid of the code duplication. I did also happen to move
the row building code out of the function into its own function, but
that's not really required. I just did that so I could see all the
code in charge of terminating each statement on my screen without
having to scroll. I've not touched any of the plumbing work to plug
the rows_per_statement variable into the command line argument. So
it'll need a bit of merge work with the existing patch. There will
need to be some code that ensures that the user does not attempt to
have 0 rows per statement. The code I wrote won't behave well if that
happens.

... Checks existing patch ...

I see you added a test, but missed checking for 0. That needs to be fixed.

+ if (dopt.dump_inserts_multiple < 0)
+ {
+ write_msg(NULL, "argument of --insert-multi must be positive number\n");
+ exit_nicely(1);
+ }

I also didn't adopt passing the rows-per-statement into the FETCH. I
think that's a very bad idea and we should keep that strictly at 100.
I don't see any reason to tie the two together. If a user wants 10
rows per statement, do we really want to FETCH 10 times more often?
And what happens when they want 1 million rows per statement? We've no
reason to run out of memory from this since we're just dumping the
rows out to the archive on each row.

+        Specify the number of values per <command>INSERT</command> command.
+        This will make the dump file smaller than <option>--inserts</option>
+        and it is faster to reload but lack per row data lost on error
+        instead entire affected insert statement data lost.

Unsure what you mean about "data lost". It also controls the number
of "rows" per <command>INSERT</command> statement, not the number of
values.

I think it would be fine just to say:

+        When using <option>--inserts</option>, this allows the maximum number
+        of rows per <command>INSERT</command> statement to be specified.
+        This setting defaults to 1.

I've used "maximum" there as the final statement on each table can
have less and also 0-column tables will always be 1 row per statement.

2. Is --insert-multi a good name? What if they do --insert-multi=1?
That's not very "multi". Is --rows-per-insert better?

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

multi-row-inserts_for_pg_dump_drowley.patchapplication/octet-stream; name=multi-row-inserts_for_pg_dump_drowley.patchDownload

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0e129f9654..ed0076573e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -1866,6 +1866,91 @@ dumpTableData_copy(Archive *fout, void *dcontext)
 	return 1;
 }
 
+/*
+ * putArchiveRow
+ *		Write the 'tuple'th row from 'res' to 'fout'.
+ *		'nfields' must match PQnfields(res) and not be 0.
+ *
+ * 'buf' must be a pre-allocated PQExpBuffer which is used as a temporary
+ * buffer in the function. The contents of which should not be expected to
+ * remain after this function returns.
+ */
+static void
+putArchiveRow(Archive *fout, PQExpBuffer buf, PGresult *res, int tuple,
+			  int nfields)
+{
+	int			field;
+
+	Assert(nfields > 0);
+
+	archputs("(", fout);
+
+	for (field = 0; field < nfields; field++)
+	{
+		if (field > 0)
+			archputs(", ", fout);
+		if (PQgetisnull(res, tuple, field))
+		{
+			archputs("NULL", fout);
+			continue;
+		}
+
+		/* XXX This code is partially duplicated in ruleutils.c */
+		switch (PQftype(res, field))
+		{
+			case INT2OID:
+			case INT4OID:
+			case INT8OID:
+			case OIDOID:
+			case FLOAT4OID:
+			case FLOAT8OID:
+			case NUMERICOID:
+				{
+					/*
+					 * These types are printed without quotes unless they
+					 * contain values that aren't accepted by the scanner
+					 * unquoted (e.g., 'NaN').  Note that strtod() and friends
+					 * might accept NaN, so we can't use that to test.
+					 *
+					 * In reality we only need to defend against infinity and
+					 * NaN, so we need not get too crazy about pattern
+					 * matching here.
+					 */
+					const char *s = PQgetvalue(res, tuple, field);
+
+					if (strspn(s, "0123456789 +-eE.") == strlen(s))
+						archputs(s, fout);
+					else
+						archprintf(fout, "'%s'", s);
+				}
+				break;
+
+			case BITOID:
+			case VARBITOID:
+				archprintf(fout, "B'%s'",
+						   PQgetvalue(res, tuple, field));
+				break;
+
+			case BOOLOID:
+				if (strcmp(PQgetvalue(res, tuple, field), "t") == 0)
+					archputs("true", fout);
+				else
+					archputs("false", fout);
+				break;
+
+			default:
+				/* All other types are printed as string literals. */
+				resetPQExpBuffer(buf);
+				appendStringLiteralAH(buf,
+									  PQgetvalue(res, tuple, field),
+									  fout);
+				archputs(buf->data, fout);
+				break;
+		}
+	}
+	archputs(")", fout);
+}
+
 /*
  * Dump table data using INSERT commands.
  *
@@ -1886,6 +1971,8 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	int			tuple;
 	int			nfields;
 	int			field;
+	int			rows_per_statement = 3;
+	int			rows_this_statement = 0;
 
 	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
 					  "SELECT * FROM ONLY %s",
@@ -1900,137 +1987,97 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		res = ExecuteSqlQuery(fout, "FETCH 100 FROM _pg_dump_cursor",
 							  PGRES_TUPLES_OK);
 		nfields = PQnfields(res);
-		for (tuple = 0; tuple < PQntuples(res); tuple++)
+
+		/*
+		 * First time through, we build as much of the INSERT statement as
+		 * possible in "insertStmt", which we can then just print for each
+		 * line. If the table happens to have zero columns then this will be a
+		 * complete statement, otherwise it will end in "VALUES " and be ready
+		 * to have the row's column values appended.
+		 */
+		if (insertStmt == NULL)
 		{
-			/*
-			 * First time through, we build as much of the INSERT statement as
-			 * possible in "insertStmt", which we can then just print for each
-			 * line. If the table happens to have zero columns then this will
-			 * be a complete statement, otherwise it will end in "VALUES(" and
-			 * be ready to have the row's column values appended.
-			 */
-			if (insertStmt == NULL)
-			{
-				TableInfo  *targettab;
+			TableInfo  *targettab;
 
-				insertStmt = createPQExpBuffer();
+			insertStmt = createPQExpBuffer();
 
-				/*
-				 * When load-via-partition-root is set, get the root table
-				 * name for the partition table, so that we can reload data
-				 * through the root table.
-				 */
-				if (dopt->load_via_partition_root && tbinfo->ispartition)
-					targettab = getRootTableInfo(tbinfo);
-				else
-					targettab = tbinfo;
+			/*
+			 * When load-via-partition-root is set, get the root table name
+			 * for the partition table, so that we can reload data through the
+			 * root table.
+			 */
+			if (dopt->load_via_partition_root && tbinfo->ispartition)
+				targettab = getRootTableInfo(tbinfo);
+			else
+				targettab = tbinfo;
 
-				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
-								  fmtQualifiedDumpable(targettab));
+			appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+							  fmtQualifiedDumpable(targettab));
 
-				/* corner case for zero-column table */
-				if (nfields == 0)
-				{
-					appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
-				}
-				else
+			/* corner case for zero-column table */
+			if (nfields == 0)
+			{
+				appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+			}
+			else
+			{
+				/* append the list of column names if required */
+				if (dopt->column_inserts)
 				{
-					/* append the list of column names if required */
-					if (dopt->column_inserts)
+					appendPQExpBufferChar(insertStmt, '(');
+					for (field = 0; field < nfields; field++)
 					{
-						appendPQExpBufferChar(insertStmt, '(');
-						for (field = 0; field < nfields; field++)
-						{
-							if (field > 0)
-								appendPQExpBufferStr(insertStmt, ", ");
-							appendPQExpBufferStr(insertStmt,
-												 fmtId(PQfname(res, field)));
-						}
-						appendPQExpBufferStr(insertStmt, ") ");
+						if (field > 0)
+							appendPQExpBufferStr(insertStmt, ", ");
+						appendPQExpBufferStr(insertStmt,
+											 fmtId(PQfname(res, field)));
 					}
+					appendPQExpBufferStr(insertStmt, ") ");
+				}
 
-					if (tbinfo->needs_override)
-						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+				if (tbinfo->needs_override)
+					appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
 
-					appendPQExpBufferStr(insertStmt, "VALUES (");
-				}
+				appendPQExpBufferStr(insertStmt, "VALUES ");
 			}
+		}
 
-			archputs(insertStmt->data, fout);
+		for (tuple = 0; tuple < PQntuples(res); tuple++)
+		{
+			/*
+			 * If we've not written the initial part of the statement yet then
+			 * do so now.
+			 */
+			if (rows_this_statement == 0)
+				archputs(insertStmt->data, fout);
 
-			/* if it is zero-column table then we're done */
+			/*
+			 * If it is zero-column table then we've written all we need to.
+			 * We're unable to do multi-inserts for this case due to lack of a
+			 * valid syntax, so continue to use single row statements
+			 */
 			if (nfields == 0)
 				continue;
 
-			for (field = 0; field < nfields; field++)
-			{
-				if (field > 0)
-					archputs(", ", fout);
-				if (PQgetisnull(res, tuple, field))
-				{
-					archputs("NULL", fout);
-					continue;
-				}
-
-				/* XXX This code is partially duplicated in ruleutils.c */
-				switch (PQftype(res, field))
-				{
-					case INT2OID:
-					case INT4OID:
-					case INT8OID:
-					case OIDOID:
-					case FLOAT4OID:
-					case FLOAT8OID:
-					case NUMERICOID:
-						{
-							/*
-							 * These types are printed without quotes unless
-							 * they contain values that aren't accepted by the
-							 * scanner unquoted (e.g., 'NaN').  Note that
-							 * strtod() and friends might accept NaN, so we
-							 * can't use that to test.
-							 *
-							 * In reality we only need to defend against
-							 * infinity and NaN, so we need not get too crazy
-							 * about pattern matching here.
-							 */
-							const char *s = PQgetvalue(res, tuple, field);
-
-							if (strspn(s, "0123456789 +-eE.") == strlen(s))
-								archputs(s, fout);
-							else
-								archprintf(fout, "'%s'", s);
-						}
-						break;
+			if (rows_this_statement > 0)
+				archputs(", ", fout);
 
-					case BITOID:
-					case VARBITOID:
-						archprintf(fout, "B'%s'",
-								   PQgetvalue(res, tuple, field));
-						break;
+			putArchiveRow(fout, q, res, tuple, nfields);
+			rows_this_statement++;
 
-					case BOOLOID:
-						if (strcmp(PQgetvalue(res, tuple, field), "t") == 0)
-							archputs("true", fout);
-						else
-							archputs("false", fout);
-						break;
-
-					default:
-						/* All other types are printed as string literals. */
-						resetPQExpBuffer(q);
-						appendStringLiteralAH(q,
-											  PQgetvalue(res, tuple, field),
-											  fout);
-						archputs(q->data, fout);
-						break;
-				}
+			/*
+			 * If we've put the target number of rows onto this statement then
+			 * we can terminate it now.
+			 */
+			if (rows_this_statement == rows_per_statement)
+			{
+				/* reset the row counter */
+				rows_this_statement = 0;
+				if (!dopt->do_nothing)
+					archputs(";\n", fout);
+				else
+					archputs(" ON CONFLICT DO NOTHING;\n", fout);
 			}
-
-			if (!dopt->do_nothing)
-				archputs(");\n", fout);
-			else
-				archputs(") ON CONFLICT DO NOTHING;\n", fout);
 		}
 
 		if (PQntuples(res) <= 0)
@@ -2041,6 +2088,16 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		PQclear(res);
 	}
 
+	/* terminate any partially written statement */
+	if (rows_this_statement > 0)
+	{
+		if (!dopt->do_nothing)
+			archputs(";\n", fout);
+		else
+			archputs(" ON CONFLICT DO NOTHING;\n", fout);
+	}
+
+
 	archputs("\n\n", fout);
 
 	ExecuteSqlStatement(fout, "CLOSE _pg_dump_cursor");

#22

alvherre@2ndquadrant.com

about 7 years ago

In reply to: David Rowley (#21)

Re: pg_dump multi VALUES INSERT

FWIW you can insert multiple zero-column rows with "insert into ..
select union all select union all select".

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#23

surafel3000@gmail.com

almost 7 years ago

In reply to: David Rowley (#21)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

On Fri, Jan 4, 2019 at 3:08 PM David Rowley <david.rowley@2ndquadrant.com>
wrote:

On Mon, 31 Dec 2018 at 18:58, Surafel Temesgen <surafel3000@gmail.com>
wrote:

On Fri, Dec 28, 2018 at 6:46 PM Fabien COELHO <coelho@cri.ensmp.fr>

wrote:

At first i also try to do it like that but it seems the function will
became long and more complex to me

Probably. But calling it with size 100 should result in the same

behavior,

so it is really just an extension of the preceeding one? Or am I missing
something?

Specifying table data using single value insert statement and user

specified values insert statement

have enough deference that demand to be separate function and they are

not the same thing that should implement

with the same function. Regarding code duplication i think the solution

is making those code separate function

and call at appropriate place.

I don't really buy this. I've just hacked up a version of
dumpTableData_insert() which supports a variable number rows per
statement. It seems fairly clean and easy to me. Likely the fact that
this is very possible greatly increases the chances of this getting in
since it gets rid of the code duplication. I did also happen to move
the row building code out of the function into its own function, but
that's not really required. I just did that so I could see all the
code in charge of terminating each statement on my screen without
having to scroll. I've not touched any of the plumbing work to plug
the rows_per_statement variable into the command line argument. So
it'll need a bit of merge work with the existing patch. There will
need to be some code that ensures that the user does not attempt to
have 0 rows per statement. The code I wrote won't behave well if that
happens.

The attache patch use your method mostly

... Checks existing patch ...

I see you added a test, but missed checking for 0. That needs to be fixed.
+ if (dopt.dump_inserts_multiple < 0)
+ {
+ write_msg(NULL, "argument of --insert-multi must be positive number\n");
+ exit_nicely(1);
+ }

fixed

I also didn't adopt passing the rows-per-statement into the FETCH. I

think that's a very bad idea and we should keep that strictly at 100.
I don't see any reason to tie the two together. If a user wants 10
rows per statement, do we really want to FETCH 10 times more often?
And what happens when they want 1 million rows per statement? We've no
reason to run out of memory from this since we're just dumping the
rows out to the archive on each row.

okay

+        Specify the number of values per <command>INSERT</command>
command.
+        This will make the dump file smaller than
<option>--inserts</option>
+        and it is faster to reload but lack per row data lost on error
+        instead entire affected insert statement data lost.

Unsure what you mean about "data lost". It also controls the number
of "rows" per <command>INSERT</command> statement, not the number of
values.

I think it would be fine just to say:

+        When using <option>--inserts</option>, this allows the maximum
number
+        of rows per <command>INSERT</command> statement to be specified.
+        This setting defaults to 1.

i change it too except "This setting defaults to 1" because it doesn't
have default value.
1 row per statement means --inserts option .

2. Is --insert-multi a good name? What if they do --insert-multi=1?
That's not very "multi". Is --rows-per-insert better?

--rows-per-insert is better for me .

regards
Surafel

Attachments:

multi_values_inserts_dum_v7.patchtext/x-patch; charset=US-ASCII; name=multi_values_inserts_dum_v7.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 9e0bb93f08..4195fb81a2 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -775,6 +775,16 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--rows-per-insert</option></term>
+      <listitem>
+       <para>
+        When using <option>--rows-per-insert</option>, this allows the maximum number
+        of rows per <command>INSERT</command> statement to be specified.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--load-via-partition-root</option></term>
       <listitem>
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..73a243ecb0 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -72,6 +72,7 @@ typedef struct _restoreOptions
 	int			dropSchema;
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;	/* Skip comments */
@@ -144,6 +145,7 @@ typedef struct _dumpOptions
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0e129f9654..e49e2206e7 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -313,6 +313,7 @@ main(int argc, char **argv)
 	int			plainText = 0;
 	ArchiveFormat archiveFormat = archUnknown;
 	ArchiveMode archiveMode;
+	char       *p;
 
 	static DumpOptions dopt;
 
@@ -359,6 +360,7 @@ main(int argc, char **argv)
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
 		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"rows-per-insert", required_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -557,6 +559,27 @@ main(int argc, char **argv)
 				dosync = false;
 				break;
 
+			case 8:			/* inserts row number */
+				errno = 0;
+				dopt.dump_inserts_multiple = strtol(optarg, &p, 10);
+				if (p == optarg || *p != '\0')
+				{
+					write_msg(NULL, "argument of --rows-per-insert must be a number\n");
+					exit_nicely(1);
+				}
+				if (errno == ERANGE)
+				{
+					write_msg(NULL, "argument of --rows-per-insert exceeds integer range.\n");
+					exit_nicely(1);
+				}
+				if (dopt.dump_inserts_multiple <= 0)
+				{
+					write_msg(NULL, "argument of --rows-per-insert must be positive number\n");
+					exit_nicely(1);
+				}
+
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -607,8 +630,9 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
-		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
+	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts ||
+		dopt.dump_inserts_multiple))
+		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\n");
 
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -877,6 +901,7 @@ main(int argc, char **argv)
 	ropt->use_setsessauth = dopt.use_setsessauth;
 	ropt->disable_dollar_quoting = dopt.disable_dollar_quoting;
 	ropt->dump_inserts = dopt.dump_inserts;
+	ropt->dump_inserts_multiple = dopt.dump_inserts_multiple;
 	ropt->no_comments = dopt.no_comments;
 	ropt->no_publications = dopt.no_publications;
 	ropt->no_security_labels = dopt.no_security_labels;
@@ -967,6 +992,7 @@ help(const char *progname)
 	printf(_("  --exclude-table-data=TABLE   do NOT dump data for the named table(s)\n"));
 	printf(_("  --if-exists                  use IF EXISTS when dropping objects\n"));
 	printf(_("  --inserts                    dump data as INSERT commands, rather than COPY\n"));
+	printf(_("  --rows-per-insert            number of row per INSERT command\n"));
 	printf(_("  --load-via-partition-root    load partitions via the root table\n"));
 	printf(_("  --no-comments                do not dump comments\n"));
 	printf(_("  --no-publications            do not dump publications\n"));
@@ -1886,6 +1912,8 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	int			tuple;
 	int			nfields;
 	int			field;
+	int number_of_row = 1;
+	int end_of_statement = 0;
 
 	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
 					  "SELECT * FROM ONLY %s",
@@ -1900,67 +1928,82 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		res = ExecuteSqlQuery(fout, "FETCH 100 FROM _pg_dump_cursor",
 							  PGRES_TUPLES_OK);
 		nfields = PQnfields(res);
+
+		/*
+		 * First time through, we build as much of the INSERT statement as
+		 * possible in "insertStmt", which we can then just print for each
+		 * line. If the table happens to have zero columns then this will
+		 * be a complete statement, otherwise it will end in "VALUES(" and
+		 * be ready to have the row's column values appended.
+		 */
+		if (insertStmt == NULL)
+		{
+			TableInfo  *targettab;
+
+			insertStmt = createPQExpBuffer();
+
+			/*
+			 * When load-via-partition-root is set, get the root table
+			 * name for the partition table, so that we can reload data
+			 * through the root table.
+			 */
+			if (dopt->load_via_partition_root && tbinfo->ispartition)
+				targettab = getRootTableInfo(tbinfo);
+			else
+				targettab = tbinfo;
+
+			appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+							  fmtQualifiedDumpable(targettab));
+
+			/* corner case for zero-column table */
+			if (nfields == 0)
+			{
+				appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+			}
+			else
+			{
+				/* append the list of column names if required */
+				if (dopt->column_inserts)
+				{
+					appendPQExpBufferChar(insertStmt, '(');
+					for (field = 0; field < nfields; field++)
+					{
+						if (field > 0)
+							appendPQExpBufferStr(insertStmt, ", ");
+						appendPQExpBufferStr(insertStmt,
+											 fmtId(PQfname(res, field)));
+					}
+					appendPQExpBufferStr(insertStmt, ") ");
+				}
+
+				if (tbinfo->needs_override)
+					appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+
+				appendPQExpBufferStr(insertStmt, "VALUES ");
+			}
+		}
+
 		for (tuple = 0; tuple < PQntuples(res); tuple++)
 		{
-			/*
-			 * First time through, we build as much of the INSERT statement as
-			 * possible in "insertStmt", which we can then just print for each
-			 * line. If the table happens to have zero columns then this will
-			 * be a complete statement, otherwise it will end in "VALUES(" and
-			 * be ready to have the row's column values appended.
-			 */
-			if (insertStmt == NULL)
-			{
-				TableInfo  *targettab;
-
-				insertStmt = createPQExpBuffer();
-
-				/*
-				 * When load-via-partition-root is set, get the root table
-				 * name for the partition table, so that we can reload data
-				 * through the root table.
-				 */
-				if (dopt->load_via_partition_root && tbinfo->ispartition)
-					targettab = getRootTableInfo(tbinfo);
-				else
-					targettab = tbinfo;
-
-				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
-								  fmtQualifiedDumpable(targettab));
-
-				/* corner case for zero-column table */
-				if (nfields == 0)
-				{
-					appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
-				}
-				else
-				{
-					/* append the list of column names if required */
-					if (dopt->column_inserts)
-					{
-						appendPQExpBufferChar(insertStmt, '(');
-						for (field = 0; field < nfields; field++)
-						{
-							if (field > 0)
-								appendPQExpBufferStr(insertStmt, ", ");
-							appendPQExpBufferStr(insertStmt,
-												 fmtId(PQfname(res, field)));
-						}
-						appendPQExpBufferStr(insertStmt, ") ");
-					}
-
-					if (tbinfo->needs_override)
-						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
-
-					appendPQExpBufferStr(insertStmt, "VALUES (");
-				}
-			}
-
-			archputs(insertStmt->data, fout);
 
 			/* if it is zero-column table then we're done */
 			if (nfields == 0)
+			{
+				archputs(insertStmt->data, fout);
 				continue;
+			}
+
+			if (number_of_row == 1 || dopt->dump_inserts || end_of_statement)
+			{
+				archputs(insertStmt->data, fout);
+				archputs("(", fout);
+				end_of_statement = 0;
+			}
+
+			if (number_of_row > 1)
+			{
+				archputs(", ( ", fout);
+			}
 
 			for (field = 0; field < nfields; field++)
 			{
@@ -2027,12 +2070,40 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 				}
 			}
 
+			if (dopt->dump_inserts)
+			{
 			if (!dopt->do_nothing)
 				archputs(");\n", fout);
 			else
 				archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			}
+
+			if (dopt->dump_inserts_multiple)
+			{
+				if (number_of_row == dopt->dump_inserts_multiple)
+				{
+					number_of_row = 1;
+					end_of_statement = 1;
+					if (!dopt->do_nothing)
+						archputs(");\n", fout);
+					else
+						archputs(") ON CONFLICT DO NOTHING;\n", fout);
+				}
+				else
+				{
+					archputs(")\n", fout);
+					number_of_row++;
+				}
+			}
 		}
 
+		if (number_of_row > 1 && PQntuples(res) == 0)
+		{
+			if (!dopt->do_nothing)
+				archputs(";\n", fout);
+			else
+				archputs(" ON CONFLICT DO NOTHING;\n", fout);
+		}
 		if (PQntuples(res) <= 0)
 		{
 			PQclear(res);
@@ -2091,7 +2162,7 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
 	char	   *copyStmt;
 	const char *copyFrom;
 
-	if (!dopt->dump_inserts)
+	if (!dopt->dump_inserts && !dopt->dump_inserts_multiple)
 	{
 		/* Dump/restore using COPY */
 		dumpFn = dumpTableData_copy;
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a875d540b8..ebd83922dd 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -118,8 +118,8 @@ command_fails_like(
 
 command_fails_like(
 	[ 'pg_dump', '--on-conflict-do-nothing' ],
-	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts\E/,
-	'pg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts');
+	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\E/,
+	'pg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts');
 
 # pg_dumpall command-line argument checks
 command_fails_like(

#24

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Surafel Temesgen (#23)

Re: pg_dump multi VALUES INSERT

On Fri, 18 Jan 2019 at 01:15, Surafel Temesgen <surafel3000@gmail.com> wrote:

The attache patch use your method mostly

I disagree with the "mostly" part. As far as I can see, you took the
idea and then made a series of changes to completely break it. For
bonus points, you put back my comment change to make it incorrect
again.

Here's what I got after applying your latest patch:

$ pg_dump --table=t --inserts --rows-per-insert=4 postgres

[...]
INSERT INTO public.t VALUES (1);
)
INSERT INTO public.t VALUES (, ( 2);
)
INSERT INTO public.t VALUES (, ( 3);
)
INSERT INTO public.t VALUES (, ( 4);
);
INSERT INTO public.t VALUES (5);
)
INSERT INTO public.t VALUES (, ( 6);
)
INSERT INTO public.t VALUES (, ( 7);
)
INSERT INTO public.t VALUES (, ( 8);
);
INSERT INTO public.t VALUES (9);
)
;

I didn't test, but I'm pretty sure that's not valid INSERT syntax.

I'd suggest taking my changes and doing the plumbing work to tie the
rows_per_statement into the command line arg instead of how I left it
hardcoded as 3.

+        When using <option>--inserts</option>, this allows the maximum number
+        of rows per <command>INSERT</command> statement to be specified.
+        This setting defaults to 1.
i change it too except "This setting defaults to 1" because it doesn't have default value.
1 row per statement means --inserts option .

If it does not default to 1 then what happens when the option is not
specified?

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#25

David G. Johnston

david.g.johnston@gmail.com

almost 7 years ago

In reply to: Surafel Temesgen (#23)

Re: pg_dump multi VALUES INSERT

On Thu, Jan 17, 2019 at 5:15 AM Surafel Temesgen <surafel3000@gmail.com> wrote:

On Fri, Jan 4, 2019 at 3:08 PM David Rowley <david.rowley@2ndquadrant.com> wrote:

On Mon, 31 Dec 2018 at 18:58, Surafel Temesgen <surafel3000@gmail.com> wrote:

2. Is --insert-multi a good name? What if they do --insert-multi=1?
That's not very "multi". Is --rows-per-insert better?

--rows-per-insert is better for me .

Some thoughts/suggestions:

+ int dump_inserts_multiple;

The option name uses rows, seems like this should mirror that and be
named "dump_inserts_max_rows"

+     <varlistentry>
+      <term><option>--rows-per-insert</option></term>
+      <listitem>
+       <para>
+        When using <option>--rows-per-insert</option>, this allows
the maximum number
+        of rows per <command>INSERT</command> statement to be specified.
+       </para>
+      </listitem>
+     </varlistentry>

"When using <repeat option name from 20 characters ago>..." - no other
option description uses this redundant language and this should not
either. Just say what it does.

This specifies the maximum number of rows (default 1) that will be
attached to each <command>INSERT</command> command generated by the
<option>--inserts</option> or <option>--column-inserts</option>
options; exactly one of which must be specified as well. (see my note
at the end)

+ printf(_(" --rows-per-insert number of row per INSERT
command\n"));

(maximum?) number of row[s] per INSERT command

+ qr/\Qpg_dump: option --on-conflict-do-nothing requires option
--inserts , --rows-per-insert or --column-inserts\E/,
+ 'pg_dump: option --on-conflict-do-nothing requires option --inserts
, --rows-per-insert or --column-inserts');

You don't put spaces on both sides of the comma, just after; and add a
comma before the "or" (I think...not withstanding the below)

I suggest we require that --rows-per-insert be dependent upon exactly
one of --inserts or --column-inserts being set and not let it be set
by itself (in which case the original message for
--on-conflict-do-nothing is OK).

David J.

#26

surafel3000@gmail.com

almost 7 years ago

In reply to: David Rowley (#24)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

On Fri, Jan 18, 2019 at 7:14 AM David Rowley <david.rowley@2ndquadrant.com>
wrote:

On Fri, 18 Jan 2019 at 01:15, Surafel Temesgen <surafel3000@gmail.com>
wrote:

The attache patch use your method mostly

I disagree with the "mostly" part. As far as I can see, you took the
idea and then made a series of changes to completely break it. For
bonus points, you put back my comment change to make it incorrect
again.

Here's what I got after applying your latest patch:

$ pg_dump --table=t --inserts --rows-per-insert=4 postgres

[...]
INSERT INTO public.t VALUES (1);
)
INSERT INTO public.t VALUES (, ( 2);
)
INSERT INTO public.t VALUES (, ( 3);
)
INSERT INTO public.t VALUES (, ( 4);
);
INSERT INTO public.t VALUES (5);
)
INSERT INTO public.t VALUES (, ( 6);
)
INSERT INTO public.t VALUES (, ( 7);
)
INSERT INTO public.t VALUES (, ( 8);
);
INSERT INTO public.t VALUES (9);
)
;

I didn't test, but I'm pretty sure that's not valid INSERT syntax.

this happen because i don't disallow the usage of --inserts and
--rows-per-insert
option together.it should be error out in those case.i correct it in
attached patch

I'd suggest taking my changes and doing the plumbing work to tie the
rows_per_statement into the command line arg instead of how I left it
hardcoded as 3.

+ When using <option>--inserts</option>, this allows the maximum

number

+ of rows per <command>INSERT</command> statement to be

specified.

+ This setting defaults to 1.

i change it too except "This setting defaults to 1" because it doesn't

have default value.

1 row per statement means --inserts option .

If it does not default to 1 then what happens when the option is not
specified

if --inserts option specified it use single values insert statement
otherwise
it use COPY command

regards
Surafel

Attachments:

multi_values_inserts_dum_v8.patchtext/x-patch; charset=US-ASCII; name=multi_values_inserts_dum_v8.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 9e0bb93f08..4195fb81a2 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -775,6 +775,16 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--rows-per-insert</option></term>
+      <listitem>
+       <para>
+        When using <option>--rows-per-insert</option>, this allows the maximum number
+        of rows per <command>INSERT</command> statement to be specified.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--load-via-partition-root</option></term>
       <listitem>
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..73a243ecb0 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -72,6 +72,7 @@ typedef struct _restoreOptions
 	int			dropSchema;
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;	/* Skip comments */
@@ -144,6 +145,7 @@ typedef struct _dumpOptions
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0e129f9654..7a2a9789f9 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -313,6 +313,7 @@ main(int argc, char **argv)
 	int			plainText = 0;
 	ArchiveFormat archiveFormat = archUnknown;
 	ArchiveMode archiveMode;
+	char       *p;
 
 	static DumpOptions dopt;
 
@@ -359,6 +360,7 @@ main(int argc, char **argv)
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
 		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"rows-per-insert", required_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -557,6 +559,27 @@ main(int argc, char **argv)
 				dosync = false;
 				break;
 
+			case 8:			/* inserts row number */
+				errno = 0;
+				dopt.dump_inserts_multiple = strtol(optarg, &p, 10);
+				if (p == optarg || *p != '\0')
+				{
+					write_msg(NULL, "argument of --rows-per-insert must be a number\n");
+					exit_nicely(1);
+				}
+				if (errno == ERANGE)
+				{
+					write_msg(NULL, "argument of --rows-per-insert exceeds integer range.\n");
+					exit_nicely(1);
+				}
+				if (dopt.dump_inserts_multiple <= 0)
+				{
+					write_msg(NULL, "argument of --rows-per-insert must be positive number\n");
+					exit_nicely(1);
+				}
+
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -580,6 +603,12 @@ main(int argc, char **argv)
 		exit_nicely(1);
 	}
 
+	if (dopt.dump_inserts && dopt.dump_inserts_multiple)
+	{
+		write_msg(NULL, "options --inserts and --rows-per-insert cannot be used together\n");
+		exit_nicely(1);
+	}
+
 	/* --column-inserts implies --inserts */
 	if (dopt.column_inserts)
 		dopt.dump_inserts = 1;
@@ -607,8 +636,9 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
-		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
+	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts ||
+		dopt.dump_inserts_multiple))
+		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\n");
 
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -877,6 +907,7 @@ main(int argc, char **argv)
 	ropt->use_setsessauth = dopt.use_setsessauth;
 	ropt->disable_dollar_quoting = dopt.disable_dollar_quoting;
 	ropt->dump_inserts = dopt.dump_inserts;
+	ropt->dump_inserts_multiple = dopt.dump_inserts_multiple;
 	ropt->no_comments = dopt.no_comments;
 	ropt->no_publications = dopt.no_publications;
 	ropt->no_security_labels = dopt.no_security_labels;
@@ -967,6 +998,7 @@ help(const char *progname)
 	printf(_("  --exclude-table-data=TABLE   do NOT dump data for the named table(s)\n"));
 	printf(_("  --if-exists                  use IF EXISTS when dropping objects\n"));
 	printf(_("  --inserts                    dump data as INSERT commands, rather than COPY\n"));
+	printf(_("  --rows-per-insert            number of row per INSERT command\n"));
 	printf(_("  --load-via-partition-root    load partitions via the root table\n"));
 	printf(_("  --no-comments                do not dump comments\n"));
 	printf(_("  --no-publications            do not dump publications\n"));
@@ -1886,6 +1918,8 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	int			tuple;
 	int			nfields;
 	int			field;
+	int number_of_row = 1;
+	int end_of_statement = 0;
 
 	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
 					  "SELECT * FROM ONLY %s",
@@ -1900,67 +1934,86 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		res = ExecuteSqlQuery(fout, "FETCH 100 FROM _pg_dump_cursor",
 							  PGRES_TUPLES_OK);
 		nfields = PQnfields(res);
+
+		/*
+		 * First time through, we build as much of the INSERT statement as
+		 * possible in "insertStmt", which we can then just print for each
+		 * line. If the table happens to have zero columns then this will
+		 * be a complete statement, otherwise it will end in "VALUES(" and
+		 * be ready to have the row's column values appended.
+		 */
+		if (insertStmt == NULL)
+		{
+			TableInfo  *targettab;
+
+			insertStmt = createPQExpBuffer();
+
+			/*
+			 * When load-via-partition-root is set, get the root table
+			 * name for the partition table, so that we can reload data
+			 * through the root table.
+			 */
+			if (dopt->load_via_partition_root && tbinfo->ispartition)
+				targettab = getRootTableInfo(tbinfo);
+			else
+				targettab = tbinfo;
+
+			appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+							  fmtQualifiedDumpable(targettab));
+
+			/* corner case for zero-column table */
+			if (nfields == 0)
+			{
+				appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+			}
+			else
+			{
+				/* append the list of column names if required */
+				if (dopt->column_inserts)
+				{
+					appendPQExpBufferChar(insertStmt, '(');
+					for (field = 0; field < nfields; field++)
+					{
+						if (field > 0)
+							appendPQExpBufferStr(insertStmt, ", ");
+						appendPQExpBufferStr(insertStmt,
+											 fmtId(PQfname(res, field)));
+					}
+					appendPQExpBufferStr(insertStmt, ") ");
+				}
+
+				if (tbinfo->needs_override)
+					appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+
+				appendPQExpBufferStr(insertStmt, "VALUES ");
+			}
+		}
+
 		for (tuple = 0; tuple < PQntuples(res); tuple++)
 		{
+
 			/*
-			 * First time through, we build as much of the INSERT statement as
-			 * possible in "insertStmt", which we can then just print for each
-			 * line. If the table happens to have zero columns then this will
-			 * be a complete statement, otherwise it will end in "VALUES(" and
-			 * be ready to have the row's column values appended.
+			 * If it is zero-column table then we've written all we need to.
+			 * We're unable to do multi-inserts for this case due to lack of a
+			 * valid syntax, so continue to use single row statements
 			 */
-			if (insertStmt == NULL)
-			{
-				TableInfo  *targettab;
-
-				insertStmt = createPQExpBuffer();
-
-				/*
-				 * When load-via-partition-root is set, get the root table
-				 * name for the partition table, so that we can reload data
-				 * through the root table.
-				 */
-				if (dopt->load_via_partition_root && tbinfo->ispartition)
-					targettab = getRootTableInfo(tbinfo);
-				else
-					targettab = tbinfo;
-
-				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
-								  fmtQualifiedDumpable(targettab));
-
-				/* corner case for zero-column table */
-				if (nfields == 0)
-				{
-					appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
-				}
-				else
-				{
-					/* append the list of column names if required */
-					if (dopt->column_inserts)
-					{
-						appendPQExpBufferChar(insertStmt, '(');
-						for (field = 0; field < nfields; field++)
-						{
-							if (field > 0)
-								appendPQExpBufferStr(insertStmt, ", ");
-							appendPQExpBufferStr(insertStmt,
-												 fmtId(PQfname(res, field)));
-						}
-						appendPQExpBufferStr(insertStmt, ") ");
-					}
-
-					if (tbinfo->needs_override)
-						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
-
-					appendPQExpBufferStr(insertStmt, "VALUES (");
-				}
-			}
-
-			archputs(insertStmt->data, fout);
-
-			/* if it is zero-column table then we're done */
 			if (nfields == 0)
+			{
+				archputs(insertStmt->data, fout);
 				continue;
+			}
+
+			if (number_of_row == 1 || dopt->dump_inserts || end_of_statement)
+			{
+				archputs(insertStmt->data, fout);
+				archputs("(", fout);
+				end_of_statement = 0;
+			}
+
+			if (number_of_row > 1)
+			{
+				archputs(", ( ", fout);
+			}
 
 			for (field = 0; field < nfields; field++)
 			{
@@ -2027,12 +2080,40 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 				}
 			}
 
+			if (dopt->dump_inserts)
+			{
 			if (!dopt->do_nothing)
 				archputs(");\n", fout);
 			else
 				archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			}
+
+			if (dopt->dump_inserts_multiple)
+			{
+				if (number_of_row == dopt->dump_inserts_multiple)
+				{
+					number_of_row = 1;
+					end_of_statement = 1;
+					if (!dopt->do_nothing)
+						archputs(");\n", fout);
+					else
+						archputs(") ON CONFLICT DO NOTHING;\n", fout);
+				}
+				else
+				{
+					archputs(")\n", fout);
+					number_of_row++;
+				}
+			}
 		}
 
+		if (number_of_row > 1 && PQntuples(res) == 0)
+		{
+			if (!dopt->do_nothing)
+				archputs(";\n", fout);
+			else
+				archputs(" ON CONFLICT DO NOTHING;\n", fout);
+		}
 		if (PQntuples(res) <= 0)
 		{
 			PQclear(res);
@@ -2091,7 +2172,7 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
 	char	   *copyStmt;
 	const char *copyFrom;
 
-	if (!dopt->dump_inserts)
+	if (!dopt->dump_inserts && !dopt->dump_inserts_multiple)
 	{
 		/* Dump/restore using COPY */
 		dumpFn = dumpTableData_copy;
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a875d540b8..ebd83922dd 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -118,8 +118,8 @@ command_fails_like(
 
 command_fails_like(
 	[ 'pg_dump', '--on-conflict-do-nothing' ],
-	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts\E/,
-	'pg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts');
+	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\E/,
+	'pg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts');
 
 # pg_dumpall command-line argument checks
 command_fails_like(

#27

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Surafel Temesgen (#26)

Re: pg_dump multi VALUES INSERT

On Fri, 18 Jan 2019 at 19:29, Surafel Temesgen <surafel3000@gmail.com> wrote:

this happen because i don't disallow the usage of --inserts and --rows-per-insert
option together.it should be error out in those case.i correct it in attached patch

I don't think it should be an error. It's not like the two options
conflict. I imagined that you'd need to specify you want --inserts and
optionally could control how many rows per statement that would be put
in those commands. I'd be surprised to be confronted with an error for
asking for that.

It might be worth doing the same as what we do if --column-inserts is
specified without --inserts. In this case we just do:

/* --column-inserts implies --inserts */
if (dopt.column_inserts)
dopt.dump_inserts = 1;

If you do it that way you'll not need to modify the code much from how
I wrote it. We can likely debate if we want --rows-per-insert to imply
--inserts once there's a working patch.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#28

surafel3000@gmail.com

almost 7 years ago

In reply to: David Rowley (#27)

Re: pg_dump multi VALUES INSERT

On Fri, Jan 18, 2019 at 2:29 PM David Rowley <david.rowley@2ndquadrant.com>
wrote:

On Fri, 18 Jan 2019 at 19:29, Surafel Temesgen <surafel3000@gmail.com>
wrote:

this happen because i don't disallow the usage of --inserts and

--rows-per-insert

option together.it should be error out in those case.i correct it in

attached patch

I don't think it should be an error. It's not like the two options
conflict. I imagined that you'd need to specify you want --inserts and
optionally could control how many rows per statement that would be put
in those commands. I'd be surprised to be confronted with an error for
asking for that.

if you specified --inserts option you already specified the number of rows
per statement which is 1 .
if more than one rows per statement needed it must be specified using
--rows-per-insert
and specifying one row per statement using --inserts option at the same
time specify
different number of rows per statement with --rows-per-insert option seems
conflicting to me.

It might be worth doing the same as what we do if --column-inserts is

specified without --inserts. In this case we just do:

/* --column-inserts implies --inserts */
if (dopt.column_inserts)
dopt.dump_inserts = 1;

If you do it that way you'll not need to modify the code much from how
I wrote it. We can likely debate if we want --rows-per-insert to imply
--inserts once there's a working patch.

version 3 of the patch work in similar way except it doesn't have two
option.

regards
Surafel

#29

David G. Johnston

david.g.johnston@gmail.com

almost 7 years ago

In reply to: Surafel Temesgen (#28)

Re: pg_dump multi VALUES INSERT

On Fri, Jan 18, 2019 at 5:02 AM Surafel Temesgen <surafel3000@gmail.com> wrote:

On Fri, Jan 18, 2019 at 2:29 PM David Rowley <david.rowley@2ndquadrant.com> wrote:

On Fri, 18 Jan 2019 at 19:29, Surafel Temesgen <surafel3000@gmail.com> wrote:

this happen because i don't disallow the usage of --inserts and --rows-per-insert
option together.it should be error out in those case.i correct it in attached patch

I don't think it should be an error. It's not like the two options
conflict. I imagined that you'd need to specify you want --inserts and
optionally could control how many rows per statement that would be put
in those commands. I'd be surprised to be confronted with an error for
asking for that.

if you specified --inserts option you already specified the number of rows per statement which is 1 .
if more than one rows per statement needed it must be specified using --rows-per-insert
and specifying one row per statement using --inserts option at the same time specify
different number of rows per statement with --rows-per-insert option seems conflicting to me.

So, the other way of looking at it - why do we even need an entirely
new option. Modify --inserts to accept an optional integer value that
defaults to 1 (I'm not sure how tricky dealing with optional option
values is though...).

--inserts-columns implies --inserts but if you want to change the
number of rows you need to specify both (or add the same optional
integer to --inserts-columns)

David J.

#30

David G. Johnston

david.g.johnston@gmail.com

almost 7 years ago

In reply to: Fabien COELHO (#13)

Re: pg_dump multi VALUES INSERT

On Tue, Dec 25, 2018 at 4:47 AM Fabien COELHO <coelho@cri.ensmp.fr> wrote:

ISTM that command-line switches with optional arguments should be avoided:
This feature is seldom used (hmmm... 2 existing instances), because it
interferes with argument processing if such switches are used as the last
one.

Excellent point; though avoiding adding yet another limited-use option
seems like a fair trade-off here. Though maybe we also need to add
the traditional "--" option as well. I'm not married to the idea
though; but its also not like mis-interpreting the final argument as
an integer instead of a database is going to be a silent error.

David J.

#31

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Surafel Temesgen (#28)

Re: pg_dump multi VALUES INSERT

On Sat, 19 Jan 2019 at 01:01, Surafel Temesgen <surafel3000@gmail.com> wrote:

if you specified --inserts option you already specified the number of rows per statement which is 1 .
if more than one rows per statement needed it must be specified using --rows-per-insert
and specifying one row per statement using --inserts option at the same time specify
different number of rows per statement with --rows-per-insert option seems conflicting to me.

So you're saying an INSERT, where you insert multiple rows in a single
statement is not an insert? That logic surprises me. --inserts makes
pg_dump use INSERTs rather than COPY. --rows-per-inserts still uses
INSERTs. I'd love to know why you think there's some conflict with
that.

By your logic, you could say --column-inserts and --inserts should
also conflict, but they don't. --column-inserts happens to be coded to
imply --inserts. I really suggest we follow the lead from that. Doing
it this way reduces the complexity of the code where we build the
INSERT statement. Remember that a patch that is overly complex has
much less chance of making it. I'd really suggest you keep this as
simple as possible.

It also seems perfectly logical to me to default --rows-per-insert to
1 so that when --inserts is specified we do 1 row per INSERT. If the
user changes that value to something higher then nothing special needs
to happen as the function building the INSERT statement will always be
paying attention to whatever the --rows-per-insert value is set to.
That simplifies the logic meaning you don't need to check if --inserts
was specified.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#32

surafel3000@gmail.com

almost 7 years ago

In reply to: David Rowley (#31)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

On Tue, Jan 22, 2019 at 3:35 PM David Rowley <david.rowley@2ndquadrant.com>
wrote:

On Sat, 19 Jan 2019 at 01:01, Surafel Temesgen <surafel3000@gmail.com>
wrote:

if you specified --inserts option you already specified the number of

rows per statement which is 1 .

if more than one rows per statement needed it must be specified using

--rows-per-insert

and specifying one row per statement using --inserts option at the same

time specify

different number of rows per statement with --rows-per-insert option

seems conflicting to me.

So you're saying an INSERT, where you insert multiple rows in a single
statement is not an insert? That logic surprises me. --inserts makes
pg_dump use INSERTs rather than COPY. --rows-per-inserts still uses
INSERTs. I'd love to know why you think there's some conflict with
that.

By your logic, you could say --column-inserts and --inserts should
also conflict, but they don't. --column-inserts happens to be coded to
imply --inserts. I really suggest we follow the lead from that. Doing
it this way reduces the complexity of the code where we build the
INSERT statement. Remember that a patch that is overly complex has
much less chance of making it. I'd really suggest you keep this as
simple as possible.

okay i understand it now .Fabien also comment about it uptread i
misunderstand it as
using separate new option.

It also seems perfectly logical to me to default --rows-per-insert to

1 so that when --inserts is specified we do 1 row per INSERT. If the
user changes that value to something higher then nothing special needs
to happen as the function building the INSERT statement will always be
paying attention to whatever the --rows-per-insert value is set to.
That simplifies the logic meaning you don't need to check if --inserts
was specified.

okay .thank you for explaining. i attach a patch corrected as such

Attachments:

multi_values_inserts_dum_v9.patchtext/x-patch; charset=US-ASCII; name=multi_values_inserts_dum_v9.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 9e0bb93f08..4195fb81a2 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -775,6 +775,16 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--rows-per-insert</option></term>
+      <listitem>
+       <para>
+        When using <option>--rows-per-insert</option>, this allows the maximum number
+        of rows per <command>INSERT</command> statement to be specified.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--load-via-partition-root</option></term>
       <listitem>
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..73a243ecb0 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -72,6 +72,7 @@ typedef struct _restoreOptions
 	int			dropSchema;
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;	/* Skip comments */
@@ -144,6 +145,7 @@ typedef struct _dumpOptions
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			dump_inserts_multiple;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2b1a94733b..e23f5cc70f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -313,6 +313,7 @@ main(int argc, char **argv)
 	int			plainText = 0;
 	ArchiveFormat archiveFormat = archUnknown;
 	ArchiveMode archiveMode;
+	char       *p;
 
 	static DumpOptions dopt;
 
@@ -359,6 +360,7 @@ main(int argc, char **argv)
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
 		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"rows-per-insert", required_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -557,6 +559,27 @@ main(int argc, char **argv)
 				dosync = false;
 				break;
 
+			case 8:			/* inserts row number */
+				errno = 0;
+				dopt.dump_inserts_multiple = strtol(optarg, &p, 10);
+				if (p == optarg || *p != '\0')
+				{
+					write_msg(NULL, "argument of --rows-per-insert must be a number\n");
+					exit_nicely(1);
+				}
+				if (errno == ERANGE)
+				{
+					write_msg(NULL, "argument of --rows-per-insert exceeds integer range.\n");
+					exit_nicely(1);
+				}
+				if (dopt.dump_inserts_multiple <= 0)
+				{
+					write_msg(NULL, "argument of --rows-per-insert must be positive number\n");
+					exit_nicely(1);
+				}
+
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -584,6 +607,9 @@ main(int argc, char **argv)
 	if (dopt.column_inserts)
 		dopt.dump_inserts = 1;
 
+	if (dopt.dump_inserts && !dopt.dump_inserts_multiple)
+		dopt.dump_inserts_multiple = 1;
+
 	/*
 	 * Binary upgrade mode implies dumping sequence data even in schema-only
 	 * mode.  This is not exposed as a separate option, but kept separate
@@ -607,8 +633,9 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
-		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
+	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts ||
+		dopt.dump_inserts_multiple))
+		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\n");
 
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -877,6 +904,7 @@ main(int argc, char **argv)
 	ropt->use_setsessauth = dopt.use_setsessauth;
 	ropt->disable_dollar_quoting = dopt.disable_dollar_quoting;
 	ropt->dump_inserts = dopt.dump_inserts;
+	ropt->dump_inserts_multiple = dopt.dump_inserts_multiple;
 	ropt->no_comments = dopt.no_comments;
 	ropt->no_publications = dopt.no_publications;
 	ropt->no_security_labels = dopt.no_security_labels;
@@ -967,6 +995,7 @@ help(const char *progname)
 	printf(_("  --exclude-table-data=TABLE   do NOT dump data for the named table(s)\n"));
 	printf(_("  --if-exists                  use IF EXISTS when dropping objects\n"));
 	printf(_("  --inserts                    dump data as INSERT commands, rather than COPY\n"));
+	printf(_("  --rows-per-insert            number of row per INSERT command\n"));
 	printf(_("  --load-via-partition-root    load partitions via the root table\n"));
 	printf(_("  --no-comments                do not dump comments\n"));
 	printf(_("  --no-publications            do not dump publications\n"));
@@ -1886,6 +1915,8 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	int			tuple;
 	int			nfields;
 	int			field;
+	int number_of_row = 1;
+	int end_of_statement = 0;
 
 	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
 					  "SELECT * FROM ONLY %s",
@@ -1900,67 +1931,86 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		res = ExecuteSqlQuery(fout, "FETCH 100 FROM _pg_dump_cursor",
 							  PGRES_TUPLES_OK);
 		nfields = PQnfields(res);
+
+		/*
+		 * First time through, we build as much of the INSERT statement as
+		 * possible in "insertStmt", which we can then just print for each
+		 * line. If the table happens to have zero columns then this will
+		 * be a complete statement, otherwise it will end in "VALUES(" and
+		 * be ready to have the row's column values appended.
+		 */
+		if (insertStmt == NULL)
+		{
+			TableInfo  *targettab;
+
+			insertStmt = createPQExpBuffer();
+
+			/*
+			 * When load-via-partition-root is set, get the root table
+			 * name for the partition table, so that we can reload data
+			 * through the root table.
+			 */
+			if (dopt->load_via_partition_root && tbinfo->ispartition)
+				targettab = getRootTableInfo(tbinfo);
+			else
+				targettab = tbinfo;
+
+			appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+							  fmtQualifiedDumpable(targettab));
+
+			/* corner case for zero-column table */
+			if (nfields == 0)
+			{
+				appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+			}
+			else
+			{
+				/* append the list of column names if required */
+				if (dopt->column_inserts)
+				{
+					appendPQExpBufferChar(insertStmt, '(');
+					for (field = 0; field < nfields; field++)
+					{
+						if (field > 0)
+							appendPQExpBufferStr(insertStmt, ", ");
+						appendPQExpBufferStr(insertStmt,
+											 fmtId(PQfname(res, field)));
+					}
+					appendPQExpBufferStr(insertStmt, ") ");
+				}
+
+				if (tbinfo->needs_override)
+					appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+
+				appendPQExpBufferStr(insertStmt, "VALUES ");
+			}
+		}
+
 		for (tuple = 0; tuple < PQntuples(res); tuple++)
 		{
+
 			/*
-			 * First time through, we build as much of the INSERT statement as
-			 * possible in "insertStmt", which we can then just print for each
-			 * line. If the table happens to have zero columns then this will
-			 * be a complete statement, otherwise it will end in "VALUES(" and
-			 * be ready to have the row's column values appended.
+			 * If it is zero-column table then we've written all we need to.
+			 * We're unable to do multi-inserts for this case due to lack of a
+			 * valid syntax, so continue to use single row statements
 			 */
-			if (insertStmt == NULL)
-			{
-				TableInfo  *targettab;
-
-				insertStmt = createPQExpBuffer();
-
-				/*
-				 * When load-via-partition-root is set, get the root table
-				 * name for the partition table, so that we can reload data
-				 * through the root table.
-				 */
-				if (dopt->load_via_partition_root && tbinfo->ispartition)
-					targettab = getRootTableInfo(tbinfo);
-				else
-					targettab = tbinfo;
-
-				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
-								  fmtQualifiedDumpable(targettab));
-
-				/* corner case for zero-column table */
-				if (nfields == 0)
-				{
-					appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
-				}
-				else
-				{
-					/* append the list of column names if required */
-					if (dopt->column_inserts)
-					{
-						appendPQExpBufferChar(insertStmt, '(');
-						for (field = 0; field < nfields; field++)
-						{
-							if (field > 0)
-								appendPQExpBufferStr(insertStmt, ", ");
-							appendPQExpBufferStr(insertStmt,
-												 fmtId(PQfname(res, field)));
-						}
-						appendPQExpBufferStr(insertStmt, ") ");
-					}
-
-					if (tbinfo->needs_override)
-						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
-
-					appendPQExpBufferStr(insertStmt, "VALUES (");
-				}
-			}
-
-			archputs(insertStmt->data, fout);
-
-			/* if it is zero-column table then we're done */
 			if (nfields == 0)
+			{
+				archputs(insertStmt->data, fout);
 				continue;
+			}
+
+			if (number_of_row == 1 || end_of_statement)
+			{
+				archputs(insertStmt->data, fout);
+				archputs("(", fout);
+				end_of_statement = 0;
+			}
+
+			if (number_of_row > 1)
+			{
+				archputs(", ( ", fout);
+			}
 
 			for (field = 0; field < nfields; field++)
 			{
@@ -2027,12 +2077,44 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 				}
 			}
 
+			if (dopt->dump_inserts_multiple == 1)
+			{
 			if (!dopt->do_nothing)
 				archputs(");\n", fout);
 			else
 				archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			}
+
+			if (dopt->dump_inserts_multiple !=1)
+			{
+				if (number_of_row == dopt->dump_inserts_multiple)
+				{
+					number_of_row = 1;
+					end_of_statement = 1;
+					if (!dopt->do_nothing)
+						archputs(");\n", fout);
+					else
+						archputs(") ON CONFLICT DO NOTHING;\n", fout);
+				}
+				else
+				{
+					archputs(")\n", fout);
+					number_of_row++;
+				}
+			}
 		}
 
+		/*
+		 * If number of tuples returned is less than specified rows count
+		 * we close the statment after last row.
+		 */
+		if (number_of_row > 1 && PQntuples(res) == 0)
+		{
+			if (!dopt->do_nothing)
+				archputs(";\n", fout);
+			else
+				archputs(" ON CONFLICT DO NOTHING;\n", fout);
+		}
 		if (PQntuples(res) <= 0)
 		{
 			PQclear(res);
@@ -2091,7 +2173,7 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
 	char	   *copyStmt;
 	const char *copyFrom;
 
-	if (!dopt->dump_inserts)
+	if (!dopt->dump_inserts && !dopt->dump_inserts_multiple)
 	{
 		/* Dump/restore using COPY */
 		dumpFn = dumpTableData_copy;
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a875d540b8..ebd83922dd 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -118,8 +118,8 @@ command_fails_like(
 
 command_fails_like(
 	[ 'pg_dump', '--on-conflict-do-nothing' ],
-	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts\E/,
-	'pg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts');
+	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\E/,
+	'pg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts');
 
 # pg_dumpall command-line argument checks
 command_fails_like(

#33

alvherre@2ndquadrant.com

almost 7 years ago

In reply to: Surafel Temesgen (#32)

Re: pg_dump multi VALUES INSERT

Nice stuff.

Is it possible to avoid the special case for 0 columns by using the
UNION ALL syntax I showed?

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#34

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: Surafel Temesgen (#32)

Re: pg_dump multi VALUES INSERT

Hello Surafel,

okay .thank you for explaining. i attach a patch corrected as such

About this v9: applies cleanly, compiles, global and local "make check"
ok.

The option is not exercise in the TAP tests. I'd suggest that it should be
tested on a small table with zero, 1, more than the value set number of
rows. Maybe use -t and other options to reduce the output to the minimum.

About the documentation:

  +   When using <option>--rows-per-insert</option>, this allows the maximum number
  +   of rows per <command>INSERT</command> statement to be specified.

I'd suggest a more direct and simple style, something like:

Set the maximum number of rows per INSERT statement.
This option implies --inserts.
Default to 1.

About the help message, the new option expects an argument, but it does
not show:

+ printf(_(" --rows-per-insert number of row per INSERT command\n"));

About the code, maybe avoid using an int as a bool, eg:

... && !dopt.dump_inserts_multiple)
-> ... && dopt.dump_inserts_multiple == 0)

Spacing around operators, eg: "!=1)" -> "!= 1)"

ISTM that the "dump_inserts_multiple" field is useless, you can reuse
"dump_inserts" instead, i.e. --inserts sets it to 1 *if zero*, and
--rows-per-inserts=XXX sets it to XXX. That would simplify the code
significantly.

ISTM that there are indentation issues, eg on added

if (dopt->dump_inserts_multiple == 1) {

The old code is not indented properly.

--
Fabien.

#35

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Alvaro Herrera (#33)

Re: pg_dump multi VALUES INSERT

On Wed, 23 Jan 2019 at 04:08, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Is it possible to avoid the special case for 0 columns by using the
UNION ALL syntax I showed?

It would be possible, but my thoughts are that we're moving away from
the SQL standard by doing so.

Looking at the standard I see:

<row value constructor element list> ::=
<row value constructor element> [ { <comma> <row value constructor
element> }... ]

so it appears that multirow VALUES clauses are allowed.

INSERT INTO ... DEFAULT VALUES; is standard too, but looking at
SELECT, neither the target list or FROM clause is optional.

You could maybe argue that 0-column tables are not standard anyway.
Going by DROP COLUMN I see "4) C shall be a column of T and C shall
not be the only column of T.". Are we the only database to break that?

I think since pg_dump --inserts is meant to be for importing data into
other databases then we should likely keep it as standard as possible.

Another argument against is that we've only supported empty SELECT
clauses since 9.4, so it may not help anyone who mysteriously wanted
to import data into an old version. Maybe that's a corner case, but
I'm sure 0 column tables are too.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#36

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Surafel Temesgen (#32)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

On Wed, 23 Jan 2019 at 02:13, Surafel Temesgen <surafel3000@gmail.com> wrote:

okay .thank you for explaining. i attach a patch corrected as such

I did a bit of work to this to fix a bunch of things:

1. Docs for --rows-per-insert didn't mention anything about a parameter.
2. You'd not followed the alphabetical order of how the parameters are
documented.
3. Various parts of the docs claimed that --inserts just inserted 1
row per statement. Those needed to be updated.
4. New options out of order in --help. The rest were in alphabetical order.
5. DumpOptions struct variable was not in the right place. It was
grouped in with some parameterless options.
6. Code in dumpTableData_insert() was convoluted. Not sure what you
had added end_of_statement for or why you were checking PQntuples(res)
== 0. You'd also made the number_of_row variable 1-based and set it
to 1 when we had added 0 rows. You then checked for the existence of 1
row by checking the variable was > 1... That made very little sense to
me. I've pretty much put back the code that I had sent to you
previously, just without the part where I split the row building code
out into another function.
7. A comment in dumpTableData_insert() claimed that the insertStmt
would end in "VALUES(", but it'll end in "VALUES ". I had updated that
in my last version, but you must have missed that.
8. I've made it so --rows-per-insert implies --inserts. This is
aligned with how --column-inserts behaves.

I changed a few other things. I simplified the condition that raises
an error when someone does --on-conflict-do-nothing without the
--inserts option. There was no need to check for the other options
that imply --inserts since that will already be enabled if one of the
other options has.

I also removed most of the error checking you'd added to ensure that
the --rows-per-insert parameter was a number. I'd have kept this but
I saw that we do nothing that special for the compression option. I
didn't see why --rows-per-insert was any more special. It was quite a
bit of code for very little reward.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

pg_dump-rows-per-insert-option_v10.patchapplication/octet-stream; name=pg_dump-rows-per-insert-option_v10.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 9e0bb93f08..bf10d012e4 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -661,9 +661,9 @@ PostgreSQL documentation
         ...</literal>).  This will make restoration very slow; it is mainly
         useful for making dumps that can be loaded into
         non-<productname>PostgreSQL</productname> databases.
-        However, since this option generates a separate command for each row,
-        an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
+        However, since, by default this option generates a separate command
+        for each row, an error in reloading a row causes only that row to be
+        lost rather than the entire table contents.
        </para>
       </listitem>
      </varlistentry>
@@ -764,11 +764,10 @@ PostgreSQL documentation
         than <command>COPY</command>).  This will make restoration very slow;
         it is mainly useful for making dumps that can be loaded into
         non-<productname>PostgreSQL</productname> databases.
-        However, since this option generates a separate command for each row,
-        an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
-        Note that
-        the restore might fail altogether if you have rearranged column order.
+        However, since this option, by default, generates a separate command
+        for each row, an error in reloading a row causes only that row to be
+        lost rather than the entire table contents.  Note that the restore
+        might fail altogether if you have rearranged column order.
         The <option>--column-inserts</option> option is safe against column
         order changes, though even slower.
        </para>
@@ -914,8 +913,9 @@ PostgreSQL documentation
        <para>
         Add <literal>ON CONFLICT DO NOTHING</literal> to
         <command>INSERT</command> commands.
-        This option is not valid unless <option>--inserts</option> or
-        <option>--column-inserts</option> is also specified.
+        This option is not valid unless <option>--inserts</option>,
+        <option>--column-inserts</option> or
+        <option>--rows-per-insert</option> is also specified.
        </para>
       </listitem>
      </varlistentry>
@@ -938,6 +938,18 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--rows-per-insert=<replaceable class="parameter">nrows</replaceable></option></term>
+      <listitem>
+       <para>
+        Dump data as <command>INSERT</command> commands (rather than
+        <command>COPY</command>).  Controls the maximum number of rows per
+        <command>INSERT</command> statement. The value specified must be a
+        number greater than zero.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
        <term><option>--section=<replaceable class="parameter">sectionname</replaceable></option></term>
        <listitem>
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..646b8ae3ba 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -72,6 +72,7 @@ typedef struct _restoreOptions
 	int			dropSchema;
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			rows_per_insert;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;	/* Skip comments */
@@ -140,6 +141,7 @@ typedef struct _dumpOptions
 	int			dumpSections;	/* bitmask of chosen sections */
 	bool		aclsSkip;
 	const char *lockWaitTimeout;
+	int			rows_per_insert;
 
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 2c2f6fb4a9..decb2846d2 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -169,6 +169,7 @@ InitDumpOptions(DumpOptions *opts)
 	/* set any fields that shouldn't default to zeroes */
 	opts->include_everything = true;
 	opts->dumpSections = DUMP_UNSECTIONED;
+	opts->rows_per_insert = 1;
 }
 
 /*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2b1a94733b..900fb9e0c4 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -313,6 +313,7 @@ main(int argc, char **argv)
 	int			plainText = 0;
 	ArchiveFormat archiveFormat = archUnknown;
 	ArchiveMode archiveMode;
+	bool		got_rows_per_insert = false;
 
 	static DumpOptions dopt;
 
@@ -359,6 +360,7 @@ main(int argc, char **argv)
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
 		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"rows-per-insert", required_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -557,6 +559,16 @@ main(int argc, char **argv)
 				dosync = false;
 				break;
 
+			case 8:			/* rows per insert */
+				dopt.rows_per_insert = atoi(optarg);
+				if (dopt.rows_per_insert <= 0)
+				{
+					write_msg(NULL, "rows-per-insert must be a positive number\n");
+					exit_nicely(1);
+				}
+				got_rows_per_insert = true;
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -584,6 +596,10 @@ main(int argc, char **argv)
 	if (dopt.column_inserts)
 		dopt.dump_inserts = 1;
 
+	/* --rows-per-insert implies --inserts */
+	if (got_rows_per_insert)
+		dopt.dump_inserts = 1;
+
 	/*
 	 * Binary upgrade mode implies dumping sequence data even in schema-only
 	 * mode.  This is not exposed as a separate option, but kept separate
@@ -607,8 +623,12 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
-		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
+	/*
+	 * --inserts are already implied above if --column-inserts or
+	 * --rows-per-insert were specified.
+	 */
+	if (dopt.do_nothing && !dopt.dump_inserts)
+		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\n");
 
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -877,6 +897,7 @@ main(int argc, char **argv)
 	ropt->use_setsessauth = dopt.use_setsessauth;
 	ropt->disable_dollar_quoting = dopt.disable_dollar_quoting;
 	ropt->dump_inserts = dopt.dump_inserts;
+	ropt->rows_per_insert = dopt.rows_per_insert;
 	ropt->no_comments = dopt.no_comments;
 	ropt->no_publications = dopt.no_publications;
 	ropt->no_security_labels = dopt.no_security_labels;
@@ -977,6 +998,7 @@ help(const char *progname)
 	printf(_("  --no-unlogged-table-data     do not dump unlogged table data\n"));
 	printf(_("  --on-conflict-do-nothing     add ON CONFLICT DO NOTHING to INSERT commands\n"));
 	printf(_("  --quote-all-identifiers      quote all identifiers, even if not key words\n"));
+	printf(_("  --rows-per-insert            number of row per INSERT command\n"));
 	printf(_("  --section=SECTION            dump named section (pre-data, data, or post-data)\n"));
 	printf(_("  --serializable-deferrable    wait until the dump can run without anomalies\n"));
 	printf(_("  --snapshot=SNAPSHOT          use given snapshot for the dump\n"));
@@ -1886,6 +1908,8 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	int			tuple;
 	int			nfields;
 	int			field;
+	int			rows_per_statement = dopt->rows_per_insert;
+	int			rows_this_statement = 0;
 
 	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
 					  "SELECT * FROM ONLY %s",
@@ -1900,68 +1924,86 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		res = ExecuteSqlQuery(fout, "FETCH 100 FROM _pg_dump_cursor",
 							  PGRES_TUPLES_OK);
 		nfields = PQnfields(res);
-		for (tuple = 0; tuple < PQntuples(res); tuple++)
+
+		/*
+		 * First time through, we build as much of the INSERT statement as
+		 * possible in "insertStmt", which we can then just print for each
+		 * line. If the table happens to have zero columns then this will
+		 * be a complete statement, otherwise it will end in "VALUES " and
+		 * be ready to have the row's column values printed.
+		 */
+		if (insertStmt == NULL)
 		{
-			/*
-			 * First time through, we build as much of the INSERT statement as
-			 * possible in "insertStmt", which we can then just print for each
-			 * line. If the table happens to have zero columns then this will
-			 * be a complete statement, otherwise it will end in "VALUES(" and
-			 * be ready to have the row's column values appended.
-			 */
-			if (insertStmt == NULL)
-			{
-				TableInfo  *targettab;
+			TableInfo  *targettab;
 
-				insertStmt = createPQExpBuffer();
+			insertStmt = createPQExpBuffer();
 
-				/*
-				 * When load-via-partition-root is set, get the root table
-				 * name for the partition table, so that we can reload data
-				 * through the root table.
-				 */
-				if (dopt->load_via_partition_root && tbinfo->ispartition)
-					targettab = getRootTableInfo(tbinfo);
-				else
-					targettab = tbinfo;
+			/*
+			 * When load-via-partition-root is set, get the root table
+			 * name for the partition table, so that we can reload data
+			 * through the root table.
+			 */
+			if (dopt->load_via_partition_root && tbinfo->ispartition)
+				targettab = getRootTableInfo(tbinfo);
+			else
+				targettab = tbinfo;
 
-				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
-								  fmtQualifiedDumpable(targettab));
+			appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+							  fmtQualifiedDumpable(targettab));
 
-				/* corner case for zero-column table */
-				if (nfields == 0)
-				{
-					appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
-				}
-				else
+			/* corner case for zero-column table */
+			if (nfields == 0)
+			{
+				appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+			}
+			else
+			{
+				/* append the list of column names if required */
+				if (dopt->column_inserts)
 				{
-					/* append the list of column names if required */
-					if (dopt->column_inserts)
+					appendPQExpBufferChar(insertStmt, '(');
+					for (field = 0; field < nfields; field++)
 					{
-						appendPQExpBufferChar(insertStmt, '(');
-						for (field = 0; field < nfields; field++)
-						{
-							if (field > 0)
-								appendPQExpBufferStr(insertStmt, ", ");
-							appendPQExpBufferStr(insertStmt,
-												 fmtId(PQfname(res, field)));
-						}
-						appendPQExpBufferStr(insertStmt, ") ");
+						if (field > 0)
+							appendPQExpBufferStr(insertStmt, ", ");
+						appendPQExpBufferStr(insertStmt,
+											 fmtId(PQfname(res, field)));
 					}
+					appendPQExpBufferStr(insertStmt, ") ");
+				}
 
-					if (tbinfo->needs_override)
-						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+				if (tbinfo->needs_override)
+					appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
 
-					appendPQExpBufferStr(insertStmt, "VALUES (");
-				}
+				appendPQExpBufferStr(insertStmt, "VALUES ");
 			}
+		}
 
-			archputs(insertStmt->data, fout);
+		for (tuple = 0; tuple < PQntuples(res); tuple++)
+		{
+			/* Write the INSERT if not in the middle of a multi-row INSERT. */
+			if (rows_this_statement == 0)
+				archputs(insertStmt->data, fout);
 
-			/* if it is zero-column table then we're done */
+
+			/*
+			 * If it is zero-column table then we've aleady written the
+			 * complete statement, which will mean we've disobeyed
+			 * --rows-per-insert when it's set greater than 1.  We do support
+			 * a way to make this multi-row with:
+			 * SELECT UNION ALL SELECT UNION ALL ... but that's non-standard
+			 * so likely we should avoid it given that using INSERTs is
+			 * mostly only ever needed for cross-database exports.
+			 */
 			if (nfields == 0)
 				continue;
 
+			if (rows_this_statement > 0)
+				archputs(", (", fout);
+			else
+				archputs("(", fout);
+
+
 			for (field = 0; field < nfields; field++)
 			{
 				if (field > 0)
@@ -2027,10 +2069,27 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 				}
 			}
 
-			if (!dopt->do_nothing)
-				archputs(");\n", fout);
+			rows_this_statement++;
+
+			/*
+			 * If we've put the target number of rows onto this statement then
+			 * we can terminate it now.
+			 */
+			if (rows_this_statement == rows_per_statement)
+			{
+				/* Reset the row counter */
+				rows_this_statement = 0;
+				if (!dopt->do_nothing)
+					archputs(");\n", fout);
+				else
+					archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			}
 			else
-				archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			{
+				/* Otherwise, get ready for the next row. */
+				archputs(")", fout);
+			}
+
 		}
 
 		if (PQntuples(res) <= 0)
@@ -2041,6 +2100,15 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		PQclear(res);
 	}
 
+	/* Terminate any statements that didn't make the row count.*/
+	if (rows_this_statement > 0)
+	{
+		if (!dopt->do_nothing)
+			archputs(";\n", fout);
+		else
+			archputs(" ON CONFLICT DO NOTHING;\n", fout);
+	}
+
 	archputs("\n\n", fout);
 
 	ExecuteSqlStatement(fout, "CLOSE _pg_dump_cursor");
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a875d540b8..ebd83922dd 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -118,8 +118,8 @@ command_fails_like(
 
 command_fails_like(
 	[ 'pg_dump', '--on-conflict-do-nothing' ],
-	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts\E/,
-	'pg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts');
+	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\E/,
+	'pg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts');
 
 # pg_dumpall command-line argument checks
 command_fails_like(

#37

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: David Rowley (#36)

Re: pg_dump multi VALUES INSERT

Hello David & Surafel,

About this v10:

Patch applies and compiles cleanly, local & global "make check" ok.

A few comments, possibly redundant with some already in the thread.

Out of abc-order rows-per-inserts option in getopt struct.

help string does not specify the expected argument.

I still think that the added rows_per_insert field is useless, ISTM That
"int dump_inserts" can be reused, and you could also drop boolean
got_rows_per_insert.

The feature is not tested anywhere. I still think that there should be a
test on empty/small/larger-than-rows-per-insert tables, possibly added to
existing TAP-tests.

--
Fabien.

#38

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Fabien COELHO (#37)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

On Wed, 23 Jan 2019 at 22:08, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Out of abc-order rows-per-inserts option in getopt struct.

I missed that. Thanks. Fixed in attached.

help string does not specify the expected argument.

Also fixed.

I still think that the added rows_per_insert field is useless, ISTM That
"int dump_inserts" can be reused, and you could also drop boolean
got_rows_per_insert.

I thought about this and looked into it, but I decided it didn't look
like a smart thing to do. The reason is that if --inserts sets
dump_inserts to 1 then --rows-per-insert sets it to something else
that's likely fine, but if that happens in the opposite order then the
--rows-per-insert gets overwritten with 1. The bad news is the order
that happens is defined by the order of the command line args. It
might be possible to make it work by having --inserts set some other
variable, then set dump_inserts to 1 if it's set to 0 and the other
variable is set to >= 1... but that requires another variable, which
is what you want to avoid... I think it's best to have a variable per
argument.

I could get rid of the got_rows_per_insert variable, but it would
require setting the default value for rows_per_insert in the main()
function rather than in InitDumpOptions(). I thought
InitDumpOptions() looked like just the place to do this, so went with
that option. To make it work without got_rows_per_insert,
rows_per_insert would have to be 0 by default and we'd know we saw a
--rows-per-insert command line arg by the fact that rows_per_insert
was non-zero. Would you rather have it that way?

The feature is not tested anywhere. I still think that there should be a
test on empty/small/larger-than-rows-per-insert tables, possibly added to
existing TAP-tests.

I was hoping to get away with not having to do that... mainly because
I've no idea how. Please have at it if you know how it's done.

FWIW I looked at 002_pg_dump.pl and did add a test, but I was unable
to make it pass because I couldn't really figure out how the regex
matching is meant to work. It does not seem to be explained very well
in the comments and I lack patience for Perl.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

pg_dump-rows-per-insert-option_v11.patchapplication/octet-stream; name=pg_dump-rows-per-insert-option_v11.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 9e0bb93f08..bf10d012e4 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -661,9 +661,9 @@ PostgreSQL documentation
         ...</literal>).  This will make restoration very slow; it is mainly
         useful for making dumps that can be loaded into
         non-<productname>PostgreSQL</productname> databases.
-        However, since this option generates a separate command for each row,
-        an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
+        However, since, by default this option generates a separate command
+        for each row, an error in reloading a row causes only that row to be
+        lost rather than the entire table contents.
        </para>
       </listitem>
      </varlistentry>
@@ -764,11 +764,10 @@ PostgreSQL documentation
         than <command>COPY</command>).  This will make restoration very slow;
         it is mainly useful for making dumps that can be loaded into
         non-<productname>PostgreSQL</productname> databases.
-        However, since this option generates a separate command for each row,
-        an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
-        Note that
-        the restore might fail altogether if you have rearranged column order.
+        However, since this option, by default, generates a separate command
+        for each row, an error in reloading a row causes only that row to be
+        lost rather than the entire table contents.  Note that the restore
+        might fail altogether if you have rearranged column order.
         The <option>--column-inserts</option> option is safe against column
         order changes, though even slower.
        </para>
@@ -914,8 +913,9 @@ PostgreSQL documentation
        <para>
         Add <literal>ON CONFLICT DO NOTHING</literal> to
         <command>INSERT</command> commands.
-        This option is not valid unless <option>--inserts</option> or
-        <option>--column-inserts</option> is also specified.
+        This option is not valid unless <option>--inserts</option>,
+        <option>--column-inserts</option> or
+        <option>--rows-per-insert</option> is also specified.
        </para>
       </listitem>
      </varlistentry>
@@ -938,6 +938,18 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--rows-per-insert=<replaceable class="parameter">nrows</replaceable></option></term>
+      <listitem>
+       <para>
+        Dump data as <command>INSERT</command> commands (rather than
+        <command>COPY</command>).  Controls the maximum number of rows per
+        <command>INSERT</command> statement. The value specified must be a
+        number greater than zero.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
        <term><option>--section=<replaceable class="parameter">sectionname</replaceable></option></term>
        <listitem>
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..646b8ae3ba 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -72,6 +72,7 @@ typedef struct _restoreOptions
 	int			dropSchema;
 	int			disable_dollar_quoting;
 	int			dump_inserts;
+	int			rows_per_insert;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;	/* Skip comments */
@@ -140,6 +141,7 @@ typedef struct _dumpOptions
 	int			dumpSections;	/* bitmask of chosen sections */
 	bool		aclsSkip;
 	const char *lockWaitTimeout;
+	int			rows_per_insert;
 
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 2c2f6fb4a9..decb2846d2 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -169,6 +169,7 @@ InitDumpOptions(DumpOptions *opts)
 	/* set any fields that shouldn't default to zeroes */
 	opts->include_everything = true;
 	opts->dumpSections = DUMP_UNSECTIONED;
+	opts->rows_per_insert = 1;
 }
 
 /*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2b1a94733b..65a54271a9 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -313,6 +313,7 @@ main(int argc, char **argv)
 	int			plainText = 0;
 	ArchiveFormat archiveFormat = archUnknown;
 	ArchiveMode archiveMode;
+	bool		got_rows_per_insert = false;
 
 	static DumpOptions dopt;
 
@@ -377,6 +378,7 @@ main(int argc, char **argv)
 		{"no-subscriptions", no_argument, &dopt.no_subscriptions, 1},
 		{"no-sync", no_argument, NULL, 7},
 		{"on-conflict-do-nothing", no_argument, &dopt.do_nothing, 1},
+		{"rows-per-insert", required_argument, NULL, 8},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -557,6 +559,16 @@ main(int argc, char **argv)
 				dosync = false;
 				break;
 
+			case 8:			/* rows per insert */
+				dopt.rows_per_insert = atoi(optarg);
+				if (dopt.rows_per_insert <= 0)
+				{
+					write_msg(NULL, "rows-per-insert must be a positive number\n");
+					exit_nicely(1);
+				}
+				got_rows_per_insert = true;
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -584,6 +596,10 @@ main(int argc, char **argv)
 	if (dopt.column_inserts)
 		dopt.dump_inserts = 1;
 
+	/* --rows-per-insert implies --inserts */
+	if (got_rows_per_insert)
+		dopt.dump_inserts = 1;
+
 	/*
 	 * Binary upgrade mode implies dumping sequence data even in schema-only
 	 * mode.  This is not exposed as a separate option, but kept separate
@@ -607,8 +623,12 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
-		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
+	/*
+	 * --inserts are already implied above if --column-inserts or
+	 * --rows-per-insert were specified.
+	 */
+	if (dopt.do_nothing && !dopt.dump_inserts)
+		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\n");
 
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -877,6 +897,7 @@ main(int argc, char **argv)
 	ropt->use_setsessauth = dopt.use_setsessauth;
 	ropt->disable_dollar_quoting = dopt.disable_dollar_quoting;
 	ropt->dump_inserts = dopt.dump_inserts;
+	ropt->rows_per_insert = dopt.rows_per_insert;
 	ropt->no_comments = dopt.no_comments;
 	ropt->no_publications = dopt.no_publications;
 	ropt->no_security_labels = dopt.no_security_labels;
@@ -977,6 +998,7 @@ help(const char *progname)
 	printf(_("  --no-unlogged-table-data     do not dump unlogged table data\n"));
 	printf(_("  --on-conflict-do-nothing     add ON CONFLICT DO NOTHING to INSERT commands\n"));
 	printf(_("  --quote-all-identifiers      quote all identifiers, even if not key words\n"));
+	printf(_("  --rows-per-insert=NROWS      number of row per INSERT command\n"));
 	printf(_("  --section=SECTION            dump named section (pre-data, data, or post-data)\n"));
 	printf(_("  --serializable-deferrable    wait until the dump can run without anomalies\n"));
 	printf(_("  --snapshot=SNAPSHOT          use given snapshot for the dump\n"));
@@ -1886,6 +1908,8 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	int			tuple;
 	int			nfields;
 	int			field;
+	int			rows_per_statement = dopt->rows_per_insert;
+	int			rows_this_statement = 0;
 
 	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
 					  "SELECT * FROM ONLY %s",
@@ -1900,68 +1924,86 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		res = ExecuteSqlQuery(fout, "FETCH 100 FROM _pg_dump_cursor",
 							  PGRES_TUPLES_OK);
 		nfields = PQnfields(res);
-		for (tuple = 0; tuple < PQntuples(res); tuple++)
+
+		/*
+		 * First time through, we build as much of the INSERT statement as
+		 * possible in "insertStmt", which we can then just print for each
+		 * line. If the table happens to have zero columns then this will
+		 * be a complete statement, otherwise it will end in "VALUES " and
+		 * be ready to have the row's column values printed.
+		 */
+		if (insertStmt == NULL)
 		{
-			/*
-			 * First time through, we build as much of the INSERT statement as
-			 * possible in "insertStmt", which we can then just print for each
-			 * line. If the table happens to have zero columns then this will
-			 * be a complete statement, otherwise it will end in "VALUES(" and
-			 * be ready to have the row's column values appended.
-			 */
-			if (insertStmt == NULL)
-			{
-				TableInfo  *targettab;
+			TableInfo  *targettab;
 
-				insertStmt = createPQExpBuffer();
+			insertStmt = createPQExpBuffer();
 
-				/*
-				 * When load-via-partition-root is set, get the root table
-				 * name for the partition table, so that we can reload data
-				 * through the root table.
-				 */
-				if (dopt->load_via_partition_root && tbinfo->ispartition)
-					targettab = getRootTableInfo(tbinfo);
-				else
-					targettab = tbinfo;
+			/*
+			 * When load-via-partition-root is set, get the root table
+			 * name for the partition table, so that we can reload data
+			 * through the root table.
+			 */
+			if (dopt->load_via_partition_root && tbinfo->ispartition)
+				targettab = getRootTableInfo(tbinfo);
+			else
+				targettab = tbinfo;
 
-				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
-								  fmtQualifiedDumpable(targettab));
+			appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+							  fmtQualifiedDumpable(targettab));
 
-				/* corner case for zero-column table */
-				if (nfields == 0)
-				{
-					appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
-				}
-				else
+			/* corner case for zero-column table */
+			if (nfields == 0)
+			{
+				appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+			}
+			else
+			{
+				/* append the list of column names if required */
+				if (dopt->column_inserts)
 				{
-					/* append the list of column names if required */
-					if (dopt->column_inserts)
+					appendPQExpBufferChar(insertStmt, '(');
+					for (field = 0; field < nfields; field++)
 					{
-						appendPQExpBufferChar(insertStmt, '(');
-						for (field = 0; field < nfields; field++)
-						{
-							if (field > 0)
-								appendPQExpBufferStr(insertStmt, ", ");
-							appendPQExpBufferStr(insertStmt,
-												 fmtId(PQfname(res, field)));
-						}
-						appendPQExpBufferStr(insertStmt, ") ");
+						if (field > 0)
+							appendPQExpBufferStr(insertStmt, ", ");
+						appendPQExpBufferStr(insertStmt,
+											 fmtId(PQfname(res, field)));
 					}
+					appendPQExpBufferStr(insertStmt, ") ");
+				}
 
-					if (tbinfo->needs_override)
-						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+				if (tbinfo->needs_override)
+					appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
 
-					appendPQExpBufferStr(insertStmt, "VALUES (");
-				}
+				appendPQExpBufferStr(insertStmt, "VALUES ");
 			}
+		}
 
-			archputs(insertStmt->data, fout);
+		for (tuple = 0; tuple < PQntuples(res); tuple++)
+		{
+			/* Write the INSERT if not in the middle of a multi-row INSERT. */
+			if (rows_this_statement == 0)
+				archputs(insertStmt->data, fout);
 
-			/* if it is zero-column table then we're done */
+
+			/*
+			 * If it is zero-column table then we've aleady written the
+			 * complete statement, which will mean we've disobeyed
+			 * --rows-per-insert when it's set greater than 1.  We do support
+			 * a way to make this multi-row with:
+			 * SELECT UNION ALL SELECT UNION ALL ... but that's non-standard
+			 * so likely we should avoid it given that using INSERTs is
+			 * mostly only ever needed for cross-database exports.
+			 */
 			if (nfields == 0)
 				continue;
 
+			if (rows_this_statement > 0)
+				archputs(", (", fout);
+			else
+				archputs("(", fout);
+
+
 			for (field = 0; field < nfields; field++)
 			{
 				if (field > 0)
@@ -2027,10 +2069,27 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 				}
 			}
 
-			if (!dopt->do_nothing)
-				archputs(");\n", fout);
+			rows_this_statement++;
+
+			/*
+			 * If we've put the target number of rows onto this statement then
+			 * we can terminate it now.
+			 */
+			if (rows_this_statement == rows_per_statement)
+			{
+				/* Reset the row counter */
+				rows_this_statement = 0;
+				if (!dopt->do_nothing)
+					archputs(");\n", fout);
+				else
+					archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			}
 			else
-				archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			{
+				/* Otherwise, get ready for the next row. */
+				archputs(")", fout);
+			}
+
 		}
 
 		if (PQntuples(res) <= 0)
@@ -2041,6 +2100,15 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		PQclear(res);
 	}
 
+	/* Terminate any statements that didn't make the row count.*/
+	if (rows_this_statement > 0)
+	{
+		if (!dopt->do_nothing)
+			archputs(";\n", fout);
+		else
+			archputs(" ON CONFLICT DO NOTHING;\n", fout);
+	}
+
 	archputs("\n\n", fout);
 
 	ExecuteSqlStatement(fout, "CLOSE _pg_dump_cursor");
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a875d540b8..ebd83922dd 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -118,8 +118,8 @@ command_fails_like(
 
 command_fails_like(
 	[ 'pg_dump', '--on-conflict-do-nothing' ],
-	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts\E/,
-	'pg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts');
+	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\E/,
+	'pg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts');
 
 # pg_dumpall command-line argument checks
 command_fails_like(

#39

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: David Rowley (#38)

Re: pg_dump multi VALUES INSERT

Hello David,

I thought about this and looked into it, but I decided it didn't look
like a smart thing to do. The reason is that if --inserts sets
dump_inserts to 1 then --rows-per-insert sets it to something else
that's likely fine, but if that happens in the opposite order then the
--rows-per-insert gets overwritten with 1.

You can test before doing that!

case X:
if (opt.dump_inserts == 0)
opt.dump_inserts = 1;
// otherwise option is already set

The bad news is the order that happens is defined by the order of the
command line args.

It might be possible to make it work by having --inserts set some other
variable,

ISTM that it is enough to test whether the variable is zero.

then set dump_inserts to 1 if it's set to 0 and the other variable is
set to >= 1... but that requires another variable, which is what you
want to avoid...

I still do not understand the need for another variable.

int ninserts = 0; // default is to use copy
while (getopt...)
{
switch (...) {
case "--inserts":
if (ninserts == 0) ninserts = 1;
break;
case "--rows-per-insert":
ninserts = arg_value;
checks...
break;
...

I think it's best to have a variable per argument.

I disagree, because it adds complexity where none is needed: here the new
option is an extension of a previous one, thus the previous one just
becomes a particular case, so it seems simpler to manage it as the
particular case it is rather than a special case, creating the need for
checking the consistency and so if two variables are used.

I could get rid of the got_rows_per_insert variable, but it would
require setting the default value for rows_per_insert in the main()
function rather than in InitDumpOptions(). I thought
InitDumpOptions() looked like just the place to do this, so went with
that option. To make it work without got_rows_per_insert,
rows_per_insert would have to be 0 by default and we'd know we saw a
--rows-per-insert command line arg by the fact that rows_per_insert
was non-zero. Would you rather have it that way?

Yep, esp as rows_per_insert & dump_inserts could be the same.

The feature is not tested anywhere. I still think that there should be a
test on empty/small/larger-than-rows-per-insert tables, possibly added to
existing TAP-tests.

I was hoping to get away with not having to do that... mainly because
I've no idea how.

Hmmm. That is another question! Maybe someone will help.

--
Fabien.

#40

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Fabien COELHO (#39)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

On Thu, 24 Jan 2019 at 04:45, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

I still do not understand the need for another variable.

int ninserts = 0; // default is to use copy
while (getopt...)
{
switch (...) {
case "--inserts":
if (ninserts == 0) ninserts = 1;
break;
case "--rows-per-insert":
ninserts = arg_value;
checks...
break;
...

I didn't think of that. Attached is a version that changes it to work
along those lines.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

pg_dump-rows-per-insert-option_v12.patchapplication/octet-stream; name=pg_dump-rows-per-insert-option_v12.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 9e0bb93f08..bf10d012e4 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -661,9 +661,9 @@ PostgreSQL documentation
         ...</literal>).  This will make restoration very slow; it is mainly
         useful for making dumps that can be loaded into
         non-<productname>PostgreSQL</productname> databases.
-        However, since this option generates a separate command for each row,
-        an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
+        However, since, by default this option generates a separate command
+        for each row, an error in reloading a row causes only that row to be
+        lost rather than the entire table contents.
        </para>
       </listitem>
      </varlistentry>
@@ -764,11 +764,10 @@ PostgreSQL documentation
         than <command>COPY</command>).  This will make restoration very slow;
         it is mainly useful for making dumps that can be loaded into
         non-<productname>PostgreSQL</productname> databases.
-        However, since this option generates a separate command for each row,
-        an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
-        Note that
-        the restore might fail altogether if you have rearranged column order.
+        However, since this option, by default, generates a separate command
+        for each row, an error in reloading a row causes only that row to be
+        lost rather than the entire table contents.  Note that the restore
+        might fail altogether if you have rearranged column order.
         The <option>--column-inserts</option> option is safe against column
         order changes, though even slower.
        </para>
@@ -914,8 +913,9 @@ PostgreSQL documentation
        <para>
         Add <literal>ON CONFLICT DO NOTHING</literal> to
         <command>INSERT</command> commands.
-        This option is not valid unless <option>--inserts</option> or
-        <option>--column-inserts</option> is also specified.
+        This option is not valid unless <option>--inserts</option>,
+        <option>--column-inserts</option> or
+        <option>--rows-per-insert</option> is also specified.
        </para>
       </listitem>
      </varlistentry>
@@ -938,6 +938,18 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--rows-per-insert=<replaceable class="parameter">nrows</replaceable></option></term>
+      <listitem>
+       <para>
+        Dump data as <command>INSERT</command> commands (rather than
+        <command>COPY</command>).  Controls the maximum number of rows per
+        <command>INSERT</command> statement. The value specified must be a
+        number greater than zero.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
        <term><option>--section=<replaceable class="parameter">sectionname</replaceable></option></term>
        <listitem>
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..7ab27391fb 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -140,10 +140,10 @@ typedef struct _dumpOptions
 	int			dumpSections;	/* bitmask of chosen sections */
 	bool		aclsSkip;
 	const char *lockWaitTimeout;
+	int			dump_inserts;	/* 0 = COPY, otherwise rows per INSERT */
 
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
-	int			dump_inserts;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2b1a94733b..ba37f29c2e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -358,7 +358,7 @@ main(int argc, char **argv)
 		{"enable-row-security", no_argument, &dopt.enable_row_security, 1},
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
-		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"inserts", no_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -377,6 +377,7 @@ main(int argc, char **argv)
 		{"no-subscriptions", no_argument, &dopt.no_subscriptions, 1},
 		{"no-sync", no_argument, NULL, 7},
 		{"on-conflict-do-nothing", no_argument, &dopt.do_nothing, 1},
+		{"rows-per-insert", required_argument, NULL, 9},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -557,6 +558,24 @@ main(int argc, char **argv)
 				dosync = false;
 				break;
 
+			case 8:				/* inserts */
+				/*
+				 * dump_inserts also stores --rows-per-insert, careful not to
+				 * overwrite that.
+				 */
+				if (dopt.dump_inserts == 0)
+					dopt.dump_inserts = DUMP_DEFAULT_ROWS_PER_INSERT;
+				break;
+
+			case 9:			/* rows per insert */
+				dopt.dump_inserts = atoi(optarg);
+				if (dopt.dump_inserts <= 0)
+				{
+					write_msg(NULL, "rows-per-insert must be a positive number\n");
+					exit_nicely(1);
+				}
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -581,8 +600,8 @@ main(int argc, char **argv)
 	}
 
 	/* --column-inserts implies --inserts */
-	if (dopt.column_inserts)
-		dopt.dump_inserts = 1;
+	if (dopt.column_inserts && dopt.dump_inserts == 0)
+		dopt.dump_inserts = DUMP_DEFAULT_ROWS_PER_INSERT;
 
 	/*
 	 * Binary upgrade mode implies dumping sequence data even in schema-only
@@ -607,8 +626,12 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
-		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
+	/*
+	 * --inserts are already implied above if --column-inserts or
+	 * --rows-per-insert were specified.
+	 */
+	if (dopt.do_nothing && dopt.dump_inserts == 0)
+		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\n");
 
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -977,6 +1000,7 @@ help(const char *progname)
 	printf(_("  --no-unlogged-table-data     do not dump unlogged table data\n"));
 	printf(_("  --on-conflict-do-nothing     add ON CONFLICT DO NOTHING to INSERT commands\n"));
 	printf(_("  --quote-all-identifiers      quote all identifiers, even if not key words\n"));
+	printf(_("  --rows-per-insert=NROWS      number of row per INSERT command\n"));
 	printf(_("  --section=SECTION            dump named section (pre-data, data, or post-data)\n"));
 	printf(_("  --serializable-deferrable    wait until the dump can run without anomalies\n"));
 	printf(_("  --snapshot=SNAPSHOT          use given snapshot for the dump\n"));
@@ -1886,6 +1910,8 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	int			tuple;
 	int			nfields;
 	int			field;
+	int			rows_per_statement = dopt->dump_inserts;
+	int			rows_this_statement = 0;
 
 	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
 					  "SELECT * FROM ONLY %s",
@@ -1900,68 +1926,86 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		res = ExecuteSqlQuery(fout, "FETCH 100 FROM _pg_dump_cursor",
 							  PGRES_TUPLES_OK);
 		nfields = PQnfields(res);
-		for (tuple = 0; tuple < PQntuples(res); tuple++)
+
+		/*
+		 * First time through, we build as much of the INSERT statement as
+		 * possible in "insertStmt", which we can then just print for each
+		 * line. If the table happens to have zero columns then this will
+		 * be a complete statement, otherwise it will end in "VALUES " and
+		 * be ready to have the row's column values printed.
+		 */
+		if (insertStmt == NULL)
 		{
-			/*
-			 * First time through, we build as much of the INSERT statement as
-			 * possible in "insertStmt", which we can then just print for each
-			 * line. If the table happens to have zero columns then this will
-			 * be a complete statement, otherwise it will end in "VALUES(" and
-			 * be ready to have the row's column values appended.
-			 */
-			if (insertStmt == NULL)
-			{
-				TableInfo  *targettab;
+			TableInfo  *targettab;
 
-				insertStmt = createPQExpBuffer();
+			insertStmt = createPQExpBuffer();
 
-				/*
-				 * When load-via-partition-root is set, get the root table
-				 * name for the partition table, so that we can reload data
-				 * through the root table.
-				 */
-				if (dopt->load_via_partition_root && tbinfo->ispartition)
-					targettab = getRootTableInfo(tbinfo);
-				else
-					targettab = tbinfo;
+			/*
+			 * When load-via-partition-root is set, get the root table
+			 * name for the partition table, so that we can reload data
+			 * through the root table.
+			 */
+			if (dopt->load_via_partition_root && tbinfo->ispartition)
+				targettab = getRootTableInfo(tbinfo);
+			else
+				targettab = tbinfo;
 
-				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
-								  fmtQualifiedDumpable(targettab));
+			appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+							  fmtQualifiedDumpable(targettab));
 
-				/* corner case for zero-column table */
-				if (nfields == 0)
-				{
-					appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
-				}
-				else
+			/* corner case for zero-column table */
+			if (nfields == 0)
+			{
+				appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+			}
+			else
+			{
+				/* append the list of column names if required */
+				if (dopt->column_inserts)
 				{
-					/* append the list of column names if required */
-					if (dopt->column_inserts)
+					appendPQExpBufferChar(insertStmt, '(');
+					for (field = 0; field < nfields; field++)
 					{
-						appendPQExpBufferChar(insertStmt, '(');
-						for (field = 0; field < nfields; field++)
-						{
-							if (field > 0)
-								appendPQExpBufferStr(insertStmt, ", ");
-							appendPQExpBufferStr(insertStmt,
-												 fmtId(PQfname(res, field)));
-						}
-						appendPQExpBufferStr(insertStmt, ") ");
+						if (field > 0)
+							appendPQExpBufferStr(insertStmt, ", ");
+						appendPQExpBufferStr(insertStmt,
+											 fmtId(PQfname(res, field)));
 					}
+					appendPQExpBufferStr(insertStmt, ") ");
+				}
 
-					if (tbinfo->needs_override)
-						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+				if (tbinfo->needs_override)
+					appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
 
-					appendPQExpBufferStr(insertStmt, "VALUES (");
-				}
+				appendPQExpBufferStr(insertStmt, "VALUES ");
 			}
+		}
 
-			archputs(insertStmt->data, fout);
+		for (tuple = 0; tuple < PQntuples(res); tuple++)
+		{
+			/* Write the INSERT if not in the middle of a multi-row INSERT. */
+			if (rows_this_statement == 0)
+				archputs(insertStmt->data, fout);
 
-			/* if it is zero-column table then we're done */
+
+			/*
+			 * If it is zero-column table then we've aleady written the
+			 * complete statement, which will mean we've disobeyed
+			 * --rows-per-insert when it's set greater than 1.  We do support
+			 * a way to make this multi-row with:
+			 * SELECT UNION ALL SELECT UNION ALL ... but that's non-standard
+			 * so likely we should avoid it given that using INSERTs is
+			 * mostly only ever needed for cross-database exports.
+			 */
 			if (nfields == 0)
 				continue;
 
+			if (rows_this_statement > 0)
+				archputs(", (", fout);
+			else
+				archputs("(", fout);
+
+
 			for (field = 0; field < nfields; field++)
 			{
 				if (field > 0)
@@ -2027,10 +2071,27 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 				}
 			}
 
-			if (!dopt->do_nothing)
-				archputs(");\n", fout);
+			rows_this_statement++;
+
+			/*
+			 * If we've put the target number of rows onto this statement then
+			 * we can terminate it now.
+			 */
+			if (rows_this_statement == rows_per_statement)
+			{
+				/* Reset the row counter */
+				rows_this_statement = 0;
+				if (!dopt->do_nothing)
+					archputs(");\n", fout);
+				else
+					archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			}
 			else
-				archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			{
+				/* Otherwise, get ready for the next row. */
+				archputs(")", fout);
+			}
+
 		}
 
 		if (PQntuples(res) <= 0)
@@ -2041,6 +2102,15 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		PQclear(res);
 	}
 
+	/* Terminate any statements that didn't make the row count.*/
+	if (rows_this_statement > 0)
+	{
+		if (!dopt->do_nothing)
+			archputs(";\n", fout);
+		else
+			archputs(" ON CONFLICT DO NOTHING;\n", fout);
+	}
+
 	archputs("\n\n", fout);
 
 	ExecuteSqlStatement(fout, "CLOSE _pg_dump_cursor");
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 21d2ab05b0..59ac3d096e 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -126,6 +126,12 @@ typedef uint32 DumpComponents;	/* a bitmask of dump object components */
 		DUMP_COMPONENT_DATA |\
 		DUMP_COMPONENT_POLICY)
 
+/*
+ * The default number of rows per INSERT statement when
+ * --inserts is specified without --rows-per-insert
+ */
+#define DUMP_DEFAULT_ROWS_PER_INSERT 1
+
 typedef struct _dumpableObject
 {
 	DumpableObjectType objType;
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a875d540b8..ebd83922dd 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -118,8 +118,8 @@ command_fails_like(
 
 command_fails_like(
 	[ 'pg_dump', '--on-conflict-do-nothing' ],
-	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts\E/,
-	'pg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts');
+	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\E/,
+	'pg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts');
 
 # pg_dumpall command-line argument checks
 command_fails_like(

#41

alvherre@2ndquadrant.com

almost 7 years ago

In reply to: David Rowley (#35)

Re: pg_dump multi VALUES INSERT

On 2019-Jan-23, David Rowley wrote:

On Wed, 23 Jan 2019 at 04:08, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Is it possible to avoid the special case for 0 columns by using the
UNION ALL syntax I showed?

It would be possible, but my thoughts are that we're moving away from
the SQL standard by doing so.

Ah, that's a good point that I missed -- I agree with your reasoning.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#42

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Fabien COELHO (#39)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

On Thu, 24 Jan 2019 at 04:45, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

The feature is not tested anywhere. I still think that there should be a
test on empty/small/larger-than-rows-per-insert tables, possibly added to
existing TAP-tests.

I was hoping to get away with not having to do that... mainly because
I've no idea how.

Hmmm. That is another question! Maybe someone will help.

Here's another version, same as before but with tests this time.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

pg_dump-rows-per-insert-option_v13.patchapplication/octet-stream; name=pg_dump-rows-per-insert-option_v13.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 9e0bb93f08..bf10d012e4 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -661,9 +661,9 @@ PostgreSQL documentation
         ...</literal>).  This will make restoration very slow; it is mainly
         useful for making dumps that can be loaded into
         non-<productname>PostgreSQL</productname> databases.
-        However, since this option generates a separate command for each row,
-        an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
+        However, since, by default this option generates a separate command
+        for each row, an error in reloading a row causes only that row to be
+        lost rather than the entire table contents.
        </para>
       </listitem>
      </varlistentry>
@@ -764,11 +764,10 @@ PostgreSQL documentation
         than <command>COPY</command>).  This will make restoration very slow;
         it is mainly useful for making dumps that can be loaded into
         non-<productname>PostgreSQL</productname> databases.
-        However, since this option generates a separate command for each row,
-        an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
-        Note that
-        the restore might fail altogether if you have rearranged column order.
+        However, since this option, by default, generates a separate command
+        for each row, an error in reloading a row causes only that row to be
+        lost rather than the entire table contents.  Note that the restore
+        might fail altogether if you have rearranged column order.
         The <option>--column-inserts</option> option is safe against column
         order changes, though even slower.
        </para>
@@ -914,8 +913,9 @@ PostgreSQL documentation
        <para>
         Add <literal>ON CONFLICT DO NOTHING</literal> to
         <command>INSERT</command> commands.
-        This option is not valid unless <option>--inserts</option> or
-        <option>--column-inserts</option> is also specified.
+        This option is not valid unless <option>--inserts</option>,
+        <option>--column-inserts</option> or
+        <option>--rows-per-insert</option> is also specified.
        </para>
       </listitem>
      </varlistentry>
@@ -938,6 +938,18 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--rows-per-insert=<replaceable class="parameter">nrows</replaceable></option></term>
+      <listitem>
+       <para>
+        Dump data as <command>INSERT</command> commands (rather than
+        <command>COPY</command>).  Controls the maximum number of rows per
+        <command>INSERT</command> statement. The value specified must be a
+        number greater than zero.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
        <term><option>--section=<replaceable class="parameter">sectionname</replaceable></option></term>
        <listitem>
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..7ab27391fb 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -140,10 +140,10 @@ typedef struct _dumpOptions
 	int			dumpSections;	/* bitmask of chosen sections */
 	bool		aclsSkip;
 	const char *lockWaitTimeout;
+	int			dump_inserts;	/* 0 = COPY, otherwise rows per INSERT */
 
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
-	int			dump_inserts;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2b1a94733b..ba37f29c2e 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -358,7 +358,7 @@ main(int argc, char **argv)
 		{"enable-row-security", no_argument, &dopt.enable_row_security, 1},
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
-		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"inserts", no_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -377,6 +377,7 @@ main(int argc, char **argv)
 		{"no-subscriptions", no_argument, &dopt.no_subscriptions, 1},
 		{"no-sync", no_argument, NULL, 7},
 		{"on-conflict-do-nothing", no_argument, &dopt.do_nothing, 1},
+		{"rows-per-insert", required_argument, NULL, 9},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -557,6 +558,24 @@ main(int argc, char **argv)
 				dosync = false;
 				break;
 
+			case 8:				/* inserts */
+				/*
+				 * dump_inserts also stores --rows-per-insert, careful not to
+				 * overwrite that.
+				 */
+				if (dopt.dump_inserts == 0)
+					dopt.dump_inserts = DUMP_DEFAULT_ROWS_PER_INSERT;
+				break;
+
+			case 9:			/* rows per insert */
+				dopt.dump_inserts = atoi(optarg);
+				if (dopt.dump_inserts <= 0)
+				{
+					write_msg(NULL, "rows-per-insert must be a positive number\n");
+					exit_nicely(1);
+				}
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -581,8 +600,8 @@ main(int argc, char **argv)
 	}
 
 	/* --column-inserts implies --inserts */
-	if (dopt.column_inserts)
-		dopt.dump_inserts = 1;
+	if (dopt.column_inserts && dopt.dump_inserts == 0)
+		dopt.dump_inserts = DUMP_DEFAULT_ROWS_PER_INSERT;
 
 	/*
 	 * Binary upgrade mode implies dumping sequence data even in schema-only
@@ -607,8 +626,12 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
-		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
+	/*
+	 * --inserts are already implied above if --column-inserts or
+	 * --rows-per-insert were specified.
+	 */
+	if (dopt.do_nothing && dopt.dump_inserts == 0)
+		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\n");
 
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -977,6 +1000,7 @@ help(const char *progname)
 	printf(_("  --no-unlogged-table-data     do not dump unlogged table data\n"));
 	printf(_("  --on-conflict-do-nothing     add ON CONFLICT DO NOTHING to INSERT commands\n"));
 	printf(_("  --quote-all-identifiers      quote all identifiers, even if not key words\n"));
+	printf(_("  --rows-per-insert=NROWS      number of row per INSERT command\n"));
 	printf(_("  --section=SECTION            dump named section (pre-data, data, or post-data)\n"));
 	printf(_("  --serializable-deferrable    wait until the dump can run without anomalies\n"));
 	printf(_("  --snapshot=SNAPSHOT          use given snapshot for the dump\n"));
@@ -1886,6 +1910,8 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	int			tuple;
 	int			nfields;
 	int			field;
+	int			rows_per_statement = dopt->dump_inserts;
+	int			rows_this_statement = 0;
 
 	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
 					  "SELECT * FROM ONLY %s",
@@ -1900,68 +1926,86 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		res = ExecuteSqlQuery(fout, "FETCH 100 FROM _pg_dump_cursor",
 							  PGRES_TUPLES_OK);
 		nfields = PQnfields(res);
-		for (tuple = 0; tuple < PQntuples(res); tuple++)
+
+		/*
+		 * First time through, we build as much of the INSERT statement as
+		 * possible in "insertStmt", which we can then just print for each
+		 * line. If the table happens to have zero columns then this will
+		 * be a complete statement, otherwise it will end in "VALUES " and
+		 * be ready to have the row's column values printed.
+		 */
+		if (insertStmt == NULL)
 		{
-			/*
-			 * First time through, we build as much of the INSERT statement as
-			 * possible in "insertStmt", which we can then just print for each
-			 * line. If the table happens to have zero columns then this will
-			 * be a complete statement, otherwise it will end in "VALUES(" and
-			 * be ready to have the row's column values appended.
-			 */
-			if (insertStmt == NULL)
-			{
-				TableInfo  *targettab;
+			TableInfo  *targettab;
 
-				insertStmt = createPQExpBuffer();
+			insertStmt = createPQExpBuffer();
 
-				/*
-				 * When load-via-partition-root is set, get the root table
-				 * name for the partition table, so that we can reload data
-				 * through the root table.
-				 */
-				if (dopt->load_via_partition_root && tbinfo->ispartition)
-					targettab = getRootTableInfo(tbinfo);
-				else
-					targettab = tbinfo;
+			/*
+			 * When load-via-partition-root is set, get the root table
+			 * name for the partition table, so that we can reload data
+			 * through the root table.
+			 */
+			if (dopt->load_via_partition_root && tbinfo->ispartition)
+				targettab = getRootTableInfo(tbinfo);
+			else
+				targettab = tbinfo;
 
-				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
-								  fmtQualifiedDumpable(targettab));
+			appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+							  fmtQualifiedDumpable(targettab));
 
-				/* corner case for zero-column table */
-				if (nfields == 0)
-				{
-					appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
-				}
-				else
+			/* corner case for zero-column table */
+			if (nfields == 0)
+			{
+				appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+			}
+			else
+			{
+				/* append the list of column names if required */
+				if (dopt->column_inserts)
 				{
-					/* append the list of column names if required */
-					if (dopt->column_inserts)
+					appendPQExpBufferChar(insertStmt, '(');
+					for (field = 0; field < nfields; field++)
 					{
-						appendPQExpBufferChar(insertStmt, '(');
-						for (field = 0; field < nfields; field++)
-						{
-							if (field > 0)
-								appendPQExpBufferStr(insertStmt, ", ");
-							appendPQExpBufferStr(insertStmt,
-												 fmtId(PQfname(res, field)));
-						}
-						appendPQExpBufferStr(insertStmt, ") ");
+						if (field > 0)
+							appendPQExpBufferStr(insertStmt, ", ");
+						appendPQExpBufferStr(insertStmt,
+											 fmtId(PQfname(res, field)));
 					}
+					appendPQExpBufferStr(insertStmt, ") ");
+				}
 
-					if (tbinfo->needs_override)
-						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+				if (tbinfo->needs_override)
+					appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
 
-					appendPQExpBufferStr(insertStmt, "VALUES (");
-				}
+				appendPQExpBufferStr(insertStmt, "VALUES ");
 			}
+		}
 
-			archputs(insertStmt->data, fout);
+		for (tuple = 0; tuple < PQntuples(res); tuple++)
+		{
+			/* Write the INSERT if not in the middle of a multi-row INSERT. */
+			if (rows_this_statement == 0)
+				archputs(insertStmt->data, fout);
 
-			/* if it is zero-column table then we're done */
+
+			/*
+			 * If it is zero-column table then we've aleady written the
+			 * complete statement, which will mean we've disobeyed
+			 * --rows-per-insert when it's set greater than 1.  We do support
+			 * a way to make this multi-row with:
+			 * SELECT UNION ALL SELECT UNION ALL ... but that's non-standard
+			 * so likely we should avoid it given that using INSERTs is
+			 * mostly only ever needed for cross-database exports.
+			 */
 			if (nfields == 0)
 				continue;
 
+			if (rows_this_statement > 0)
+				archputs(", (", fout);
+			else
+				archputs("(", fout);
+
+
 			for (field = 0; field < nfields; field++)
 			{
 				if (field > 0)
@@ -2027,10 +2071,27 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 				}
 			}
 
-			if (!dopt->do_nothing)
-				archputs(");\n", fout);
+			rows_this_statement++;
+
+			/*
+			 * If we've put the target number of rows onto this statement then
+			 * we can terminate it now.
+			 */
+			if (rows_this_statement == rows_per_statement)
+			{
+				/* Reset the row counter */
+				rows_this_statement = 0;
+				if (!dopt->do_nothing)
+					archputs(");\n", fout);
+				else
+					archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			}
 			else
-				archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			{
+				/* Otherwise, get ready for the next row. */
+				archputs(")", fout);
+			}
+
 		}
 
 		if (PQntuples(res) <= 0)
@@ -2041,6 +2102,15 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		PQclear(res);
 	}
 
+	/* Terminate any statements that didn't make the row count.*/
+	if (rows_this_statement > 0)
+	{
+		if (!dopt->do_nothing)
+			archputs(";\n", fout);
+		else
+			archputs(" ON CONFLICT DO NOTHING;\n", fout);
+	}
+
 	archputs("\n\n", fout);
 
 	ExecuteSqlStatement(fout, "CLOSE _pg_dump_cursor");
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 21d2ab05b0..59ac3d096e 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -126,6 +126,12 @@ typedef uint32 DumpComponents;	/* a bitmask of dump object components */
 		DUMP_COMPONENT_DATA |\
 		DUMP_COMPONENT_POLICY)
 
+/*
+ * The default number of rows per INSERT statement when
+ * --inserts is specified without --rows-per-insert
+ */
+#define DUMP_DEFAULT_ROWS_PER_INSERT 1
+
 typedef struct _dumpableObject
 {
 	DumpableObjectType objType;
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a875d540b8..ebd83922dd 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -118,8 +118,8 @@ command_fails_like(
 
 command_fails_like(
 	[ 'pg_dump', '--on-conflict-do-nothing' ],
-	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts\E/,
-	'pg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts');
+	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\E/,
+	'pg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts');
 
 # pg_dumpall command-line argument checks
 command_fails_like(
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 245fcbf5ce..4cf028de30 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -289,6 +289,16 @@ my %pgdump_runs = (
 			"$tempdir/role_parallel",
 		],
 	},
+	rows_per_insert => {
+		dump_cmd => [
+			'pg_dump',
+			'--no-sync',
+			"--file=$tempdir/rows_per_insert.sql", '-a',
+			'--rows-per-insert=3',
+			'--table=dump_test.test_table',
+			'postgres',
+		],
+	},
 	schema_only => {
 		dump_cmd => [
 			'pg_dump',                         '--format=plain',
@@ -1287,6 +1297,13 @@ my %tests = (
 		like => { column_inserts => 1, },
 	},
 
+	'INSERT INTO test_table' => {
+		regexp => qr/^
+			(?:INSERT\ INTO\ dump_test.test_table\ VALUES\ \(\d,\ NULL,\ NULL,\ NULL\),\ \(\d,\ NULL,\ NULL,\ NULL\),\ \(\d,\ NULL,\ NULL,\ NULL\);\n){3}
+			/xm,
+		like => { rows_per_insert => 1, },
+	},
+
 	'INSERT INTO test_second_table' => {
 		regexp => qr/^
 			(?:INSERT\ INTO\ dump_test.test_second_table\ \(col1,\ col2\)

#43

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: David Rowley (#42)

Re: pg_dump multi VALUES INSERT

On Thu, 31 Jan 2019 at 11:49, David Rowley <david.rowley@2ndquadrant.com> wrote:

Here's another version, same as before but with tests this time.

Hi Fabien,

Wondering if you have anything else here? I'm happy for the v13
version to be marked as ready for committer.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#44

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: David Rowley (#43)

Re: pg_dump multi VALUES INSERT

Hello David,

Wondering if you have anything else here? I'm happy for the v13
version to be marked as ready for committer.

I still have a few comments.

Patch applies cleanly, compiles, global & local make check are ok.

Typos and style in the doc:

"However, since, by default this option generates ..."
"However, since this option, by default, generates ..."

I'd suggest a more straightforward to my mind and ear: "However, since by
default the option generates ..., ....", although beware that I'm not a
native English speaker.

I do not understand why dump_inserts declaration has left the "flags for
options" section.

I'd suggest not to rely on "atoi" because it does not check the argument
syntax, so basically anything is accepted, eg "1O" is 1;

On "if (!dopt->do_nothing) $1 else $2;", I'd rather use a straight
condition "if (dopt->do_nothing) $2 else $1;" (two instances).

There is a test, that is good! Charater "." should be backslashed in the
regexpr. I'd consider also introducing limit cases: empty table, empty
columns by creating corresponding tables and using -t repeatedly.

--
Fabien.

#45

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Fabien COELHO (#44)

Re: pg_dump multi VALUES INSERT

On Sat, 2 Feb 2019 at 21:26, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

I do not understand why dump_inserts declaration has left the "flags for
options" section.

I moved that because it's no longer just a flag. It now stores an int value.

I'd suggest not to rely on "atoi" because it does not check the argument
syntax, so basically anything is accepted, eg "1O" is 1;

Seems like it's good enough for --jobs and --compress. Do you think
those should be changed too? or what's the reason to hold
--rows-per-insert to a different standard?

There is a test, that is good! Charater "." should be backslashed in the
regexpr.

Yeah, you're right. I wonder if we should fix the test of them in
another patch.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#46

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: David Rowley (#45)

Re: pg_dump multi VALUES INSERT

Hello David,

I do not understand why dump_inserts declaration has left the "flags for
options" section.

I moved that because it's no longer just a flag. It now stores an int value.

Hmmm. Indeed, all th "int"s of this section should be "bool" instead. Now,
some "flags" do not appear although the culd (clear, createdb, blobs), so
the logic is kinda fuzzy anyway. Do as you wish.

I'd suggest not to rely on "atoi" because it does not check the argument
syntax, so basically anything is accepted, eg "1O" is 1;

Seems like it's good enough for --jobs and --compress. Do you think
those should be changed too? or what's the reason to hold
--rows-per-insert to a different standard?

I think that there is a case for avoiding sloppy "good enough" programming
practices:-) Alas, as you point out, "atoi" is widely used. I'm campaining
to avoid adding more of them. There has been some push to actually remove
"atoi" when not appropriate, eg from "libpq". I'd suggest to consider
starting doing the right thing, and left fixing old patterns to another
patch.

There is a test, that is good! Charater "." should be backslashed in the
regexpr.

Yeah, you're right. I wonder if we should fix the test of them in
another patch.

From a software engineering perspective, I'd say that a feature and its
tests really belong to the same patch.

--
Fabien.

#47

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Fabien COELHO (#46)

Re: pg_dump multi VALUES INSERT

On Sun, 3 Feb 2019 at 21:00, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

There is a test, that is good! Charater "." should be backslashed in the
regexpr.

Yeah, you're right. I wonder if we should fix the test of them in
another patch.

From a software engineering perspective, I'd say that a feature and its
tests really belong to the same patch.

I meant to say "fix the rest" if them, not "the test of them".

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#48

surafel3000@gmail.com

almost 7 years ago

In reply to: Fabien COELHO (#46)

Re: pg_dump multi VALUES INSERT

On Sun, Feb 3, 2019 at 11:00 AM Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Hello David,

I do not understand why dump_inserts declaration has left the "flags for
options" section.

I moved that because it's no longer just a flag. It now stores an int

value.

Hmmm. Indeed, all th "int"s of this section should be "bool" instead. Now,
some "flags" do not appear although the culd (clear, createdb, blobs), so
the logic is kinda fuzzy anyway. Do as you wish.

I'd suggest not to rely on "atoi" because it does not check the argument
syntax, so basically anything is accepted, eg "1O" is 1;

Seems like it's good enough for --jobs and --compress. Do you think
those should be changed too? or what's the reason to hold
--rows-per-insert to a different standard?

I think that there is a case for avoiding sloppy "good enough" programming
practices:-) Alas, as you point out, "atoi" is widely used. I'm campaining
to avoid adding more of them. There has been some push to actually remove
"atoi" when not appropriate, eg from "libpq". I'd suggest to consider
starting doing the right thing, and left fixing old patterns to another
patch.

at least for processing user argument i think it is better to use strtol or
other
function that have better error handling. i can make a patch that change
usage
of atoi for user argument processing after getting feedback from here or i
will do
simultaneously

regards
Surafel

#49

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Surafel Temesgen (#48)

Re: pg_dump multi VALUES INSERT

On Sun, Feb 03, 2019 at 01:21:45PM +0300, Surafel Temesgen wrote:

at least for processing user argument i think it is better to use strtol or
other
function that have better error handling. i can make a patch that change
usage
of atoi for user argument processing after getting feedback from here or i
will do
simultaneously

Moved the patch to next CF for now, the discussion is going on.
--
Michael

#50

surafel3000@gmail.com

almost 7 years ago

In reply to: Fabien COELHO (#44)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

On Sat, Feb 2, 2019 at 11:26 AM Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Hello David,

Wondering if you have anything else here? I'm happy for the v13
version to be marked as ready for committer.

I still have a few comments.

Patch applies cleanly, compiles, global & local make check are ok.

Typos and style in the doc:

"However, since, by default this option generates ..."
"However, since this option, by default, generates ..."

I'd suggest a more straightforward to my mind and ear: "However, since by
default the option generates ..., ....", although beware that I'm not a
native English speaker.

fixed

I'd suggest not to rely on "atoi" because it does not check the argument

syntax, so basically anything is accepted, eg "1O" is 1;

i change it to strtol

On "if (!dopt->do_nothing) $1 else $2;", I'd rather use a straight
condition "if (dopt->do_nothing) $2 else $1;" (two instances).

fixed

regards
Surafel

Attachments:

pg_dump-rows-per-insert-option_v14.patchtext/x-patch; charset=US-ASCII; name=pg_dump-rows-per-insert-option_v14.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 9e0bb93f08..0ab57067a8 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -661,9 +661,9 @@ PostgreSQL documentation
         ...</literal>).  This will make restoration very slow; it is mainly
         useful for making dumps that can be loaded into
         non-<productname>PostgreSQL</productname> databases.
-        However, since this option generates a separate command for each row,
-        an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
+        However, since by default the option generates a separate command
+        for each row, an error in reloading a row causes only that row to be
+        lost rather than the entire table contents.
        </para>
       </listitem>
      </varlistentry>
@@ -764,11 +764,10 @@ PostgreSQL documentation
         than <command>COPY</command>).  This will make restoration very slow;
         it is mainly useful for making dumps that can be loaded into
         non-<productname>PostgreSQL</productname> databases.
-        However, since this option generates a separate command for each row,
-        an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
-        Note that
-        the restore might fail altogether if you have rearranged column order.
+        However, since by default the option generates a separate command
+        for each row, an error in reloading a row causes only that row to be
+        lost rather than the entire table contents.  Note that the restore
+        might fail altogether if you have rearranged column order.
         The <option>--column-inserts</option> option is safe against column
         order changes, though even slower.
        </para>
@@ -914,8 +913,9 @@ PostgreSQL documentation
        <para>
         Add <literal>ON CONFLICT DO NOTHING</literal> to
         <command>INSERT</command> commands.
-        This option is not valid unless <option>--inserts</option> or
-        <option>--column-inserts</option> is also specified.
+        This option is not valid unless <option>--inserts</option>,
+        <option>--column-inserts</option> or
+        <option>--rows-per-insert</option> is also specified.
        </para>
       </listitem>
      </varlistentry>
@@ -938,6 +938,18 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--rows-per-insert=<replaceable class="parameter">nrows</replaceable></option></term>
+      <listitem>
+       <para>
+        Dump data as <command>INSERT</command> commands (rather than
+        <command>COPY</command>).  Controls the maximum number of rows per
+        <command>INSERT</command> statement. The value specified must be a
+        number greater than zero.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
        <term><option>--section=<replaceable class="parameter">sectionname</replaceable></option></term>
        <listitem>
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..7ab27391fb 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -140,10 +140,10 @@ typedef struct _dumpOptions
 	int			dumpSections;	/* bitmask of chosen sections */
 	bool		aclsSkip;
 	const char *lockWaitTimeout;
+	int			dump_inserts;	/* 0 = COPY, otherwise rows per INSERT */
 
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
-	int			dump_inserts;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 3a89ad846a..957687db0f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -307,6 +307,7 @@ main(int argc, char **argv)
 	const char *dumpencoding = NULL;
 	const char *dumpsnapshot = NULL;
 	char	   *use_role = NULL;
+	char       *rowPerInsertEndPtr;
 	int			numWorkers = 1;
 	trivalue	prompt_password = TRI_DEFAULT;
 	int			compressLevel = -1;
@@ -358,7 +359,7 @@ main(int argc, char **argv)
 		{"enable-row-security", no_argument, &dopt.enable_row_security, 1},
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
-		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"inserts", no_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -377,6 +378,7 @@ main(int argc, char **argv)
 		{"no-subscriptions", no_argument, &dopt.no_subscriptions, 1},
 		{"no-sync", no_argument, NULL, 7},
 		{"on-conflict-do-nothing", no_argument, &dopt.do_nothing, 1},
+		{"rows-per-insert", required_argument, NULL, 9},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -557,6 +559,36 @@ main(int argc, char **argv)
 				dosync = false;
 				break;
 
+			case 8:				/* inserts */
+				/*
+				 * dump_inserts also stores --rows-per-insert, careful not to
+				 * overwrite that.
+				 */
+				if (dopt.dump_inserts == 0)
+					dopt.dump_inserts = DUMP_DEFAULT_ROWS_PER_INSERT;
+				break;
+
+			case 9:			/* rows per insert */
+				errno = 0;
+				dopt.dump_inserts = strtol(optarg, &rowPerInsertEndPtr, 10);
+
+				if (rowPerInsertEndPtr == optarg || *rowPerInsertEndPtr != '\0')
+				{
+					write_msg(NULL, "argument of --rows-per-insert must be a number\n");
+					exit_nicely(1);
+				}
+				if (errno == ERANGE)
+				{
+					write_msg(NULL, "argument of --rows-per-insert exceeds integer range.\n");
+					exit_nicely(1);
+				}
+				if (dopt.dump_inserts <= 0)
+				{
+					write_msg(NULL, "rows-per-insert must be a positive number\n");
+					exit_nicely(1);
+				}
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -581,8 +613,8 @@ main(int argc, char **argv)
 	}
 
 	/* --column-inserts implies --inserts */
-	if (dopt.column_inserts)
-		dopt.dump_inserts = 1;
+	if (dopt.column_inserts && dopt.dump_inserts == 0)
+		dopt.dump_inserts = DUMP_DEFAULT_ROWS_PER_INSERT;
 
 	/*
 	 * Binary upgrade mode implies dumping sequence data even in schema-only
@@ -607,8 +639,12 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
-		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
+	/*
+	 * --inserts are already implied above if --column-inserts or
+	 * --rows-per-insert were specified.
+	 */
+	if (dopt.do_nothing && dopt.dump_inserts == 0)
+		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\n");
 
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -977,6 +1013,7 @@ help(const char *progname)
 	printf(_("  --no-unlogged-table-data     do not dump unlogged table data\n"));
 	printf(_("  --on-conflict-do-nothing     add ON CONFLICT DO NOTHING to INSERT commands\n"));
 	printf(_("  --quote-all-identifiers      quote all identifiers, even if not key words\n"));
+	printf(_("  --rows-per-insert=NROWS      number of row per INSERT command\n"));
 	printf(_("  --section=SECTION            dump named section (pre-data, data, or post-data)\n"));
 	printf(_("  --serializable-deferrable    wait until the dump can run without anomalies\n"));
 	printf(_("  --snapshot=SNAPSHOT          use given snapshot for the dump\n"));
@@ -1886,6 +1923,8 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	int			tuple;
 	int			nfields;
 	int			field;
+	int			rows_per_statement = dopt->dump_inserts;
+	int			rows_this_statement = 0;
 
 	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
 					  "SELECT * FROM ONLY %s",
@@ -1900,68 +1939,86 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		res = ExecuteSqlQuery(fout, "FETCH 100 FROM _pg_dump_cursor",
 							  PGRES_TUPLES_OK);
 		nfields = PQnfields(res);
+
+		/*
+		 * First time through, we build as much of the INSERT statement as
+		 * possible in "insertStmt", which we can then just print for each
+		 * line. If the table happens to have zero columns then this will
+		 * be a complete statement, otherwise it will end in "VALUES " and
+		 * be ready to have the row's column values printed.
+		 */
+		if (insertStmt == NULL)
+		{
+			TableInfo  *targettab;
+
+			insertStmt = createPQExpBuffer();
+
+			/*
+			 * When load-via-partition-root is set, get the root table
+			 * name for the partition table, so that we can reload data
+			 * through the root table.
+			 */
+			if (dopt->load_via_partition_root && tbinfo->ispartition)
+				targettab = getRootTableInfo(tbinfo);
+			else
+				targettab = tbinfo;
+
+			appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+							  fmtQualifiedDumpable(targettab));
+
+			/* corner case for zero-column table */
+			if (nfields == 0)
+			{
+				appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+			}
+			else
+			{
+				/* append the list of column names if required */
+				if (dopt->column_inserts)
+				{
+					appendPQExpBufferChar(insertStmt, '(');
+					for (field = 0; field < nfields; field++)
+					{
+						if (field > 0)
+							appendPQExpBufferStr(insertStmt, ", ");
+						appendPQExpBufferStr(insertStmt,
+											 fmtId(PQfname(res, field)));
+					}
+					appendPQExpBufferStr(insertStmt, ") ");
+				}
+
+				if (tbinfo->needs_override)
+					appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+
+				appendPQExpBufferStr(insertStmt, "VALUES ");
+			}
+		}
+
 		for (tuple = 0; tuple < PQntuples(res); tuple++)
 		{
+			/* Write the INSERT if not in the middle of a multi-row INSERT. */
+			if (rows_this_statement == 0)
+				archputs(insertStmt->data, fout);
+
+
 			/*
-			 * First time through, we build as much of the INSERT statement as
-			 * possible in "insertStmt", which we can then just print for each
-			 * line. If the table happens to have zero columns then this will
-			 * be a complete statement, otherwise it will end in "VALUES(" and
-			 * be ready to have the row's column values appended.
+			 * If it is zero-column table then we've aleady written the
+			 * complete statement, which will mean we've disobeyed
+			 * --rows-per-insert when it's set greater than 1.  We do support
+			 * a way to make this multi-row with:
+			 * SELECT UNION ALL SELECT UNION ALL ... but that's non-standard
+			 * so likely we should avoid it given that using INSERTs is
+			 * mostly only ever needed for cross-database exports.
 			 */
-			if (insertStmt == NULL)
-			{
-				TableInfo  *targettab;
-
-				insertStmt = createPQExpBuffer();
-
-				/*
-				 * When load-via-partition-root is set, get the root table
-				 * name for the partition table, so that we can reload data
-				 * through the root table.
-				 */
-				if (dopt->load_via_partition_root && tbinfo->ispartition)
-					targettab = getRootTableInfo(tbinfo);
-				else
-					targettab = tbinfo;
-
-				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
-								  fmtQualifiedDumpable(targettab));
-
-				/* corner case for zero-column table */
-				if (nfields == 0)
-				{
-					appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
-				}
-				else
-				{
-					/* append the list of column names if required */
-					if (dopt->column_inserts)
-					{
-						appendPQExpBufferChar(insertStmt, '(');
-						for (field = 0; field < nfields; field++)
-						{
-							if (field > 0)
-								appendPQExpBufferStr(insertStmt, ", ");
-							appendPQExpBufferStr(insertStmt,
-												 fmtId(PQfname(res, field)));
-						}
-						appendPQExpBufferStr(insertStmt, ") ");
-					}
-
-					if (tbinfo->needs_override)
-						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
-
-					appendPQExpBufferStr(insertStmt, "VALUES (");
-				}
-			}
-
-			archputs(insertStmt->data, fout);
-
-			/* if it is zero-column table then we're done */
 			if (nfields == 0)
 				continue;
 
+			if (rows_this_statement > 0)
+				archputs(", (", fout);
+			else
+				archputs("(", fout);
+
+
 			for (field = 0; field < nfields; field++)
 			{
 				if (field > 0)
@@ -2027,10 +2084,27 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 				}
 			}
 
-			if (!dopt->do_nothing)
-				archputs(");\n", fout);
+			rows_this_statement++;
+
+			/*
+			 * If we've put the target number of rows onto this statement then
+			 * we can terminate it now.
+			 */
+			if (rows_this_statement == rows_per_statement)
+			{
+				/* Reset the row counter */
+				rows_this_statement = 0;
+				if (dopt->do_nothing)
+					archputs(") ON CONFLICT DO NOTHING;\n", fout);
+				else
+					archputs(");\n", fout);
+			}
 			else
-				archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			{
+				/* Otherwise, get ready for the next row. */
+				archputs(")", fout);
+			}
+
 		}
 
 		if (PQntuples(res) <= 0)
@@ -2041,6 +2115,15 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		PQclear(res);
 	}
 
+	/* Terminate any statements that didn't make the row count.*/
+	if (rows_this_statement > 0)
+	{
+		if (dopt->do_nothing)
+			archputs(" ON CONFLICT DO NOTHING;\n", fout);
+		else
+			archputs(";\n", fout);
+	}
+
 	archputs("\n\n", fout);
 
 	ExecuteSqlStatement(fout, "CLOSE _pg_dump_cursor");
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 21d2ab05b0..59ac3d096e 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -126,6 +126,12 @@ typedef uint32 DumpComponents;	/* a bitmask of dump object components */
 		DUMP_COMPONENT_DATA |\
 		DUMP_COMPONENT_POLICY)
 
+/*
+ * The default number of rows per INSERT statement when
+ * --inserts is specified without --rows-per-insert
+ */
+#define DUMP_DEFAULT_ROWS_PER_INSERT 1
+
 typedef struct _dumpableObject
 {
 	DumpableObjectType objType;
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a875d540b8..ebd83922dd 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -118,8 +118,8 @@ command_fails_like(
 
 command_fails_like(
 	[ 'pg_dump', '--on-conflict-do-nothing' ],
-	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts\E/,
-	'pg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts');
+	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\E/,
+	'pg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts');
 
 # pg_dumpall command-line argument checks
 command_fails_like(
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 245fcbf5ce..4cf028de30 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -289,6 +289,16 @@ my %pgdump_runs = (
 			"$tempdir/role_parallel",
 		],
 	},
+	rows_per_insert => {
+		dump_cmd => [
+			'pg_dump',
+			'--no-sync',
+			"--file=$tempdir/rows_per_insert.sql", '-a',
+			'--rows-per-insert=3',
+			'--table=dump_test.test_table',
+			'postgres',
+		],
+	},
 	schema_only => {
 		dump_cmd => [
 			'pg_dump',                         '--format=plain',
@@ -1287,6 +1297,13 @@ my %tests = (
 		like => { column_inserts => 1, },
 	},
 
+	'INSERT INTO test_table' => {
+		regexp => qr/^
+			(?:INSERT\ INTO\ dump_test.test_table\ VALUES\ \(\d,\ NULL,\ NULL,\ NULL\),\ \(\d,\ NULL,\ NULL,\ NULL\),\ \(\d,\ NULL,\ NULL,\ NULL\);\n){3}
+			/xm,
+		like => { rows_per_insert => 1, },
+	},
+
 	'INSERT INTO test_second_table' => {
 		regexp => qr/^
 			(?:INSERT\ INTO\ dump_test.test_second_table\ \(col1,\ col2\)

#51

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Surafel Temesgen (#50)

Re: pg_dump multi VALUES INSERT

Reviewing pg_dump-rows-per-insert-option-v14.

Mostly going back over things that Fabien mentioned:

On Sat, 2 Feb 2019 at 21:26, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

There is a test, that is good! Charater "." should be backslashed in the
regexpr. I'd consider also introducing limit cases: empty table, empty
columns by creating corresponding tables and using -t repeatedly.

+ (?:INSERT\ INTO\ dump_test.test_table\ VALUES\ $\d,\ NULL,\ NULL,\
NULL$,\ $\d,\ NULL,\ NULL,\ NULL$,\ $\d,\ NULL,\ NULL,\
NULL$;\n){3}

the . here before the table name needs to be escaped. The ones missing
in the existing tests should have been fixed by d07fb6810e.

There's also the additional tests that Fabien mentions.

Also, maybe one for Fabien (because he seems keen on keeping the
--rows-per-insert validation code)

strtol() returns a long. dump_inserts is an int, so on machines where
sizeof(long) == 8 and sizeof(int) == 4 (most machines, these days) the
validation is not bulletproof. This could lead to:

$ pg_dump --rows-per-insert=2147483648
pg_dump: rows-per-insert must be a positive number

For me, I was fine with the atoi() code that the other options use,
but maybe Fabien has a problem with the long vs int?

It would be simple to workaround by assigning the strtol() to a long
and making the ERANGE test check for ERANGE or ... > PG_INT_MAX

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#52

surafel3000@gmail.com

almost 7 years ago

In reply to: David Rowley (#51)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

On Mon, Feb 11, 2019 at 10:20 AM David Rowley <david.rowley@2ndquadrant.com>
wrote:

Reviewing pg_dump-rows-per-insert-option-v14.

Also, maybe one for Fabien (because he seems keen on keeping the

--rows-per-insert validation code)

strtol() returns a long. dump_inserts is an int, so on machines where
sizeof(long) == 8 and sizeof(int) == 4 (most machines, these days) the
validation is not bulletproof. This could lead to:

$ pg_dump --rows-per-insert=2147483648
pg_dump: rows-per-insert must be a positive number

fixed

For me, I was fine with the atoi() code that the other options use,
but maybe Fabien has a problem with the long vs int?

The main issue with atoi() is it does not detect errors and return 0 for
both invalid input and input value 0 but in our case it doesn't case a
problem because it error out for value 0. but for example in compress Level
if invalid input supplied it silently precede as input value 0 is supplied.

regards
Surafel

Attachments:

pg_dump-rows-per-insert-option-v15.patchtext/x-patch; charset=US-ASCII; name=pg_dump-rows-per-insert-option-v15.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 9e0bb93f08..0ab57067a8 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -661,9 +661,9 @@ PostgreSQL documentation
         ...</literal>).  This will make restoration very slow; it is mainly
         useful for making dumps that can be loaded into
         non-<productname>PostgreSQL</productname> databases.
-        However, since this option generates a separate command for each row,
-        an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
+        However, since by default the option generates a separate command
+        for each row, an error in reloading a row causes only that row to be
+        lost rather than the entire table contents.
        </para>
       </listitem>
      </varlistentry>
@@ -764,11 +764,10 @@ PostgreSQL documentation
         than <command>COPY</command>).  This will make restoration very slow;
         it is mainly useful for making dumps that can be loaded into
         non-<productname>PostgreSQL</productname> databases.
-        However, since this option generates a separate command for each row,
-        an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
-        Note that
-        the restore might fail altogether if you have rearranged column order.
+        However, since by default the option generates a separate command
+        for each row, an error in reloading a row causes only that row to be
+        lost rather than the entire table contents.  Note that the restore
+        might fail altogether if you have rearranged column order.
         The <option>--column-inserts</option> option is safe against column
         order changes, though even slower.
        </para>
@@ -914,8 +913,9 @@ PostgreSQL documentation
        <para>
         Add <literal>ON CONFLICT DO NOTHING</literal> to
         <command>INSERT</command> commands.
-        This option is not valid unless <option>--inserts</option> or
-        <option>--column-inserts</option> is also specified.
+        This option is not valid unless <option>--inserts</option>,
+        <option>--column-inserts</option> or
+        <option>--rows-per-insert</option> is also specified.
        </para>
       </listitem>
      </varlistentry>
@@ -938,6 +938,18 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--rows-per-insert=<replaceable class="parameter">nrows</replaceable></option></term>
+      <listitem>
+       <para>
+        Dump data as <command>INSERT</command> commands (rather than
+        <command>COPY</command>).  Controls the maximum number of rows per
+        <command>INSERT</command> statement. The value specified must be a
+        number greater than zero.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
        <term><option>--section=<replaceable class="parameter">sectionname</replaceable></option></term>
        <listitem>
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..7ab27391fb 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -140,10 +140,10 @@ typedef struct _dumpOptions
 	int			dumpSections;	/* bitmask of chosen sections */
 	bool		aclsSkip;
 	const char *lockWaitTimeout;
+	int			dump_inserts;	/* 0 = COPY, otherwise rows per INSERT */
 
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
-	int			dump_inserts;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 3a89ad846a..c7403f4e40 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -307,6 +307,8 @@ main(int argc, char **argv)
 	const char *dumpencoding = NULL;
 	const char *dumpsnapshot = NULL;
 	char	   *use_role = NULL;
+	char       *rowPerInsertEndPtr;
+	long			rowPerInsert;
 	int			numWorkers = 1;
 	trivalue	prompt_password = TRI_DEFAULT;
 	int			compressLevel = -1;
@@ -358,7 +360,7 @@ main(int argc, char **argv)
 		{"enable-row-security", no_argument, &dopt.enable_row_security, 1},
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
-		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"inserts", no_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -377,6 +379,7 @@ main(int argc, char **argv)
 		{"no-subscriptions", no_argument, &dopt.no_subscriptions, 1},
 		{"no-sync", no_argument, NULL, 7},
 		{"on-conflict-do-nothing", no_argument, &dopt.do_nothing, 1},
+		{"rows-per-insert", required_argument, NULL, 9},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -557,6 +560,37 @@ main(int argc, char **argv)
 				dosync = false;
 				break;
 
+			case 8:				/* inserts */
+				/*
+				 * dump_inserts also stores --rows-per-insert, careful not to
+				 * overwrite that.
+				 */
+				if (dopt.dump_inserts == 0)
+					dopt.dump_inserts = DUMP_DEFAULT_ROWS_PER_INSERT;
+				break;
+
+			case 9:			/* rows per insert */
+				errno = 0;
+				rowPerInsert = strtol(optarg, &rowPerInsertEndPtr, 10);
+
+				if (rowPerInsertEndPtr == optarg || *rowPerInsertEndPtr != '\0')
+				{
+					write_msg(NULL, "argument of --rows-per-insert must be a number\n");
+					exit_nicely(1);
+				}
+				if (rowPerInsert > INT_MAX)
+				{
+					write_msg(NULL, "argument of --rows-per-insert exceeds integer range.\n");
+					exit_nicely(1);
+				}
+				if (rowPerInsert <= 0)
+				{
+					write_msg(NULL, "rows-per-insert must be a positive number\n");
+					exit_nicely(1);
+				}
+				dopt.dump_inserts = rowPerInsert;
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -581,8 +615,8 @@ main(int argc, char **argv)
 	}
 
 	/* --column-inserts implies --inserts */
-	if (dopt.column_inserts)
-		dopt.dump_inserts = 1;
+	if (dopt.column_inserts && dopt.dump_inserts == 0)
+		dopt.dump_inserts = DUMP_DEFAULT_ROWS_PER_INSERT;
 
 	/*
 	 * Binary upgrade mode implies dumping sequence data even in schema-only
@@ -607,8 +641,12 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
-		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
+	/*
+	 * --inserts are already implied above if --column-inserts or
+	 * --rows-per-insert were specified.
+	 */
+	if (dopt.do_nothing && dopt.dump_inserts == 0)
+		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\n");
 
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -977,6 +1015,7 @@ help(const char *progname)
 	printf(_("  --no-unlogged-table-data     do not dump unlogged table data\n"));
 	printf(_("  --on-conflict-do-nothing     add ON CONFLICT DO NOTHING to INSERT commands\n"));
 	printf(_("  --quote-all-identifiers      quote all identifiers, even if not key words\n"));
+	printf(_("  --rows-per-insert=NROWS      number of row per INSERT command\n"));
 	printf(_("  --section=SECTION            dump named section (pre-data, data, or post-data)\n"));
 	printf(_("  --serializable-deferrable    wait until the dump can run without anomalies\n"));
 	printf(_("  --snapshot=SNAPSHOT          use given snapshot for the dump\n"));
@@ -1886,6 +1925,8 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	int			tuple;
 	int			nfields;
 	int			field;
+	int			rows_per_statement = dopt->dump_inserts;
+	int			rows_this_statement = 0;
 
 	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
 					  "SELECT * FROM ONLY %s",
@@ -1900,68 +1941,86 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		res = ExecuteSqlQuery(fout, "FETCH 100 FROM _pg_dump_cursor",
 							  PGRES_TUPLES_OK);
 		nfields = PQnfields(res);
+
+		/*
+		 * First time through, we build as much of the INSERT statement as
+		 * possible in "insertStmt", which we can then just print for each
+		 * line. If the table happens to have zero columns then this will
+		 * be a complete statement, otherwise it will end in "VALUES " and
+		 * be ready to have the row's column values printed.
+		 */
+		if (insertStmt == NULL)
+		{
+			TableInfo  *targettab;
+
+			insertStmt = createPQExpBuffer();
+
+			/*
+			 * When load-via-partition-root is set, get the root table
+			 * name for the partition table, so that we can reload data
+			 * through the root table.
+			 */
+			if (dopt->load_via_partition_root && tbinfo->ispartition)
+				targettab = getRootTableInfo(tbinfo);
+			else
+				targettab = tbinfo;
+
+			appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+							  fmtQualifiedDumpable(targettab));
+
+			/* corner case for zero-column table */
+			if (nfields == 0)
+			{
+				appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+			}
+			else
+			{
+				/* append the list of column names if required */
+				if (dopt->column_inserts)
+				{
+					appendPQExpBufferChar(insertStmt, '(');
+					for (field = 0; field < nfields; field++)
+					{
+						if (field > 0)
+							appendPQExpBufferStr(insertStmt, ", ");
+						appendPQExpBufferStr(insertStmt,
+											 fmtId(PQfname(res, field)));
+					}
+					appendPQExpBufferStr(insertStmt, ") ");
+				}
+
+				if (tbinfo->needs_override)
+					appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+
+				appendPQExpBufferStr(insertStmt, "VALUES ");
+			}
+		}
+
 		for (tuple = 0; tuple < PQntuples(res); tuple++)
 		{
+			/* Write the INSERT if not in the middle of a multi-row INSERT. */
+			if (rows_this_statement == 0)
+				archputs(insertStmt->data, fout);
+
+
 			/*
-			 * First time through, we build as much of the INSERT statement as
-			 * possible in "insertStmt", which we can then just print for each
-			 * line. If the table happens to have zero columns then this will
-			 * be a complete statement, otherwise it will end in "VALUES(" and
-			 * be ready to have the row's column values appended.
+			 * If it is zero-column table then we've aleady written the
+			 * complete statement, which will mean we've disobeyed
+			 * --rows-per-insert when it's set greater than 1.  We do support
+			 * a way to make this multi-row with:
+			 * SELECT UNION ALL SELECT UNION ALL ... but that's non-standard
+			 * so likely we should avoid it given that using INSERTs is
+			 * mostly only ever needed for cross-database exports.
 			 */
-			if (insertStmt == NULL)
-			{
-				TableInfo  *targettab;
-
-				insertStmt = createPQExpBuffer();
-
-				/*
-				 * When load-via-partition-root is set, get the root table
-				 * name for the partition table, so that we can reload data
-				 * through the root table.
-				 */
-				if (dopt->load_via_partition_root && tbinfo->ispartition)
-					targettab = getRootTableInfo(tbinfo);
-				else
-					targettab = tbinfo;
-
-				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
-								  fmtQualifiedDumpable(targettab));
-
-				/* corner case for zero-column table */
-				if (nfields == 0)
-				{
-					appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
-				}
-				else
-				{
-					/* append the list of column names if required */
-					if (dopt->column_inserts)
-					{
-						appendPQExpBufferChar(insertStmt, '(');
-						for (field = 0; field < nfields; field++)
-						{
-							if (field > 0)
-								appendPQExpBufferStr(insertStmt, ", ");
-							appendPQExpBufferStr(insertStmt,
-												 fmtId(PQfname(res, field)));
-						}
-						appendPQExpBufferStr(insertStmt, ") ");
-					}
-
-					if (tbinfo->needs_override)
-						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
-
-					appendPQExpBufferStr(insertStmt, "VALUES (");
-				}
-			}
-
-			archputs(insertStmt->data, fout);
-
-			/* if it is zero-column table then we're done */
 			if (nfields == 0)
 				continue;
 
+			if (rows_this_statement > 0)
+				archputs(", (", fout);
+			else
+				archputs("(", fout);
+
+
 			for (field = 0; field < nfields; field++)
 			{
 				if (field > 0)
@@ -2027,10 +2086,27 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 				}
 			}
 
-			if (!dopt->do_nothing)
-				archputs(");\n", fout);
+			rows_this_statement++;
+
+			/*
+			 * If we've put the target number of rows onto this statement then
+			 * we can terminate it now.
+			 */
+			if (rows_this_statement == rows_per_statement)
+			{
+				/* Reset the row counter */
+				rows_this_statement = 0;
+				if (dopt->do_nothing)
+					archputs(") ON CONFLICT DO NOTHING;\n", fout);
+				else
+					archputs(");\n", fout);
+			}
 			else
-				archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			{
+				/* Otherwise, get ready for the next row. */
+				archputs(")", fout);
+			}
+
 		}
 
 		if (PQntuples(res) <= 0)
@@ -2041,6 +2117,15 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		PQclear(res);
 	}
 
+	/* Terminate any statements that didn't make the row count.*/
+	if (rows_this_statement > 0)
+	{
+		if (dopt->do_nothing)
+			archputs(" ON CONFLICT DO NOTHING;\n", fout);
+		else
+			archputs(";\n", fout);
+	}
+
 	archputs("\n\n", fout);
 
 	ExecuteSqlStatement(fout, "CLOSE _pg_dump_cursor");
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 21d2ab05b0..59ac3d096e 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -126,6 +126,12 @@ typedef uint32 DumpComponents;	/* a bitmask of dump object components */
 		DUMP_COMPONENT_DATA |\
 		DUMP_COMPONENT_POLICY)
 
+/*
+ * The default number of rows per INSERT statement when
+ * --inserts is specified without --rows-per-insert
+ */
+#define DUMP_DEFAULT_ROWS_PER_INSERT 1
+
 typedef struct _dumpableObject
 {
 	DumpableObjectType objType;
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a875d540b8..ebd83922dd 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -118,8 +118,8 @@ command_fails_like(
 
 command_fails_like(
 	[ 'pg_dump', '--on-conflict-do-nothing' ],
-	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts\E/,
-	'pg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts');
+	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\E/,
+	'pg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts');
 
 # pg_dumpall command-line argument checks
 command_fails_like(
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 245fcbf5ce..4cf028de30 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -289,6 +289,16 @@ my %pgdump_runs = (
 			"$tempdir/role_parallel",
 		],
 	},
+	rows_per_insert => {
+		dump_cmd => [
+			'pg_dump',
+			'--no-sync',
+			"--file=$tempdir/rows_per_insert.sql", '-a',
+			'--rows-per-insert=3',
+			'--table=dump_test.test_table',
+			'postgres',
+		],
+	},
 	schema_only => {
 		dump_cmd => [
 			'pg_dump',                         '--format=plain',
@@ -1287,6 +1297,13 @@ my %tests = (
 		like => { column_inserts => 1, },
 	},
 
+	'INSERT INTO test_table' => {
+		regexp => qr/^
+			(?:INSERT\ INTO\ dump_test.test_table\ VALUES\ \(\d,\ NULL,\ NULL,\ NULL\),\ \(\d,\ NULL,\ NULL,\ NULL\),\ \(\d,\ NULL,\ NULL,\ NULL\);\n){3}
+			/xm,
+		like => { rows_per_insert => 1, },
+	},
+
 	'INSERT INTO test_second_table' => {
 		regexp => qr/^
 			(?:INSERT\ INTO\ dump_test.test_second_table\ \(col1,\ col2\)

#53

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Surafel Temesgen (#52)

Re: pg_dump multi VALUES INSERT

On Wed, 13 Feb 2019 at 19:36, Surafel Temesgen <surafel3000@gmail.com> wrote:

On Mon, Feb 11, 2019 at 10:20 AM David Rowley <david.rowley@2ndquadrant.com> wrote:

Reviewing pg_dump-rows-per-insert-option-v14.

Also, maybe one for Fabien (because he seems keen on keeping the
--rows-per-insert validation code)

strtol() returns a long. dump_inserts is an int, so on machines where
sizeof(long) == 8 and sizeof(int) == 4 (most machines, these days) the
validation is not bulletproof. This could lead to:

$ pg_dump --rows-per-insert=2147483648
pg_dump: rows-per-insert must be a positive number

fixed

Thanks.

I see you didn't touch the tests yet, so I'll set this back to waiting
on author.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#54

surafel3000@gmail.com

almost 7 years ago

In reply to: Fabien COELHO (#44)

Re: pg_dump multi VALUES INSERT

On Sat, Feb 2, 2019 at 11:26 AM Fabien COELHO <coelho@cri.ensmp.fr> wrote:

There is a test, that is good! Charater "." should be backslashed in the
regexpr. I'd consider also introducing limit cases: empty table, empty
columns by creating corresponding tables and using -t repeatedly

I see that there are already a test for zero column table in
test_fourth_table_zero_col
and if am not wrong table_index_stats is empty table

regards
Surafel

#55

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Surafel Temesgen (#54)

Re: pg_dump multi VALUES INSERT

On Tue, 19 Feb 2019 at 02:34, Surafel Temesgen <surafel3000@gmail.com> wrote:

I see that there are already a test for zero column table in test_fourth_table_zero_col
and if am not wrong table_index_stats is empty table

Maybe Fabien would like to see a test that dumps that table with
--rows-per-insert=<something above one> to ensure the output remains
as the other test. I think it might be a good idea too.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#56

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: David Rowley (#55)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

On Fri, 22 Feb 2019 at 14:40, David Rowley <david.rowley@2ndquadrant.com> wrote:

On Tue, 19 Feb 2019 at 02:34, Surafel Temesgen <surafel3000@gmail.com> wrote:

I see that there are already a test for zero column table in test_fourth_table_zero_col
and if am not wrong table_index_stats is empty table

Maybe Fabien would like to see a test that dumps that table with
--rows-per-insert=<something above one> to ensure the output remains
as the other test. I think it might be a good idea too.

This patch was failing to build due to the new extra_float_digits
option that's been added to pg_dump. It was adding an additional case
for 8 in the getopt_long switch statement. In the attached, I've
changed it to use value 9 and 10 for the new options.

I also went ahead and added the zero column test that Fabien
mentioned. Also added the missing backslash from the other test that
had been added.

Fabien also complained about some wording in the docs. I ended up
changing this a little bit as I thought the change was a little
uninformative about what rows won't be restored when an INSERT fails.
I've changed this so that it mentions that all rows which are part of
the same INSERT command will fail in the restore.

I think this can be marked as ready for committer now, but I'll defer
to Fabien to see if he's any other comments.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

pg_dump-rows-per-insert-option-v16.patchapplication/octet-stream; name=pg_dump-rows-per-insert-option-v16.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 033eae9b46..e0e65f9c21 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -661,9 +661,9 @@ PostgreSQL documentation
         ...</literal>).  This will make restoration very slow; it is mainly
         useful for making dumps that can be loaded into
         non-<productname>PostgreSQL</productname> databases.
-        However, since this option generates a separate command for each row,
-        an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
+        Any error during reloading will cause only rows that are part of the
+        problematic <command>INSERT</command> to be lost, rather than the
+        entire table contents.
        </para>
       </listitem>
      </varlistentry>
@@ -775,13 +775,12 @@ PostgreSQL documentation
         than <command>COPY</command>).  This will make restoration very slow;
         it is mainly useful for making dumps that can be loaded into
         non-<productname>PostgreSQL</productname> databases.
-        However, since this option generates a separate command for each row,
-        an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
-        Note that
-        the restore might fail altogether if you have rearranged column order.
-        The <option>--column-inserts</option> option is safe against column
-        order changes, though even slower.
+        Any error during reloading will cause only rows that are part of the
+        problematic <command>INSERT</command> to be lost, rather than the
+        entire table contents.  Note that the restore might fail altogether if
+        you have rearranged column order.  The
+        <option>--column-inserts</option> option is safe against column order
+        changes, though even slower.
        </para>
       </listitem>
      </varlistentry>
@@ -925,8 +924,9 @@ PostgreSQL documentation
        <para>
         Add <literal>ON CONFLICT DO NOTHING</literal> to
         <command>INSERT</command> commands.
-        This option is not valid unless <option>--inserts</option> or
-        <option>--column-inserts</option> is also specified.
+        This option is not valid unless <option>--inserts</option>,
+        <option>--column-inserts</option> or
+        <option>--rows-per-insert</option> is also specified.
        </para>
       </listitem>
      </varlistentry>
@@ -949,6 +949,20 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--rows-per-insert=<replaceable class="parameter">nrows</replaceable></option></term>
+      <listitem>
+       <para>
+        Dump data as <command>INSERT</command> commands (rather than
+        <command>COPY</command>).  Controls the maximum number of rows per
+        <command>INSERT</command> command. The value specified must be a
+        number greater than zero.  Any error during reloading will cause only
+        rows that are part of the problematic <command>INSERT</command> to be
+        lost, rather than the entire table contents.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
        <term><option>--section=<replaceable class="parameter">sectionname</replaceable></option></term>
        <listitem>
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..7ab27391fb 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -140,10 +140,10 @@ typedef struct _dumpOptions
 	int			dumpSections;	/* bitmask of chosen sections */
 	bool		aclsSkip;
 	const char *lockWaitTimeout;
+	int			dump_inserts;	/* 0 = COPY, otherwise rows per INSERT */
 
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
-	int			dump_inserts;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index a08bc4ecae..1adade6b4c 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -311,6 +311,8 @@ main(int argc, char **argv)
 	const char *dumpencoding = NULL;
 	const char *dumpsnapshot = NULL;
 	char	   *use_role = NULL;
+	char       *rowPerInsertEndPtr;
+	long			rowPerInsert;
 	int			numWorkers = 1;
 	trivalue	prompt_password = TRI_DEFAULT;
 	int			compressLevel = -1;
@@ -363,7 +365,7 @@ main(int argc, char **argv)
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"extra-float-digits", required_argument, NULL, 8},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
-		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"inserts", no_argument, NULL, 9},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -382,6 +384,7 @@ main(int argc, char **argv)
 		{"no-subscriptions", no_argument, &dopt.no_subscriptions, 1},
 		{"no-sync", no_argument, NULL, 7},
 		{"on-conflict-do-nothing", no_argument, &dopt.do_nothing, 1},
+		{"rows-per-insert", required_argument, NULL, 10},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -572,6 +575,37 @@ main(int argc, char **argv)
 				}
 				break;
 
+			case 9:				/* inserts */
+				/*
+				 * dump_inserts also stores --rows-per-insert, careful not to
+				 * overwrite that.
+				 */
+				if (dopt.dump_inserts == 0)
+					dopt.dump_inserts = DUMP_DEFAULT_ROWS_PER_INSERT;
+				break;
+
+			case 10:			/* rows per insert */
+				errno = 0;
+				rowPerInsert = strtol(optarg, &rowPerInsertEndPtr, 10);
+
+				if (rowPerInsertEndPtr == optarg || *rowPerInsertEndPtr != '\0')
+				{
+					write_msg(NULL, "argument of --rows-per-insert must be a number\n");
+					exit_nicely(1);
+				}
+				if (rowPerInsert > INT_MAX)
+				{
+					write_msg(NULL, "argument of --rows-per-insert exceeds integer range.\n");
+					exit_nicely(1);
+				}
+				if (rowPerInsert <= 0)
+				{
+					write_msg(NULL, "rows-per-insert must be a positive number\n");
+					exit_nicely(1);
+				}
+				dopt.dump_inserts = rowPerInsert;
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -596,8 +630,8 @@ main(int argc, char **argv)
 	}
 
 	/* --column-inserts implies --inserts */
-	if (dopt.column_inserts)
-		dopt.dump_inserts = 1;
+	if (dopt.column_inserts && dopt.dump_inserts == 0)
+		dopt.dump_inserts = DUMP_DEFAULT_ROWS_PER_INSERT;
 
 	/*
 	 * Binary upgrade mode implies dumping sequence data even in schema-only
@@ -622,8 +656,12 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
-		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
+	/*
+	 * --inserts are already implied above if --column-inserts or
+	 * --rows-per-insert were specified.
+	 */
+	if (dopt.do_nothing && dopt.dump_inserts == 0)
+		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\n");
 
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -993,6 +1031,7 @@ help(const char *progname)
 	printf(_("  --no-unlogged-table-data     do not dump unlogged table data\n"));
 	printf(_("  --on-conflict-do-nothing     add ON CONFLICT DO NOTHING to INSERT commands\n"));
 	printf(_("  --quote-all-identifiers      quote all identifiers, even if not key words\n"));
+	printf(_("  --rows-per-insert=NROWS      number of row per INSERT command\n"));
 	printf(_("  --section=SECTION            dump named section (pre-data, data, or post-data)\n"));
 	printf(_("  --serializable-deferrable    wait until the dump can run without anomalies\n"));
 	printf(_("  --snapshot=SNAPSHOT          use given snapshot for the dump\n"));
@@ -1911,6 +1950,8 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	int			tuple;
 	int			nfields;
 	int			field;
+	int			rows_per_statement = dopt->dump_inserts;
+	int			rows_this_statement = 0;
 
 	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
 					  "SELECT * FROM ONLY %s",
@@ -1925,68 +1966,86 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		res = ExecuteSqlQuery(fout, "FETCH 100 FROM _pg_dump_cursor",
 							  PGRES_TUPLES_OK);
 		nfields = PQnfields(res);
-		for (tuple = 0; tuple < PQntuples(res); tuple++)
+
+		/*
+		 * First time through, we build as much of the INSERT statement as
+		 * possible in "insertStmt", which we can then just print for each
+		 * line. If the table happens to have zero columns then this will
+		 * be a complete statement, otherwise it will end in "VALUES " and
+		 * be ready to have the row's column values printed.
+		 */
+		if (insertStmt == NULL)
 		{
-			/*
-			 * First time through, we build as much of the INSERT statement as
-			 * possible in "insertStmt", which we can then just print for each
-			 * line. If the table happens to have zero columns then this will
-			 * be a complete statement, otherwise it will end in "VALUES(" and
-			 * be ready to have the row's column values appended.
-			 */
-			if (insertStmt == NULL)
-			{
-				TableInfo  *targettab;
+			TableInfo  *targettab;
 
-				insertStmt = createPQExpBuffer();
+			insertStmt = createPQExpBuffer();
 
-				/*
-				 * When load-via-partition-root is set, get the root table
-				 * name for the partition table, so that we can reload data
-				 * through the root table.
-				 */
-				if (dopt->load_via_partition_root && tbinfo->ispartition)
-					targettab = getRootTableInfo(tbinfo);
-				else
-					targettab = tbinfo;
+			/*
+			 * When load-via-partition-root is set, get the root table
+			 * name for the partition table, so that we can reload data
+			 * through the root table.
+			 */
+			if (dopt->load_via_partition_root && tbinfo->ispartition)
+				targettab = getRootTableInfo(tbinfo);
+			else
+				targettab = tbinfo;
 
-				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
-								  fmtQualifiedDumpable(targettab));
+			appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+							  fmtQualifiedDumpable(targettab));
 
-				/* corner case for zero-column table */
-				if (nfields == 0)
-				{
-					appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
-				}
-				else
+			/* corner case for zero-column table */
+			if (nfields == 0)
+			{
+				appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+			}
+			else
+			{
+				/* append the list of column names if required */
+				if (dopt->column_inserts)
 				{
-					/* append the list of column names if required */
-					if (dopt->column_inserts)
+					appendPQExpBufferChar(insertStmt, '(');
+					for (field = 0; field < nfields; field++)
 					{
-						appendPQExpBufferChar(insertStmt, '(');
-						for (field = 0; field < nfields; field++)
-						{
-							if (field > 0)
-								appendPQExpBufferStr(insertStmt, ", ");
-							appendPQExpBufferStr(insertStmt,
-												 fmtId(PQfname(res, field)));
-						}
-						appendPQExpBufferStr(insertStmt, ") ");
+						if (field > 0)
+							appendPQExpBufferStr(insertStmt, ", ");
+						appendPQExpBufferStr(insertStmt,
+											 fmtId(PQfname(res, field)));
 					}
+					appendPQExpBufferStr(insertStmt, ") ");
+				}
 
-					if (tbinfo->needs_override)
-						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+				if (tbinfo->needs_override)
+					appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
 
-					appendPQExpBufferStr(insertStmt, "VALUES (");
-				}
+				appendPQExpBufferStr(insertStmt, "VALUES ");
 			}
+		}
+
+		for (tuple = 0; tuple < PQntuples(res); tuple++)
+		{
+			/* Write the INSERT if not in the middle of a multi-row INSERT. */
+			if (rows_this_statement == 0)
+				archputs(insertStmt->data, fout);
 
-			archputs(insertStmt->data, fout);
 
-			/* if it is zero-column table then we're done */
+			/*
+			 * If it is zero-column table then we've aleady written the
+			 * complete statement, which will mean we've disobeyed
+			 * --rows-per-insert when it's set greater than 1.  We do support
+			 * a way to make this multi-row with:
+			 * SELECT UNION ALL SELECT UNION ALL ... but that's non-standard
+			 * so likely we should avoid it given that using INSERTs is
+			 * mostly only ever needed for cross-database exports.
+			 */
 			if (nfields == 0)
 				continue;
 
+			if (rows_this_statement > 0)
+				archputs(", (", fout);
+			else
+				archputs("(", fout);
+
+
 			for (field = 0; field < nfields; field++)
 			{
 				if (field > 0)
@@ -2052,10 +2111,27 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 				}
 			}
 
-			if (!dopt->do_nothing)
-				archputs(");\n", fout);
+			rows_this_statement++;
+
+			/*
+			 * If we've put the target number of rows onto this statement then
+			 * we can terminate it now.
+			 */
+			if (rows_this_statement == rows_per_statement)
+			{
+				/* Reset the row counter */
+				rows_this_statement = 0;
+				if (dopt->do_nothing)
+					archputs(") ON CONFLICT DO NOTHING;\n", fout);
+				else
+					archputs(");\n", fout);
+			}
 			else
-				archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			{
+				/* Otherwise, get ready for the next row. */
+				archputs(")", fout);
+			}
+
 		}
 
 		if (PQntuples(res) <= 0)
@@ -2066,6 +2142,15 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		PQclear(res);
 	}
 
+	/* Terminate any statements that didn't make the row count.*/
+	if (rows_this_statement > 0)
+	{
+		if (dopt->do_nothing)
+			archputs(" ON CONFLICT DO NOTHING;\n", fout);
+		else
+			archputs(";\n", fout);
+	}
+
 	archputs("\n\n", fout);
 
 	ExecuteSqlStatement(fout, "CLOSE _pg_dump_cursor");
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 21d2ab05b0..59ac3d096e 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -126,6 +126,12 @@ typedef uint32 DumpComponents;	/* a bitmask of dump object components */
 		DUMP_COMPONENT_DATA |\
 		DUMP_COMPONENT_POLICY)
 
+/*
+ * The default number of rows per INSERT statement when
+ * --inserts is specified without --rows-per-insert
+ */
+#define DUMP_DEFAULT_ROWS_PER_INSERT 1
+
 typedef struct _dumpableObject
 {
 	DumpableObjectType objType;
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a875d540b8..ebd83922dd 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -118,8 +118,8 @@ command_fails_like(
 
 command_fails_like(
 	[ 'pg_dump', '--on-conflict-do-nothing' ],
-	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts\E/,
-	'pg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts');
+	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\E/,
+	'pg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts');
 
 # pg_dumpall command-line argument checks
 command_fails_like(
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 0233fcb47f..06f4ab6ecd 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -289,6 +289,26 @@ my %pgdump_runs = (
 			"$tempdir/role_parallel",
 		],
 	},
+	rows_per_insert => {
+		dump_cmd => [
+			'pg_dump',
+			'--no-sync',
+			"--file=$tempdir/rows_per_insert.sql", '-a',
+			'--rows-per-insert=3',
+			'--table=dump_test.test_table',
+			'postgres',
+		],
+	},
+	rows_per_insert_zero_col => {
+		dump_cmd => [
+			'pg_dump',
+			'--no-sync',
+			"--file=$tempdir/rows_per_insert_zero_col.sql", '-a',
+			'--rows-per-insert=3',
+			'--table=dump_test.dump_test.test_fourth_table',
+			'postgres',
+		],
+	},
 	schema_only => {
 		dump_cmd => [
 			'pg_dump',                         '--format=plain',
@@ -1287,6 +1307,13 @@ my %tests = (
 		like => { column_inserts => 1, },
 	},
 
+	'INSERT INTO test_table' => {
+		regexp => qr/^
+			(?:INSERT\ INTO\ dump_test\.test_table\ VALUES\ \(\d,\ NULL,\ NULL,\ NULL\),\ \(\d,\ NULL,\ NULL,\ NULL\),\ \(\d,\ NULL,\ NULL,\ NULL\);\n){3}
+			/xm,
+		like => { rows_per_insert => 1, },
+	},
+
 	'INSERT INTO test_second_table' => {
 		regexp => qr/^
 			(?:INSERT\ INTO\ dump_test\.test_second_table\ \(col1,\ col2\)
@@ -1297,7 +1324,7 @@ my %tests = (
 	'INSERT INTO test_fourth_table' => {
 		regexp =>
 		  qr/^\QINSERT INTO dump_test.test_fourth_table DEFAULT VALUES;\E/m,
-		like => { column_inserts => 1, },
+		like => { column_inserts => 1, rows_per_insert_zero_col => 1 },
 	},
 
 	'INSERT INTO test_fifth_table' => {

#57

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: David Rowley (#56)

Re: pg_dump multi VALUES INSERT

Hello David & Surafel,

I think this can be marked as ready for committer now, but I'll defer
to Fabien to see if he's any other comments.

Patch v16 applies and compiles cleanly, local and global "make check"
are ok. Doc build is ok.

I did some manual testing with limit cases which did work. Good.

Although I'm all in favor of checking the int associated to the option, I
do not think that it warrants three checks and messages. I would suggest
to factor them all as just one check and one (terse) message.

Option "--help" line: number of row*s* ?

About the output: I'd suggest to indent one line per row, something like:

INSERT INTO foo VALUES
(..., ..., ..., ...),
(..., ..., ..., ...),
(..., ..., ..., ...);

so as to avoid very very very very very very very very very very very very
very very very very long lines in the output.

I'd suggest to add test tables with (1) no rows and (2) no columns but a
few rows, with multiple --table options.

--
Fabien.

#58

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Fabien COELHO (#57)

Re: pg_dump multi VALUES INSERT

On Sat, Mar 02, 2019 at 08:01:50AM +0100, Fabien COELHO wrote:

About the output: I'd suggest to indent one line per row, something like:

INSERT INTO foo VALUES
(..., ..., ..., ...),
(..., ..., ..., ...),
(..., ..., ..., ...);

so as to avoid very very very very very very very very very very very very
very very very very long lines in the output.

Note: folks sometimes manually edit the dump file generated. So
having one row/SQL query/VALUE per line really brings a lot of value.
--
Michael

#59

alvherre@2ndquadrant.com

almost 7 years ago

In reply to: Fabien COELHO (#57)

Re: pg_dump multi VALUES INSERT

On 2019-Mar-02, Fabien COELHO wrote:

Although I'm all in favor of checking the int associated to the option, I do
not think that it warrants three checks and messages. I would suggest to
factor them all as just one check and one (terse) message.

I suggest ("rows-per-insert must be in range 1..%d", INT_MAX), like
extra_float_digits and compression level.

About the output: I'd suggest to indent one line per row, something like:

INSERT INTO foo VALUES
(..., ..., ..., ...),
(..., ..., ..., ...),
(..., ..., ..., ...);

+1.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#60

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Fabien COELHO (#57)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

Thanks for looking at this again.

On Sat, 2 Mar 2019 at 20:01, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Although I'm all in favor of checking the int associated to the option, I
do not think that it warrants three checks and messages. I would suggest
to factor them all as just one check and one (terse) message.

Yeah. I've been trying to keep that area sane for a while, so I agree
that one message is fine. Done that way in the attached and put back
the missing ERANGE check.

Option "--help" line: number of row*s* ?

Oops. Fixed.

About the output: I'd suggest to indent one line per row, something like:

INSERT INTO foo VALUES
(..., ..., ..., ...),
(..., ..., ..., ...),
(..., ..., ..., ...);

Reasonable. Change it to that. I put a tab at the start of those
lines. There's still the possibility that one 1 final row makes up
the final INSERT. These will still span multiple lines. I don't think
there's anything that can reasonably be done about that.

I'd suggest to add test tables with (1) no rows and (2) no columns but a
few rows, with multiple --table options.

I didn't do that. I partially think that you're asking for tests to
test existing behaviour and partly because perl gives me a sore head.
Maybe Surafel want to do that?

v17 attached.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

pg_dump-rows-per-insert-option-v17.patchapplication/octet-stream; name=pg_dump-rows-per-insert-option-v17.patchDownload

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 033eae9b46..e0e65f9c21 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -661,9 +661,9 @@ PostgreSQL documentation
         ...</literal>).  This will make restoration very slow; it is mainly
         useful for making dumps that can be loaded into
         non-<productname>PostgreSQL</productname> databases.
-        However, since this option generates a separate command for each row,
-        an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
+        Any error during reloading will cause only rows that are part of the
+        problematic <command>INSERT</command> to be lost, rather than the
+        entire table contents.
        </para>
       </listitem>
      </varlistentry>
@@ -775,13 +775,12 @@ PostgreSQL documentation
         than <command>COPY</command>).  This will make restoration very slow;
         it is mainly useful for making dumps that can be loaded into
         non-<productname>PostgreSQL</productname> databases.
-        However, since this option generates a separate command for each row,
-        an error in reloading a row causes only that row to be lost rather
-        than the entire table contents.
-        Note that
-        the restore might fail altogether if you have rearranged column order.
-        The <option>--column-inserts</option> option is safe against column
-        order changes, though even slower.
+        Any error during reloading will cause only rows that are part of the
+        problematic <command>INSERT</command> to be lost, rather than the
+        entire table contents.  Note that the restore might fail altogether if
+        you have rearranged column order.  The
+        <option>--column-inserts</option> option is safe against column order
+        changes, though even slower.
        </para>
       </listitem>
      </varlistentry>
@@ -925,8 +924,9 @@ PostgreSQL documentation
        <para>
         Add <literal>ON CONFLICT DO NOTHING</literal> to
         <command>INSERT</command> commands.
-        This option is not valid unless <option>--inserts</option> or
-        <option>--column-inserts</option> is also specified.
+        This option is not valid unless <option>--inserts</option>,
+        <option>--column-inserts</option> or
+        <option>--rows-per-insert</option> is also specified.
        </para>
       </listitem>
      </varlistentry>
@@ -949,6 +949,20 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--rows-per-insert=<replaceable class="parameter">nrows</replaceable></option></term>
+      <listitem>
+       <para>
+        Dump data as <command>INSERT</command> commands (rather than
+        <command>COPY</command>).  Controls the maximum number of rows per
+        <command>INSERT</command> command. The value specified must be a
+        number greater than zero.  Any error during reloading will cause only
+        rows that are part of the problematic <command>INSERT</command> to be
+        lost, rather than the entire table contents.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
        <term><option>--section=<replaceable class="parameter">sectionname</replaceable></option></term>
        <listitem>
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..7ab27391fb 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -140,10 +140,10 @@ typedef struct _dumpOptions
 	int			dumpSections;	/* bitmask of chosen sections */
 	bool		aclsSkip;
 	const char *lockWaitTimeout;
+	int			dump_inserts;	/* 0 = COPY, otherwise rows per INSERT */
 
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
-	int			dump_inserts;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 5d83038348..9d6e25aee1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -311,6 +311,8 @@ main(int argc, char **argv)
 	const char *dumpencoding = NULL;
 	const char *dumpsnapshot = NULL;
 	char	   *use_role = NULL;
+	char       *rowPerInsertEndPtr;
+	long			rowPerInsert;
 	int			numWorkers = 1;
 	trivalue	prompt_password = TRI_DEFAULT;
 	int			compressLevel = -1;
@@ -363,7 +365,7 @@ main(int argc, char **argv)
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"extra-float-digits", required_argument, NULL, 8},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
-		{"inserts", no_argument, &dopt.dump_inserts, 1},
+		{"inserts", no_argument, NULL, 9},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
@@ -382,6 +384,7 @@ main(int argc, char **argv)
 		{"no-subscriptions", no_argument, &dopt.no_subscriptions, 1},
 		{"no-sync", no_argument, NULL, 7},
 		{"on-conflict-do-nothing", no_argument, &dopt.do_nothing, 1},
+		{"rows-per-insert", required_argument, NULL, 10},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -572,6 +575,29 @@ main(int argc, char **argv)
 				}
 				break;
 
+			case 9:				/* inserts */
+				/*
+				 * dump_inserts also stores --rows-per-insert, careful not to
+				 * overwrite that.
+				 */
+				if (dopt.dump_inserts == 0)
+					dopt.dump_inserts = DUMP_DEFAULT_ROWS_PER_INSERT;
+				break;
+
+			case 10:			/* rows per insert */
+				errno = 0;
+				rowPerInsert = strtol(optarg, &rowPerInsertEndPtr, 10);
+
+				if (rowPerInsertEndPtr == optarg || *rowPerInsertEndPtr != '\0' ||
+					rowPerInsert > INT_MAX || rowPerInsert <= 0 || errno == ERANGE)
+				{
+					write_msg(NULL, "rows-per-insert must be in range %d..%d\n",
+							  1, INT_MAX);
+					exit_nicely(1);
+				}
+				dopt.dump_inserts = rowPerInsert;
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -596,8 +622,8 @@ main(int argc, char **argv)
 	}
 
 	/* --column-inserts implies --inserts */
-	if (dopt.column_inserts)
-		dopt.dump_inserts = 1;
+	if (dopt.column_inserts && dopt.dump_inserts == 0)
+		dopt.dump_inserts = DUMP_DEFAULT_ROWS_PER_INSERT;
 
 	/*
 	 * Binary upgrade mode implies dumping sequence data even in schema-only
@@ -622,8 +648,12 @@ main(int argc, char **argv)
 	if (dopt.if_exists && !dopt.outputClean)
 		exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
 
-	if (dopt.do_nothing && !(dopt.dump_inserts || dopt.column_inserts))
-		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts or --column-inserts\n");
+	/*
+	 * --inserts are already implied above if --column-inserts or
+	 * --rows-per-insert were specified.
+	 */
+	if (dopt.do_nothing && dopt.dump_inserts == 0)
+		exit_horribly(NULL, "option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\n");
 
 	/* Identify archive format to emit */
 	archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -993,6 +1023,7 @@ help(const char *progname)
 	printf(_("  --no-unlogged-table-data     do not dump unlogged table data\n"));
 	printf(_("  --on-conflict-do-nothing     add ON CONFLICT DO NOTHING to INSERT commands\n"));
 	printf(_("  --quote-all-identifiers      quote all identifiers, even if not key words\n"));
+	printf(_("  --rows-per-insert=NROWS      number of rows per INSERT command\n"));
 	printf(_("  --section=SECTION            dump named section (pre-data, data, or post-data)\n"));
 	printf(_("  --serializable-deferrable    wait until the dump can run without anomalies\n"));
 	printf(_("  --snapshot=SNAPSHOT          use given snapshot for the dump\n"));
@@ -1912,6 +1943,8 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	int			tuple;
 	int			nfields;
 	int			field;
+	int			rows_per_statement = dopt->dump_inserts;
+	int			rows_this_statement = 0;
 
 	appendPQExpBuffer(q, "DECLARE _pg_dump_cursor CURSOR FOR "
 					  "SELECT * FROM ONLY %s",
@@ -1926,68 +1959,88 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		res = ExecuteSqlQuery(fout, "FETCH 100 FROM _pg_dump_cursor",
 							  PGRES_TUPLES_OK);
 		nfields = PQnfields(res);
-		for (tuple = 0; tuple < PQntuples(res); tuple++)
+
+		/*
+		 * First time through, we build as much of the INSERT statement as
+		 * possible in "insertStmt", which we can then just print for each
+		 * line. If the table happens to have zero columns then this will
+		 * be a complete statement, otherwise it will end in "VALUES" and
+		 * be ready to have the row's column values printed.
+		 */
+		if (insertStmt == NULL)
 		{
-			/*
-			 * First time through, we build as much of the INSERT statement as
-			 * possible in "insertStmt", which we can then just print for each
-			 * line. If the table happens to have zero columns then this will
-			 * be a complete statement, otherwise it will end in "VALUES(" and
-			 * be ready to have the row's column values appended.
-			 */
-			if (insertStmt == NULL)
-			{
-				TableInfo  *targettab;
+			TableInfo  *targettab;
 
-				insertStmt = createPQExpBuffer();
+			insertStmt = createPQExpBuffer();
 
-				/*
-				 * When load-via-partition-root is set, get the root table
-				 * name for the partition table, so that we can reload data
-				 * through the root table.
-				 */
-				if (dopt->load_via_partition_root && tbinfo->ispartition)
-					targettab = getRootTableInfo(tbinfo);
-				else
-					targettab = tbinfo;
+			/*
+			 * When load-via-partition-root is set, get the root table
+			 * name for the partition table, so that we can reload data
+			 * through the root table.
+			 */
+			if (dopt->load_via_partition_root && tbinfo->ispartition)
+				targettab = getRootTableInfo(tbinfo);
+			else
+				targettab = tbinfo;
 
-				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
-								  fmtQualifiedDumpable(targettab));
+			appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
+							  fmtQualifiedDumpable(targettab));
 
-				/* corner case for zero-column table */
-				if (nfields == 0)
-				{
-					appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
-				}
-				else
+			/* corner case for zero-column table */
+			if (nfields == 0)
+			{
+				appendPQExpBufferStr(insertStmt, "DEFAULT VALUES;\n");
+			}
+			else
+			{
+				/* append the list of column names if required */
+				if (dopt->column_inserts)
 				{
-					/* append the list of column names if required */
-					if (dopt->column_inserts)
+					appendPQExpBufferChar(insertStmt, '(');
+					for (field = 0; field < nfields; field++)
 					{
-						appendPQExpBufferChar(insertStmt, '(');
-						for (field = 0; field < nfields; field++)
-						{
-							if (field > 0)
-								appendPQExpBufferStr(insertStmt, ", ");
-							appendPQExpBufferStr(insertStmt,
-												 fmtId(PQfname(res, field)));
-						}
-						appendPQExpBufferStr(insertStmt, ") ");
+						if (field > 0)
+							appendPQExpBufferStr(insertStmt, ", ");
+						appendPQExpBufferStr(insertStmt,
+											 fmtId(PQfname(res, field)));
 					}
+					appendPQExpBufferStr(insertStmt, ") ");
+				}
 
-					if (tbinfo->needs_override)
-						appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
+				if (tbinfo->needs_override)
+					appendPQExpBufferStr(insertStmt, "OVERRIDING SYSTEM VALUE ");
 
-					appendPQExpBufferStr(insertStmt, "VALUES (");
-				}
+				appendPQExpBufferStr(insertStmt, "VALUES");
 			}
+		}
 
-			archputs(insertStmt->data, fout);
+		for (tuple = 0; tuple < PQntuples(res); tuple++)
+		{
+			/* Write the INSERT if not in the middle of a multi-row INSERT. */
+			if (rows_this_statement == 0)
+				archputs(insertStmt->data, fout);
 
-			/* if it is zero-column table then we're done */
+
+			/*
+			 * If it is zero-column table then we've aleady written the
+			 * complete statement, which will mean we've disobeyed
+			 * --rows-per-insert when it's set greater than 1.  We do support
+			 * a way to make this multi-row with:
+			 * SELECT UNION ALL SELECT UNION ALL ... but that's non-standard
+			 * so likely we should avoid it given that using INSERTs is
+			 * mostly only ever needed for cross-database exports.
+			 */
 			if (nfields == 0)
 				continue;
 
+			if (rows_this_statement > 0)
+				archputs(",\n\t(", fout);
+			else if (rows_per_statement == 1)
+				archputs(" (", fout);
+			else
+				archputs("\n\t(", fout);
+
+
 			for (field = 0; field < nfields; field++)
 			{
 				if (field > 0)
@@ -2053,10 +2106,27 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 				}
 			}
 
-			if (!dopt->do_nothing)
-				archputs(");\n", fout);
+			rows_this_statement++;
+
+			/*
+			 * If we've put the target number of rows onto this statement then
+			 * we can terminate it now.
+			 */
+			if (rows_this_statement == rows_per_statement)
+			{
+				/* Reset the row counter */
+				rows_this_statement = 0;
+				if (dopt->do_nothing)
+					archputs(") ON CONFLICT DO NOTHING;\n", fout);
+				else
+					archputs(");\n", fout);
+			}
 			else
-				archputs(") ON CONFLICT DO NOTHING;\n", fout);
+			{
+				/* Otherwise, get ready for the next row. */
+				archputs(")", fout);
+			}
+
 		}
 
 		if (PQntuples(res) <= 0)
@@ -2067,6 +2137,15 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 		PQclear(res);
 	}
 
+	/* Terminate any statements that didn't make the row count.*/
+	if (rows_this_statement > 0)
+	{
+		if (dopt->do_nothing)
+			archputs(" ON CONFLICT DO NOTHING;\n", fout);
+		else
+			archputs(";\n", fout);
+	}
+
 	archputs("\n\n", fout);
 
 	ExecuteSqlStatement(fout, "CLOSE _pg_dump_cursor");
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 21d2ab05b0..59ac3d096e 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -126,6 +126,12 @@ typedef uint32 DumpComponents;	/* a bitmask of dump object components */
 		DUMP_COMPONENT_DATA |\
 		DUMP_COMPONENT_POLICY)
 
+/*
+ * The default number of rows per INSERT statement when
+ * --inserts is specified without --rows-per-insert
+ */
+#define DUMP_DEFAULT_ROWS_PER_INSERT 1
+
 typedef struct _dumpableObject
 {
 	DumpableObjectType objType;
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index 1dad553739..45dd57f2bf 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -118,8 +118,8 @@ command_fails_like(
 
 command_fails_like(
 	[ 'pg_dump', '--on-conflict-do-nothing' ],
-	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts\E/,
-	'pg_dump: option --on-conflict-do-nothing requires option --inserts or --column-inserts');
+	qr/\Qpg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts\E/,
+	'pg_dump: option --on-conflict-do-nothing requires option --inserts , --rows-per-insert or --column-inserts');
 
 # pg_dumpall command-line argument checks
 command_fails_like(
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 8fa7f0f61f..c5a8f763af 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -295,6 +295,26 @@ my %pgdump_runs = (
 			"$tempdir/role_parallel",
 		],
 	},
+	rows_per_insert => {
+		dump_cmd => [
+			'pg_dump',
+			'--no-sync',
+			"--file=$tempdir/rows_per_insert.sql", '-a',
+			'--rows-per-insert=3',
+			'--table=dump_test.test_table',
+			'postgres',
+		],
+	},
+	rows_per_insert_zero_col => {
+		dump_cmd => [
+			'pg_dump',
+			'--no-sync',
+			"--file=$tempdir/rows_per_insert_zero_col.sql", '-a',
+			'--rows-per-insert=3',
+			'--table=dump_test.dump_test.test_fourth_table',
+			'postgres',
+		],
+	},
 	schema_only => {
 		dump_cmd => [
 			'pg_dump',                         '--format=plain',
@@ -1295,6 +1315,13 @@ my %tests = (
 		like => { column_inserts => 1, },
 	},
 
+	'INSERT INTO test_table' => {
+		regexp => qr/^
+			(?:INSERT\ INTO\ dump_test\.test_table\ VALUES\n\t\(\d,\ NULL,\ NULL,\ NULL\),\n\t\(\d,\ NULL,\ NULL,\ NULL\),\n\t\(\d,\ NULL,\ NULL,\ NULL\);\n){3}
+			/xm,
+		like => { rows_per_insert => 1, },
+	},
+
 	'INSERT INTO test_second_table' => {
 		regexp => qr/^
 			(?:INSERT\ INTO\ dump_test\.test_second_table\ \(col1,\ col2\)
@@ -1305,7 +1332,7 @@ my %tests = (
 	'INSERT INTO test_fourth_table' => {
 		regexp =>
 		  qr/^\QINSERT INTO dump_test.test_fourth_table DEFAULT VALUES;\E/m,
-		like => { column_inserts => 1, },
+		like => { column_inserts => 1, rows_per_insert_zero_col => 1 },
 	},
 
 	'INSERT INTO test_fifth_table' => {

#61

alvherre@2ndquadrant.com

almost 7 years ago

In reply to: David Rowley (#60)

Re: pg_dump multi VALUES INSERT

Pushed, thanks!

If anyone is feeling inspired, one additional test we could use is
--rows-per-insert together with --on-conflict-do-nothing.

I made a couple of edits to v17 before pushing,

* rename strtol endptr variable so that it can be used by other strtol
calls, if we ever have them

* use pre-increment in if() test rather than separate line with
post-increment; reduces line count by 2.

* reworded --help output to: "number of rows per INSERT; implies --inserts"

* added one row-ending archputs(")") which makes the repeated
statement-ending archputs() match exactly. (Negligible slowdown, I
expect)

* moved DUMP_DEFAULT_ROWS_PER_INSERT to pg_dump.c from pg_dump.h

* there was a space-before-comma in an error message, even immortalized
in the test expected output.

* remove the rows_per_insert_zero_col dump output file; the test can be
done by adding the table to the rows_per_insert file. Add one more row
to that zero-column table, so that the INSERT .. DEFAULT VALUES test
verifies the case with more than one row.

* changed the rows_per_insert to use 4 rows per insert rather than
three; that improves coverage (table had 9 rows so it was always hitting
the case where a full statement is emitted.)

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#62

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Alvaro Herrera (#61)

Re: pg_dump multi VALUES INSERT

On Fri, 8 Mar 2019 at 01:46, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Pushed, thanks!

I made a couple of edits to v17 before pushing,

Thank you for making those changes and for pushing it.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#63

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

over 6 years ago

In reply to: David Rowley (#62)

Re: pg_dump multi VALUES INSERT

Shouldn't the --rows-per-insert option also be available via pg_dumpall?
All the other options for switching between COPY and INSERT are
settable in pg_dumpall.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#64

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Peter Eisentraut (#63)

Re: pg_dump multi VALUES INSERT

On 2019-Jun-14, Peter Eisentraut wrote:

Shouldn't the --rows-per-insert option also be available via pg_dumpall?
All the other options for switching between COPY and INSERT are
settable in pg_dumpall.

Uh, yeah, absolutely.

Surafel, are you in a position to provide a patch for that quickly?

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#65

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Alvaro Herrera (#64)

1 attachment(s)

Re: pg_dump multi VALUES INSERT

Hello Alvaro,

Shouldn't the --rows-per-insert option also be available via pg_dumpall?
All the other options for switching between COPY and INSERT are
settable in pg_dumpall.

Uh, yeah, absolutely.

Surafel, are you in a position to provide a patch for that quickly?

End of the week, more time, easy enough and I should have seen the issue
while reviewing. Patch attached.

BTW, is the libpq hostaddr fix ok?

--
Fabien.

Attachments:

dumpall-missing-opt-1.patchtext/x-diff; charset=us-ascii; name=dumpall-missing-opt-1.patchDownload

diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index b35c702f99..ac8d039bf4 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -514,6 +514,20 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--rows-per-insert=<replaceable class="parameter">nrows</replaceable></option></term>
+      <listitem>
+       <para>
+        Dump data as <command>INSERT</command> commands (rather than
+        <command>COPY</command>).  Controls the maximum number of rows per
+        <command>INSERT</command> command. The value specified must be a
+        number greater than zero.  Any error during reloading will cause only
+        rows that are part of the problematic <command>INSERT</command> to be
+        lost, rather than the entire table contents.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
        <term><option>-?</option></term>
        <term><option>--help</option></term>
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index ea4ac91c00..fd1244fa5d 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -146,6 +146,7 @@ main(int argc, char *argv[])
 		{"no-sync", no_argument, NULL, 4},
 		{"no-unlogged-table-data", no_argument, &no_unlogged_table_data, 1},
 		{"on-conflict-do-nothing", no_argument, &on_conflict_do_nothing, 1},
+		{"rows-per-insert", required_argument, NULL, 7},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -329,6 +330,11 @@ main(int argc, char *argv[])
 				simple_string_list_append(&database_exclude_patterns, optarg);
 				break;
 
+			case 7:
+				appendPQExpBufferStr(pgdumpopts, " --rows-per-insert ");
+				appendShellString(pgdumpopts, optarg);
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit_nicely(1);
@@ -651,6 +657,7 @@ help(void)
 	printf(_("  --use-set-session-authorization\n"
 			 "                               use SET SESSION AUTHORIZATION commands instead of\n"
 			 "                               ALTER OWNER commands to set ownership\n"));
+	printf(_("  --rows-per-insert=NROWS      number of rows per INSERT; implies --inserts\n"));
 
 	printf(_("\nConnection options:\n"));
 	printf(_("  -d, --dbname=CONNSTR     connect using connection string\n"));

#66