[PoC PATCH] Parallel dump to /dev/null

Started by Michael Banckalmost 8 years ago28 messages
#1Michael Banck
michael.banck@credativ.de
1 attachment(s)

Hi,

dumping a database to /dev/null via pg_dump is (AFAIK) one recommended
way to check for corruption. However, dumping to /dev/null is currently
not supported in directory mode which makes it not possible to dump to
/dev/null in parallel.

I had a look at this, and it appears it would suffice to just override
the few spots in pg_backup_directory.c which assemble filenames in the
target directory to revert to '/dev/null' (or 'NUL' on Windows).

The attached proof-of-concept patch does that, and it seems to work; I'm
getting some nice speedups on a 170 GB test database:

$ time pg_dump -Fc -Z0 -f /dev/null TESTDB
real 32m45.120s
[...]
$ time pg_dump -Fd -j8 -Z0 -f /dev/null TESTDB
real 9m28.357s

Thoughts?

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB M�nchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 M�nchengladbach
Gesch�ftsf�hrung: Dr. Michael Meskes, J�rg Folz, Sascha Heuer

Attachments:

parallel-pgdump-dev-null.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index a96da15dc1..82aeb01f91 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -512,7 +512,10 @@ cfopen_write(const char *path, const char *mode, int compression)
 #ifdef HAVE_LIBZ
 		char	   *fname;
 
-		fname = psprintf("%s.gz", path);
+		if (strcmp(path, NULL_DEVICE) != 0)
+			fname = psprintf("%s.gz", path);
+		else
+			fname = strdup(path);
 		fp = cfopen(fname, mode, compression);
 		free_keep_errno(fname);
 #else
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index becfee6e81..ecdd9fc8d3 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -62,6 +62,13 @@ typedef struct _z_stream
 typedef z_stream *z_streamp;
 #endif
 
+/* Null device */
+#ifdef WIN32
+#define NULL_DEVICE "NUL"
+#else
+#define NULL_DEVICE "/dev/null"
+#endif
+
 /* Data block types */
 #define BLK_DATA 1
 #define BLK_BLOBS 3
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 4aabb40f59..cd418ac7ee 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -192,9 +192,14 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
 			}
 		}
 
-		if (!is_empty && mkdir(ctx->directory, 0700) < 0)
-			exit_horribly(modulename, "could not create directory \"%s\": %s\n",
-						  ctx->directory, strerror(errno));
+		/*
+		 * Create the output directory, unless it exists already and is empty.
+		 * If the output directory is the null device, we need not create it.
+		 */
+		if (!is_empty && strcmp(ctx->directory, NULL_DEVICE) != 0)
+			if (mkdir(ctx->directory, 0711) < 0)
+				exit_horribly(modulename, "could not create directory \"%s\": %s\n",
+							  ctx->directory, strerror(errno));
 	}
 	else
 	{							/* Read Mode */
@@ -602,8 +607,10 @@ _CloseArchive(ArchiveHandle *AH)
 		/*
 		 * In directory mode, there is no need to sync all the entries
 		 * individually. Just recurse once through all the files generated.
+		 * If the output directory is the null device, it does not need
+		 * syncing.
 		 */
-		if (AH->dosync)
+		if (AH->dosync && strcmp(ctx->directory, NULL_DEVICE) != 0)
 			fsync_dir_recurse(ctx->directory, progname);
 	}
 	AH->FH = NULL;
@@ -659,7 +666,10 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
 	lclContext *ctx = (lclContext *) AH->formatData;
 	char		fname[MAXPGPATH];
 
-	snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
+	if (strcmp(ctx->directory, NULL_DEVICE) != 0)
+		snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
+	else
+		snprintf(fname, MAXPGPATH, NULL_DEVICE);
 
 	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
 
@@ -721,9 +731,18 @@ setFilePath(ArchiveHandle *AH, char *buf, const char *relativeFilename)
 	if (strlen(dname) + 1 + strlen(relativeFilename) + 1 > MAXPGPATH)
 		exit_horribly(modulename, "file name too long: \"%s\"\n", dname);
 
-	strcpy(buf, dname);
-	strcat(buf, "/");
-	strcat(buf, relativeFilename);
+	/*
+	 * If the output directory is the null device, we cannot set a file path
+	 * and just set the buffer to the null device.
+	 */
+	if (strcmp(ctx->directory, NULL_DEVICE) == 0)
+	{
+		strcpy(buf, NULL_DEVICE);
+	} else {
+		strcpy(buf, dname);
+		strcat(buf, "/");
+		strcat(buf, relativeFilename);
+	}
 }
 
 /*
#2Michael Paquier
michael.paquier@gmail.com
In reply to: Michael Banck (#1)
Re: [PoC PATCH] Parallel dump to /dev/null

On Thu, Feb 01, 2018 at 02:24:46PM +0100, Michael Banck wrote:

dumping a database to /dev/null via pg_dump is (AFAIK) one recommended
way to check for corruption. However, dumping to /dev/null is currently
not supported in directory mode which makes it not possible to dump to
/dev/null in parallel.

FWIW, I use this trick as well in some test deployments. Now, those
deployments happen in places where I want the checks to warn me, so the
time it takes is not really an issue generally. Do folks here think
that speedups would be worth it? Say to make the operation shorter on
production systems for example. A lengthy pg_dump bloats data for a
longer time, so I can buy that shorter times may help, though I think
that hearing from other folks is necessary as well.
--
Michael

#3Michael Banck
michael.banck@credativ.de
In reply to: Michael Paquier (#2)
Re: [PoC PATCH] Parallel dump to /dev/null

Hi,

Am Freitag, den 02.02.2018, 13:22 +0900 schrieb Michael Paquier:

On Thu, Feb 01, 2018 at 02:24:46PM +0100, Michael Banck wrote:

dumping a database to /dev/null via pg_dump is (AFAIK) one recommended
way to check for corruption. However, dumping to /dev/null is currently
not supported in directory mode which makes it not possible to dump to
/dev/null in parallel.

FWIW, I use this trick as well in some test deployments. Now, those
deployments happen in places where I want the checks to warn me, so the
time it takes is not really an issue generally. Do folks here think
that speedups would be worth it? Say to make the operation shorter on
production systems for example. A lengthy pg_dump bloats data for a
longer time, so I can buy that shorter times may help, though I think
that hearing from other folks is necessary as well.

Another use-case would be automatic restores of basebackups, where you
might want to dump to /dev/null afterwards to check for issues, as just
starting the basebackup wouldn't tell you. If you have lots of instances
and lots of basebackups to check, doing that in parallel might be
helpful.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

#4Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Michael Paquier (#2)
Re: [PoC PATCH] Parallel dump to /dev/null

Hi!

Michael, thanks for the patch!

2 февр. 2018 г., в 9:22, Michael Paquier <michael.paquier@gmail.com> написал(а):

On Thu, Feb 01, 2018 at 02:24:46PM +0100, Michael Banck wrote:

dumping a database to /dev/null via pg_dump is (AFAIK) one recommended
way to check for corruption. However, dumping to /dev/null is currently
not supported in directory mode which makes it not possible to dump to
/dev/null in parallel.

Do folks here think that speedups would be worth it?

I've seen backup verification via COPY TO to /dev/null in production. These copies were parallelized on the side of verification script.
Not sure this counts for feature or against it.

Best regards, Andrey Borodin.

#5Vladimir Borodin
root@simply.name
In reply to: Andrey Borodin (#4)
Re: [PoC PATCH] Parallel dump to /dev/null

Hi.

2 февр. 2018 г., в 13:25, Andrey Borodin <x4mmm@yandex-team.ru> написал(а):

Do folks here think that speedups would be worth it?

I've seen backup verification via COPY TO to /dev/null in production. These copies were parallelized on the side of verification script.
Not sure this counts for feature or against it.

This sounds for feature because usage of COPY is involuntary. Having this in pg_dump would make our scripts for checking backups simpler.

--
May the force be with you…
https://simply.name

#6Michael Banck
michael.banck@credativ.de
In reply to: Michael Banck (#1)
1 attachment(s)
Re: [PoC PATCH] Parallel dump to /dev/null

Hi,

On Thu, Feb 01, 2018 at 02:24:46PM +0100, Michael Banck wrote:

dumping a database to /dev/null via pg_dump is (AFAIK) one recommended
way to check for corruption. However, dumping to /dev/null is currently
not supported in directory mode which makes it not possible to dump to
/dev/null in parallel.

I had a look at this, and it appears it would suffice to just override
the few spots in pg_backup_directory.c which assemble filenames in the
target directory to revert to '/dev/null' (or 'NUL' on Windows).

The attached proof-of-concept patch does that, and it seems to work; I'm
getting some nice speedups on a 170 GB test database:

$ time pg_dump -Fc -Z0 -f /dev/null TESTDB
real 32m45.120s
[...]
$ time pg_dump -Fd -j8 -Z0 -f /dev/null TESTDB
real 9m28.357s

I have added this patch to the next commitfest:

https://commitfest.postgresql.org/17/1576/

I also tried to add a TAP test, but as this patch produces no dump
output, I had to exclude it from all restores tests and then got an
off-by-one between the number of tests that were expected vs. those that
were run. I've attached it if somebody wants to take a look, but I hope
with Stephen's refactoring of the pg_dump TAP tests, this might be
easier.

Michael

Attachments:

0001-WIP-pg_dump-TAP-testcase.patchtext/x-diff; charset=us-asciiDownload
From ff96d9f2984bb473ec6fdad779a09070a227d550 Mon Sep 17 00:00:00 2001
From: Michael Banck <michael.banck@credativ.de>
Date: Fri, 2 Feb 2018 15:02:26 +0100
Subject: [PATCH] WIP pg_dump TAP testcase

---
 src/bin/pg_dump/t/002_pg_dump.pl | 89 +++++++++++++++++++++++++++++++++++++---
 1 file changed, 84 insertions(+), 5 deletions(-)

diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 3e9b4d94dc..f8e64910b7 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -153,6 +153,12 @@ my %pgdump_runs = (
 			'--format=tar',
 			"--file=$tempdir/defaults_tar_format.sql",
 			"$tempdir/defaults_tar_format.tar", ], },
+
+	parallel_discard => {
+		dump_cmd => [
+			'pg_dump', '-Fd', '-j2', '--file=/dev/null',
+			'postgres', ], },
+
 	exclude_dump_test_schema => {
 		dump_cmd => [
 			'pg_dump', '--no-sync',
@@ -334,6 +340,7 @@ my %tests = (
 			exclude_dump_test_schema => 1,
 			no_privs                 => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -372,6 +379,7 @@ my %tests = (
 			no_privs                 => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -415,6 +423,7 @@ my %tests = (
 			no_privs                 => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -449,6 +458,7 @@ my %tests = (
 			no_owner                 => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			role                     => 1,
 			schema_only              => 1,
 			section_pre_data         => 1,
@@ -639,6 +649,7 @@ my %tests = (
 			section_post_data        => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			role                     => 1, }, },
 
 	'ALTER OPERATOR CLASS dump_test.op_class OWNER TO' => {
@@ -693,6 +704,7 @@ my %tests = (
 			no_owner                 => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -724,6 +736,7 @@ my %tests = (
 			no_owner                 => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -833,6 +846,7 @@ my %tests = (
 			data_only                => 1,
 			exclude_test_table       => 1,
 			exclude_dump_test_schema => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -866,6 +880,7 @@ my %tests = (
 			data_only                => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			section_data             => 1,
@@ -1059,6 +1074,7 @@ my %tests = (
 			no_blobs                 => 1,
 			no_privs                 => 1,
 			no_owner                 => 1,
+			parallel_discard         => 1,
 			pg_dumpall_dbprivs       => 1,
 			role                     => 1,
 			schema_only              => 1,
@@ -1128,6 +1144,7 @@ my %tests = (
 			no_privs                 => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_dbprivs       => 1,
 			schema_only              => 1,
 			section_pre_data         => 1,
@@ -1426,6 +1443,7 @@ qr/^ALTER (?!EVENT TRIGGER|LARGE OBJECT|PUBLICATION|SUBSCRIPTION)(.*) OWNER TO .
 			column_inserts           => 1,
 			data_only                => 1,
 			no_owner                 => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			section_data             => 1,
@@ -1438,6 +1456,7 @@ qr/^ALTER (?!EVENT TRIGGER|LARGE OBJECT|PUBLICATION|SUBSCRIPTION)(.*) OWNER TO .
 		like   => {},    # use more-specific options above
 		unlike => {
 			column_inserts           => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			section_data             => 1, }, },
@@ -1469,6 +1488,7 @@ qr/^ALTER (?!EVENT TRIGGER|LARGE OBJECT|PUBLICATION|SUBSCRIPTION)(.*) OWNER TO .
 			no_blobs                 => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -1505,6 +1525,7 @@ qr/^ALTER (?!EVENT TRIGGER|LARGE OBJECT|PUBLICATION|SUBSCRIPTION)(.*) OWNER TO .
 			no_blobs                 => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -1843,6 +1864,7 @@ qr/^COMMENT ON CONVERSION test_conversion IS 'comment on test conversion';/m,
 			no_blobs                 => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -1878,6 +1900,7 @@ qr/^COMMENT ON CONVERSION test_conversion IS 'comment on test conversion';/m,
 			data_only                => 1,
 			only_dump_test_table     => 1,
 			only_dump_test_schema    => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -1913,6 +1936,7 @@ qr/^COMMENT ON CONVERSION test_conversion IS 'comment on test conversion';/m,
 			data_only                => 1,
 			only_dump_test_table     => 1,
 			only_dump_test_schema    => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -2189,6 +2213,7 @@ qr/^COMMENT ON TEXT SEARCH TEMPLATE alt_ts_temp1 IS 'comment on text search temp
 		regexp   => qr/^COMMENT ON /m,
 		like     => {},                  # use more-specific options above
 		unlike   => {
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			section_data             => 1, }, },
@@ -2461,6 +2486,7 @@ qr/^COMMENT ON TEXT SEARCH TEMPLATE alt_ts_temp1 IS 'comment on text search temp
 		unlike   => {
 			binary_upgrade           => 1,
 			column_inserts           => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			schema_only              => 1,
@@ -2541,6 +2567,7 @@ qr/^\QINSERT INTO test_table_identity (col1, col2) OVERRIDING SYSTEM VALUE VALUE
 			section_data             => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			role                     => 1,
 			schema_only              => 1,
 			section_pre_data         => 1,
@@ -2574,6 +2601,7 @@ qr/^\QINSERT INTO test_table_identity (col1, col2) OVERRIDING SYSTEM VALUE VALUE
 			no_owner                 => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			role                     => 1,
 			schema_only              => 1,
 			section_pre_data         => 1,
@@ -2702,6 +2730,7 @@ qr/CREATE CAST \(timestamp with time zone AS interval\) WITH FUNCTION pg_catalog
 			no_owner                 => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_dbprivs       => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
@@ -2737,6 +2766,7 @@ qr/CREATE CAST \(timestamp with time zone AS interval\) WITH FUNCTION pg_catalog
 			no_owner                 => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -2770,6 +2800,7 @@ qr/CREATE CAST \(timestamp with time zone AS interval\) WITH FUNCTION pg_catalog
 			no_privs                 => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_dbprivs       => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
@@ -2822,6 +2853,7 @@ qr/CREATE CAST \(timestamp with time zone AS interval\) WITH FUNCTION pg_catalog
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -2857,6 +2889,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -2900,6 +2933,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -2941,6 +2975,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -2981,6 +3016,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3021,6 +3057,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3057,6 +3094,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3109,6 +3147,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3147,6 +3186,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3188,6 +3228,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			exclude_test_table       => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3227,6 +3268,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3258,6 +3300,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			no_privs                 => 1,
 			no_owner                 => 1,
 			only_dump_test_schema    => 1,
+			parallel_discard         => 1,
 			pg_dumpall_dbprivs       => 1,
 			schema_only              => 1,
 			section_pre_data         => 1,
@@ -3304,6 +3347,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3337,6 +3381,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3373,6 +3418,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3462,6 +3508,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3498,6 +3545,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3538,6 +3586,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3575,6 +3624,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3614,6 +3664,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3653,6 +3704,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3691,6 +3743,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3739,6 +3792,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3780,6 +3834,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3813,6 +3868,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3845,6 +3901,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3878,6 +3935,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3922,6 +3980,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3956,6 +4015,7 @@ qr/^\QCREATE DEFAULT CONVERSION test_conversion FOR 'LATIN1' TO 'UTF8' FROM iso8
 			data_only                => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -3991,6 +4051,7 @@ qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog
 			data_only                => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -4524,6 +4585,7 @@ qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog
 		unlike => {
 			column_inserts           => 1,
 			data_only                => 1,
+			parallel_discard         => 1,
 			role                     => 1,
 			section_data             => 1,
 			section_pre_data         => 1,
@@ -4559,6 +4621,7 @@ qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog
 			exclude_dump_test_schema => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			test_schema_plus_blobs   => 1, }, },
@@ -5379,9 +5442,10 @@ qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog
 		regexp   => qr/^CREATE /m,
 		like     => {},              # use more-specific options above
 		unlike   => {
-			column_inserts => 1,
-			data_only      => 1,
-			section_data   => 1, }, },
+			column_inserts   => 1,
+			data_only        => 1,
+			parallel_discard => 1,
+			section_data     => 1, }, },
 
 	'DROP SCHEMA public (for testing without public schema)' => {
 		all_runs     => 1,
@@ -5392,7 +5456,8 @@ qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog
 		like         => {},
 		unlike       => {
 			defaults_no_public       => 1,
-			defaults_no_public_clean => 1, } },
+			defaults_no_public_clean => 1,
+			parallel_discard         => 1, } },
 
 	'DROP SCHEMA public' => {
 		all_runs  => 1,
@@ -5434,6 +5499,7 @@ qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog
 		like   => { clean => 1, },
 		unlike => {
 			clean_if_exists          => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals_clean => 1, }, },
 
 	'DROP LANGUAGE pltestlang' => {
@@ -5618,6 +5684,7 @@ qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog
 			no_owner                 => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_dbprivs       => 1,
 			pg_dumpall_globals       => 1,
 			role                     => 1,
@@ -6293,6 +6360,7 @@ qr/^GRANT SELECT ON TABLE measurement_y2006m2 TO regress_dump_test_role;/m,
 			section_pre_data         => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
 			section_data             => 1,
@@ -6305,6 +6373,7 @@ qr/^GRANT SELECT ON TABLE measurement_y2006m2 TO regress_dump_test_role;/m,
 		like     => {},             # use more-specific options above
 		unlike   => {
 			no_privs                 => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals_clean => 1,
 			section_data             => 1,
 			section_post_data        => 1, }, },
@@ -6333,6 +6402,7 @@ qr/^GRANT SELECT ON TABLE measurement_y2006m2 TO regress_dump_test_role;/m,
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -6368,6 +6438,7 @@ qr/^GRANT SELECT ON TABLE measurement_y2006m2 TO regress_dump_test_role;/m,
 			data_only                => 1,
 			exclude_dump_test_schema => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
 			role                     => 1,
@@ -6397,6 +6468,7 @@ qr/^GRANT SELECT ON TABLE measurement_y2006m2 TO regress_dump_test_role;/m,
 			no_owner                 => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_dbprivs       => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
@@ -6430,6 +6502,7 @@ qr/^GRANT SELECT ON TABLE measurement_y2006m2 TO regress_dump_test_role;/m,
 			no_owner                 => 1,
 			only_dump_test_schema    => 1,
 			only_dump_test_table     => 1,
+			parallel_discard         => 1,
 			pg_dumpall_dbprivs       => 1,
 			pg_dumpall_globals       => 1,
 			pg_dumpall_globals_clean => 1,
@@ -6609,6 +6682,7 @@ qr/^GRANT SELECT ON TABLE measurement_y2006m2 TO regress_dump_test_role;/m,
 			column_inserts     => 1,
 			data_only          => 1,
 			no_privs           => 1,
+			parallel_discard   => 1,
 			pg_dumpall_globals => 1, }, },);
 
 #########################################
@@ -6845,7 +6919,12 @@ foreach my $run (sort keys %pgdump_runs)
 		$test_key = $pgdump_runs{$run}->{test_key};
 	}
 
-	my $output_file = slurp_file("$tempdir/${run}.sql");
+	my $output_file;
+
+	if (-f "$tempdir/${run}.sql")
+	{
+		$output_file = slurp_file("$tempdir/${run}.sql");
+	}
 
 	#########################################
 	# Run all tests where this run is included
-- 
2.14.2

#7Andres Freund
andres@anarazel.de
In reply to: Michael Banck (#6)
Re: [PoC PATCH] Parallel dump to /dev/null

Hi,

On 2018-02-28 14:28:41 +0100, Michael Banck wrote:

I have added this patch to the next commitfest:

https://commitfest.postgresql.org/17/1576/

Given this is a new patch, submitted for the last commitfest, and not
completely trivial, I'd argue this is too late for v11.

Does anybody disagree?

- Andres

#8Michael Banck
michael.banck@credativ.de
In reply to: Andres Freund (#7)
Re: [PoC PATCH] Parallel dump to /dev/null

Hi,

Am Donnerstag, den 01.03.2018, 01:28 -0800 schrieb Andres Freund:

Hi,

On 2018-02-28 14:28:41 +0100, Michael Banck wrote:

I have added this patch to the next commitfest:

https://commitfest.postgresql.org/17/1576/

Given this is a new patch, submitted for the last commitfest, and not
completely trivial, I'd argue this is too late for v11.

I was under the impression that rule was rather about complicated and
invasive patches, not just any patch which isn't completely trivial.

I agree that the patch is not completely trivial, but is rather small
(it's diffstat is "3 files changed, 38 insertions(+), 9 deletions(-)"),
and it certainly is not highly invasive, or touching code all over the
place, but really only in a few places in pg_backup_directory.c.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

#9Andres Freund
andres@anarazel.de
In reply to: Michael Banck (#8)
Re: [PoC PATCH] Parallel dump to /dev/null

On 2018-03-01 14:21:24 +0100, Michael Banck wrote:

Hi,

Am Donnerstag, den 01.03.2018, 01:28 -0800 schrieb Andres Freund:

Hi,

On 2018-02-28 14:28:41 +0100, Michael Banck wrote:

I have added this patch to the next commitfest:

https://commitfest.postgresql.org/17/1576/

Given this is a new patch, submitted for the last commitfest, and not
completely trivial, I'd argue this is too late for v11.

I was under the impression that rule was rather about complicated and
invasive patches, not just any patch which isn't completely trivial.

I think you're right this patch is a bit boderline for that. But we have
~220 pending CF entries, and fairly limited review and commit
capacity. Something's gotta give. A lot of those are patches that have
been waiting to be committed for a while. So things added first to the
last CF the day before it starts are prime candidates.

You probably can increase your chances by herding a few patches towards
being committable.

Greetings,

Andres Freund

#10Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Michael Banck (#1)
Re: [PoC PATCH] Parallel dump to /dev/null

strdup -> pg_strdup()

I wonder if, instead of doing strcmp() all over the place, we should
give this behavior a separate boolean flag in lclContext.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#11Michael Banck
michael.banck@credativ.de
In reply to: Alvaro Herrera (#10)
1 attachment(s)
Re: [PoC PATCH] Parallel dump to /dev/null

Hi Alvaro,

On Thu, Mar 01, 2018 at 04:00:52PM -0300, Alvaro Herrera wrote:

strdup -> pg_strdup()

Fixed.

I wonder if, instead of doing strcmp() all over the place, we should
give this behavior a separate boolean flag in lclContext.

I mused a bit about what to name that flag and came up with `discard',
but maybe somebody has a better idea?

In any case, new patch attached which does this.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB M�nchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 M�nchengladbach
Gesch�ftsf�hrung: Dr. Michael Meskes, J�rg Folz, Sascha Heuer

Attachments:

parallel-pgdump-dev-null-V2.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index a96da15dc1..8657ac30e6 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -512,7 +512,10 @@ cfopen_write(const char *path, const char *mode, int compression)
 #ifdef HAVE_LIBZ
 		char	   *fname;
 
-		fname = psprintf("%s.gz", path);
+		if (strcmp(path, NULL_DEVICE) != 0)
+			fname = psprintf("%s.gz", path);
+		else
+			fname = pg_strdup(path);
 		fp = cfopen(fname, mode, compression);
 		free_keep_errno(fname);
 #else
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 8dd1915998..ab17adbc62 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -62,6 +62,13 @@ typedef struct _z_stream
 typedef z_stream *z_streamp;
 #endif
 
+/* Null device */
+#ifdef WIN32
+#define NULL_DEVICE "NUL"
+#else
+#define NULL_DEVICE "/dev/null"
+#endif
+
 /* Data block types */
 #define BLK_DATA 1
 #define BLK_BLOBS 3
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 4aabb40f59..5936a799c7 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -54,6 +54,7 @@ typedef struct
 
 	cfp		   *blobsTocFH;		/* file handle for blobs.toc */
 	ParallelState *pstate;		/* for parallel backup / restore */
+	int			discard;		/* target is NULL_DEVICE */
 } lclContext;
 
 typedef struct
@@ -158,6 +159,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
 
 	ctx->directory = AH->fSpec;
 
+	if (strcmp(ctx->directory, NULL_DEVICE) == 0)
+		ctx->discard = 1;
+	else
+		ctx->discard = 0;
+
 	if (AH->mode == archModeWrite)
 	{
 		struct stat st;
@@ -192,9 +198,15 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
 			}
 		}
 
-		if (!is_empty && mkdir(ctx->directory, 0700) < 0)
-			exit_horribly(modulename, "could not create directory \"%s\": %s\n",
-						  ctx->directory, strerror(errno));
+		/*
+		 * Create the output directory, unless it exists already and is empty.
+		 * If we are discarding output to the null device, we must not
+		 * create it.
+		 */
+		if (!is_empty && !ctx->discard)
+			if (mkdir(ctx->directory, 0711) < 0)
+				exit_horribly(modulename, "could not create directory \"%s\": %s\n",
+							  ctx->directory, strerror(errno));
 	}
 	else
 	{							/* Read Mode */
@@ -602,8 +614,10 @@ _CloseArchive(ArchiveHandle *AH)
 		/*
 		 * In directory mode, there is no need to sync all the entries
 		 * individually. Just recurse once through all the files generated.
+		 * If we are discarding output to the null device, it does not
+		 * need syncing and would emit a warning if we did.
 		 */
-		if (AH->dosync)
+		if (AH->dosync && !ctx->discard)
 			fsync_dir_recurse(ctx->directory, progname);
 	}
 	AH->FH = NULL;
@@ -659,7 +673,14 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
 	lclContext *ctx = (lclContext *) AH->formatData;
 	char		fname[MAXPGPATH];
 
-	snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
+	/*
+	 * If we are discarding output to the null device, just use that as
+	 * fname.
+	 */
+	if (ctx->discard)
+		snprintf(fname, MAXPGPATH, NULL_DEVICE);
+	else
+		snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
 
 	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
 
@@ -721,9 +742,18 @@ setFilePath(ArchiveHandle *AH, char *buf, const char *relativeFilename)
 	if (strlen(dname) + 1 + strlen(relativeFilename) + 1 > MAXPGPATH)
 		exit_horribly(modulename, "file name too long: \"%s\"\n", dname);
 
-	strcpy(buf, dname);
-	strcat(buf, "/");
-	strcat(buf, relativeFilename);
+	/*
+	 * If we are discarding output to the null device, we cannot set a file
+	 * path and just set the buffer to the null device.
+	 */
+	if (ctx->discard)
+	{
+		strcpy(buf, NULL_DEVICE);
+	} else {
+		strcpy(buf, dname);
+		strcat(buf, "/");
+		strcat(buf, relativeFilename);
+	}
 }
 
 /*
#12Andreas 'ads' Scherbaum
adsmail@wars-nicht.de
In reply to: Michael Banck (#11)
Re: [PoC PATCH] Parallel dump to /dev/null

On Mon, Mar 5, 2018 at 1:49 PM, Michael Banck <michael.banck@credativ.de>
wrote:

In any case, new patch attached which does this.

The null device is already defined in port.h, as DEVNULL. No need to
redefine it as NULL_DEVICE.
Alternatively paths.h defines _PATH_DEVNULL.

Regards,

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project

#13Michael Banck
michael.banck@credativ.de
In reply to: Andreas 'ads' Scherbaum (#12)
1 attachment(s)
Re: [PoC PATCH] Parallel dump to /dev/null

Hi Andreas,

On Mon, Mar 05, 2018 at 03:15:19PM +0100, Andreas 'ads' Scherbaum wrote:

The null device is already defined in port.h, as DEVNULL. No need to
redefine it as NULL_DEVICE.

Thanks for pointing that out, a new patch removing NULL_DEVICE is
attached.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB M�nchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 M�nchengladbach
Gesch�ftsf�hrung: Dr. Michael Meskes, J�rg Folz, Sascha Heuer

Attachments:

parallel-pgdump-dev-null-V3.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index a96da15dc1..0953290069 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -512,7 +512,10 @@ cfopen_write(const char *path, const char *mode, int compression)
 #ifdef HAVE_LIBZ
 		char	   *fname;
 
-		fname = psprintf("%s.gz", path);
+		if (strcmp(path, DEVNULL) != 0)
+			fname = psprintf("%s.gz", path);
+		else
+			fname = pg_strdup(path);
 		fp = cfopen(fname, mode, compression);
 		free_keep_errno(fname);
 #else
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 4aabb40f59..6aee04dad3 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -54,6 +54,7 @@ typedef struct
 
 	cfp		   *blobsTocFH;		/* file handle for blobs.toc */
 	ParallelState *pstate;		/* for parallel backup / restore */
+	int			discard;		/* target is DEVNULL */
 } lclContext;
 
 typedef struct
@@ -158,6 +159,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
 
 	ctx->directory = AH->fSpec;
 
+	if (strcmp(ctx->directory, DEVNULL) == 0)
+		ctx->discard = 1;
+	else
+		ctx->discard = 0;
+
 	if (AH->mode == archModeWrite)
 	{
 		struct stat st;
@@ -192,9 +198,15 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
 			}
 		}
 
-		if (!is_empty && mkdir(ctx->directory, 0700) < 0)
-			exit_horribly(modulename, "could not create directory \"%s\": %s\n",
-						  ctx->directory, strerror(errno));
+		/*
+		 * Create the output directory, unless it exists already and is empty.
+		 * If we are discarding output to the null device, we must not
+		 * create it.
+		 */
+		if (!is_empty && !ctx->discard)
+			if (mkdir(ctx->directory, 0711) < 0)
+				exit_horribly(modulename, "could not create directory \"%s\": %s\n",
+							  ctx->directory, strerror(errno));
 	}
 	else
 	{							/* Read Mode */
@@ -602,8 +614,10 @@ _CloseArchive(ArchiveHandle *AH)
 		/*
 		 * In directory mode, there is no need to sync all the entries
 		 * individually. Just recurse once through all the files generated.
+		 * If we are discarding output to the null device, it does not
+		 * need syncing and would emit a warning if we did.
 		 */
-		if (AH->dosync)
+		if (AH->dosync && !ctx->discard)
 			fsync_dir_recurse(ctx->directory, progname);
 	}
 	AH->FH = NULL;
@@ -659,7 +673,14 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
 	lclContext *ctx = (lclContext *) AH->formatData;
 	char		fname[MAXPGPATH];
 
-	snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
+	/*
+	 * If we are discarding output to the null device, just use that as
+	 * fname.
+	 */
+	if (ctx->discard)
+		snprintf(fname, MAXPGPATH, DEVNULL);
+	else
+		snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
 
 	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
 
@@ -721,9 +742,18 @@ setFilePath(ArchiveHandle *AH, char *buf, const char *relativeFilename)
 	if (strlen(dname) + 1 + strlen(relativeFilename) + 1 > MAXPGPATH)
 		exit_horribly(modulename, "file name too long: \"%s\"\n", dname);
 
-	strcpy(buf, dname);
-	strcat(buf, "/");
-	strcat(buf, relativeFilename);
+	/*
+	 * If we are discarding output to the null device, we cannot set a file
+	 * path and just set the buffer to the null device.
+	 */
+	if (ctx->discard)
+	{
+		strcpy(buf, DEVNULL);
+	} else {
+		strcpy(buf, dname);
+		strcat(buf, "/");
+		strcat(buf, relativeFilename);
+	}
 }
 
 /*
#14Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Michael Banck (#13)
Re: [PoC PATCH] Parallel dump to /dev/null

Please make ctx->discard a bool, not an int.

You can initialize it like this:
ctx->discard = strcmp(ctx->directory, DEVNULL) == 0;

Why do you change the mode to create directory to 0711 from 0700?

You have a cuddled "else" in the last hunk. Please uncuddle.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#15Michael Banck
michael.banck@credativ.de
In reply to: Alvaro Herrera (#14)
1 attachment(s)
Re: [PoC PATCH] Parallel dump to /dev/null

Hi Alvaro,

Am Montag, den 05.03.2018, 12:48 -0300 schrieb Alvaro Herrera:

Please make ctx->discard a bool, not an int.

Ok, done so now. I forgot to mention that in the earlier resubmission,
but I had a look at the other pg_backup_*.c files and isSpecialScript
and hasSeek were ints as well, so I opted for that.

You can initialize it like this:
ctx->discard = strcmp(ctx->directory, DEVNULL) == 0;

Thanks, changed it thusly.

Why do you change the mode to create directory to 0711 from 0700?

Oh wow, that was either a mistype in vim or a very old test hack, thanks
for spotting it.

You have a cuddled "else" in the last hunk. Please uncuddle.

Done so now, hopefully.

Thanks again for the review, a new version is attached.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Attachments:

parallel-pgdump-dev-null-V4.patchtext/x-patch; charset=UTF-8; name=parallel-pgdump-dev-null-V4.patchDownload
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index a96da15dc1..0953290069 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -512,7 +512,10 @@ cfopen_write(const char *path, const char *mode, int compression)
 #ifdef HAVE_LIBZ
 		char	   *fname;
 
-		fname = psprintf("%s.gz", path);
+		if (strcmp(path, DEVNULL) != 0)
+			fname = psprintf("%s.gz", path);
+		else
+			fname = pg_strdup(path);
 		fp = cfopen(fname, mode, compression);
 		free_keep_errno(fname);
 #else
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 4aabb40f59..3b77023d77 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -54,6 +54,7 @@ typedef struct
 
 	cfp		   *blobsTocFH;		/* file handle for blobs.toc */
 	ParallelState *pstate;		/* for parallel backup / restore */
+	bool		discard;		/* target is DEVNULL */
 } lclContext;
 
 typedef struct
@@ -158,6 +159,8 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
 
 	ctx->directory = AH->fSpec;
 
+	ctx->discard = strcmp(ctx->directory, DEVNULL) == 0;
+
 	if (AH->mode == archModeWrite)
 	{
 		struct stat st;
@@ -192,9 +195,15 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
 			}
 		}
 
-		if (!is_empty && mkdir(ctx->directory, 0700) < 0)
-			exit_horribly(modulename, "could not create directory \"%s\": %s\n",
-						  ctx->directory, strerror(errno));
+		/*
+		 * Create the output directory, unless it exists already and is empty.
+		 * If we are discarding output to the null device, we must not
+		 * create it.
+		 */
+		if (!is_empty && !ctx->discard)
+			if (mkdir(ctx->directory, 0700) < 0)
+				exit_horribly(modulename, "could not create directory \"%s\": %s\n",
+							  ctx->directory, strerror(errno));
 	}
 	else
 	{							/* Read Mode */
@@ -602,8 +611,10 @@ _CloseArchive(ArchiveHandle *AH)
 		/*
 		 * In directory mode, there is no need to sync all the entries
 		 * individually. Just recurse once through all the files generated.
+		 * If we are discarding output to the null device, it does not
+		 * need syncing and would emit a warning if we did.
 		 */
-		if (AH->dosync)
+		if (AH->dosync && !ctx->discard)
 			fsync_dir_recurse(ctx->directory, progname);
 	}
 	AH->FH = NULL;
@@ -659,7 +670,14 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
 	lclContext *ctx = (lclContext *) AH->formatData;
 	char		fname[MAXPGPATH];
 
-	snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
+	/*
+	 * If we are discarding output to the null device, just use that as
+	 * fname.
+	 */
+	if (ctx->discard)
+		snprintf(fname, MAXPGPATH, DEVNULL);
+	else
+		snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
 
 	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
 
@@ -721,9 +739,18 @@ setFilePath(ArchiveHandle *AH, char *buf, const char *relativeFilename)
 	if (strlen(dname) + 1 + strlen(relativeFilename) + 1 > MAXPGPATH)
 		exit_horribly(modulename, "file name too long: \"%s\"\n", dname);
 
-	strcpy(buf, dname);
-	strcat(buf, "/");
-	strcat(buf, relativeFilename);
+	/*
+	 * If we are discarding output to the null device, we cannot set a file
+	 * path and just set the buffer to the null device.
+	 */
+	if (ctx->discard)
+		strcpy(buf, DEVNULL);
+	else
+	{
+		strcpy(buf, dname);
+		strcat(buf, "/");
+		strcat(buf, relativeFilename);
+	}
 }
 
 /*
#16Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Michael Banck (#15)
1 attachment(s)
Re: [PoC PATCH] Parallel dump to /dev/null

I made a few amendments (here's v5) and was ready to push, when I
noticed that _StartBlobs() does not seem to be doing the right thing.
Did you test this with a few large objects?

The reason I noticed is I wondered about the amount of open() calls
(plus zlib function calls) we would save by keeping an FD open to
/dev/null, rather than opening it over and over for each object -- ie.,
maybe not touch setFilePath() at all, if possible. That looks perhaps
too invasive, so maybe not. But do audit other callsites that may open
files.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v5-0001-Allow-parallel-dump-to-go-to-dev-null.patchtext/plain; charset=us-asciiDownload
From 5627c18f5a26ad8314f0590ce8ccf3b4093efb74 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Mon, 5 Mar 2018 13:42:29 -0300
Subject: [PATCH v5] Allow parallel dump to go to /dev/null

This is useful to quickly determine whether a dump would finish
without errors.

Author: Michael Banck
Discussion: https://postgr.es/m/20180201132446.GA13784@nighthawk.caipicrew.dd-dns.de
---
 src/bin/pg_dump/compress_io.c         |  8 +++++--
 src/bin/pg_dump/pg_backup_directory.c | 43 ++++++++++++++++++++++++++++-------
 2 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index a96da15dc1..a83b1e2ef5 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -496,7 +496,8 @@ cfopen_read(const char *path, const char *mode)
  *
  * If 'compression' is non-zero, a gzip compressed stream is opened, and
  * 'compression' indicates the compression level used. The ".gz" suffix
- * is automatically added to 'path' in that case.
+ * is automatically added to 'path' in that case (unless it's the null
+ * device).
  *
  * On failure, return NULL with an error code in errno.
  */
@@ -512,7 +513,10 @@ cfopen_write(const char *path, const char *mode, int compression)
 #ifdef HAVE_LIBZ
 		char	   *fname;
 
-		fname = psprintf("%s.gz", path);
+		if (strcmp(path, DEVNULL) != 0)
+			fname = psprintf("%s.gz", path);
+		else
+			fname = pg_strdup(path);
 		fp = cfopen(fname, mode, compression);
 		free_keep_errno(fname);
 #else
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 4aabb40f59..3b77023d77 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -54,6 +54,7 @@ typedef struct
 
 	cfp		   *blobsTocFH;		/* file handle for blobs.toc */
 	ParallelState *pstate;		/* for parallel backup / restore */
+	bool		discard;		/* target is DEVNULL */
 } lclContext;
 
 typedef struct
@@ -158,6 +159,8 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
 
 	ctx->directory = AH->fSpec;
 
+	ctx->discard = strcmp(ctx->directory, DEVNULL) == 0;
+
 	if (AH->mode == archModeWrite)
 	{
 		struct stat st;
@@ -192,9 +195,15 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
 			}
 		}
 
-		if (!is_empty && mkdir(ctx->directory, 0700) < 0)
-			exit_horribly(modulename, "could not create directory \"%s\": %s\n",
-						  ctx->directory, strerror(errno));
+		/*
+		 * Create the output directory, unless it exists already and is empty.
+		 * If we are discarding output to the null device, we must not
+		 * create it.
+		 */
+		if (!is_empty && !ctx->discard)
+			if (mkdir(ctx->directory, 0700) < 0)
+				exit_horribly(modulename, "could not create directory \"%s\": %s\n",
+							  ctx->directory, strerror(errno));
 	}
 	else
 	{							/* Read Mode */
@@ -602,8 +611,10 @@ _CloseArchive(ArchiveHandle *AH)
 		/*
 		 * In directory mode, there is no need to sync all the entries
 		 * individually. Just recurse once through all the files generated.
+		 * If we are discarding output to the null device, it does not
+		 * need syncing and would emit a warning if we did.
 		 */
-		if (AH->dosync)
+		if (AH->dosync && !ctx->discard)
 			fsync_dir_recurse(ctx->directory, progname);
 	}
 	AH->FH = NULL;
@@ -659,7 +670,14 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
 	lclContext *ctx = (lclContext *) AH->formatData;
 	char		fname[MAXPGPATH];
 
-	snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
+	/*
+	 * If we are discarding output to the null device, just use that as
+	 * fname.
+	 */
+	if (ctx->discard)
+		snprintf(fname, MAXPGPATH, DEVNULL);
+	else
+		snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
 
 	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
 
@@ -721,9 +739,18 @@ setFilePath(ArchiveHandle *AH, char *buf, const char *relativeFilename)
 	if (strlen(dname) + 1 + strlen(relativeFilename) + 1 > MAXPGPATH)
 		exit_horribly(modulename, "file name too long: \"%s\"\n", dname);
 
-	strcpy(buf, dname);
-	strcat(buf, "/");
-	strcat(buf, relativeFilename);
+	/*
+	 * If we are discarding output to the null device, we cannot set a file
+	 * path and just set the buffer to the null device.
+	 */
+	if (ctx->discard)
+		strcpy(buf, DEVNULL);
+	else
+	{
+		strcpy(buf, dname);
+		strcat(buf, "/");
+		strcat(buf, relativeFilename);
+	}
 }
 
 /*
-- 
2.11.0

#17Michael Banck
michael.banck@credativ.de
In reply to: Alvaro Herrera (#16)
1 attachment(s)
Re: [PoC PATCH] Parallel dump to /dev/null

Hi Alvaro,

Am Montag, den 05.03.2018, 13:56 -0300 schrieb Alvaro Herrera:

I made a few amendments (here's v5) and was ready to push, when I
noticed that _StartBlobs() does not seem to be doing the right thing.
Did you test this with a few large objects?

I did test it on the regression database, which leaves one or two large
objects behind:

mba@fock:~$ psql -h /tmp -p 65432 -d regression -c '\dl'
                Large objects
  ID   | Owner |         Description          
-------+-------+------------------------------
  3001 | mba   | testing comments
 36867 | mba   | I Wandered Lonely as a Cloud
 47229 | mba   | 
(3 rows)

What exactly is wrong with _StartBlobs()? It calls setFilePath() which
makes sure /dev/null is opened and not something else.

The reason I noticed is I wondered about the amount of open() calls
(plus zlib function calls) we would save by keeping an FD open to
/dev/null, rather than opening it over and over for each object -- ie.,
maybe not touch setFilePath() at all, if possible. That looks perhaps
too invasive, so maybe not. But do audit other callsites that may open
files.

I see your point; I've hacked together the attached additional PoC
patch, which keeps the dataFH open. The parallel case was tricky, I had
to add an additional flag to lclContext so that DEVNULL is only opened
once for data files cause I could not find a place where to set it for
parallel workers and it crashed otherwise.

This cuts down the number of /dev/null opens from 352 to 6 for a -j 2
dump of the regression database to /dev/null.

In my opinion, I think this is too evolved and possibly error-prone for
being just an optimization, so I'd drop that for v11 for now and maybe
revisit it later.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Attachments:

parallel-pgdump-keep-dev-null-open.patchtext/x-patch; charset=UTF-8; name=parallel-pgdump-keep-dev-null-open.patchDownload
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 3b77023d77..4ffa6aaef2 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -55,6 +55,7 @@ typedef struct
 	cfp		   *blobsTocFH;		/* file handle for blobs.toc */
 	ParallelState *pstate;		/* for parallel backup / restore */
 	bool		discard;		/* target is DEVNULL */
+	bool		devnull_open;		/* whether DEVNULL has been opened as dataFH */
 } lclContext;
 
 typedef struct
@@ -340,10 +341,14 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
 
 	setFilePath(AH, fname, tctx->filename);
 
-	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
-	if (ctx->dataFH == NULL)
-		exit_horribly(modulename, "could not open output file \"%s\": %s\n",
-					  fname, strerror(errno));
+	/* Open the file, unless we are discarding output to the null device */
+	if (!ctx->discard)
+	{
+		ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
+		if (ctx->dataFH == NULL)
+			exit_horribly(modulename, "could not open output file \"%s\": %s\n",
+						  fname, strerror(errno));
+	}
 }
 
 /*
@@ -379,10 +384,13 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
 {
 	lclContext *ctx = (lclContext *) AH->formatData;
 
-	/* Close the file */
-	cfclose(ctx->dataFH);
+	/* Close the file, unless we are discarding output to the null device */
+	if (!ctx->discard)
+	{
+		cfclose(ctx->dataFH);
 
-	ctx->dataFH = NULL;
+		ctx->dataFH = NULL;
+	}
 }
 
 /*
@@ -604,6 +612,14 @@ _CloseArchive(ArchiveHandle *AH)
 		if (cfclose(tocFH) != 0)
 			exit_horribly(modulename, "could not close TOC file: %s\n",
 						  strerror(errno));
+
+		/*
+		 * Open the DEVNULL special file for writing, we only need to open it
+		 * once.
+		 */
+		if (ctx->discard)
+			ctx->dataFH = cfopen_write(DEVNULL, PG_BINARY_W, 0);
+
 		WriteDataChunks(AH, ctx->pstate);
 
 		ParallelBackupEnd(AH, ctx->pstate);
@@ -657,6 +673,20 @@ _StartBlobs(ArchiveHandle *AH, TocEntry *te)
 	if (ctx->blobsTocFH == NULL)
 		exit_horribly(modulename, "could not open output file \"%s\": %s\n",
 					  fname, strerror(errno));
+
+	/*
+	 * If we are discarding output to the null device, open the dataFH here so
+	 * we do not have to open and close it for each blob.
+	 */
+	if (ctx->discard)
+	{
+		snprintf(fname, MAXPGPATH, DEVNULL);
+		ctx->dataFH = cfopen_write(fname, PG_BINARY_W, 0);
+
+		if (ctx->dataFH == NULL)
+			exit_horribly(modulename, "could not open output file \"%s\": %s\n",
+						  fname, strerror(errno));
+	}
 }
 
 /*
@@ -670,20 +700,16 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
 	lclContext *ctx = (lclContext *) AH->formatData;
 	char		fname[MAXPGPATH];
 
-	/*
-	 * If we are discarding output to the null device, just use that as
-	 * fname.
-	 */
-	if (ctx->discard)
-		snprintf(fname, MAXPGPATH, DEVNULL);
-	else
+	if (!ctx->discard)
+	{
 		snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
 
-	ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
+		ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
 
-	if (ctx->dataFH == NULL)
-		exit_horribly(modulename, "could not open output file \"%s\": %s\n",
-					  fname, strerror(errno));
+		if (ctx->dataFH == NULL)
+			exit_horribly(modulename, "could not open output file \"%s\": %s\n",
+						  fname, strerror(errno));
+	}
 }
 
 /*
@@ -698,9 +724,12 @@ _EndBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
 	char		buf[50];
 	int			len;
 
-	/* Close the BLOB data file itself */
-	cfclose(ctx->dataFH);
-	ctx->dataFH = NULL;
+	if (!ctx->discard)
+	{
+		/* Close the BLOB data file itself */
+		cfclose(ctx->dataFH);
+		ctx->dataFH = NULL;
+	}
 
 	/* register the blob in blobs.toc */
 	len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
@@ -720,6 +749,12 @@ _EndBlobs(ArchiveHandle *AH, TocEntry *te)
 
 	cfclose(ctx->blobsTocFH);
 	ctx->blobsTocFH = NULL;
+
+	if (ctx->discard)
+	{
+		cfclose(ctx->dataFH);
+		ctx->dataFH = NULL;
+	}
 }
 
 /*
@@ -793,6 +828,14 @@ _DeClone(ArchiveHandle *AH)
 static int
 _WorkerJobDumpDirectory(ArchiveHandle *AH, TocEntry *te)
 {
+	lclContext *ctx = (lclContext *) AH->formatData;
+	if (ctx->discard)
+		if (!ctx->devnull_open)
+		{
+			ctx->dataFH = cfopen_write(DEVNULL, PG_BINARY_W, 0);
+			ctx->devnull_open = true;
+		}
+
 	/*
 	 * This function returns void. We either fail and die horribly or
 	 * succeed... A failure will be detected by the parent when the child dies
#18Christoph Berg
christoph.berg@credativ.de
In reply to: Alvaro Herrera (#16)
Re: [PoC PATCH] Parallel dump to /dev/null

Re: Alvaro Herrera 2018-03-05 <20180305165609.kq5y7uzy64o45vsg@alvherre.pgsql>

The reason I noticed is I wondered about the amount of open() calls
(plus zlib function calls) we would save by keeping an FD open to
/dev/null, rather than opening it over and over for each object -- ie.,
maybe not touch setFilePath() at all, if possible. That looks perhaps
too invasive, so maybe not. But do audit other callsites that may open
files.

In normal operation, the files are also opened individually, so
special-casing /dev/null seems overly specific to me (and I'd be very
surprised if it made any difference.)

For the feature to be useful in practise, it needs documentation.
People won't expect -Fd -f /dev/null to work because /dev/null is not
a directory.

Otherwise, +1 from me.
Christoph

#19Michael Paquier
michael@paquier.xyz
In reply to: Christoph Berg (#18)
Re: [PoC PATCH] Parallel dump to /dev/null

On Tue, Mar 20, 2018 at 09:54:07AM +0100, Christoph Berg wrote:

Otherwise, +1 from me.

I have been thinking about this patch lately, and on the contrary I am
voting -1 for the concepts behind this patch. pg_dump is by nature a
tool aimed at fetching data from the database, at putting it in a shape
wanted by the user, and optionally at saving the data and at making it
durable (since recently for the last part).
It is not a corruption detection tool. Even if it was a tool to detect
corruption, it is doing it wrong in two ways:
1) It scans tables using only sequential scans, so it basically never
checks any other AMs than heap.
2) It detects only one failure at a time and stops. Hence in order to
detect all corruptions, one need to run pg_dump, repair or zero the
pages and then rince and repeat until a successful run is achieved.
This is a costly process particularly on large relations, where a run of
pg_dump can take minutes, and the more the pages, the more time it takes
to do the whole cleanup before being able to save as much data as
possible.

Now, why are people using pg_dump > /dev/null? Mainly the lack of
better tools, which would be actually able to detect pages in corrupted
pages in one run, and not only heap pages. I honestly think that
amcheck is something that we sould more focus on and has more potential
on the matter, and that we are just complicating pg_dump to do something
it is not designed for, and would do it badly anyway.
--
Michael

#20Christoph Berg
christoph.berg@credativ.de
In reply to: Michael Paquier (#19)
Re: [PoC PATCH] Parallel dump to /dev/null

Re: Michael Paquier 2018-03-20 <20180320135720.GE13368@paquier.xyz>

Now, why are people using pg_dump > /dev/null? Mainly the lack of
better tools, which would be actually able to detect pages in corrupted
pages in one run, and not only heap pages. I honestly think that
amcheck is something that we sould more focus on and has more potential
on the matter, and that we are just complicating pg_dump to do something
it is not designed for, and would do it badly anyway.

Still, people are using it for that purpose now, and speeding it up
would just be a non-intrusive patch.

Also, if pg_dump is a backup tool, why does that mean it must not be
used for anything else?

Mit freundlichen Gr��en,
Christoph Berg
--
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB M�nchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 M�nchengladbach
Gesch�ftsf�hrung: Dr. Michael Meskes, J�rg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christoph Berg (#20)
Re: [PoC PATCH] Parallel dump to /dev/null

Christoph Berg <christoph.berg@credativ.de> writes:

Re: Michael Paquier 2018-03-20 <20180320135720.GE13368@paquier.xyz>

Now, why are people using pg_dump > /dev/null? Mainly the lack of
better tools, which would be actually able to detect pages in corrupted
pages in one run, and not only heap pages. I honestly think that
amcheck is something that we sould more focus on and has more potential
on the matter, and that we are just complicating pg_dump to do something
it is not designed for, and would do it badly anyway.

FWIW, I've been wondering the same thing about this patch. I hadn't
bothered to actually read the thread up to now, but the title was
enough to make me think "do we really need that?"

Still, people are using it for that purpose now, and speeding it up
would just be a non-intrusive patch.

If it were non-intrusive, that would be OK, but after a quick look
at the latest patch I can't say I find it so. It seems to add a bunch
of poorly-defined additional states to each pg_dump format module.

It might help if the patch were less enthusiastic about trying to
"optimize" by avoiding extra file opens/closes in the case of output
to /dev/null. That seems to account for a lot of the additional
complication, and I can't see a reason that it'd be worth it.

regards, tom lane

#22Christoph Berg
christoph.berg@credativ.de
In reply to: Tom Lane (#21)
Re: [PoC PATCH] Parallel dump to /dev/null

Re: Tom Lane 2018-03-20 <12960.1521557852@sss.pgh.pa.us>

It might help if the patch were less enthusiastic about trying to
"optimize" by avoiding extra file opens/closes in the case of output
to /dev/null. That seems to account for a lot of the additional
complication, and I can't see a reason that it'd be worth it.

Note that the last patch was just a PoC to check if the extra
open/close could be avoided. The "real" patch is the 2nd last.

Mit freundlichen Gr��en,
Christoph Berg
--
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB M�nchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 M�nchengladbach
Gesch�ftsf�hrung: Dr. Michael Meskes, J�rg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE

#23Stephen Frost
sfrost@snowman.net
In reply to: Christoph Berg (#22)
Re: [PoC PATCH] Parallel dump to /dev/null

Greetings,

* Christoph Berg (christoph.berg@credativ.de) wrote:

Re: Tom Lane 2018-03-20 <12960.1521557852@sss.pgh.pa.us>

It might help if the patch were less enthusiastic about trying to
"optimize" by avoiding extra file opens/closes in the case of output
to /dev/null. That seems to account for a lot of the additional
complication, and I can't see a reason that it'd be worth it.

Note that the last patch was just a PoC to check if the extra
open/close could be avoided. The "real" patch is the 2nd last.

Even so, I'm really not a fan of this patch either. If we could do this
in a general way where we supported parallel mode with output to stdout
or to a file and then that file could happen to be /dev/null, I'd be
more interested because it's at least reasonable for someone to want
that beyond using pg_dump to (poorly) check for corruption.

As it is, this is an extremely special case which may even end up being
confusing for users (I can run a parallel pg_dump to /dev/null, but not
to a regular file?!).

Instead of trying to use pg_dump for this, we should provide a way to
actually check for corruption across everything (instead of just the
heap..), and have all detected corruption reported in one pass.

One of the things that I really like about PostgreSQL is that we
typically try to provide appropriate tools for the job and avoid just
hacking things to give users a half-solution, which is what this seems
like to me. Let's write a proper tool (perhaps as a background
worker..?) to scan the database (all of it...) which will find and
report corruption anywhere. That'll take another release to do, but
hopefully pushing back on this will encourage that to happen, whereas
allowing this in would actively discourage someone from writing a proper
tool and we would be much worse off for that.

Thanks!

Stephen

#24Michael Banck
michael.banck@credativ.de
In reply to: Stephen Frost (#23)
Re: [PoC PATCH] Parallel dump to /dev/null

Hi,

Am Dienstag, den 20.03.2018, 19:19 -0400 schrieb Stephen Frost:

* Christoph Berg (christoph.berg@credativ.de) wrote:

Re: Tom Lane 2018-03-20 <12960.1521557852@sss.pgh.pa.us>

It might help if the patch were less enthusiastic about trying to
"optimize" by avoiding extra file opens/closes in the case of output
to /dev/null. That seems to account for a lot of the additional
complication, and I can't see a reason that it'd be worth it.

Note that the last patch was just a PoC to check if the extra
open/close could be avoided. The "real" patch is the 2nd last.

Even so, I'm really not a fan of this patch either. If we could do this
in a general way where we supported parallel mode with output to stdout
or to a file and then that file could happen to be /dev/null, I'd be
more interested because it's at least reasonable for someone to want
that beyond using pg_dump to (poorly) check for corruption.

What you are saying is you want parallel Dump for the custom format, not
just the directory format. Sure, I want that too, but that is not what
this patch is about, it does not change the custom format in any way.

But I agree that this would be a useful feature.

As it is, this is an extremely special case which may even end up being
confusing for users (I can run a parallel pg_dump to /dev/null, but not
to a regular file?!).

This patch allows to treat the NUL device as a directory (which it
normally isn't) in the directory format. So it addresses the "I can
'pg_dump -Fc -f /dev/null', but not 'pg_dump -Fd -f /dev/null'?!"
question. 

That it allows for a parallel dump to /dev/null is merely a side-effect
of this, so you could just as well name the patch "Support dumping to
/dev/null in directory format as well".

I honestly didn't know before I wrote that patch whether 'pg_dump -Fd -f
/dev/null' might not just work and the error message you get ('could not
create directory "/dev/null": File exists') is a bit meh (but logical
from the point of view of pg_dump, of course).

Instead of trying to use pg_dump for this, we should provide a way to
actually check for corruption across everything (instead of just the
heap..), and have all detected corruption reported in one pass.

Well, I agree that we should provide a more general tool as well.

That'll take another release to do, but hopefully pushing back on this
will encourage that to happen, whereas allowing this in would actively
discourage someone from writing a proper tool and we would be much
worse off for that.

To be honest, I don't buy that argument. Having had log shipping and
warm standbys did not eventually stop us from implementing proper
streaming replication. I would of course welcome the above tool, but I
probably won't work on that for v12.

I think what will happen is that everybody interested is just continuing
to dump to /dev/null sequentially, or use external hacks with ram disks,
nullfs FUSE drivers or scripts like https://github.com/cybertec-postgres
ql/scripts/blob/master/quick_verify.py .

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

In reply to: Michael Paquier (#19)
Re: [PoC PATCH] Parallel dump to /dev/null

On Tue, Mar 20, 2018 at 6:57 AM, Michael Paquier <michael@paquier.xyz> wrote:

Now, why are people using pg_dump > /dev/null? Mainly the lack of
better tools, which would be actually able to detect pages in corrupted
pages in one run, and not only heap pages. I honestly think that
amcheck is something that we sould more focus on and has more potential
on the matter, and that we are just complicating pg_dump to do something
it is not designed for, and would do it badly anyway.

+1.

There are a wide variety of things that pg_dump will not check when
used as a smoke-test for heap corruption. Things that could be checked
without doing any more I/O will not be checked. For example, we won't
test for sane xmin/xmax fields across update chains, even though the
additional cost of doing that is pretty modest. Also, we won't make
any attempt to validate NOT NULL constraints. I've noticed that if you
corrupt an ItemId entry in a heap page, that will tend to result in
the row seeming to consist of all-NULLs to expression evaluation.
Typically, the pg_dump trick only detects abject corruption, such as a
totally corrupt page header.

amcheck isn't there yet, of course. The heapallindexed enhancement
that's in the current CF will help, though. And, it won't be very hard
to adapt heapallindexed verification to take place in parallel, now
that IndexBuildHeapScan() can handle parallel heap scans.

I do still think that we need an amcheck function that specifically
targets a heap relation (not an index), and performs verification
fairly quickly, so my heapallindexed patch isn't enough. That wouldn't
share much with the existing amcheck verification functions. I hope
that someone else can pick that up soon.

--
Peter Geoghegan

#26Michael Paquier
michael@paquier.xyz
In reply to: Michael Banck (#24)
Re: [PoC PATCH] Parallel dump to /dev/null

On Wed, Mar 21, 2018 at 09:07:41AM +0100, Michael Banck wrote:

Am Dienstag, den 20.03.2018, 19:19 -0400 schrieb Stephen Frost:

Instead of trying to use pg_dump for this, we should provide a way to
actually check for corruption across everything (instead of just the
heap..), and have all detected corruption reported in one pass.

Well, I agree that we should provide a more general tool as well.

A background worker is one piece of the puzzle, and while automatic scan
and checks would be nice, I don't think that is is mandatory as one
could as well have a script running as a cron job. My point is: you are
going to need a generic tool able to do the sanity checks and the
scanning, the way to invoke or run the tool is just more sugar on top of
the cake.

Note that there is still a patch pending for amcheck for add support for
heap checks, which is based on a bloom filter method. Impossible to say
if this is going to make it to upstream for v11. For the time being
Peter G has worked on a more advanced tool which is basically using the
patch submitted and maintains it here:
https://github.com/petergeoghegan/amcheck

The module has been renamed to amcheck_next to not conflict with what
upstream has since v10.

Hence, is there any objection to mark this patch as returned with
feedback? Corruption tools are definitely wanted, but not in this shape
per the feedback gathered on this thread.
--
Michael

#27Michael Banck
michael.banck@credativ.de
In reply to: Michael Paquier (#26)
Re: [PoC PATCH] Parallel dump to /dev/null

Hi,

Am Donnerstag, den 22.03.2018, 10:33 +0900 schrieb Michael Paquier:

Hence, is there any objection to mark this patch as returned with
feedback?

I've done so now.

Cheers,

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

#28Michael Paquier
michael@paquier.xyz
In reply to: Michael Banck (#27)
Re: [PoC PATCH] Parallel dump to /dev/null

On Thu, Mar 22, 2018 at 08:42:05AM +0100, Michael Banck wrote:

Am Donnerstag, den 22.03.2018, 10:33 +0900 schrieb Michael Paquier:

Hence, is there any objection to mark this patch as returned with
feedback?

I've done so now.

Thanks Michael.
--
Michael