pg_basebackup's --gzip switch misbehaves

Started by Tom Laneover 3 years ago24 messages
#1Tom Lane
tgl@sss.pgh.pa.us

I've been trying to figure out why my new buildfarm animal mamba
occasionally fails the pg_basebackup tests [1]https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2022-09-01%2018%3A38%3A27[2]https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2022-08-31%2011%3A46%3A09. I've not run
that to ground yet, but one thing I've found that's consistently
reproducible everywhere is that pg_basebackup's --gzip switch
misbehaves. The manual says, and the principle of least astonishment
agrees, that that should invoke gzip with the default compression
level. However, the three test cases beginning at about line 810 of
010_pg_basebackup.pl produce these output file sizes on my x86_64
Linux machine:

backup_gzip ("--compress 1"):
total 3672
-rw-r-----. 1 postgres postgres 137756 Sep 2 23:38 backup_manifest
-rw-r-----. 1 postgres postgres 3538992 Sep 2 23:38 base.tar.gz
-rw-r-----. 1 postgres postgres 73991 Sep 2 23:38 pg_wal.tar.gz

backup_gzip2 ("--gzip"):
total 19544
-rw-r-----. 1 postgres postgres 137756 Sep 2 23:38 backup_manifest
-rw-r-----. 1 postgres postgres 3086972 Sep 2 23:38 base.tar.gz
-rw-r-----. 1 postgres postgres 16781399 Sep 2 23:38 pg_wal.tar.gz

backup_gzip3 ("--compress gzip:1"):
total 3672
-rw-r-----. 1 postgres postgres 137756 Sep 2 23:38 backup_manifest
-rw-r-----. 1 postgres postgres 3539006 Sep 2 23:38 base.tar.gz
-rw-r-----. 1 postgres postgres 73989 Sep 2 23:38 pg_wal.tar.gz

It makes sense that base.tar.gz is compressed a little better with
--gzip than with level-1 compression, but why is pg_wal.tar.gz not
compressed at all? It looks like the problem probably boils down to
which of "-1" and "0" means "default behavior" vs "no compression",
with different code layers interpreting that differently. I can't
find exactly where that's happening, but I did manage to stop the
failures with this crude hack:

diff --git a/src/bin/pg_basebackup/walmethods.c b/src/bin/pg_basebackup/walmethods.c
index e90aa0ba37..edddd9b578 100644
--- a/src/bin/pg_basebackup/walmethods.c
+++ b/src/bin/pg_basebackup/walmethods.c
@@ -1358,7 +1358,7 @@ CreateWalTarMethod(const char *tarbase,
 	sprintf(tar_data->tarfilename, "%s%s", tarbase, suffix);
 	tar_data->fd = -1;
 	tar_data->compression_algorithm = compression_algorithm;
-	tar_data->compression_level = compression_level;
+	tar_data->compression_level = compression_level > 0 ? compression_level : Z_DEFAULT_COMPRESSION;
 	tar_data->sync = sync;
 #ifdef HAVE_LIBZ
 	if (compression_algorithm == PG_COMPRESSION_GZIP)

That's not right as a real fix, because it would have the effect
that "--compress gzip:0" would also invoke default compression,
whereas what it should do is produce the uncompressed output
we're actually getting. Both cases have compression_level == 0
by the time we reach here, though.

I suspect that there are related bugs in other code paths in this
rat's nest of undocumented functions and dubious API abstractions;
but since it's all undocumented, who can say which places are wrong
and which are not?

I might not ding this code quite this hard, if I hadn't had
equally-unpleasant encounters with it previously (eg 248c3a937).
It's a mess, and I do not find it to be up to project standards.

A vaguely-related matter is that the deflateParams calls all pass "0"
as the third parameter:

if (deflateParams(tar_data->zp, tar_data->compression_level, 0) != Z_OK)

Aside from being unreadable, that's entirely unwarranted familiarity
with the innards of libz. zlib.h says you should be writing a named
constant, probably Z_DEFAULT_STRATEGY.

BTW, I'm fairly astonished that anyone would have thought that three
complete pg_basebackup cycles testing essentially-identical options
were a good use of developer time and buildfarm cycles from here to
eternity. Even if digging into it did expose a bug, the test case
deserves little credit for that, because it entirely failed to call
attention to the problem. I had to whack the script pretty hard
just to get it to not delete the evidence.

regards, tom lane

[1]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2022-09-01%2018%3A38%3A27
[2]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2022-08-31%2011%3A46%3A09

#2Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#1)
Re: pg_basebackup's --gzip switch misbehaves

On Sat, Sep 03, 2022 at 11:11:29AM -0400, Tom Lane wrote:

It makes sense that base.tar.gz is compressed a little better with
--gzip than with level-1 compression, but why is pg_wal.tar.gz not
compressed at all? It looks like the problem probably boils down to
which of "-1" and "0" means "default behavior" vs "no compression",
with different code layers interpreting that differently. I can't
find exactly where that's happening, but I did manage to stop the
failures with this crude hack:

There is a distinction coming in pg_basebackup.c from the way we
deparse the compression specification and the default compression
level that should be assigned if there is no level directly specified
by the user. It seems to me that the error comes from this code in
BaseBackup() when we are under STREAM_WAL (default):

if (client_compress->algorithm == PG_COMPRESSION_GZIP)
{
wal_compress_algorithm = PG_COMPRESSION_GZIP;
wal_compress_level =
(client_compress->options & PG_COMPRESSION_OPTION_LEVEL)
!= 0 ? client_compress->level : 0;

ffd5365 has missed that wal_compress_level should be set to
Z_DEFAULT_COMPRESSION if there is nothing set in the compression
spec for a zlib build. pg_receivewal.c enforces that already.

That's not right as a real fix, because it would have the effect
that "--compress gzip:0" would also invoke default compression,
whereas what it should do is produce the uncompressed output
we're actually getting. Both cases have compression_level == 0
by the time we reach here, though.

Nope, that would not be right.

BTW, I'm fairly astonished that anyone would have thought that three
complete pg_basebackup cycles testing essentially-identical options
were a good use of developer time and buildfarm cycles from here to
eternity. Even if digging into it did expose a bug, the test case
deserves little credit for that, because it entirely failed to call
attention to the problem. I had to whack the script pretty hard
just to get it to not delete the evidence.

The introduction of the compression specification has introduced a lot
of patterns where we expect or not expect compression to happen, and
on top of that this needs to be careful about backward-compatibility.
--
Michael

#3Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#2)
1 attachment(s)
Re: pg_basebackup's --gzip switch misbehaves

On Sun, Sep 04, 2022 at 02:20:52PM +0900, Michael Paquier wrote:

ffd5365 has missed that wal_compress_level should be set to
Z_DEFAULT_COMPRESSION if there is nothing set in the compression
spec for a zlib build. pg_receivewal.c enforces that already.

So, I have looked at this one. And it seems to me that the confusion
comes down to the existence of PG_COMPRESSION_OPTION_LEVEL. I have
considered a couple of approaches here, like introducing an extra
routine in compression.c to assign a default compression level, but
my conclusion is at the end simpler: we always finish by setting up a
level even if the caller wants nothing, in which can we can just use
each library's default. And lz4, zstd and zlib are able to handle the
case where a default is given down to their internal routines just
fine.

Attached is the patch I am finishing with, consisting of:
- the removal of PG_COMPRESSION_OPTION_LEVEL.
- assigning a default compression level when nothing is specified in
the spec.
- a couple of complifications in pg_receivewal, pg_basebackup and the
backend code as there is no need to worry about the compression
level.

A nice effect of this approach is that we can centralize the checks on
lz4, zstd and zlib when a build does not support any of these
options, as well as centralize the place where the default compression
levels are set. This passes all the regression tests, and it fixes
the issue reported. (Note that I have yet to run tests with all the
libraries disabled in ./configure.)

Thoughts?
--
Michael

Attachments:

0001-Simplify-compression-level-handling-in-compression-s.patchtext/x-diff; charset=us-asciiDownload
From 8752cb283d65e40fc4096f84d99bcb93d97c59bc Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Tue, 13 Sep 2022 16:06:29 +0900
Subject: [PATCH] Simplify compression level handling in compression
 specifications

---
 src/include/common/compression.h             |   3 +-
 src/backend/backup/basebackup_gzip.c         |  10 +-
 src/backend/backup/basebackup_lz4.c          |   9 +-
 src/backend/backup/basebackup_zstd.c         |  11 +-
 src/common/compression.c                     | 105 +++++++++++++++----
 src/bin/pg_basebackup/bbstreamer_gzip.c      |   4 +-
 src/bin/pg_basebackup/bbstreamer_lz4.c       |   3 +-
 src/bin/pg_basebackup/bbstreamer_zstd.c      |  13 +--
 src/bin/pg_basebackup/pg_basebackup.c        |  19 +---
 src/bin/pg_basebackup/pg_receivewal.c        |  35 +------
 src/bin/pg_basebackup/t/010_pg_basebackup.pl |   3 -
 11 files changed, 107 insertions(+), 108 deletions(-)

diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 436b48104e..5d680058ed 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -22,8 +22,7 @@ typedef enum pg_compress_algorithm
 	PG_COMPRESSION_ZSTD
 } pg_compress_algorithm;
 
-#define PG_COMPRESSION_OPTION_LEVEL			(1 << 0)
-#define PG_COMPRESSION_OPTION_WORKERS		(1 << 1)
+#define PG_COMPRESSION_OPTION_WORKERS		(1 << 0)
 
 typedef struct pg_compress_specification
 {
diff --git a/src/backend/backup/basebackup_gzip.c b/src/backend/backup/basebackup_gzip.c
index a965866ff2..1ec42791f1 100644
--- a/src/backend/backup/basebackup_gzip.c
+++ b/src/backend/backup/basebackup_gzip.c
@@ -72,13 +72,9 @@ bbsink_gzip_new(bbsink *next, pg_compress_specification *compress)
 
 	Assert(next != NULL);
 
-	if ((compress->options & PG_COMPRESSION_OPTION_LEVEL) == 0)
-		compresslevel = Z_DEFAULT_COMPRESSION;
-	else
-	{
-		compresslevel = compress->level;
-		Assert(compresslevel >= 1 && compresslevel <= 9);
-	}
+	compresslevel = compress->level;
+	Assert((compresslevel >= 1 && compresslevel <= 9) ||
+		   compresslevel == Z_DEFAULT_COMPRESSION);
 
 	sink = palloc0(sizeof(bbsink_gzip));
 	*((const bbsink_ops **) &sink->base.bbs_ops) = &bbsink_gzip_ops;
diff --git a/src/backend/backup/basebackup_lz4.c b/src/backend/backup/basebackup_lz4.c
index d919e3dec7..986272ab9a 100644
--- a/src/backend/backup/basebackup_lz4.c
+++ b/src/backend/backup/basebackup_lz4.c
@@ -72,13 +72,8 @@ bbsink_lz4_new(bbsink *next, pg_compress_specification *compress)
 
 	Assert(next != NULL);
 
-	if ((compress->options & PG_COMPRESSION_OPTION_LEVEL) == 0)
-		compresslevel = 0;
-	else
-	{
-		compresslevel = compress->level;
-		Assert(compresslevel >= 1 && compresslevel <= 12);
-	}
+	compresslevel = compress->level;
+	Assert(compresslevel >= 0 && compresslevel <= 12);
 
 	sink = palloc0(sizeof(bbsink_lz4));
 	*((const bbsink_ops **) &sink->base.bbs_ops) = &bbsink_lz4_ops;
diff --git a/src/backend/backup/basebackup_zstd.c b/src/backend/backup/basebackup_zstd.c
index 865067f8dc..409bf88101 100644
--- a/src/backend/backup/basebackup_zstd.c
+++ b/src/backend/backup/basebackup_zstd.c
@@ -96,14 +96,11 @@ bbsink_zstd_begin_backup(bbsink *sink)
 	if (!mysink->cctx)
 		elog(ERROR, "could not create zstd compression context");
 
-	if ((compress->options & PG_COMPRESSION_OPTION_LEVEL) != 0)
-	{
-		ret = ZSTD_CCtx_setParameter(mysink->cctx, ZSTD_c_compressionLevel,
+	ret = ZSTD_CCtx_setParameter(mysink->cctx, ZSTD_c_compressionLevel,
 									 compress->level);
-		if (ZSTD_isError(ret))
-			elog(ERROR, "could not set zstd compression level to %d: %s",
-				 compress->level, ZSTD_getErrorName(ret));
-	}
+	if (ZSTD_isError(ret))
+		elog(ERROR, "could not set zstd compression level to %d: %s",
+			 compress->level, ZSTD_getErrorName(ret));
 
 	if ((compress->options & PG_COMPRESSION_OPTION_WORKERS) != 0)
 	{
diff --git a/src/common/compression.c b/src/common/compression.c
index da3c291c0f..ac26287d54 100644
--- a/src/common/compression.c
+++ b/src/common/compression.c
@@ -27,6 +27,13 @@
 #include "postgres_fe.h"
 #endif
 
+#ifdef USE_ZSTD
+#include <zstd.h>
+#endif
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
 #include "common/compression.h"
 
 static int	expect_integer_value(char *keyword, char *value,
@@ -88,6 +95,9 @@ get_compress_algorithm_name(pg_compress_algorithm algorithm)
  * Note, however, even if there's no parse error, the string might not make
  * sense: e.g. for gzip, level=12 is not sensible, but it does parse OK.
  *
+ * The compression level is assigned by default if not directly specified
+ * by the specification.
+ *
  * Use validate_compress_specification() to find out whether a compression
  * specification is semantically sensible.
  */
@@ -101,9 +111,46 @@ parse_compress_specification(pg_compress_algorithm algorithm, char *specificatio
 	/* Initial setup of result object. */
 	result->algorithm = algorithm;
 	result->options = 0;
-	result->level = -1;
 	result->parse_error = NULL;
 
+	/*
+	 * Assign a default level depending on the compression method.  This may
+	 * be enforced later.
+	 */
+	switch (result->algorithm)
+	{
+		case PG_COMPRESSION_NONE:
+			result->level = 0;
+			break;
+		case PG_COMPRESSION_LZ4:
+#ifdef USE_LZ4
+			result->level = 0;	/* fast compression mode */
+#else
+			result->parse_error =
+				psprintf(_("this build does not support compression with %s"),
+						 "LZ4");
+#endif
+			break;
+		case PG_COMPRESSION_ZSTD:
+#ifdef USE_ZSTD
+			result->level = ZSTD_CLEVEL_DEFAULT;
+#else
+			result->parse_error =
+				psprintf(_("this build does not support compression with %s"),
+						 "ZSTD");
+#endif
+			break;
+		case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+			result->level = Z_DEFAULT_COMPRESSION;
+#else
+			result->parse_error =
+				psprintf(_("this build does not support compression with %s"),
+						 "gzip");
+#endif
+			break;
+	}
+
 	/* If there is no specification, we're done already. */
 	if (specification == NULL)
 		return;
@@ -113,7 +160,6 @@ parse_compress_specification(pg_compress_algorithm algorithm, char *specificatio
 	if (specification != bare_level_endp && *bare_level_endp == '\0')
 	{
 		result->level = bare_level;
-		result->options |= PG_COMPRESSION_OPTION_LEVEL;
 		return;
 	}
 
@@ -175,7 +221,11 @@ parse_compress_specification(pg_compress_algorithm algorithm, char *specificatio
 		if (strcmp(keyword, "level") == 0)
 		{
 			result->level = expect_integer_value(keyword, value, result);
-			result->options |= PG_COMPRESSION_OPTION_LEVEL;
+
+			/*
+			 * No need to set a flag in "options", there is a default level
+			 * set at least thanks to the logic above.
+			 */
 		}
 		else if (strcmp(keyword, "workers") == 0)
 		{
@@ -249,36 +299,49 @@ expect_integer_value(char *keyword, char *value, pg_compress_specification *resu
 char *
 validate_compress_specification(pg_compress_specification *spec)
 {
+	int			min_level = 1;
+	int			max_level = 1;
+	int			default_level = 0;
+
 	/* If it didn't even parse OK, it's definitely no good. */
 	if (spec->parse_error != NULL)
 		return spec->parse_error;
 
 	/*
-	 * If a compression level was specified, check that the algorithm expects
-	 * a compression level and that the level is within the legal range for
-	 * the algorithm.
+	 * Check that the algorithm expects a compression level and it is
+	 * is within the legal range for the algorithm.
 	 */
-	if ((spec->options & PG_COMPRESSION_OPTION_LEVEL) != 0)
+	switch (spec->algorithm)
 	{
-		int			min_level = 1;
-		int			max_level;
-
-		if (spec->algorithm == PG_COMPRESSION_GZIP)
+		case PG_COMPRESSION_GZIP:
 			max_level = 9;
-		else if (spec->algorithm == PG_COMPRESSION_LZ4)
+#ifdef HAVE_LIBZ
+			default_level = Z_DEFAULT_COMPRESSION;
+#endif
+			break;
+		case PG_COMPRESSION_LZ4:
 			max_level = 12;
-		else if (spec->algorithm == PG_COMPRESSION_ZSTD)
+			default_level = 0;	/* fast mode */
+			break;
+		case PG_COMPRESSION_ZSTD:
 			max_level = 22;
-		else
-			return psprintf(_("compression algorithm \"%s\" does not accept a compression level"),
-							get_compress_algorithm_name(spec->algorithm));
-
-		if (spec->level < min_level || spec->level > max_level)
-			return psprintf(_("compression algorithm \"%s\" expects a compression level between %d and %d"),
-							get_compress_algorithm_name(spec->algorithm),
-							min_level, max_level);
+#ifdef USE_ZSTD
+			default_level = ZSTD_CLEVEL_DEFAULT;
+#endif
+			break;
+		case PG_COMPRESSION_NONE:
+			if (spec->level != 0)
+				return psprintf(_("compression algorithm \"%s\" does not accept a compression level"),
+								get_compress_algorithm_name(spec->algorithm));
+			break;
 	}
 
+	if ((spec->level < min_level || spec->level > max_level) &&
+		spec->level != default_level)
+		return psprintf(_("compression algorithm \"%s\" expects a compression level between %d and %d"),
+						get_compress_algorithm_name(spec->algorithm),
+						min_level, max_level);
+
 	/*
 	 * Of the compression algorithms that we currently support, only zstd
 	 * allows parallel workers.
diff --git a/src/bin/pg_basebackup/bbstreamer_gzip.c b/src/bin/pg_basebackup/bbstreamer_gzip.c
index e7261910d8..c3455ffbdd 100644
--- a/src/bin/pg_basebackup/bbstreamer_gzip.c
+++ b/src/bin/pg_basebackup/bbstreamer_gzip.c
@@ -107,9 +107,7 @@ bbstreamer_gzip_writer_new(char *pathname, FILE *file,
 			pg_fatal("could not open output file: %m");
 	}
 
-	if ((compress->options & PG_COMPRESSION_OPTION_LEVEL) != 0 &&
-		gzsetparams(streamer->gzfile, compress->level,
-					Z_DEFAULT_STRATEGY) != Z_OK)
+	if (gzsetparams(streamer->gzfile, compress->level, Z_DEFAULT_STRATEGY) != Z_OK)
 		pg_fatal("could not set compression level %d: %s",
 				 compress->level, get_gz_error(streamer->gzfile));
 
diff --git a/src/bin/pg_basebackup/bbstreamer_lz4.c b/src/bin/pg_basebackup/bbstreamer_lz4.c
index b9752354c9..ed2aa01305 100644
--- a/src/bin/pg_basebackup/bbstreamer_lz4.c
+++ b/src/bin/pg_basebackup/bbstreamer_lz4.c
@@ -88,8 +88,7 @@ bbstreamer_lz4_compressor_new(bbstreamer *next, pg_compress_specification *compr
 	prefs = &streamer->prefs;
 	memset(prefs, 0, sizeof(LZ4F_preferences_t));
 	prefs->frameInfo.blockSizeID = LZ4F_max256KB;
-	if ((compress->options & PG_COMPRESSION_OPTION_LEVEL) != 0)
-		prefs->compressionLevel = compress->level;
+	prefs->compressionLevel = compress->level;
 
 	ctxError = LZ4F_createCompressionContext(&streamer->cctx, LZ4F_VERSION);
 	if (LZ4F_isError(ctxError))
diff --git a/src/bin/pg_basebackup/bbstreamer_zstd.c b/src/bin/pg_basebackup/bbstreamer_zstd.c
index 8a839145a6..32e1fb4b04 100644
--- a/src/bin/pg_basebackup/bbstreamer_zstd.c
+++ b/src/bin/pg_basebackup/bbstreamer_zstd.c
@@ -84,15 +84,12 @@ bbstreamer_zstd_compressor_new(bbstreamer *next, pg_compress_specification *comp
 	if (!streamer->cctx)
 		pg_fatal("could not create zstd compression context");
 
-	/* Set compression level, if specified */
-	if ((compress->options & PG_COMPRESSION_OPTION_LEVEL) != 0)
-	{
-		ret = ZSTD_CCtx_setParameter(streamer->cctx, ZSTD_c_compressionLevel,
+	/* Set compression level */
+	ret = ZSTD_CCtx_setParameter(streamer->cctx, ZSTD_c_compressionLevel,
 									 compress->level);
-		if (ZSTD_isError(ret))
-			pg_fatal("could not set zstd compression level to %d: %s",
-					 compress->level, ZSTD_getErrorName(ret));
-	}
+	if (ZSTD_isError(ret))
+		pg_fatal("could not set zstd compression level to %d: %s",
+				 compress->level, ZSTD_getErrorName(ret));
 
 	/* Set # of workers, if specified */
 	if ((compress->options & PG_COMPRESSION_OPTION_WORKERS) != 0)
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 9ce30d43a4..20d1baf675 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -2003,27 +2003,12 @@ BaseBackup(char *compression_algorithm, char *compression_detail,
 	 */
 	if (includewal == STREAM_WAL)
 	{
-		pg_compress_algorithm wal_compress_algorithm;
-		int			wal_compress_level;
-
 		if (verbose)
 			pg_log_info("starting background WAL receiver");
 
-		if (client_compress->algorithm == PG_COMPRESSION_GZIP)
-		{
-			wal_compress_algorithm = PG_COMPRESSION_GZIP;
-			wal_compress_level =
-				(client_compress->options & PG_COMPRESSION_OPTION_LEVEL)
-				!= 0 ? client_compress->level : 0;
-		}
-		else
-		{
-			wal_compress_algorithm = PG_COMPRESSION_NONE;
-			wal_compress_level = 0;
-		}
-
 		StartLogStreamer(xlogstart, starttli, sysidentifier,
-						 wal_compress_algorithm, wal_compress_level);
+						 client_compress->algorithm,
+						 client_compress->level);
 	}
 
 	if (serverMajor >= 1500)
diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c
index f064cff4ab..a6e3387a6d 100644
--- a/src/bin/pg_basebackup/pg_receivewal.c
+++ b/src/bin/pg_basebackup/pg_receivewal.c
@@ -875,38 +875,11 @@ main(int argc, char **argv)
 		pg_fatal("invalid compression specification: %s",
 				 error_detail);
 
-	/* Extract the compression level, if found in the specification */
-	if ((compression_spec.options & PG_COMPRESSION_OPTION_LEVEL) != 0)
-		compresslevel = compression_spec.level;
-
-	switch (compression_algorithm)
-	{
-		case PG_COMPRESSION_NONE:
-			/* nothing to do */
-			break;
-		case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
-			if ((compression_spec.options & PG_COMPRESSION_OPTION_LEVEL) == 0)
-			{
-				pg_log_info("no value specified for --compress, switching to default");
-				compresslevel = Z_DEFAULT_COMPRESSION;
-			}
-#else
-			pg_fatal("this build does not support compression with %s",
-					 "gzip");
-#endif
-			break;
-		case PG_COMPRESSION_LZ4:
-#ifndef USE_LZ4
-			pg_fatal("this build does not support compression with %s",
-					 "LZ4");
-#endif
-			break;
-		case PG_COMPRESSION_ZSTD:
-			pg_fatal("compression with %s is not yet supported", "ZSTD");
-			break;
-	}
+	/* Extract the compression level */
+	compresslevel = compression_spec.level;
 
+	if (compression_algorithm == PG_COMPRESSION_ZSTD)
+		pg_fatal("compression with %s is not yet supported", "ZSTD");
 
 	/*
 	 * Check existence of destination folder.
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 3d1a4ddd5c..40f1d3f7e2 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -860,9 +860,6 @@ SKIP:
 	my $gzip_is_valid =
 	  system_log($gzip, '--test', @zlib_files, @zlib_files2, @zlib_files3);
 	is($gzip_is_valid, 0, "gzip verified the integrity of compressed data");
-	rmtree("$tempdir/backup_gzip");
-	rmtree("$tempdir/backup_gzip2");
-	rmtree("$tempdir/backup_gzip3");
 }
 
 # Test background stream process terminating before the basebackup has
-- 
2.37.2

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#3)
Re: pg_basebackup's --gzip switch misbehaves

Michael Paquier <michael@paquier.xyz> writes:

Attached is the patch I am finishing with, consisting of:
- the removal of PG_COMPRESSION_OPTION_LEVEL.
- assigning a default compression level when nothing is specified in
the spec.
- a couple of complifications in pg_receivewal, pg_basebackup and the
backend code as there is no need to worry about the compression
level.

This looks good to me. It seems simpler, and I concur that it
fixes the described problem. I now see

tmp_check/backup_gzip:
total 3668
-rw-r-----. 1 postgres postgres 137756 Sep 13 17:29 backup_manifest
-rw-r-----. 1 postgres postgres 3537499 Sep 13 17:29 base.tar.gz
-rw-r-----. 1 postgres postgres 73989 Sep 13 17:29 pg_wal.tar.gz

tmp_check/backup_gzip2:
total 3168
-rw-r-----. 1 postgres postgres 137756 Sep 13 17:29 backup_manifest
-rw-r-----. 1 postgres postgres 3083516 Sep 13 17:29 base.tar.gz
-rw-r-----. 1 postgres postgres 17069 Sep 13 17:29 pg_wal.tar.gz

tmp_check/backup_gzip3:
total 3668
-rw-r-----. 1 postgres postgres 137756 Sep 13 17:29 backup_manifest
-rw-r-----. 1 postgres postgres 3537517 Sep 13 17:29 base.tar.gz
-rw-r-----. 1 postgres postgres 73988 Sep 13 17:29 pg_wal.tar.gz

which looks sane: the gzip2 case should, and does, have better
compression than the other two.

BTW, this bit:

diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 3d1a4ddd5c..40f1d3f7e2 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -860,9 +860,6 @@ SKIP:
 	my $gzip_is_valid =
 	  system_log($gzip, '--test', @zlib_files, @zlib_files2, @zlib_files3);
 	is($gzip_is_valid, 0, "gzip verified the integrity of compressed data");
-	rmtree("$tempdir/backup_gzip");
-	rmtree("$tempdir/backup_gzip2");
-	rmtree("$tempdir/backup_gzip3");
 }

# Test background stream process terminating before the basebackup has

is something I tried along the way to diagnosing the problem, and
it turns out to have exactly zero effect. The $tempdir is some
temporary subdirectory of tmp_check that will get nuked at the end
of the TAP test no matter what. So these rmtrees are merely making
the evidence disappear a bit faster; it will anyway.

What I did to diagnose the problem was this:

@@ -860,9 +860,9 @@ SKIP:
    my $gzip_is_valid =
      system_log($gzip, '--test', @zlib_files, @zlib_files2, @zlib_files3);
    is($gzip_is_valid, 0, "gzip verified the integrity of compressed data");
-   rmtree("$tempdir/backup_gzip");
-   rmtree("$tempdir/backup_gzip2");
-   rmtree("$tempdir/backup_gzip3");
+   system_log('mv', "$tempdir/backup_gzip", "tmp_check");
+   system_log('mv', "$tempdir/backup_gzip2", "tmp_check");
+   system_log('mv', "$tempdir/backup_gzip3", "tmp_check");
 }

# Test background stream process terminating before the basebackup has

which is not real clean, since then the files get left behind even
on success, which I doubt we want either.

Anyway, I have no objection to dropping the rmtrees, since they're
pretty useless as the code stands. Just wanted to mention this
issue for the archives.

regards, tom lane

#5Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#4)
Re: pg_basebackup's --gzip switch misbehaves

On Tue, Sep 13, 2022 at 05:38:47PM -0400, Tom Lane wrote:

is something I tried along the way to diagnosing the problem, and
it turns out to have exactly zero effect. The $tempdir is some
temporary subdirectory of tmp_check that will get nuked at the end
of the TAP test no matter what. So these rmtrees are merely making
the evidence disappear a bit faster; it will anyway.

FWIW, I just stick a die() in the middle of the code path when I want
to look at specific results. Similar method, same result.

# Test background stream process terminating before the basebackup has

which is not real clean, since then the files get left behind even
on success, which I doubt we want either.

Another thing that could be done here is to use the same base location
as the cluster nodes aka $PostgreSQL::Test::Utils::tmp_check. That
would mean storing in a repo more data associated to the base backups
after a fresh run, though. I am not sure that small machine would
like this accumulation in a single run even if disk space is cheap
these days.

Anyway, I have no objection to dropping the rmtrees, since they're
pretty useless as the code stands. Just wanted to mention this
issue for the archives.

I see more ways to change the existing behavior, so for now I have
left that untouched.

And so, I have spent a couple of hours torturing the patch, applying
it after a few tweaks and CI runs:
- --without-zlib was causing a failure in the pg_basebackup tests as
we have a few tests that parse and validate a set of invalid specs for
the client-side and server-side compression. With zlib around, the
tests and their expected results are unchanged, that's just a
consequence of moving the assignment of a default level much earlier.
- pg_basebackup was triggering an assertion when using client-lz4 or
client-zstd as we use the directory method of walmethods.c. In this
case, we just support zlib as compression and enforce no compression
when we are under lz4 or zstd. This came from an over-simplification
of the code. There is a gap in the testing of pg_basebackup,
actually, because we have zero tests for LZ4 and zstd there.
- The documentation of the replication protocol needed some
adjustments for the default comporession levels.

The buildfarm is green so I think that we are good. I have closed the
open item.

You have mentioned upthread an extra thing about the fact that we pass
down 0 to deflateParams(). This is indeed wrong and we are lucky that
Z_DEFAULT_STRATEGY maps to 0. Better to fix and backpatch this one
down to where gzip compression has been added to walmethods.c.. I'll
just do that in a bit after double-checking the area and the other
routines.
--
Michael

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#5)
Re: pg_basebackup's --gzip switch misbehaves

Michael Paquier <michael@paquier.xyz> writes:

And so, I have spent a couple of hours torturing the patch, applying
it after a few tweaks and CI runs:
...
The buildfarm is green so I think that we are good. I have closed the
open item.

+1, thanks for taking care of that.

As far as my original complaint about mamba goes, I've not quite
been able to run it to ground. However, I found that NetBSD
seems to be shipping unmodified zlib 1.2.11, which contains a
number of known bugs in deflate_stored() --- that is, the code
path implementing compression level 0. Red Hat for one is
carrying several back-patched fixes in that part of zlib.
So for the moment I'm willing to write it off as "not our bug".
We aren't intentionally testing compression level 0, and hardly
anybody would intentionally use it in the field, so it's not
really a thing worth worrying about IMO. But if mamba continues
to show failures in that test then it will be worth looking closer.

regards, tom lane

#7Daniel Gustafsson
daniel@yesql.se
In reply to: Tom Lane (#4)
Re: pg_basebackup's --gzip switch misbehaves

On 13 Sep 2022, at 23:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The $tempdir is some temporary subdirectory of tmp_check that will get nuked at
the end of the TAP test no matter what. So these rmtrees are merely making the
evidence disappear a bit faster; it will anyway.

Maybe the creation of $tempdir should take PG_TEST_NOCLEAN into account and not
register CLEANUP if set?

--
Daniel Gustafsson https://vmware.com/

#8Michael Paquier
michael@paquier.xyz
In reply to: Daniel Gustafsson (#7)
Re: pg_basebackup's --gzip switch misbehaves

On Wed, Sep 14, 2022 at 10:26:42AM +0200, Daniel Gustafsson wrote:

Maybe the creation of $tempdir should take PG_TEST_NOCLEAN into account and not
register CLEANUP if set?

Agreed. It sounds like a good idea to me to extend that to temporary
paths, and then check those rmtree() calls where the tests would not
retain too much data for small-ish machines.

By the way, should we document PG_TEST_TIMEOUT_DEFAULT and
PG_TEST_NOCLEAN not only in src/test/perl/README but also doc/?. We
provide something in the docs about PROVE_FLAGS and PROVE_TESTS, for
example.
--
Michael

#9Justin Pryzby
pryzby@telsasoft.com
In reply to: Michael Paquier (#3)
Re: pg_basebackup's --gzip switch misbehaves

On Tue, Sep 13, 2022 at 04:13:20PM +0900, Michael Paquier wrote:

diff --git a/src/common/compression.c b/src/common/compression.c
index da3c291c0f..ac26287d54 100644
--- a/src/common/compression.c
+++ b/src/common/compression.c
@@ -249,36 +299,49 @@ expect_integer_value(char *keyword, char *value, pg_compress_specification *resu
char *
validate_compress_specification(pg_compress_specification *spec)
{
+	int			min_level = 1;
+	int			max_level = 1;
+	int			default_level = 0;
+
/* If it didn't even parse OK, it's definitely no good. */
if (spec->parse_error != NULL)
return spec->parse_error;
/*
-	 * If a compression level was specified, check that the algorithm expects
-	 * a compression level and that the level is within the legal range for
-	 * the algorithm.
+	 * Check that the algorithm expects a compression level and it is
+	 * is within the legal range for the algorithm.
*/
-	if ((spec->options & PG_COMPRESSION_OPTION_LEVEL) != 0)
+	switch (spec->algorithm)
{
-		int			min_level = 1;
-		int			max_level;
-
-		if (spec->algorithm == PG_COMPRESSION_GZIP)
+		case PG_COMPRESSION_GZIP:
max_level = 9;
-		else if (spec->algorithm == PG_COMPRESSION_LZ4)
+#ifdef HAVE_LIBZ
+			default_level = Z_DEFAULT_COMPRESSION;
+#endif
+			break;
+		case PG_COMPRESSION_LZ4:
max_level = 12;
-		else if (spec->algorithm == PG_COMPRESSION_ZSTD)
+			default_level = 0;	/* fast mode */
+			break;
+		case PG_COMPRESSION_ZSTD:
max_level = 22;

I should've suggested to add:

min_level = -7;

which has been supported since zstd 1.3.4 (and postgres requires 1.4.0).

I think at some point (maybe before releasing 1.3.4) the range was
increased to very large(small), negative levels. It's possible to query
the library about the lowest supported compression level, but then
there's a complication regarding the client-side library version vs the
server-side version. So it seems better to just use -7.

--
Justin

#10Michael Paquier
michael@paquier.xyz
In reply to: Justin Pryzby (#9)
1 attachment(s)
Re: pg_basebackup's --gzip switch misbehaves

On Wed, Sep 21, 2022 at 07:31:48PM -0500, Justin Pryzby wrote:

I think at some point (maybe before releasing 1.3.4) the range was
increased to very large(small), negative levels. It's possible to query
the library about the lowest supported compression level, but then
there's a complication regarding the client-side library version vs the
server-side version. So it seems better to just use -7.

Indeed. Contrary to the default level, there are no variables for the
minimum and maximum levels. As you are pointing out, a lookup at
zstd_compress.c shows that we have ZSTD_minCLevel() and
ZSTD_maxCLevel() that assign the bounds. Both are available since
1.4.0. We still need a backend-side check as the level passed with a
BASE_BACKUP command would be only validated there. It seems to me
that this is going to be less of a headache in the long-term if we
just use those routines at runtime, as zstd wants to keep some freedom
with the min and max bounds for the compression level, at least that's
the flexibility that this gives the library. So I would tweak things
as the attached.
--
Michael

Attachments:

zstd-minmax-level.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/common/compression.c b/src/common/compression.c
index e40ce98ef3..0c6bb9177b 100644
--- a/src/common/compression.c
+++ b/src/common/compression.c
@@ -324,8 +324,9 @@ validate_compress_specification(pg_compress_specification *spec)
 			default_level = 0;	/* fast mode */
 			break;
 		case PG_COMPRESSION_ZSTD:
-			max_level = 22;
 #ifdef USE_ZSTD
+			max_level = ZSTD_maxCLevel();
+			min_level = ZSTD_minCLevel();
 			default_level = ZSTD_CLEVEL_DEFAULT;
 #endif
 			break;
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index f63c912e97..5cff3d3c07 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2757,8 +2757,10 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
            <literal>-1</literal>), for <literal>lz4</literal> an integer
            between 1 and 12 (default <literal>0</literal> for fast compression
            mode), and for <literal>zstd</literal> an integer between
-           <literal>1</literal> and <literal>22</literal> (default
-           <literal>ZSTD_CLEVEL_DEFAULT</literal> or <literal>3</literal>).
+           <literal>ZSTD_minCLevel()</literal> (usually <literal>-131072</literal>)
+           and <literal>ZSTD_maxCLevel()</literal> (usually <literal>22</literal>)
+           (default <literal>ZSTD_CLEVEL_DEFAULT</literal> or
+           <literal>3</literal>).
           </para>
 
           <para>
#11Justin Pryzby
pryzby@telsasoft.com
In reply to: Michael Paquier (#10)
Re: pg_basebackup's --gzip switch misbehaves

On Thu, Sep 22, 2022 at 10:25:11AM +0900, Michael Paquier wrote:

On Wed, Sep 21, 2022 at 07:31:48PM -0500, Justin Pryzby wrote:

I think at some point (maybe before releasing 1.3.4) the range was
increased to very large(small), negative levels. It's possible to query
the library about the lowest supported compression level, but then
there's a complication regarding the client-side library version vs the
server-side version. So it seems better to just use -7.

Indeed. Contrary to the default level, there are no variables for the
minimum and maximum levels. As you are pointing out, a lookup at
zstd_compress.c shows that we have ZSTD_minCLevel() and
ZSTD_maxCLevel() that assign the bounds. Both are available since
1.4.0. We still need a backend-side check as the level passed with a
BASE_BACKUP command would be only validated there. It seems to me
that this is going to be less of a headache in the long-term if we
just use those routines at runtime, as zstd wants to keep some freedom
with the min and max bounds for the compression level, at least that's
the flexibility that this gives the library. So I would tweak things
as the attached.

Okay. Will that complicate tests at all? It looks like it's not an
issue for the tests currently proposed in the CF APP.
https://commitfest.postgresql.org/39/3835/

However the patch ends up, +0.75 to backpatch it to v15 rather than
calling it a new feature in v16.

--
Justin

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Justin Pryzby (#11)
Re: pg_basebackup's --gzip switch misbehaves

Justin Pryzby <pryzby@telsasoft.com> writes:

However the patch ends up, +0.75 to backpatch it to v15 rather than
calling it a new feature in v16.

I don't have any opinion on the concrete merits of this change,
but I want to note that 15rc1 wraps on Monday, and we don't like
people pushing noncritical changes shortly before a wrap. There
is not a lot of time for fooling around here.

regards, tom lane

#13Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#12)
Re: pg_basebackup's --gzip switch misbehaves

On Wed, Sep 21, 2022 at 11:43:56PM -0400, Tom Lane wrote:

I don't have any opinion on the concrete merits of this change,
but I want to note that 15rc1 wraps on Monday, and we don't like
people pushing noncritical changes shortly before a wrap. There
is not a lot of time for fooling around here.

If I were to do it in the next couple of hours, we'd still have quite
a couple of days of coverage, which is plenty as far as I understand?

Saying that, it is not a critical change. Just to give some numbers,
for a fresh initdb's instance base.tar.zst is at:
- 3.6MB at level 0.
- 3.8MB at level 1.
- 3.6MB at level 2.
- 4.3MB at level -1.
- 4.6MB at level -2.
- 6.1MB at level -7.

I am not sure if there would be a huge demand for this much control
over the current [1,22], but the library wants to control dynamically
the bounds and has the APIs to allow that.
--
Michael

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#13)
Re: pg_basebackup's --gzip switch misbehaves

Michael Paquier <michael@paquier.xyz> writes:

On Wed, Sep 21, 2022 at 11:43:56PM -0400, Tom Lane wrote:

I don't have any opinion on the concrete merits of this change,
but I want to note that 15rc1 wraps on Monday, and we don't like
people pushing noncritical changes shortly before a wrap. There
is not a lot of time for fooling around here.

If I were to do it in the next couple of hours, we'd still have quite
a couple of days of coverage, which is plenty as far as I understand?

Sure. I'd say we have 48 hours to choose whether to put this in v15.
But not more than that.

regards, tom lane

#15Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#14)
Re: pg_basebackup's --gzip switch misbehaves

On Thu, Sep 22, 2022 at 12:47:34AM -0400, Tom Lane wrote:

Sure. I'd say we have 48 hours to choose whether to put this in v15.
But not more than that.

I have a window to be able to look at the buildfarm today, tomorrow
being harder, so I have adjusted that now on both HEAD and
REL_15_STABLE for consistency.
--
Michael

#16Daniel Gustafsson
daniel@yesql.se
In reply to: Michael Paquier (#8)
1 attachment(s)
Re: pg_basebackup's --gzip switch misbehaves

On 16 Sep 2022, at 04:22, Michael Paquier <michael@paquier.xyz> wrote:

By the way, should we document PG_TEST_TIMEOUT_DEFAULT and
PG_TEST_NOCLEAN not only in src/test/perl/README but also doc/?. We
provide something in the docs about PROVE_FLAGS and PROVE_TESTS, for
example.

I think that's a good idea, not everyone running tests will read the internals
documentation (or even know abou it even). How about the attached?

--
Daniel Gustafsson https://vmware.com/

Attachments:

regress_tap.diffapplication/octet-stream; name=regress_tap.diff; x-unix-mode=0644Download
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 8f032c4e7a..58e144dde7 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -763,6 +763,25 @@ make check PROVE_TESTS='t/001_test1.pl t/003_test3.pl'
 </programlisting>
    </para>
 
+   <para>
+    Data directories are named according to the test filename, and will be
+    retained if a test fails.  If the environment variable
+    <varname>PG_TEST_NOCLEAN</varname> is set, data directories will be
+    retained regardless of test status.  For example, retaining the data
+    directory regardless of test results when running the
+    <application>pg_dump</application> tests:
+<programlisting>
+PG_TEST_NOCLEAN make -C src/bin/pg_dump check
+</programlisting>
+   </para>
+
+   <para>
+    Many operations in the test suites use a 180 second timeout, which on slow
+    hosts may lead to load-induced timeouts.  Setting the environment variable
+    <varname>PG_TEST_TIMEOUT_DEFAULT</varname> to a higher number will change
+    the default to avoid this.
+   </para>
+
    <para>
     The TAP tests require the Perl module <literal>IPC::Run</literal>.
     This module is available from CPAN or an operating system package.
#17Michael Paquier
michael@paquier.xyz
In reply to: Daniel Gustafsson (#16)
Re: pg_basebackup's --gzip switch misbehaves

On Wed, Nov 02, 2022 at 09:42:12PM +0100, Daniel Gustafsson wrote:

I think that's a good idea, not everyone running tests will read the internals
documentation (or even know abou it even). How about the attached?

Thanks for the patch. Perhaps this should be mentioned additionally
in install-windows.sgml? I have not tested, but as long as these
variables are configured with a "set" command in a command prompt,
they would be passed down to the processes triggered by vcregress.pl
(see for example TESTLOGDIR and TESTDATADIR).
--
Michael

#18Daniel Gustafsson
daniel@yesql.se
In reply to: Michael Paquier (#17)
1 attachment(s)
Re: pg_basebackup's --gzip switch misbehaves

On 3 Nov 2022, at 12:49, Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Nov 02, 2022 at 09:42:12PM +0100, Daniel Gustafsson wrote:

I think that's a good idea, not everyone running tests will read the internals
documentation (or even know abou it even). How about the attached?

Thanks for the patch. Perhaps this should be mentioned additionally
in install-windows.sgml? I have not tested, but as long as these
variables are configured with a "set" command in a command prompt,
they would be passed down to the processes triggered by vcregress.pl
(see for example TESTLOGDIR and TESTDATADIR).

That's probably a good idea, I've amended the patch with that and also made the
CPAN mention of IPC::Run into a ulink like how it is in the Windows section in
passing. To avoid duplicating the info in the docs I made it into a sect2
which can be linked to. How about this version?

--
Daniel Gustafsson https://vmware.com/

Attachments:

regress_tap-v2.diffapplication/octet-stream; name=regress_tap-v2.diff; x-unix-mode=0644Download
diff --git a/doc/src/sgml/install-windows.sgml b/doc/src/sgml/install-windows.sgml
index a1013d1280..747d7fe429 100644
--- a/doc/src/sgml/install-windows.sgml
+++ b/doc/src/sgml/install-windows.sgml
@@ -518,6 +518,11 @@ $ENV{PROVE_TESTS}='t/020*.pl t/010*.pl'
 </programlisting>
   </para>
 
+  <para>
+    Additionally, the behavior of TAP tests can be controlled by a set of
+    environment variables, see <xref linkend="regress-tap-vars" />.
+  </para>
+
   <para>
    Some of the TAP tests depend on a set of external commands that would
    optionally trigger tests related to them. Each one of those variables
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 8f032c4e7a..389d0e8631 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -765,7 +765,9 @@ make check PROVE_TESTS='t/001_test1.pl t/003_test3.pl'
 
    <para>
     The TAP tests require the Perl module <literal>IPC::Run</literal>.
-    This module is available from CPAN or an operating system package.
+    This module is available from
+    <ulink url="https://metacpan.org/release/IPC-Run">CPAN</ulink>
+    or an operating system package.
     They also require <productname>PostgreSQL</productname> to be
     configured with the option <option>--enable-tap-tests</option>.
    </para>
@@ -789,6 +791,30 @@ make check PROVE_TESTS='t/001_test1.pl t/003_test3.pl'
     meaning that <literal>make installcheck</literal> will produce a mix of
     results from temporary servers and the already-running test server.
    </para>
+
+  <sect2 id="regress-tap-vars">
+   <title>Environment variables</title>
+
+   <para>
+    Data directories are named according to the test filename, and will be
+    retained if a test fails.  If the environment variable
+    <varname>PG_TEST_NOCLEAN</varname> is set, data directories will be
+    retained regardless of test status.  For example, retaining the data
+    directory regardless of test results when running the
+    <application>pg_dump</application> tests:
+<programlisting>
+PG_TEST_NOCLEAN make -C src/bin/pg_dump check
+</programlisting>
+   </para>
+
+   <para>
+    Many operations in the test suites use a 180 second timeout, which on slow
+    hosts may lead to load-induced timeouts.  Setting the environment variable
+    <varname>PG_TEST_TIMEOUT_DEFAULT</varname> to a higher number will change
+    the default to avoid this.
+   </para>
+  </sect2>
+
   </sect1>
 
   <sect1 id="regress-coverage">
#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Gustafsson (#18)
Re: pg_basebackup's --gzip switch misbehaves

Daniel Gustafsson <daniel@yesql.se> writes:

How about this version?

This isn't correct shell syntax is it?

+PG_TEST_NOCLEAN make -C src/bin/pg_dump check

I think you meant

+PG_TEST_NOCLEAN=1 make -C src/bin/pg_dump check

or the like.

regards, tom lane

#20Daniel Gustafsson
daniel@yesql.se
In reply to: Tom Lane (#19)
Re: pg_basebackup's --gzip switch misbehaves

On 14 Nov 2022, at 15:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Daniel Gustafsson <daniel@yesql.se> writes:

How about this version?

This isn't correct shell syntax is it?

+PG_TEST_NOCLEAN make -C src/bin/pg_dump check

I think you meant

+PG_TEST_NOCLEAN=1 make -C src/bin/pg_dump check

or the like.

Ugh, yes, that's what it should say.

--
Daniel Gustafsson https://vmware.com/

#21Michael Paquier
michael@paquier.xyz
In reply to: Daniel Gustafsson (#20)
Re: pg_basebackup's --gzip switch misbehaves

On Mon, Nov 14, 2022 at 03:27:14PM +0100, Daniel Gustafsson wrote:

Ugh, yes, that's what it should say.

A split sounds fine by me. On top of what Tom has mentioned, I have
spotted two small-ish things.

-    This module is available from CPAN or an operating system package.
+    This module is available from
+    <ulink url="https://metacpan.org/release/IPC-Run">CPAN</ulink>
+    or an operating system package.

It looks like there is a second one in install-windows.sgml.

+    Many operations in the test suites use a 180 second timeout, which on slow
Nit: s/180 second/180-second/?
--
Michael
#22Daniel Gustafsson
daniel@yesql.se
In reply to: Michael Paquier (#21)
2 attachment(s)
Re: pg_basebackup's --gzip switch misbehaves

On 15 Nov 2022, at 00:58, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Nov 14, 2022 at 03:27:14PM +0100, Daniel Gustafsson wrote:

Ugh, yes, that's what it should say.

A split sounds fine by me. On top of what Tom has mentioned, I have
spotted two small-ish things.

-    This module is available from CPAN or an operating system package.
+    This module is available from
+    <ulink url="https://metacpan.org/release/IPC-Run">CPAN</ulink>
+    or an operating system package.

It looks like there is a second one in install-windows.sgml.

Not sure I follow. IPC::Run is already linked to with a ulink from that page
(albeit with an empty tag rendering the URL instead).

A related nitpick I found though is that metacpan has changed their URL
structure and these links now 301 redirect. The attached 0001 fixes that first
before applying the other part.

+ Many operations in the test suites use a 180 second timeout, which on slow
Nit: s/180 second/180-second/?

Ok.

--
Daniel Gustafsson https://vmware.com/

Attachments:

v3-0001-doc-update-metacpan.org-links-to-avoid-redirects.patchapplication/octet-stream; name=v3-0001-doc-update-metacpan.org-links-to-avoid-redirects.patch; x-unix-mode=0644Download
From 0ae445eb0f870678cd5cf33e4b6149ec7fb59a23 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Tue, 15 Nov 2022 10:34:17 +0100
Subject: [PATCH v3 1/2] doc: update metacpan.org links to avoid redirects

The /release/ links are redirected to /dist/ and /pod/release/ to
/release/../view/, so update our links accordingly to avoid 301
redirects.
---
 doc/src/sgml/external-projects.sgml      | 2 +-
 doc/src/sgml/install-windows.sgml        | 2 +-
 src/test/perl/PostgreSQL/Test/Cluster.pm | 2 +-
 src/tools/perlcheck/perlcriticrc         | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/external-projects.sgml b/doc/src/sgml/external-projects.sgml
index bf590aba5d..2d0fd723b2 100644
--- a/doc/src/sgml/external-projects.sgml
+++ b/doc/src/sgml/external-projects.sgml
@@ -65,7 +65,7 @@
       <entry>DBD::Pg</entry>
       <entry>Perl</entry>
       <entry>Perl DBI driver</entry>
-      <entry><ulink url="https://metacpan.org/release/DBD-Pg"></ulink></entry>
+      <entry><ulink url="https://metacpan.org/dist/DBD-Pg"></ulink></entry>
      </row>
 
      <row>
diff --git a/doc/src/sgml/install-windows.sgml b/doc/src/sgml/install-windows.sgml
index a1013d1280..342178bcc2 100644
--- a/doc/src/sgml/install-windows.sgml
+++ b/doc/src/sgml/install-windows.sgml
@@ -488,7 +488,7 @@ $ENV{CONFIG}="Debug";
       ActiveState Perl installation, nor in the ActiveState Perl Package
       Manager (PPM) library. To install, download the
       <filename>IPC-Run-&lt;version&gt;.tar.gz</filename> source archive from CPAN,
-      at <ulink url="https://metacpan.org/release/IPC-Run"></ulink>, and
+      at <ulink url="https://metacpan.org/dist/IPC-Run"></ulink>, and
       uncompress. Edit the <filename>buildenv.pl</filename> file, and add a PERL5LIB
       variable to point to the <filename>lib</filename> subdirectory from the
       extracted archive. For example:
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index c7531c5efb..b5a24976e7 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -1826,7 +1826,7 @@ sub psql
 	# and set the flag.  Otherwise, and for any other exception, rethrow.
 	#
 	# For background, see
-	# https://metacpan.org/pod/release/ETHER/Try-Tiny-0.24/lib/Try/Tiny.pm
+	# https://metacpan.org/release/ETHER/Try-Tiny-0.24/view/lib/Try/Tiny.pm
 	do
 	{
 		local $@;
diff --git a/src/tools/perlcheck/perlcriticrc b/src/tools/perlcheck/perlcriticrc
index 9267fb43b2..49ac9ee52b 100644
--- a/src/tools/perlcheck/perlcriticrc
+++ b/src/tools/perlcheck/perlcriticrc
@@ -13,7 +13,7 @@ theme = core
 # print the policy name as well as the normal output
 verbose = %f: %m at line %l, column %c.  %e.  ([%p] Severity: %s)\n
 
-# Note: for policy descriptions see https://metacpan.org/release/Perl-Critic
+# Note: for policy descriptions see https://metacpan.org/dist/Perl-Critic
 
 
 # allow octal constants with leading zeros
-- 
2.32.1 (Apple Git-133)

v3-0002-doc-document-the-TAP-test-environment-variables.patchapplication/octet-stream; name=v3-0002-doc-document-the-TAP-test-environment-variables.patch; x-unix-mode=0644Download
From b9554217f1b20405bb22c41d2db80e3cb015f7de Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Tue, 15 Nov 2022 10:44:26 +0100
Subject: [PATCH v3 2/2] doc: document the TAP test environment variables

The TAP tests can, to some degree, be controlled by a set of environment
variables. These were however only documented in a README and not in the
main documentation.  This adds documentation of these variables, as well
as changes one CPAN reference to a ulink for consistency.

Discussion: https://postgr.es/m/YyPd9unV14SX2bLF@paquier.xyz
---
 doc/src/sgml/install-windows.sgml |  5 +++++
 doc/src/sgml/regress.sgml         | 28 +++++++++++++++++++++++++++-
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/install-windows.sgml b/doc/src/sgml/install-windows.sgml
index 342178bcc2..9d6ae0f16e 100644
--- a/doc/src/sgml/install-windows.sgml
+++ b/doc/src/sgml/install-windows.sgml
@@ -518,6 +518,11 @@ $ENV{PROVE_TESTS}='t/020*.pl t/010*.pl'
 </programlisting>
   </para>
 
+  <para>
+    Additionally, the behavior of TAP tests can be controlled by a set of
+    environment variables, see <xref linkend="regress-tap-vars" />.
+  </para>
+
   <para>
    Some of the TAP tests depend on a set of external commands that would
    optionally trigger tests related to them. Each one of those variables
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 8f032c4e7a..23ea93a387 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -765,7 +765,9 @@ make check PROVE_TESTS='t/001_test1.pl t/003_test3.pl'
 
    <para>
     The TAP tests require the Perl module <literal>IPC::Run</literal>.
-    This module is available from CPAN or an operating system package.
+    This module is available from
+    <ulink url="https://metacpan.org/dist/IPC-Run">CPAN</ulink>
+    or an operating system package.
     They also require <productname>PostgreSQL</productname> to be
     configured with the option <option>--enable-tap-tests</option>.
    </para>
@@ -789,6 +791,30 @@ make check PROVE_TESTS='t/001_test1.pl t/003_test3.pl'
     meaning that <literal>make installcheck</literal> will produce a mix of
     results from temporary servers and the already-running test server.
    </para>
+
+  <sect2 id="regress-tap-vars">
+   <title>Environment variables</title>
+
+   <para>
+    Data directories are named according to the test filename, and will be
+    retained if a test fails.  If the environment variable
+    <varname>PG_TEST_NOCLEAN</varname> is set, data directories will be
+    retained regardless of test status.  For example, retaining the data
+    directory regardless of test results when running the
+    <application>pg_dump</application> tests:
+<programlisting>
+PG_TEST_NOCLEAN=1 make -C src/bin/pg_dump check
+</programlisting>
+   </para>
+
+   <para>
+    Many operations in the test suites use a 180-second timeout, which on slow
+    hosts may lead to load-induced timeouts.  Setting the environment variable
+    <varname>PG_TEST_TIMEOUT_DEFAULT</varname> to a higher number will change
+    the default to avoid this.
+   </para>
+  </sect2>
+
   </sect1>
 
   <sect1 id="regress-coverage">
-- 
2.32.1 (Apple Git-133)

#23Michael Paquier
michael@paquier.xyz
In reply to: Daniel Gustafsson (#22)
Re: pg_basebackup's --gzip switch misbehaves

On Tue, Nov 15, 2022 at 11:09:54AM +0100, Daniel Gustafsson wrote:

On 15 Nov 2022, at 00:58, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Nov 14, 2022 at 03:27:14PM +0100, Daniel Gustafsson wrote:

Ugh, yes, that's what it should say.

A split sounds fine by me. On top of what Tom has mentioned, I have
spotted two small-ish things.

-    This module is available from CPAN or an operating system package.
+    This module is available from
+    <ulink url="https://metacpan.org/release/IPC-Run">CPAN</ulink>
+    or an operating system package.

It looks like there is a second one in install-windows.sgml.

Not sure I follow. IPC::Run is already linked to with a ulink from that page
(albeit with an empty tag rendering the URL instead).

Ah, I did not notice that there was already a link to that with
IPC::Run. Anyway, shouldn't CPAN be marked at least as an <acronym>
if we are not going to use a link on it? acronyms.sgml lists it, just
saying.

A related nitpick I found though is that metacpan has changed their URL
structure and these links now 301 redirect. The attached 0001 fixes that first
before applying the other part.

WFM.
--
Michael

#24Daniel Gustafsson
daniel@yesql.se
In reply to: Michael Paquier (#23)
Re: pg_basebackup's --gzip switch misbehaves

On 16 Nov 2022, at 02:02, Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Nov 15, 2022 at 11:09:54AM +0100, Daniel Gustafsson wrote:

On 15 Nov 2022, at 00:58, Michael Paquier <michael@paquier.xyz> wrote:

It looks like there is a second one in install-windows.sgml.

Not sure I follow. IPC::Run is already linked to with a ulink from that page
(albeit with an empty tag rendering the URL instead).

Ah, I did not notice that there was already a link to that with
IPC::Run. Anyway, shouldn't CPAN be marked at least as an <acronym>
if we are not going to use a link on it? acronyms.sgml lists it, just
saying.

Fair enough. I fixed that and applied this to HEAD.

--
Daniel Gustafsson https://vmware.com/