Re: Add LZ4 compression in pg_dump

Started by Justin Pryzbyalmost 3 years ago10 messages
#1Justin Pryzby
pryzby@telsasoft.com

On Mon, Feb 27, 2023 at 02:33:04PM +0000, gkokolatos@pm.me wrote:

- Finally, the "Nothing to do in the default case" comment comes from
Michael's commit 5e73a6048:

+ /*
+ * Custom and directory formats are compressed by default with gzip when
+ * available, not the others.
+ /
+ if ((archiveFormat == archCustom || archiveFormat == archDirectory) &&
+ !user_compression_defined)
{
#ifdef HAVE_LIBZ
- if (archiveFormat == archCustom || archiveFormat == archDirectory)
- compressLevel = Z_DEFAULT_COMPRESSION;
- else
+ parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
+ &compression_spec);
+#else
+ / Nothing to do in the default case */
#endif
- compressLevel = 0;
}

As the comment says: for -Fc and -Fd, the compression is set to zlib, if
enabled, and when not otherwise specified by the user.

Before 5e73a6048, this set compressLevel=0 for -Fp and -Ft, and when
zlib was unavailable.

But I'm not sure why there's now an empty "#else". I also don't know
what "the default case" refers to.

Maybe the best thing here is to move the preprocessor #if, since it's no
longer in the middle of a runtime conditional:

#ifdef HAVE_LIBZ
+ if ((archiveFormat == archCustom || archiveFormat == archDirectory) &&
+ !user_compression_defined)
+ parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
+ &compression_spec);
#endif

...but that elicits a warning about "variable set but not used"...

Not sure, I need to think about this a bit.

/* Nothing to do for the default case when LIBZ is not available */
is easier to understand.

Maybe I would write it as: "if zlib is unavailable, default to no
compression". But I think that's best done in the leading comment, and
not inside an empty preprocessor #else.

I was hoping Michael would comment on this.
The placement and phrasing of the comment makes no sense to me.

--
Justin

#2Michael Paquier
michael@paquier.xyz
In reply to: Justin Pryzby (#1)
1 attachment(s)

On Tue, Apr 11, 2023 at 07:41:11PM -0500, Justin Pryzby wrote:

Maybe I would write it as: "if zlib is unavailable, default to no
compression". But I think that's best done in the leading comment, and
not inside an empty preprocessor #else.

I was hoping Michael would comment on this.

(Sorry for the late reply, somewhat missed that.)

The placement and phrasing of the comment makes no sense to me.

Yes, this comment gives no value as it stands. I would be tempted to
follow the suggestion to group the whole code block in a single ifdef,
including the check, and remove this comment. Like the attached
perhaps?
--
Michael

Attachments:

pgdump-comment.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 967ced4eed..b7dda5ab27 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -753,16 +753,14 @@ main(int argc, char **argv)
 	 * Custom and directory formats are compressed by default with gzip when
 	 * available, not the others.
 	 */
+#ifdef HAVE_LIBZ
 	if ((archiveFormat == archCustom || archiveFormat == archDirectory) &&
 		!user_compression_defined)
 	{
-#ifdef HAVE_LIBZ
 		parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
 									 &compression_spec);
-#else
-		/* Nothing to do in the default case */
-#endif
 	}
+#endif
 
 	/*
 	 * If emitting an archive format, we always want to emit a DATABASE item,
#3Justin Pryzby
pryzby@telsasoft.com
In reply to: Michael Paquier (#2)

On Wed, Apr 12, 2023 at 10:07:08AM +0900, Michael Paquier wrote:

On Tue, Apr 11, 2023 at 07:41:11PM -0500, Justin Pryzby wrote:

Maybe I would write it as: "if zlib is unavailable, default to no
compression". But I think that's best done in the leading comment, and
not inside an empty preprocessor #else.

I was hoping Michael would comment on this.

(Sorry for the late reply, somewhat missed that.)

The placement and phrasing of the comment makes no sense to me.

Yes, this comment gives no value as it stands. I would be tempted to
follow the suggestion to group the whole code block in a single ifdef,
including the check, and remove this comment. Like the attached
perhaps?

+1

#4Michael Paquier
michael@paquier.xyz
In reply to: Justin Pryzby (#3)
1 attachment(s)

On Tue, Apr 11, 2023 at 08:19:59PM -0500, Justin Pryzby wrote:

On Wed, Apr 12, 2023 at 10:07:08AM +0900, Michael Paquier wrote:

Yes, this comment gives no value as it stands. I would be tempted to
follow the suggestion to group the whole code block in a single ifdef,
including the check, and remove this comment. Like the attached
perhaps?

+1

Let me try this one again, as the previous patch would cause a warning
under --without:-zlib as user_compression_defined would be unused. We
could do something like the attached instead. It means doing twice
parse_compress_specification() for the non-zlib path, still we are
already doing so for the zlib path.

If there are other ideas, feel free.
--
Michael

Attachments:

pgdump-comment-2.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 967ced4eed..596f64ed2c 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -751,17 +751,21 @@ main(int argc, char **argv)
 
 	/*
 	 * Custom and directory formats are compressed by default with gzip when
-	 * available, not the others.
+	 * available if no compression specification has been specified, not the
+	 * others.
 	 */
-	if ((archiveFormat == archCustom || archiveFormat == archDirectory) &&
-		!user_compression_defined)
+	if (!user_compression_defined)
 	{
+		if (archiveFormat == archCustom || archiveFormat == archDirectory)
+		{
 #ifdef HAVE_LIBZ
-		parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
-									 &compression_spec);
+			parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
+										 &compression_spec);
 #else
-		/* Nothing to do in the default case */
+			parse_compress_specification(PG_COMPRESSION_NONE, NULL,
+										 &compression_spec);
 #endif
+		}
 	}
 
 	/*
#5Justin Pryzby
pryzby@telsasoft.com
In reply to: Michael Paquier (#4)

On Thu, Apr 13, 2023 at 07:23:48AM +0900, Michael Paquier wrote:

On Tue, Apr 11, 2023 at 08:19:59PM -0500, Justin Pryzby wrote:

On Wed, Apr 12, 2023 at 10:07:08AM +0900, Michael Paquier wrote:

Yes, this comment gives no value as it stands. I would be tempted to
follow the suggestion to group the whole code block in a single ifdef,
including the check, and remove this comment. Like the attached
perhaps?

+1

Let me try this one again, as the previous patch would cause a warning
under --without:-zlib as user_compression_defined would be unused. We
could do something like the attached instead. It means doing twice
parse_compress_specification() for the non-zlib path, still we are
already doing so for the zlib path.

If there are other ideas, feel free.

I don't think you need to call parse_compress_specification(NONE).
As you wrote it, if zlib is unavailable, there's no parse(NONE) call,
even for directory and custom formats. And there's no parse(NONE) call
for plan format when zlib is available.

The old way had preprocessor #if around both the "if" and "else" - is
that what you meant ?

If you don't insist on calling parse(NONE), the only change is to remove
the empty #else, which was my original patch.

"if no compression specification has been specified" is redundant with
"by default", and causes "not the others" to dangle.

If I were to rewrite the comment, it'd say:

+        * When gzip is available, custom and directory formats are compressed by
+        * default
#6Michael Paquier
michael@paquier.xyz
In reply to: Justin Pryzby (#5)

On Wed, Apr 12, 2023 at 05:52:40PM -0500, Justin Pryzby wrote:

I don't think you need to call parse_compress_specification(NONE).
As you wrote it, if zlib is unavailable, there's no parse(NONE) call,
even for directory and custom formats. And there's no parse(NONE) call
for plan format when zlib is available.

Yeah, that's not necessary, but I was wondering if it made the code a
bit cleaner, or else the non-zlib path would rely on the default
compression method string.

The old way had preprocessor #if around both the "if" and "else" - is
that what you meant?

If you don't insist on calling parse(NONE), the only change is to remove
the empty #else, which was my original patch.

Removing the empty else has as problem to create an empty if block,
which could be itself a cause of warnings?

If I were to rewrite the comment, it'd say:

+        * When gzip is available, custom and directory formats are compressed by
+        * default

Okay.
--
Michael

#7Justin Pryzby
pryzby@telsasoft.com
In reply to: Michael Paquier (#6)

On Thu, Apr 13, 2023 at 09:37:06AM +0900, Michael Paquier wrote:

If you don't insist on calling parse(NONE), the only change is to remove
the empty #else, which was my original patch.

Removing the empty else has as problem to create an empty if block,
which could be itself a cause of warnings?

I doubt it - in the !HAVE_LIBZ case, it's currently an "if" statement
with nothing but a comment, which isn't a problem.

I think the only issue with an empty "if" is when you have no braces,
like:

if (...)
#if ...
something;
#endif

// problem here //

--
Justin

#8Michael Paquier
michael@paquier.xyz
In reply to: Justin Pryzby (#7)
1 attachment(s)

On Wed, Apr 12, 2023 at 07:53:53PM -0500, Justin Pryzby wrote:

I doubt it - in the !HAVE_LIBZ case, it's currently an "if" statement
with nothing but a comment, which isn't a problem.

I think the only issue with an empty "if" is when you have no braces,
like:

if (...)
#if ...
something;
#endif

// problem here //

(My apologies for the late reply.)

Still it could be easily messed up, and that's not a style that
really exists in the tree, either, because there are always #else
blocks set up in such cases. Another part that makes me a bit
uncomfortable is that we would still call twice
parse_compress_specification(), something that should not happen but
we are doing so on HEAD because the default compression_algorithm_str
is "none" and we want to enforce "gzip" for custom and directory
formats when building with zlib.

What about just moving this block a bit up, just before the
compression spec parsing, then? If we set compression_algorithm_str,
the specification is compiled with the expected default, once instead
of twice.
--
Michael

Attachments:

pgdump-comment-3.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 058244cd17..e4f32d170f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -721,6 +721,21 @@ main(int argc, char **argv)
 	if (archiveFormat == archNull)
 		plainText = 1;
 
+	/*
+	 * Custom and directory formats are compressed by default with gzip when
+	 * available, not the others.  If gzip is not available, no compression
+	 * is done by default.
+	 */
+	if ((archiveFormat == archCustom || archiveFormat == archDirectory) &&
+		!user_compression_defined)
+	{
+#ifdef HAVE_LIBZ
+		compression_algorithm_str = "gzip";
+#else
+		compression_algorithm_str = "none";
+#endif
+	}
+
 	/*
 	 * Compression options
 	 */
@@ -749,21 +764,6 @@ main(int argc, char **argv)
 		pg_log_warning("compression option \"%s\" is not currently supported by pg_dump",
 					   "workers");
 
-	/*
-	 * Custom and directory formats are compressed by default with gzip when
-	 * available, not the others.
-	 */
-	if ((archiveFormat == archCustom || archiveFormat == archDirectory) &&
-		!user_compression_defined)
-	{
-#ifdef HAVE_LIBZ
-		parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
-									 &compression_spec);
-#else
-		/* Nothing to do in the default case */
-#endif
-	}
-
 	/*
 	 * If emitting an archive format, we always want to emit a DATABASE item,
 	 * in case --create is specified at pg_restore time.
#9Noname
gkokolatos@pm.me
In reply to: Michael Paquier (#8)

------- Original Message -------
On Tuesday, April 25th, 2023 at 8:02 AM, Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Apr 12, 2023 at 07:53:53PM -0500, Justin Pryzby wrote:

I doubt it - in the !HAVE_LIBZ case, it's currently an "if" statement
with nothing but a comment, which isn't a problem.

I think the only issue with an empty "if" is when you have no braces,
like:

if (...)
#if ...
something;
#endif

// problem here //

(My apologies for the late reply.)

Still it could be easily messed up, and that's not a style that
really exists in the tree, either, because there are always #else
blocks set up in such cases. Another part that makes me a bit
uncomfortable is that we would still call twice
parse_compress_specification(), something that should not happen but
we are doing so on HEAD because the default compression_algorithm_str
is "none" and we want to enforce "gzip" for custom and directory
formats when building with zlib.

What about just moving this block a bit up, just before the
compression spec parsing, then? If we set compression_algorithm_str,
the specification is compiled with the expected default, once instead
of twice.

For what is worth, I think this would be the best approach. +1

Cheers,
//Georgios

Show quoted text

--
Michael

#10Michael Paquier
michael@paquier.xyz
In reply to: Noname (#9)

On Wed, Apr 26, 2023 at 08:50:46AM +0000, gkokolatos@pm.me wrote:

For what is worth, I think this would be the best approach. +1

Thanks. I have gone with that, then!
--
Michael