Introduce pg_receivewal gzip compression tests
Hi,
As suggested on a different thread [1]/messages/by-id/ZCm1J5vfyQ2E6dYvXz8si39HQ2gwxSZ3IpYaVgYa3lUwY88SLapx9EEnOf5uEwrddhx2twG7zYKjVeuP5MwZXCNPybtsGouDsAD1o2L_I5E=@pm.me, pg_receivewal can increase it's test
coverage. There exists a non trivial amount of code that handles gzip
compression. The current patch introduces tests that cover creation of gzip
compressed WAL files and the handling of gzip partial segments. Finally the
integrity of the compressed files is verified.
I hope you find this useful.
Cheers,
//Georgios
Attachments:
v1-0001-Introduce-pg_receivewal-gzip-compression-tests.patchapplication/octet-stream; name=v1-0001-Introduce-pg_receivewal-gzip-compression-tests.patchDownload
From f1ebb48ad964d2883f92c9ec50ac7e81f3ba8809 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 9 Jul 2021 11:10:12 +0000
Subject: Introduce pg_receivewal gzip compression tests
There exists a non trivial amount of code that handles gzip compression. The
current patch introduces tests that cover creation of gzip compressed WAL files
and the handling of the partial segments. Also the integrity of the compressed
files is verified.
---
src/bin/pg_basebackup/t/020_pg_receivewal.pl | 58 +++++++++++++++++++-
1 file changed, 57 insertions(+), 1 deletion(-)
diff --git a/src/bin/pg_basebackup/t/020_pg_receivewal.pl b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
index a547c97ef1..90c97b104d 100644
--- a/src/bin/pg_basebackup/t/020_pg_receivewal.pl
+++ b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
@@ -5,7 +5,7 @@ use strict;
use warnings;
use TestLib;
use PostgresNode;
-use Test::More tests => 19;
+use Test::More tests => 25;
program_help_ok('pg_receivewal');
program_version_ok('pg_receivewal');
@@ -66,6 +66,62 @@ $primary->command_ok(
],
'streaming some WAL with --synchronous');
+# Check gzip compression if available
+SKIP:
+{
+ skip "postgres was not build with gzip support", 6
+ if (!check_pg_config("#define HAVE_LIBZ 1"));
+
+ # Generate some WAL
+ $primary->psql('postgres', 'SELECT pg_switch_wal();');
+ $nextlsn = $primary->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn();');
+ chomp($nextlsn);
+ $primary->psql('postgres',
+ 'INSERT INTO test_table VALUES (generate_series(100,200));');
+ $primary->psql('postgres', 'SELECT pg_switch_wal();');
+
+ $primary->command_ok(
+ [
+ 'pg_receivewal', '-D', $stream_dir, '--verbose',
+ '--endpos', $nextlsn, '-Z', '5'
+ ],
+ "streaming some WAL using level 5 gzip compression");
+
+ # Verify that the stored file is compressed
+ my @gzip_wals = glob "$stream_dir/*.gz";
+ is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");
+
+ # Verify compressed file's integrity
+ my $gzip_is_valid = system_log('gzip', '--test', $gzip_wals[0]);
+ is($gzip_is_valid, 0, "program gzip verified file's integrity");
+
+ # There should be one .gz partial file
+ my @gzip_partial_wals = glob "$stream_dir/*.gz.partial";
+ is (scalar(@gzip_partial_wals), 1,
+ "one partial gzip compressed WAL was created");
+
+ # Generate some more WAL
+ $nextlsn = $primary->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn();');
+ $primary->psql('postgres',
+ 'INSERT INTO test_table VALUES (generate_series(200,300));');
+ chomp($nextlsn);
+ $primary->psql('postgres', 'SELECT pg_switch_wal();');
+
+ $primary->command_ok(
+ [
+ 'pg_receivewal', '-D', $stream_dir, '--verbose',
+ '--endpos', $nextlsn, '--compress', '1'
+ ],
+ "streaming some WAL using level 1 gzip compression");
+
+ # The .gz.partial file should now be complete
+ $gzip_partial_wals[0] =~ s/\.partial$//;
+ ok(-e $gzip_partial_wals[0],
+ "check that previously partial gzip compressed WAL is now complete");
+}
+
# Permissions on WAL files should be default
SKIP:
{
--
2.25.1
On Fri, Jul 09, 2021 at 11:26:58AM +0000, Georgios wrote:
As suggested on a different thread [1], pg_receivewal can increase it's test
coverage. There exists a non trivial amount of code that handles gzip
compression. The current patch introduces tests that cover creation of gzip
compressed WAL files and the handling of gzip partial segments. Finally the
integrity of the compressed files is verified.
+ # Verify compressed file's integrity
+ my $gzip_is_valid = system_log('gzip', '--test', $gzip_wals[0]);
+ is($gzip_is_valid, 0, "program gzip verified file's integrity");
libz and gzip are usually split across different packages, hence there
is no guarantee that this command is always available (same comment as
for LZ4 from a couple of days ago).
+ [
+ 'pg_receivewal', '-D', $stream_dir, '--verbose',
+ '--endpos', $nextlsn, '-Z', '5'
+ ],
I would keep the compression level to a minimum here, to limit CPU
usage but still compress something faster.
+ # Verify compressed file's integrity
+ my $gzip_is_valid = system_log('gzip', '--test', $gzip_wals[0]);
+ is($gzip_is_valid, 0, "program gzip verified file's integrity");
Shouldn't this be coded as a loop going through @gzip_wals?
--
Michael
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, July 12th, 2021 at 08:42, Michael Paquier <michael@paquier.xyz> wrote:
On Fri, Jul 09, 2021 at 11:26:58AM +0000, Georgios wrote:
As suggested on a different thread [1], pg_receivewal can increase it's test
coverage. There exists a non trivial amount of code that handles gzip
compression. The current patch introduces tests that cover creation of gzip
compressed WAL files and the handling of gzip partial segments. Finally the
integrity of the compressed files is verified.
- # Verify compressed file's integrity
- my $gzip_is_valid = system_log('gzip', '--test', $gzip_wals[0]);
- is($gzip_is_valid, 0, "program gzip verified file's integrity");
libz and gzip are usually split across different packages, hence there
is no guarantee that this command is always available (same comment as
for LZ4 from a couple of days ago).
Of course. Though while going for it, I did find in Makefile.global.in:
TAR = @TAR@
XGETTEXT = @XGETTEXT@
GZIP = gzip
BZIP2 = bzip2
DOWNLOAD = wget -O $@ --no-use-server-timestamps
Which is also used by GNUmakefile.in
distcheck: dist
rm -rf $(dummy)
mkdir $(dummy)
$(GZIP) -d -c $(distdir).tar.gz | $(TAR) xf -
install_prefix=`cd $(dummy) && pwd`; \
This to my understanding means that gzip is expected to exist.
If this is correct, then simply checking for the headers should
suffice, since that is the only dependency for the files to be
created.
If this is wrong, then I will add the discovery code as in the
other patch.
- [
- 'pg_receivewal', '-D', $stream_dir, '--verbose',
- '--endpos', $nextlsn, '-Z', '5'
- ],
I would keep the compression level to a minimum here, to limit CPU
usage but still compress something faster.
- # Verify compressed file's integrity
- my $gzip_is_valid = system_log('gzip', '--test', $gzip_wals[0]);
- is($gzip_is_valid, 0, "program gzip verified file's integrity");
Shouldn't this be coded as a loop going through @gzip_wals?
I would hope that there is only one gz file created. There is a line
further up that tests exactly that.
+ is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");
Then there should also be a partial gz file which is tested further ahead.
Cheers,
//Georgios
Show quoted text
-----------------------------------------------------------
Michael
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, July 12th, 2021 at 11:42, <gkokolatos@pm.me> wrote:
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, July 12th, 2021 at 08:42, Michael Paquier michael@paquier.xyz wrote:
On Fri, Jul 09, 2021 at 11:26:58AM +0000, Georgios wrote:
As suggested on a different thread [1], pg_receivewal can increase it's test
coverage. There exists a non trivial amount of code that handles gzip
compression. The current patch introduces tests that cover creation of gzip
compressed WAL files and the handling of gzip partial segments. Finally the
integrity of the compressed files is verified.
- # Verify compressed file's integrity
- my $gzip_is_valid = system_log('gzip', '--test', $gzip_wals[0]);
- is($gzip_is_valid, 0, "program gzip verified file's integrity");
libz and gzip are usually split across different packages, hence there
is no guarantee that this command is always available (same comment as
for LZ4 from a couple of days ago).
Of course. Though while going for it, I did find in Makefile.global.in:
TAR = @TAR@
XGETTEXT = @XGETTEXT@
GZIP = gzip
BZIP2 = bzip2
DOWNLOAD = wget -O $@ --no-use-server-timestamps
Which is also used by GNUmakefile.in
distcheck: dist
rm -rf $(dummy)
mkdir $(dummy)
$(GZIP) -d -c $(distdir).tar.gz | $(TAR) xf -
install_prefix=`cd $(dummy) && pwd`; \
This to my understanding means that gzip is expected to exist.
If this is correct, then simply checking for the headers should
suffice, since that is the only dependency for the files to be
created.
If this is wrong, then I will add the discovery code as in the
other patch.
- [
- 'pg_receivewal', '-D', $stream_dir, '--verbose',
- '--endpos', $nextlsn, '-Z', '5'
- ],
I would keep the compression level to a minimum here, to limit CPU
usage but still compress something faster.
- # Verify compressed file's integrity
- my $gzip_is_valid = system_log('gzip', '--test', $gzip_wals[0]);
- is($gzip_is_valid, 0, "program gzip verified file's integrity");
Shouldn't this be coded as a loop going through @gzip_wals?
I would hope that there is only one gz file created. There is a line
further up that tests exactly that.
- is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");
Let me amend that. The line should be instead:
is (scalar(keys @gzip_wals), 1, "one gzip compressed WAL was created");
To properly test that there is one entry.
Let me provide with v2 to fix this.
Cheers,
//Georgios
Show quoted text
Then there should also be a partial gz file which is tested further ahead.
Cheers,
//Georgios
Michael
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, July 12th, 2021 at 11:56, <gkokolatos@pm.me> wrote:
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, July 12th, 2021 at 11:42, gkokolatos@pm.me wrote:
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, July 12th, 2021 at 08:42, Michael Paquier michael@paquier.xyz wrote:
On Fri, Jul 09, 2021 at 11:26:58AM +0000, Georgios wrote:
As suggested on a different thread [1], pg_receivewal can increase it's test
coverage. There exists a non trivial amount of code that handles gzip
compression. The current patch introduces tests that cover creation of gzip
compressed WAL files and the handling of gzip partial segments. Finally the
integrity of the compressed files is verified.
- # Verify compressed file's integrity
- my $gzip_is_valid = system_log('gzip', '--test', $gzip_wals[0]);
- is($gzip_is_valid, 0, "program gzip verified file's integrity");
libz and gzip are usually split across different packages, hence there
is no guarantee that this command is always available (same comment as
for LZ4 from a couple of days ago).
Of course. Though while going for it, I did find in Makefile.global.in:
TAR = @TAR@
XGETTEXT = @XGETTEXT@
GZIP = gzip
BZIP2 = bzip2
DOWNLOAD = wget -O $@ --no-use-server-timestamps
Which is also used by GNUmakefile.in
distcheck: dist
rm -rf $(dummy)
mkdir $(dummy)
$(GZIP) -d -c $(distdir).tar.gz | $(TAR) xf -
install_prefix=`cd $(dummy) && pwd`; \
This to my understanding means that gzip is expected to exist.
If this is correct, then simply checking for the headers should
suffice, since that is the only dependency for the files to be
created.
If this is wrong, then I will add the discovery code as in the
other patch.
- [
- 'pg_receivewal', '-D', $stream_dir, '--verbose',
- '--endpos', $nextlsn, '-Z', '5'
- ],
I would keep the compression level to a minimum here, to limit CPU
usage but still compress something faster.
- # Verify compressed file's integrity
- my $gzip_is_valid = system_log('gzip', '--test', $gzip_wals[0]);
- is($gzip_is_valid, 0, "program gzip verified file's integrity");
Shouldn't this be coded as a loop going through @gzip_wals?
I would hope that there is only one gz file created. There is a line
further up that tests exactly that.
- is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");
Let me amend that. The line should be instead:
is (scalar(keys @gzip_wals), 1, "one gzip compressed WAL was created");
To properly test that there is one entry.
Let me provide with v2 to fix this.
Please find v2 attached with the above.
Cheers,
//Georgios
Show quoted text
Cheers,
//Georgios
Then there should also be a partial gz file which is tested further ahead.
Cheers,
//Georgios
Michael
Attachments:
v2-0001-Introduce-pg_receivewal-gzip-compression-tests.patchapplication/octet-stream; name=v2-0001-Introduce-pg_receivewal-gzip-compression-tests.patchDownload
From 227b554bcba87714383fdeeba43044e466373aeb Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 12 Jul 2021 10:14:08 +0000
Subject: [PATCH v2] Introduce pg_receivewal gzip compression tests
There exists a non trivial amount of code that handles gzip compression. The
current patch introduces tests that cover creation of gzip compressed WAL files
and the handling of the partial segments. Also the integrity of the compressed
files is verified.
---
src/bin/pg_basebackup/t/020_pg_receivewal.pl | 58 +++++++++++++++++++-
1 file changed, 57 insertions(+), 1 deletion(-)
diff --git a/src/bin/pg_basebackup/t/020_pg_receivewal.pl b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
index a547c97ef1..29e5dbdf8e 100644
--- a/src/bin/pg_basebackup/t/020_pg_receivewal.pl
+++ b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
@@ -5,7 +5,7 @@ use strict;
use warnings;
use TestLib;
use PostgresNode;
-use Test::More tests => 19;
+use Test::More tests => 25;
program_help_ok('pg_receivewal');
program_version_ok('pg_receivewal');
@@ -66,6 +66,62 @@ $primary->command_ok(
],
'streaming some WAL with --synchronous');
+# Check gzip compression if available
+SKIP:
+{
+ skip "postgres was not build with gzip support", 6
+ if (!check_pg_config("#define HAVE_LIBZ 1"));
+
+ # Generate some WAL
+ $primary->psql('postgres', 'SELECT pg_switch_wal();');
+ $nextlsn = $primary->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn();');
+ chomp($nextlsn);
+ $primary->psql('postgres',
+ 'INSERT INTO test_table VALUES (generate_series(100,200));');
+ $primary->psql('postgres', 'SELECT pg_switch_wal();');
+
+ $primary->command_ok(
+ [
+ 'pg_receivewal', '-D', $stream_dir, '--verbose',
+ '--endpos', $nextlsn, '-Z', '5'
+ ],
+ "streaming some WAL using level 5 gzip compression");
+
+ # Verify that the stored file is compressed
+ my @gzip_wals = glob "$stream_dir/*.gz";
+ is (scalar(keys @gzip_wals), 1, "one gzip compressed WAL was created");
+
+ # Verify compressed file's integrity
+ my $gzip_is_valid = system_log('gzip', '--test', $gzip_wals[0]);
+ is($gzip_is_valid, 0, "program gzip verified file's integrity");
+
+ # There should be one .gz partial file
+ my @gzip_partial_wals = glob "$stream_dir/*.gz.partial";
+ is (scalar(keys @gzip_partial_wals), 1,
+ "one partial gzip compressed WAL was created");
+
+ # Generate some more WAL
+ $nextlsn = $primary->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn();');
+ $primary->psql('postgres',
+ 'INSERT INTO test_table VALUES (generate_series(200,300));');
+ chomp($nextlsn);
+ $primary->psql('postgres', 'SELECT pg_switch_wal();');
+
+ $primary->command_ok(
+ [
+ 'pg_receivewal', '-D', $stream_dir, '--verbose',
+ '--endpos', $nextlsn, '--compress', '1'
+ ],
+ "streaming some WAL using level 1 gzip compression");
+
+ # The .gz.partial file should now be complete
+ $gzip_partial_wals[0] =~ s/\.partial$//;
+ ok(-e $gzip_partial_wals[0],
+ "check that previously partial gzip compressed WAL is now complete");
+}
+
# Permissions on WAL files should be default
SKIP:
{
--
2.25.1
Le 12/07/2021 à 12:27, gkokolatos@pm.me a écrit :
Shouldn't this be coded as a loop going through @gzip_wals?
I would hope that there is only one gz file created. There is a line
further up that tests exactly that.
- is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");
Let me amend that. The line should be instead:
is (scalar(keys @gzip_wals), 1, "one gzip compressed WAL was created");
To properly test that there is one entry.
Let me provide with v2 to fix this.
The following tests are not correct in Perl even if Perl returns the
right value.
+ is (scalar(keys @gzip_wals), 1, "one gzip compressed WAL was created");
+ is (scalar(keys @gzip_partial_wals), 1,
+ "one partial gzip compressed WAL was created");
Function keys or values are used only with hashes but here you are using
arrays. To obtain the length of the array you can just use the scalar
function as Perl returns the length of the array when it is called in a
scalar context. Please use the following instead:
+ is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");
+ is (scalar(@gzip_partial_wals), 1,
+ "one partial gzip compressed WAL was created");
--
Gilles Darold
http://www.darold.net/
On Mon, Jul 12, 2021 at 09:42:32AM +0000, gkokolatos@pm.me wrote:
This to my understanding means that gzip is expected to exist.
If this is correct, then simply checking for the headers should
suffice, since that is the only dependency for the files to be
created.
You cannot expect this to work on Windows when it comes to MSVC for
example, as gzip may not be in the environment PATH so the test would
fail hard. Let's just rely on $ENV{GZIP} instead, and skip the test
if it is not defined.
--
Michael
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, July 12th, 2021 at 13:00, Gilles Darold <gilles@darold.net> wrote:
Le 12/07/2021 à 12:27, gkokolatos@pm.me a écrit :
Shouldn't this be coded as a loop going through @gzip_wals?
I would hope that there is only one gz file created. There is a line
further up that tests exactly that.
- is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");
Let me amend that. The line should be instead:
is (scalar(keys @gzip_wals), 1, "one gzip compressed WAL was created");
To properly test that there is one entry.
Let me provide with v2 to fix this.
The following tests are not correct in Perl even if Perl returns the
right value.
+ is (scalar(keys @gzip_wals), 1, "one gzip compressed WAL was created");
+ is (scalar(keys @gzip_partial_wals), 1,
+ "one partial gzip compressed WAL was created");
Function keys or values are used only with hashes but here you are using
arrays. To obtain the length of the array you can just use the scalar
function as Perl returns the length of the array when it is called in a
scalar context. Please use the following instead:
+ is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");
+ is (scalar(@gzip_partial_wals), 1,
+ "one partial gzip compressed WAL was created");
You are absolutely correct. I had used that in v1, yet since it got called out
I doubted myself, assumed I was wrong and the rest is history. I shall ament the
amendment for v3 of the patch.
Cheers,
//Georgios
Show quoted text
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Gilles Darold
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, July 12th, 2021 at 13:04, Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Jul 12, 2021 at 09:42:32AM +0000, gkokolatos@pm.me wrote:
This to my understanding means that gzip is expected to exist.
If this is correct, then simply checking for the headers should
suffice, since that is the only dependency for the files to be
created.
You cannot expect this to work on Windows when it comes to MSVC for
example, as gzip may not be in the environment PATH so the test would
fail hard. Let's just rely on $ENV{GZIP} instead, and skip the test
if it is not defined.
I am admittedly not so well versed on Windows systems. Thank you for
informing me.
Please find attached v3 of the patch where $ENV{GZIP_PROGRAM} is used
instead. To the best of my knowledge one should avoid using $ENV{GZIP}
because that would translate to the obsolete, yet used environment
variable GZIP which holds a set of default options for gzip. In essence
it would be equivalent to executing:
GZIP=gzip gzip --test <files>
which can result to errors similar to:
gzip: gzip: non-option in GZIP environment variable
Cheers,
//Georgios
Show quoted text
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Michael
Attachments:
v3-0001-Introduce-pg_receivewal-gzip-compression-tests.patchapplication/octet-stream; name=v3-0001-Introduce-pg_receivewal-gzip-compression-tests.patchDownload
From 37ca54e7534eaf7a3c8b5ed811e6efae901f97b4 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 12 Jul 2021 14:55:11 +0000
Subject: [PATCH v3] Introduce pg_receivewal gzip compression tests
There exists a non trivial amount of code that handles gzip compression. The
current patch introduces tests that cover creation of gzip compressed WAL files
and the handling of the partial segments. Also the integrity of the compressed
files is verified.
---
src/bin/pg_basebackup/Makefile | 3 +-
src/bin/pg_basebackup/t/020_pg_receivewal.pl | 64 +++++++++++++++++++-
2 files changed, 65 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_basebackup/Makefile b/src/bin/pg_basebackup/Makefile
index 66e0070f1a..56fc12efa3 100644
--- a/src/bin/pg_basebackup/Makefile
+++ b/src/bin/pg_basebackup/Makefile
@@ -18,8 +18,9 @@ subdir = src/bin/pg_basebackup
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
-# make this available to TAP test scripts
+# make these available to TAP test scripts
export TAR
+export GZIP_PROGRAM=$(GZIP)
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
diff --git a/src/bin/pg_basebackup/t/020_pg_receivewal.pl b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
index a547c97ef1..0f7acbd7f4 100644
--- a/src/bin/pg_basebackup/t/020_pg_receivewal.pl
+++ b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
@@ -5,7 +5,7 @@ use strict;
use warnings;
use TestLib;
use PostgresNode;
-use Test::More tests => 19;
+use Test::More tests => 25;
program_help_ok('pg_receivewal');
program_version_ok('pg_receivewal');
@@ -66,6 +66,68 @@ $primary->command_ok(
],
'streaming some WAL with --synchronous');
+# Check gzip compression if available
+SKIP:
+{
+ skip "postgres was not build with gzip support", 6
+ if (!check_pg_config("#define HAVE_LIBZ 1"));
+
+ # Generate some WAL
+ $primary->psql('postgres', 'SELECT pg_switch_wal();');
+ $nextlsn = $primary->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn();');
+ chomp($nextlsn);
+ $primary->psql('postgres',
+ 'INSERT INTO test_table VALUES (generate_series(100,200));');
+ $primary->psql('postgres', 'SELECT pg_switch_wal();');
+
+ $primary->command_ok(
+ [
+ 'pg_receivewal', '-D', $stream_dir, '--verbose',
+ '--endpos', $nextlsn, '-Z', '5'
+ ],
+ "streaming some WAL using level 5 gzip compression");
+
+ # Verify that the stored file is compressed
+ my @gzip_wals = glob "$stream_dir/*.gz";
+ is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");
+
+ # There should be one .gz partial file
+ my @gzip_partial_wals = glob "$stream_dir/*.gz.partial";
+ is (scalar(@gzip_partial_wals), 1,
+ "one partial gzip compressed WAL was created");
+
+ # Generate some more WAL
+ $nextlsn = $primary->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn();');
+ $primary->psql('postgres',
+ 'INSERT INTO test_table VALUES (generate_series(200,300));');
+ chomp($nextlsn);
+ $primary->psql('postgres', 'SELECT pg_switch_wal();');
+
+ $primary->command_ok(
+ [
+ 'pg_receivewal', '-D', $stream_dir, '--verbose',
+ '--endpos', $nextlsn, '--compress', '1'
+ ],
+ "streaming some WAL using level 1 gzip compression");
+
+ # The .gz.partial file should now be complete
+ $gzip_partial_wals[0] =~ s/\.partial$//;
+ ok(-e $gzip_partial_wals[0],
+ "check that previously partial gzip compressed WAL is now complete");
+
+ # Finally verify compressed files's integrity
+ my $gzip_program = $ENV{GZIP_PROGRAM};
+ skip "program gzip is not found in your system", 1
+ if (!defined $gzip_program || $gzip_program eq '');
+
+ # There are two gzip compressed files, test both
+ push(@gzip_wals, @gzip_partial_wals);
+ my $gzip_is_valid = system_log($gzip_program, '--test', @gzip_wals);
+ is($gzip_is_valid, 0, "program gzip verified file's integrity");
+}
+
# Permissions on WAL files should be default
SKIP:
{
--
2.25.1
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, July 12th, 2021 at 17:07, <gkokolatos@pm.me> wrote:
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, July 12th, 2021 at 13:04, Michael Paquier michael@paquier.xyz wrote:
On Mon, Jul 12, 2021 at 09:42:32AM +0000, gkokolatos@pm.me wrote:
This to my understanding means that gzip is expected to exist.
If this is correct, then simply checking for the headers should
suffice, since that is the only dependency for the files to be
created.
You cannot expect this to work on Windows when it comes to MSVC for
example, as gzip may not be in the environment PATH so the test would
fail hard. Let's just rely on $ENV{GZIP} instead, and skip the test
if it is not defined.
I am admittedly not so well versed on Windows systems. Thank you for
informing me.
Please find attached v3 of the patch where $ENV{GZIP_PROGRAM} is used
instead. To the best of my knowledge one should avoid using $ENV{GZIP}
because that would translate to the obsolete, yet used environment
variable GZIP which holds a set of default options for gzip. In essence
it would be equivalent to executing:
GZIP=gzip gzip --test <files>
which can result to errors similar to:
gzip: gzip: non-option in GZIP environment variable
After a bit more thinking, I went ahead and added on top of v3 a test
verifying that the gzip program can actually be called.
Please find v4 attached.
Cheers,
//Georgios
Show quoted text
Michael
Attachments:
v4-0001-Introduce-pg_receivewal-gzip-compression-tests.patchapplication/octet-stream; name=v4-0001-Introduce-pg_receivewal-gzip-compression-tests.patchDownload
From ba7c47727d5280d223ca5627fe890596eb38710a Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 12 Jul 2021 16:34:26 +0000
Subject: [PATCH v4] Introduce pg_receivewal gzip compression tests
There exists a non trivial amount of code that handles gzip compression. The
current patch introduces tests that cover creation of gzip compressed WAL files
and the handling of the partial segments. Also the integrity of the compressed
files is verified.
---
src/bin/pg_basebackup/Makefile | 3 +-
src/bin/pg_basebackup/t/020_pg_receivewal.pl | 65 +++++++++++++++++++-
2 files changed, 66 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_basebackup/Makefile b/src/bin/pg_basebackup/Makefile
index 66e0070f1a..56fc12efa3 100644
--- a/src/bin/pg_basebackup/Makefile
+++ b/src/bin/pg_basebackup/Makefile
@@ -18,8 +18,9 @@ subdir = src/bin/pg_basebackup
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
-# make this available to TAP test scripts
+# make these available to TAP test scripts
export TAR
+export GZIP_PROGRAM=$(GZIP)
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
diff --git a/src/bin/pg_basebackup/t/020_pg_receivewal.pl b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
index a547c97ef1..4523274288 100644
--- a/src/bin/pg_basebackup/t/020_pg_receivewal.pl
+++ b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
@@ -5,7 +5,7 @@ use strict;
use warnings;
use TestLib;
use PostgresNode;
-use Test::More tests => 19;
+use Test::More tests => 25;
program_help_ok('pg_receivewal');
program_version_ok('pg_receivewal');
@@ -66,6 +66,69 @@ $primary->command_ok(
],
'streaming some WAL with --synchronous');
+# Check gzip compression if available
+SKIP:
+{
+ skip "postgres was not build with gzip support", 6
+ if (!check_pg_config("#define HAVE_LIBZ 1"));
+
+ # Generate some WAL
+ $primary->psql('postgres', 'SELECT pg_switch_wal();');
+ $nextlsn = $primary->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn();');
+ chomp($nextlsn);
+ $primary->psql('postgres',
+ 'INSERT INTO test_table VALUES (generate_series(100,200));');
+ $primary->psql('postgres', 'SELECT pg_switch_wal();');
+
+ $primary->command_ok(
+ [
+ 'pg_receivewal', '-D', $stream_dir, '--verbose',
+ '--endpos', $nextlsn, '-Z', '5'
+ ],
+ "streaming some WAL using level 5 gzip compression");
+
+ # Verify that the stored file is compressed
+ my @gzip_wals = glob "$stream_dir/*.gz";
+ is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");
+
+ # There should be one .gz partial file
+ my @gzip_partial_wals = glob "$stream_dir/*.gz.partial";
+ is (scalar(@gzip_partial_wals), 1,
+ "one partial gzip compressed WAL was created");
+
+ # Generate some more WAL
+ $nextlsn = $primary->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn();');
+ $primary->psql('postgres',
+ 'INSERT INTO test_table VALUES (generate_series(200,300));');
+ chomp($nextlsn);
+ $primary->psql('postgres', 'SELECT pg_switch_wal();');
+
+ $primary->command_ok(
+ [
+ 'pg_receivewal', '-D', $stream_dir, '--verbose',
+ '--endpos', $nextlsn, '--compress', '1'
+ ],
+ "streaming some WAL using level 1 gzip compression");
+
+ # The .gz.partial file should now be complete
+ $gzip_partial_wals[0] =~ s/\.partial$//;
+ ok(-e $gzip_partial_wals[0],
+ "check that previously partial gzip compressed WAL is now complete");
+
+ # Finally verify compressed files's integrity
+ my $gzip_program = $ENV{GZIP_PROGRAM};
+ skip "program gzip is not found in your system", 1
+ if (!defined $gzip_program || $gzip_program eq '' ||
+ system_log($gzip_program, '--version') != 0);
+
+ # There are two gzip compressed files, test both
+ push(@gzip_wals, @gzip_partial_wals);
+ my $gzip_is_valid = system_log($gzip_program, '--test', @gzip_wals);
+ is($gzip_is_valid, 0, "program gzip verified file's integrity");
+}
+
# Permissions on WAL files should be default
SKIP:
{
--
2.25.1
On Mon, Jul 12, 2021 at 04:46:29PM +0000, gkokolatos@pm.me wrote:
On Monday, July 12th, 2021 at 17:07, <gkokolatos@pm.me> wrote:
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
Are you using outlook? The format of your messages gets blurry on the
PG website, so does it for me.
I am admittedly not so well versed on Windows systems. Thank you for
informing me.
Please find attached v3 of the patch where $ENV{GZIP_PROGRAM} is used
instead. To the best of my knowledge one should avoid using $ENV{GZIP}
because that would translate to the obsolete, yet used environment
variable GZIP which holds a set of default options for gzip. In essence
it would be equivalent to executing:
GZIP=gzip gzip --test <files>
which can result to errors similar to:
gzip: gzip: non-option in GZIP environment variable
-# make this available to TAP test scripts
+# make these available to TAP test scripts
export TAR
+export GZIP_PROGRAM=$(GZIP)
Wow. So this comes from the fact that the command gzip can feed on
the environment variable from the same name. I was not aware of
that, and a comment would be in place here. That means complicating a
bit the test flow for people on Windows, but I am fine to live with
that as long as this does not fail hard. One extra thing we could do
is drop this part of the test, but I agree that this is useful to have
around as a validity check.
After a bit more thinking, I went ahead and added on top of v3 a test
verifying that the gzip program can actually be called.
+ system_log($gzip, '--version') != 0);
Checking after that does not hurt, indeed. I am wondering if we
should do that for TAR.
Another thing I find unnecessary is the number of the tests. This
does two rounds of pg_receivewal just to test the long and short
options of -Z/-compress, which brings only coverage to make sure that
both option names are handled. That's a high cost for a low amound of
extra coverage, so let's cut the runtime in half and just use the
round with --compress.
There was also a bit of confusion with ZLIB and gzip in the variable
names and the comments, the latter being only the command while the
compression happens with zlib. With a round of indentation on top of
all that, I ge tthe attached.
What do you think?
--
Michael
Attachments:
v5-0001-Add-some-tests-for-pg_receivewal-compress.patchtext/x-diff; charset=us-asciiDownload
From 44d971f8d4187fa34dc7426b629eafeeb0af53c2 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Tue, 13 Jul 2021 10:51:20 +0900
Subject: [PATCH v5] Add some tests for pg_receivewal --compress
---
src/bin/pg_basebackup/Makefile | 6 ++-
src/bin/pg_basebackup/t/020_pg_receivewal.pl | 46 +++++++++++++++++++-
2 files changed, 50 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_basebackup/Makefile b/src/bin/pg_basebackup/Makefile
index 66e0070f1a..459d514183 100644
--- a/src/bin/pg_basebackup/Makefile
+++ b/src/bin/pg_basebackup/Makefile
@@ -18,8 +18,12 @@ subdir = src/bin/pg_basebackup
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
-# make this available to TAP test scripts
+# make these available to TAP test scripts
export TAR
+# Note that GZIP cannot be used directly as this environment variable is
+# used by the command "gzip" to pass down options, so stick with a different
+# name.
+export GZIP_PROGRAM=$(GZIP)
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
diff --git a/src/bin/pg_basebackup/t/020_pg_receivewal.pl b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
index a547c97ef1..9fa1a3378b 100644
--- a/src/bin/pg_basebackup/t/020_pg_receivewal.pl
+++ b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
@@ -5,7 +5,7 @@ use strict;
use warnings;
use TestLib;
use PostgresNode;
-use Test::More tests => 19;
+use Test::More tests => 23;
program_help_ok('pg_receivewal');
program_version_ok('pg_receivewal');
@@ -66,6 +66,50 @@ $primary->command_ok(
],
'streaming some WAL with --synchronous');
+# Check ZLIB compression if available.
+SKIP:
+{
+ skip "postgres was not build with ZLIB support", 4
+ if (!check_pg_config("#define HAVE_LIBZ 1"));
+
+ # Generate more WAL.
+ $primary->psql('postgres', 'SELECT pg_switch_wal();');
+ $nextlsn =
+ $primary->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn();');
+ chomp($nextlsn);
+ $primary->psql('postgres',
+ 'INSERT INTO test_table VALUES (generate_series(100,200));');
+ $primary->psql('postgres', 'SELECT pg_switch_wal();');
+
+ $primary->command_ok(
+ [
+ 'pg_receivewal', '-D', $stream_dir, '--verbose',
+ '--endpos', $nextlsn, '--compress', '1'
+ ],
+ "streaming some WAL using ZLIB compression");
+
+ # Verify that the stored files are generated with the expected names.
+ my @zlib_wals = glob "$stream_dir/*.gz";
+ is(scalar(@zlib_wals), 1,
+ "one WAL segment compressed with ZLIB was created");
+ my @zlib_partial_wals = glob "$stream_dir/*.gz.partial";
+ is(scalar(@zlib_partial_wals),
+ 1, "one partial WAL segment compressed with ZLIB was created");
+
+ # There are two files compressed with ZLIB, check the integrity of
+ # all of them.
+ my $gzip = $ENV{GZIP_PROGRAM};
+ skip "program gzip is not found in your system", 1
+ if ( !defined $gzip
+ || $gzip eq ''
+ || system_log($gzip, '--version') != 0);
+
+ push(@zlib_wals, @zlib_partial_wals);
+ my $gzip_is_valid = system_log($gzip, '--test', @zlib_wals);
+ is($gzip_is_valid, 0,
+ "gzip verified the integrity of compressed WAL segments");
+}
+
# Permissions on WAL files should be default
SKIP:
{
--
2.32.0
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, July 13th, 2021 at 03:53, Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Jul 12, 2021 at 04:46:29PM +0000, gkokolatos@pm.me wrote:
On Monday, July 12th, 2021 at 17:07, gkokolatos@pm.me wrote:
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
Are you using outlook? The format of your messages gets blurry on the
PG website, so does it for me.
I am using protonmail's web page. I was not aware of the issue. Thank you
for bringing it up to my attention. I shall try to address it.
I am admittedly not so well versed on Windows systems. Thank you for
informing me.
Please find attached v3 of the patch where $ENV{GZIP_PROGRAM} is used
instead. To the best of my knowledge one should avoid using $ENV{GZIP}
because that would translate to the obsolete, yet used environment
variable GZIP which holds a set of default options for gzip. In essence
it would be equivalent to executing:
GZIP=gzip gzip --test <files>
which can result to errors similar to:
gzip: gzip: non-option in GZIP environment variable
-# make this available to TAP test scripts
+# make these available to TAP test scripts
export TAR
+export GZIP_PROGRAM=$(GZIP)
Wow. So this comes from the fact that the command gzip can feed on
the environment variable from the same name. I was not aware of
that, and a comment would be in place here. That means complicating a
bit the test flow for people on Windows, but I am fine to live with
that as long as this does not fail hard. One extra thing we could do
is drop this part of the test, but I agree that this is useful to have
around as a validity check.
Great.
After a bit more thinking, I went ahead and added on top of v3 a test
verifying that the gzip program can actually be called.
- system_log($gzip, '--version') != 0);
Checking after that does not hurt, indeed. I am wondering if we
should do that for TAR.
I do not think that this will be a necessity for TAR. TAR after all
is discovered by autoconf, which gzip is not.
Another thing I find unnecessary is the number of the tests. This
does two rounds of pg_receivewal just to test the long and short
options of -Z/-compress, which brings only coverage to make sure that
both option names are handled. That's a high cost for a low amound of
extra coverage, so let's cut the runtime in half and just use the
round with --compress.
I am sorry this was not so clear. It is indeed running twice the binary
with different flags. However the goal is not to check the flags, but
to make certain that the partial file has now been completed. That is
why there was code asserting that the previous FILENAME.gz.partial file
after the second invocation is converted to FILENAME.gz.
Additionally the second invocation of pg_receivewal is extending the
coverage of FindStreamingStart().
The different flags was an added bonus.
There was also a bit of confusion with ZLIB and gzip in the variable
names and the comments, the latter being only the command while the
compression happens with zlib. With a round of indentation on top of
all that, I ge tthe attached.
What do you think?
Thank you very much for the patch. I would prefer to keep the parts that
tests that .gz.partial are completed on a subsequent run if you agree.
Cheers,
//Georgios
Show quoted text
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Michael
On Tue, Jul 13, 2021 at 06:36:59AM +0000, gkokolatos@pm.me wrote:
I am sorry this was not so clear. It is indeed running twice the binary
with different flags. However the goal is not to check the flags, but
to make certain that the partial file has now been completed. That is
why there was code asserting that the previous FILENAME.gz.partial file
after the second invocation is converted to FILENAME.gz.
The first run you are adding checks the same thing thanks to
pg_switch_wal(), where pg_receivewal completes the generation of
000000010000000000000002.gz and finishes with
000000010000000000000003.gz.partial.
Additionally the second invocation of pg_receivewal is extending the
coverage of FindStreamingStart().
Hmm. It looks like a waste in runtime once we mix LZ4 in that as that
would mean 5 runs of pg_receivewal, but we really need only three of
them with --endpos:
- One with ZLIB compression.
- One with LZ4 compression.
- One without compression.
Do you think that we could take advantage of what is now the only run
of pg_receivewal --endpos for that? We could make the ZLIB checks run
first, conditionally, and then let the last command with --endpos
perform a full scan of the contents in $stream_dir with the .gz files
already in place. The addition of LZ4 would be an extra conditional
block similar to what's introduced in ZLIB, running before the last
command without compression.
--
Michael
On Tue, Jul 13, 2021 at 04:37:53PM +0900, Michael Paquier wrote:
Hmm. It looks like a waste in runtime once we mix LZ4 in that as that
would mean 5 runs of pg_receivewal, but we really need only three of
them with --endpos:
- One with ZLIB compression.
- One with LZ4 compression.
- One without compression.Do you think that we could take advantage of what is now the only run
of pg_receivewal --endpos for that? We could make the ZLIB checks run
first, conditionally, and then let the last command with --endpos
perform a full scan of the contents in $stream_dir with the .gz files
already in place. The addition of LZ4 would be an extra conditional
block similar to what's introduced in ZLIB, running before the last
command without compression.
Poking at this problem, I partially take this statement back as this
requires an initial run of pg_receivewal --endpos to ensure the
creation of one .gz and one .gz.partial. So I guess that this should
be structured as:
1) Keep the existing pg_receivewal --endpos.
2) Add the ZLIB block, with one pg_receivewal --endpos.
3) Add at the end one extra pg_receivewal --endpos, outside of the ZLIB
block, which should check the creation of a .partial, non-compressed
segment. This should mention that we place this command at the end of
the test for the start streaming point computation.
LZ4 tests would be then between 2) and 3), or 1) and 2).
--
Michael
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, July 13th, 2021 at 09:37, Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Jul 13, 2021 at 06:36:59AM +0000, gkokolatos@pm.me wrote:
I am sorry this was not so clear. It is indeed running twice the binary
with different flags. However the goal is not to check the flags, but
to make certain that the partial file has now been completed. That is
why there was code asserting that the previous FILENAME.gz.partial file
after the second invocation is converted to FILENAME.gz.The first run you are adding checks the same thing thanks to
pg_switch_wal(), where pg_receivewal completes the generation of
000000010000000000000002.gz and finishes with
000000010000000000000003.gz.partial.
This is correct. It is the 000000010000000000000003 awl that the rest
of the tests are targeting.
Additionally the second invocation of pg_receivewal is extending the
coverage of FindStreamingStart().Hmm. It looks like a waste in runtime once we mix LZ4 in that as that
would mean 5 runs of pg_receivewal, but we really need only three of
them with --endpos:
- One with ZLIB compression.
- One with LZ4 compression.
- One without compression.Do you think that we could take advantage of what is now the only run
of pg_receivewal --endpos for that? We could make the ZLIB checks run
first, conditionally, and then let the last command with --endpos
perform a full scan of the contents in $stream_dir with the .gz files
already in place. The addition of LZ4 would be an extra conditional
block similar to what's introduced in ZLIB, running before the last
command without compression.
I will admit that for the current patch I am not taking lz4 into account as
at the moment I have little idea as to how the lz4 patch will advance with the
review rounds. I simply accepted that it will be rebased on top of the patch
in the current thread and probably need to modify the current then.
But I digress. I would like have some combination of .gz and .gz.parial but
I will not take too strong of a stance. I am happy to go with your suggestion.
Cheers,
//Georgios
Show quoted text
--
Michael
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, July 13th, 2021 at 10:14, Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Jul 13, 2021 at 04:37:53PM +0900, Michael Paquier wrote:
Poking at this problem, I partially take this statement back as this
requires an initial run of pg_receivewal --endpos to ensure the
creation of one .gz and one .gz.partial. So I guess that this should
be structured as:1. Keep the existing pg_receivewal --endpos.
2. Add the ZLIB block, with one pg_receivewal --endpos.
3. Add at the end one extra pg_receivewal --endpos, outside of the ZLIB
block, which should check the creation of a .partial, non-compressed
segment. This should mention that we place this command at the end of
the test for the start streaming point computation.
LZ4 tests would be then between 2) and 3), or 1) and 2).
Sounds great. Let me cook up v6 for this.
Show quoted text
--
Michael
On Tue, Jul 13, 2021 at 08:28:44AM +0000, gkokolatos@pm.me wrote:
Sounds great. Let me cook up v6 for this.
Thanks. Could you use v5 I posted upthread as a base? There were
some improvements in the variable names, the comments and the test
descriptions.
--
Michael
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, July 13th, 2021 at 12:26, Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Jul 13, 2021 at 08:28:44AM +0000, gkokolatos@pm.me wrote:
Sounds great. Let me cook up v6 for this.
Thanks. Could you use v5 I posted upthread as a base? There were
some improvements in the variable names, the comments and the test
descriptions.
Agreed. For the record that is why I said v6 :)
Cheers,
//Georgios
Show quoted text
-----------------------------------------------------------------------------------------------------------------------------------------------------
Michael
On Tue, Jul 13, 2021 at 11:16:06AM +0000, gkokolatos@pm.me wrote:
Agreed. For the record that is why I said v6 :)
Okay, thanks.
--
Michael
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, July 14th, 2021 at 04:17, Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Jul 13, 2021 at 11:16:06AM +0000, gkokolatos@pm.me wrote:
Agreed. For the record that is why I said v6 :)
Okay, thanks.
Please find v6 attached.
Cheers,
//Georgios
Show quoted text
---------------
Michael
Attachments:
v6-0001-Introduce-pg_receivewal-gzip-compression-tests.patchapplication/octet-stream; name=v6-0001-Introduce-pg_receivewal-gzip-compression-tests.patchDownload
From fd48b26a175370dd64264c48ac2d61aef1b9494b Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Tue, 13 Jul 2021 15:28:57 +0000
Subject: [PATCH v6] Introduce pg_receivewal gzip compression tests
There exists a non trivial amount of code that handles gzip compression. The
current patch introduces tests that cover creation of gzip compressed WAL files
and the handling of the partial segments. The integrity of the compressed
files is verified as is the ability to start streaming from the correct
location regardless of compression.
---
src/bin/pg_basebackup/Makefile | 6 +-
src/bin/pg_basebackup/t/020_pg_receivewal.pl | 79 +++++++++++++++++++-
2 files changed, 83 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_basebackup/Makefile b/src/bin/pg_basebackup/Makefile
index 66e0070f1a..459d514183 100644
--- a/src/bin/pg_basebackup/Makefile
+++ b/src/bin/pg_basebackup/Makefile
@@ -18,8 +18,12 @@ subdir = src/bin/pg_basebackup
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
-# make this available to TAP test scripts
+# make these available to TAP test scripts
export TAR
+# Note that GZIP cannot be used directly as this environment variable is
+# used by the command "gzip" to pass down options, so stick with a different
+# name.
+export GZIP_PROGRAM=$(GZIP)
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
diff --git a/src/bin/pg_basebackup/t/020_pg_receivewal.pl b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
index a547c97ef1..26caa64246 100644
--- a/src/bin/pg_basebackup/t/020_pg_receivewal.pl
+++ b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
@@ -5,7 +5,7 @@ use strict;
use warnings;
use TestLib;
use PostgresNode;
-use Test::More tests => 19;
+use Test::More tests => 27;
program_help_ok('pg_receivewal');
program_version_ok('pg_receivewal');
@@ -66,6 +66,83 @@ $primary->command_ok(
],
'streaming some WAL with --synchronous');
+# Verify that one partial file was generated and keep track of it
+my @partial_wals = glob "$stream_dir/*\.partial";
+is(scalar(@partial_wals), 1,
+ "one partial WAL segment was created");
+
+# Check ZLIB compression if available.
+SKIP:
+{
+ skip "postgres was not build with ZLIB support", 5
+ if (!check_pg_config("#define HAVE_LIBZ 1"));
+
+ # Generate more WAL.
+ $primary->psql('postgres', 'SELECT pg_switch_wal();');
+ $nextlsn =
+ $primary->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn();');
+ chomp($nextlsn);
+ $primary->psql('postgres',
+ 'INSERT INTO test_table VALUES (generate_series(100,200));');
+ $primary->psql('postgres', 'SELECT pg_switch_wal();');
+
+ $primary->command_ok(
+ [
+ 'pg_receivewal', '-D', $stream_dir, '--verbose',
+ '--endpos', $nextlsn, '--compress', '1'
+ ],
+ "streaming some WAL using ZLIB compression");
+
+ # Verify that the stored files are generated with the expected names.
+ my @zlib_wals = glob "$stream_dir/*.gz";
+ is(scalar(@zlib_wals), 1,
+ "one WAL segment compressed with ZLIB was created");
+ my @zlib_partial_wals = glob "$stream_dir/*.gz.partial";
+ is(scalar(@zlib_partial_wals),
+ 1, "one partial WAL segment compressed with ZLIB was created");
+
+ # Verify that the start streaming position is computed correctly by
+ # comparing with the previous partial file
+ $partial_wals[0] =~ s/\.partial$/.gz/;
+ is($zlib_wals[0] =~ m/$partial_wals[0]/, 1,
+ "one partial WAL segment is now completed");
+
+ # Update the partial wals with the current values
+ @partial_wals = @zlib_partial_wals;
+
+ # There is one complete and one partial file compressed with ZLIB. Check
+ # the integrity of both.
+ my $gzip = $ENV{GZIP_PROGRAM};
+ skip "program gzip is not found in your system", 1
+ if ( !defined $gzip
+ || $gzip eq ''
+ || system_log($gzip, '--version') != 0);
+
+ push(@zlib_wals, @zlib_partial_wals);
+ my $gzip_is_valid = system_log($gzip, '--test', @zlib_wals);
+ is($gzip_is_valid, 0,
+ "gzip verified the integrity of compressed WAL segments");
+}
+
+# Verify that the start streaming position is computed and that the value is
+# correct regardless of whether ZLIB is available.
+$primary->psql('postgres', 'SELECT pg_switch_wal();');
+$nextlsn = $primary->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn();');
+chomp($nextlsn);
+$primary->psql('postgres',
+ 'INSERT INTO test_table VALUES (generate_series(200,300));');
+$primary->psql('postgres', 'SELECT pg_switch_wal();');
+$primary->command_ok(
+ [
+ 'pg_receivewal', '-D', $stream_dir, '--verbose',
+ '--endpos', $nextlsn
+ ],
+ "streaming some WAL");
+$primary->psql('postgres', 'SELECT pg_switch_wal();');
+
+$partial_wals[0] =~ s/(\.gz)?.partial//;
+ok(-e $partial_wals[0], "check that previously partial WAL is now complete");
+
# Permissions on WAL files should be default
SKIP:
{
--
2.25.1
On Wed, Jul 14, 2021 at 02:11:09PM +0000, gkokolatos@pm.me wrote:
Please find v6 attached.
Thanks. I have spent some time checking this stuff in details, and
I did some tests on Windows while on it. A run of pgperltidy was
missing. A second thing is that you added one useless WAL segment
switch in the ZLIB block, and two at the end, causing the first two in
the set of three (one in the ZLIB block and one in the final command)
to be no-ops as they followed a previous WAL switch. The final one
was not needed as no WAL is generated after that.
And applied. Let's see if the buildfarm has anything to say. Perhaps
this will even catch some bugs that pre-existed.
--
Michael
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, July 15th, 2021 at 09:00, Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Jul 14, 2021 at 02:11:09PM +0000, gkokolatos@pm.me wrote:
Please find v6 attached.
Thanks. I have spent some time checking this stuff in details, and
I did some tests on Windows while on it. A run of pgperltidy was
missing. A second thing is that you added one useless WAL segment
switch in the ZLIB block, and two at the end, causing the first two in
the set of three (one in the ZLIB block and one in the final command)
to be no-ops as they followed a previous WAL switch. The final one
was not needed as no WAL is generated after that.
Thank you for the work and comments.
And applied. Let's see if the buildfarm has anything to say. Perhaps
this will even catch some bugs that pre-existed.
Let us hope that it will prevent some bugs from happening.
Cheers,
//Georgios
Show quoted text
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Michael
On Thu, Jul 15, 2021 at 07:48:08AM +0000, gkokolatos@pm.me wrote:
Let us hope that it will prevent some bugs from happening.
The buildfarm has two reports.
1) bowerbird on Windows/MSVC:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2021-07-15%2010%3A30%3A36
pg_receivewal: fatal: could not fsync existing write-ahead log file
"000000010000000000000002.partial": Permission denied
not ok 20 - streaming some WAL using ZLIB compression
I don't think the existing code can be blamed for that as this means a
failure with gzflush(). Likely a concurrency issue as that's an
EACCES. If that's repeatable, that could point to an actual issue
with pg_receivewal --compress.
2) curculio:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=curculio&dt=2021-07-15%2010%3A30%3A15
# Running: gzip --test
/home/pgbf/buildroot/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/t_020_pg_receivewal_primary_data/archive_wal/000000010000000000000002.gz
/home/pgbf/buildroot/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/t_020_pg_receivewal_primary_data/archive_wal/000000010000000000000003.gz.partial
gzip:
/home/pgbf/buildroot/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/t_020_pg_receivewal_primary_data/archive_wal/000000010000000000000003.gz.partial:
unknown suffix: ignored
not ok 24 - gzip verified the integrity of compressed WAL segments
Looking at the OpenBSD code (usr.bin/compress/main.c), long options
are supported, where --version does exit(0) without printing
anything, and --test is supported even if that's not on the man pages.
set_outfile() is doing a discard of the file suffixes it does not
recognize, and I think that their implementation bumps on .gz.partial
and generates an exit code of 512 to map with WARNING. I still wish
to keep this test, and I'd like to think that the contents of
@zlib_wals are enough in terms of coverage. What do you think?
--
Michael
On Thu, Jul 15, 2021 at 08:35:27PM +0900, Michael Paquier wrote:
1) bowerbird on Windows/MSVC:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2021-07-15%2010%3A30%3A36
pg_receivewal: fatal: could not fsync existing write-ahead log file
"000000010000000000000002.partial": Permission denied
not ok 20 - streaming some WAL using ZLIB compression
I don't think the existing code can be blamed for that as this means a
failure with gzflush(). Likely a concurrency issue as that's an
EACCES. If that's repeatable, that could point to an actual issue
with pg_receivewal --compress.
For this one, I'll try to test harder on my own host. I am curious to
see if the other Windows members running the TAP tests have anything
to say. Looking at the code of zlib, this would come from gz_zero()
in gzflush(), which could blow up on a write() in gz_comp().
2) curculio:
Looking at the OpenBSD code (usr.bin/compress/main.c), long options
are supported, where --version does exit(0) without printing
anything, and --test is supported even if that's not on the man pages.
set_outfile() is doing a discard of the file suffixes it does not
recognize, and I think that their implementation bumps on .gz.partial
and generates an exit code of 512 to map with WARNING. I still wish
to keep this test, and I'd like to think that the contents of
@zlib_wals are enough in terms of coverage. What do you think?
After thinking more about this one, I have taken the course to just
remove the .gz.partial segment from the check, a full segment should
be enough in terms of coverage. I prefer this simplification over a
rename of the .partial segment or a tweak of the error code to map
with WARNING.
--
Michael
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, July 15th, 2021 at 14:35, Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Jul 15, 2021 at 08:35:27PM +0900, Michael Paquier wrote:
2. curculio:
Looking at the OpenBSD code (usr.bin/compress/main.c), long options
are supported, where --version does exit(0) without printing
set_outfile() is doing a discard of the file suffixes it does not
recognize, and I think that their implementation bumps on .gz.partial
and generates an exit code of 512 to map with WARNING. I still wish
to keep this test, and I'd like to think that the contents of
@zlib_wals are enough in terms of coverage. What do you think?After thinking more about this one, I have taken the course to just
remove the .gz.partial segment from the check, a full segment should
be enough in terms of coverage. I prefer this simplification over a
rename of the .partial segment or a tweak of the error code to map
with WARNING.
Fair enough.
Cheers,
//Georgios
Show quoted text
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Michael
On Thu, Jul 15, 2021 at 09:35:52PM +0900, Michael Paquier wrote:
For this one, I'll try to test harder on my own host. I am curious to
see if the other Windows members running the TAP tests have anything
to say. Looking at the code of zlib, this would come from gz_zero()
in gzflush(), which could blow up on a write() in gz_comp().
bowerbird has just failed for the second time in a row on EACCESS, so
there is more here than meets the eye. Looking at the code, I think I
have spotted what it is and the buildfarm logs give a very good hint:
# Running: pg_receivewal -D
:/prog/bf/root/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/t_020_pg_receivewal_primary_data/archive_wal
--verbose --endpos 0/3000028 --compress 1
pg_receivewal: starting log streaming at 0/2000000 (timeline 1)
pg_receivewal: fatal: could not fsync existing write-ahead log file
"000000010000000000000002.partial": Permission denied
not ok 20 - streaming some WAL using ZLIB compression
--compress is used and the sync fails for a non-compressed segment.
Looking at the code it is pretty obvious that open_walfile() is
getting confused with the handling of an existing .partial segment
while walmethods.c uses dir_data->compression in all the places that
matter. So that's a legit bug, that happens only when mixing
pg_receivewal runs for where successive runs use the compression or
non-compression modes.
I am amazed that the other buildfarm members are not complaining, to
be honest. jacana runs this TAP test with MinGW and ZLIB, and does
not complain.
--
Michael
On Fri, Jul 16, 2021 at 08:59:11AM +0900, Michael Paquier wrote:
--compress is used and the sync fails for a non-compressed segment.
Looking at the code it is pretty obvious that open_walfile() is
getting confused with the handling of an existing .partial segment
while walmethods.c uses dir_data->compression in all the places that
matter. So that's a legit bug, that happens only when mixing
pg_receivewal runs for where successive runs use the compression or
non-compression modes.
Ditto. After reading the code more carefully, the code is actually
able to work even if it could be cleaner:
1) dir_existsfile() would check for the existence of a
non-compressed, partial segment all the time.
2) If this non-compressed file was padded, the code would use
open_for_write() that would open a compressed, partial segment.
3) The compressed, partial segment would be the one flushed.
This behavior is rather debatable, and it would be more instinctive to
me to just skip any business related to the pre-padding if compression
is enabled, at the cost of one extra callback in WalWriteMethod to
grab the compression level (dir_open_for_write() skips that for
compression) to allow receivelog.c to handle that. But at the same
time few users are going to care about that as pg_receivewal has most
likely always the same set of options, so complicating this code is
not really appealing either.
I am amazed that the other buildfarm members are not complaining, to
be honest. jacana runs this TAP test with MinGW and ZLIB, and does
not complain.
I have spent more time on that with my own environment, and while
testing I have bumped on a different issue with zlib, which was
really weird. In the same scenario as above, gzdopen() has been
failing for me at step 2), causing the test to loop forever. We
document to use DLLs for ZLIB coming from zlib.net, but the ones
available there are really outdated as far as I can see (found some
called zlib.lib/dll myself, breaking Solution.pm). For now I have
disabled those tests on Windows to bring back bowerbird to green, but
there is something else going on here. We don't do much tests with
ZLIB on Windows for pg_basebackup and pg_dump, so there may be some
more issues?
@Andrew: which version of ZLIB are you using on bowerbird? That's the
one in c:\prog\3p64. That's a zdll.lib, right?
--
Michael
On Fri, Jul 16, 2021 at 02:08:57PM +0900, Michael Paquier wrote:
This behavior is rather debatable, and it would be more instinctive to
me to just skip any business related to the pre-padding if compression
is enabled, at the cost of one extra callback in WalWriteMethod to
grab the compression level (dir_open_for_write() skips that for
compression) to allow receivelog.c to handle that. But at the same
time few users are going to care about that as pg_receivewal has most
likely always the same set of options, so complicating this code is
not really appealing either.
I have chewed on that over the weekend, and skipping the padding logic
if we are in compression mode in open_walfile() makes sense, so
attached is a patch that I'd like to backpatch.
Another advantage of this patch is the handling of ".gz" is reduced to
one code path instead of four. That makes a bit easier the
introduction of new compression methods.
A second thing that was really confusing is that the name of the WAL
segment generated in this code path completely ignored the type of
compression. This led to one confusing error message if failing to
open a segment for write where we'd mention a .partial file rather
than a .gz.partial file. The versions of zlib I used on Windows
looked buggy so I cannot conclude there, but I am sure that this
should allow bowerbird to handle the test correctly.
--
Michael
Attachments:
receivewal-open.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index 3952a3f943..7af9009320 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -88,26 +88,29 @@ static bool
open_walfile(StreamCtl *stream, XLogRecPtr startpoint)
{
Walfile *f;
- char fn[MAXPGPATH];
+ char *fn;
ssize_t size;
XLogSegNo segno;
XLByteToSeg(startpoint, segno, WalSegSz);
XLogFileName(current_walfile_name, stream->timeline, segno, WalSegSz);
- snprintf(fn, sizeof(fn), "%s%s", current_walfile_name,
- stream->partial_suffix ? stream->partial_suffix : "");
+ /* Note that this considers the compression used if necessary */
+ fn = stream->walmethod->get_file_name(current_walfile_name,
+ stream->partial_suffix);
/*
* When streaming to files, if an existing file exists we verify that it's
* either empty (just created), or a complete WalSegSz segment (in which
* case it has been created and padded). Anything else indicates a corrupt
- * file.
+ * file. Compressed files have no need for padding, so just ignore this
+ * case.
*
* When streaming to tar, no file with this name will exist before, so we
* never have to verify a size.
*/
- if (stream->walmethod->existsfile(fn))
+ if (stream->walmethod->compression() == 0 &&
+ stream->walmethod->existsfile(fn))
{
size = stream->walmethod->get_file_size(fn);
if (size < 0)
diff --git a/src/bin/pg_basebackup/t/020_pg_receivewal.pl b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
index 158f7d176f..17fd71a450 100644
--- a/src/bin/pg_basebackup/t/020_pg_receivewal.pl
+++ b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
@@ -72,13 +72,11 @@ $primary->command_ok(
my @partial_wals = glob "$stream_dir/*\.partial";
is(scalar(@partial_wals), 1, "one partial WAL segment was created");
-# Check ZLIB compression if available. On Windows, some old versions
-# of zlib can cause some instabilities with this test, so disable it
-# for now.
+# Check ZLIB compression if available.
SKIP:
{
- skip "postgres was not built with ZLIB support, or Windows is involved", 5
- if (!check_pg_config("#define HAVE_LIBZ 1") || $windows_os);
+ skip "postgres was not built with ZLIB support", 5
+ if (!check_pg_config("#define HAVE_LIBZ 1"));
# Generate more WAL worth one completed, compressed, segment.
$primary->psql('postgres', 'SELECT pg_switch_wal();');
diff --git a/src/bin/pg_basebackup/walmethods.c b/src/bin/pg_basebackup/walmethods.c
index a15bbb20e7..3c0da70a04 100644
--- a/src/bin/pg_basebackup/walmethods.c
+++ b/src/bin/pg_basebackup/walmethods.c
@@ -68,20 +68,32 @@ dir_getlasterror(void)
return strerror(errno);
}
+static char *
+dir_get_file_name(const char *pathname, const char *temp_suffix)
+{
+ char *tmppath = pg_malloc0(MAXPGPATH * sizeof(char));
+
+ snprintf(tmppath, MAXPGPATH, "%s%s%s",
+ pathname, dir_data->compression > 0 ? ".gz" : "",
+ temp_suffix ? temp_suffix : "");
+
+ return tmppath;
+}
+
static Walfile
dir_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_size)
{
static char tmppath[MAXPGPATH];
+ char *filename;
int fd;
DirectoryMethodFile *f;
#ifdef HAVE_LIBZ
gzFile gzfp = NULL;
#endif
- snprintf(tmppath, sizeof(tmppath), "%s/%s%s%s",
- dir_data->basedir, pathname,
- dir_data->compression > 0 ? ".gz" : "",
- temp_suffix ? temp_suffix : "");
+ filename = dir_get_file_name(pathname, temp_suffix);
+ snprintf(tmppath, sizeof(tmppath), "%s/%s",
+ dir_data->basedir, filename);
/*
* Open a file for non-compressed as well as compressed files. Tracking
@@ -232,26 +244,30 @@ dir_close(Walfile f, WalCloseMethod method)
/* Build path to the current version of the file */
if (method == CLOSE_NORMAL && df->temp_suffix)
{
+ char *filename;
+ char *filename2;
+
/*
* If we have a temp prefix, normal operation is to rename the
* file.
*/
- snprintf(tmppath, sizeof(tmppath), "%s/%s%s%s",
- dir_data->basedir, df->pathname,
- dir_data->compression > 0 ? ".gz" : "",
- df->temp_suffix);
- snprintf(tmppath2, sizeof(tmppath2), "%s/%s%s",
- dir_data->basedir, df->pathname,
- dir_data->compression > 0 ? ".gz" : "");
+ filename = dir_get_file_name(df->pathname, df->temp_suffix);
+ snprintf(tmppath, sizeof(tmppath), "%s/%s",
+ dir_data->basedir, filename);
+
+ filename2 = dir_get_file_name(df->pathname, NULL);
+ snprintf(tmppath2, sizeof(tmppath2), "%s/%s",
+ dir_data->basedir, filename2);
r = durable_rename(tmppath, tmppath2);
}
else if (method == CLOSE_UNLINK)
{
+ char *filename;
+
/* Unlink the file once it's closed */
- snprintf(tmppath, sizeof(tmppath), "%s/%s%s%s",
- dir_data->basedir, df->pathname,
- dir_data->compression > 0 ? ".gz" : "",
- df->temp_suffix ? df->temp_suffix : "");
+ filename = dir_get_file_name(df->pathname, df->temp_suffix);
+ snprintf(tmppath, sizeof(tmppath), "%s/%s",
+ dir_data->basedir, filename);
r = unlink(tmppath);
}
else
@@ -313,6 +329,12 @@ dir_get_file_size(const char *pathname)
return statbuf.st_size;
}
+static int
+dir_compression(void)
+{
+ return dir_data->compression;
+}
+
static bool
dir_existsfile(const char *pathname)
{
@@ -355,6 +377,8 @@ CreateWalDirectoryMethod(const char *basedir, int compression, bool sync)
method->write = dir_write;
method->get_current_pos = dir_get_current_pos;
method->get_file_size = dir_get_file_size;
+ method->get_file_name = dir_get_file_name;
+ method->compression = dir_compression;
method->close = dir_close;
method->sync = dir_sync;
method->existsfile = dir_existsfile;
@@ -527,11 +551,22 @@ tar_write_padding_data(TarMethodFile *f, size_t bytes)
return true;
}
+static char *
+tar_get_file_name(const char *pathname, const char *temp_suffix)
+{
+ char *tmppath = pg_malloc0(MAXPGPATH * sizeof(char));
+
+ snprintf(tmppath, sizeof(tmppath), "%s%s",
+ pathname, temp_suffix ? temp_suffix : "");
+
+ return tmppath;
+}
+
static Walfile
tar_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_size)
{
int save_errno;
- static char tmppath[MAXPGPATH];
+ char *tmppath;
tar_clear_error();
@@ -583,8 +618,7 @@ tar_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_
tar_data->currentfile = pg_malloc0(sizeof(TarMethodFile));
- snprintf(tmppath, sizeof(tmppath), "%s%s",
- pathname, temp_suffix ? temp_suffix : "");
+ tmppath = tar_get_file_name(pathname, temp_suffix);
/* Create a header with size set to 0 - we will fill out the size on close */
if (tarCreateHeader(tar_data->currentfile->header, tmppath, NULL, 0, S_IRUSR | S_IWUSR, 0, 0, time(NULL)) != TAR_OK)
@@ -689,6 +723,12 @@ tar_get_file_size(const char *pathname)
return -1;
}
+static int
+tar_compression(void)
+{
+ return tar_data->compression;
+}
+
static off_t
tar_get_current_pos(Walfile f)
{
@@ -994,6 +1034,8 @@ CreateWalTarMethod(const char *tarbase, int compression, bool sync)
method->write = tar_write;
method->get_current_pos = tar_get_current_pos;
method->get_file_size = tar_get_file_size;
+ method->get_file_name = tar_get_file_name;
+ method->compression = tar_compression;
method->close = tar_close;
method->sync = tar_sync;
method->existsfile = tar_existsfile;
diff --git a/src/bin/pg_basebackup/walmethods.h b/src/bin/pg_basebackup/walmethods.h
index fc4bb52cb7..3f08f4256d 100644
--- a/src/bin/pg_basebackup/walmethods.h
+++ b/src/bin/pg_basebackup/walmethods.h
@@ -52,6 +52,15 @@ struct WalWriteMethod
/* Return the size of a file, or -1 on failure. */
ssize_t (*get_file_size) (const char *pathname);
+ /*
+ * Return the name of the current file working on, without the base
+ * directory. This is useful for logging.
+ */
+ char *(*get_file_name) (const char *pathname, const char *temp_suffix);
+
+ /* Return the level of compression */
+ int (*compression) (void);
+
/*
* Write count number of bytes to the file, and return the number of bytes
* actually written or -1 for error.
On Mon, Jul 19, 2021 at 04:03:33PM +0900, Michael Paquier wrote:
Another advantage of this patch is the handling of ".gz" is reduced to
one code path instead of four. That makes a bit easier the
introduction of new compression methods.A second thing that was really confusing is that the name of the WAL
segment generated in this code path completely ignored the type of
compression. This led to one confusing error message if failing to
open a segment for write where we'd mention a .partial file rather
than a .gz.partial file. The versions of zlib I used on Windows
looked buggy so I cannot conclude there, but I am sure that this
should allow bowerbird to handle the test correctly.
After more testing and more review, I have applied and backpatched
this stuff. Another thing I did on HEAD was to enable again the ZLIB
portion of the pg_receivewal tests on Windows. bowerdird should stay
green (I hope), and it is better to have as much more coverage as
possible for all that.
--
Michael