pgsql: Add suport for server-side LZ4 base backup compression.
Add suport for server-side LZ4 base backup compression.
LZ4 compression can be a lot faster than gzip compression, so users
may prefer it even if the compression ratio is not as good. We will
want pg_basebackup to support LZ4 compression and decompression on the
client side as well, and there is a pending patch for that, but it's
by a different author, so I am committing this part separately for
that reason.
Jeevan Ladhe, reviewed by Tushar Ahuja and by me.
Discussion: /messages/by-id/CANm22Cg9cArXEaYgHVZhCnzPLfqXCZLAzjwTq7Fc0quXRPfbxA@mail.gmail.com
Branch
------
master
Details
-------
https://git.postgresql.org/pg/commitdiff/dab298471ff2f91f33bc25bfb73e435d3ab02148
Modified Files
--------------
doc/src/sgml/protocol.sgml | 7 +-
doc/src/sgml/ref/pg_basebackup.sgml | 24 ++-
src/backend/replication/Makefile | 1 +
src/backend/replication/basebackup.c | 7 +-
src/backend/replication/basebackup_lz4.c | 298 ++++++++++++++++++++++++++++++
src/bin/pg_basebackup/pg_basebackup.c | 18 +-
src/bin/pg_verifybackup/Makefile | 1 +
src/bin/pg_verifybackup/t/008_untar.pl | 10 +-
src/include/replication/basebackup_sink.h | 1 +
9 files changed, 349 insertions(+), 18 deletions(-)
On Fri, Feb 11, 2022 at 01:54:36PM +0000, Robert Haas wrote:
Add suport for server-side LZ4 base backup compression.
LZ4 compression can be a lot faster than gzip compression, so users
may prefer it even if the compression ratio is not as good. We will
want pg_basebackup to support LZ4 compression and decompression on the
client side as well, and there is a pending patch for that, but it's
by a different author, so I am committing this part separately for
that reason.Jeevan Ladhe, reviewed by Tushar Ahuja and by me.
copperhead seems to be unhappy here:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=copperhead&dt=2022-02-11%2021%3A52%3A48
# Running: lz4 -d -m
/home/pgbf/buildroot/HEAD/pgsql.build/src/bin/pg_verifybackup/tmp_check/t_008_untar_primary_data/backup/server-backup/base.tar.lz4
Can't exec "lz4": No such file or directory at
/home/pgbf/buildroot/HEAD/pgsql.build/src/bin/pg_verifybackup/../../../src/test/perl/PostgreSQL/Test/Utils.pm
line 388.
Bail out! failed to execute command "lz4 -d -m
/home/pgbf/buildroot/HEAD/pgsql.build/src/bin/pg_verifybackup/tmp_check/t_008_untar_primary_data/backup/server-backup/base.tar.lz4":
No such file or directory
### Stopping node "primary" using mode immediate
Perhaps you'd better check that 'decompress_program' can be executed
and skip things if the command is not around? Note that
020_pg_receivewal.pl does an extra lz4 --version as a safety measure,
but 008_untar.pl does not.
--
Michael
On Fri, Feb 11, 2022 at 11:39 PM Michael Paquier <michael@paquier.xyz> wrote:
Perhaps you'd better check that 'decompress_program' can be executed
and skip things if the command is not around? Note that
020_pg_receivewal.pl does an extra lz4 --version as a safety measure,
but 008_untar.pl does not.
You may be right, but I'm not entirely convinced. $ENV{'LZ4'} here is
being set by make, and make is setting it to whatever configure found.
If configure found a version of the lz4 program that doesn't actually
work, isn't that configure's fault, or the system administrator's
fault, rather than this test's fault?
--
Robert Haas
EDB: http://www.enterprisedb.com
Hi,
On 2022-02-12 13:38:45 +0900, Michael Paquier wrote:
Perhaps you'd better check that 'decompress_program' can be executed
and skip things if the command is not around? Note that
020_pg_receivewal.pl does an extra lz4 --version as a safety measure,
but 008_untar.pl does not.
Actually executing the command doesn't seem useful? It doesn't seem useful to
ignore an executable that's not actually executable or such. Testing whether
the file exists makes sense though.
Greetings,
Andres Freund
On 2022-02-12 07:24:46 -0500, Robert Haas wrote:
You may be right, but I'm not entirely convinced. $ENV{'LZ4'} here is
being set by make, and make is setting it to whatever configure found.
If configure found a version of the lz4 program that doesn't actually
work, isn't that configure's fault, or the system administrator's
fault, rather than this test's fault?
I don't think there's an actual configure check for the lz4 binary? Looks like
a static assignment in src/Makefile.global.in to me.
On Sat, Feb 12, 2022 at 5:09 PM Andres Freund <andres@anarazel.de> wrote:
On 2022-02-12 07:24:46 -0500, Robert Haas wrote:
You may be right, but I'm not entirely convinced. $ENV{'LZ4'} here is
being set by make, and make is setting it to whatever configure found.
If configure found a version of the lz4 program that doesn't actually
work, isn't that configure's fault, or the system administrator's
fault, rather than this test's fault?I don't think there's an actual configure check for the lz4 binary? Looks like
a static assignment in src/Makefile.global.in to me.
Oh. That seems kind of dumb.
It does mean that trying to run it is all that we can do, though,
because we don't have a full path.
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
On Sat, Feb 12, 2022 at 5:09 PM Andres Freund <andres@anarazel.de> wrote:
I don't think there's an actual configure check for the lz4 binary? Looks like
a static assignment in src/Makefile.global.in to me.
Oh. That seems kind of dumb.
It looks to me like somebody figured it didn't need any more support
than gzip/bzip2, which is wrong on a couple of grounds:
* hardly any modern platforms lack those, unlike lz4
* we don't invoke either one of them during testing, only when
you explicitly ask to make a compressed tarball
I think adding an explicit PGAC_PATH_PROGS would be worth the cycles.
regards, tom lane
On Sat, Feb 12, 2022 at 05:16:03PM -0500, Tom Lane wrote:
It looks to me like somebody figured it didn't need any more support
than gzip/bzip2, which is wrong on a couple of grounds:
* hardly any modern platforms lack those, unlike lz4
* we don't invoke either one of them during testing, only when
you explicitly ask to make a compressed tarballI think adding an explicit PGAC_PATH_PROGS would be worth the cycles.
Well, this somebody is I, and the buildfarm did not blow up on any of
that so that looked rather fine. Adding a few cycles for this check
is fine by me. What do you think of the attached?
--
Michael
Attachments:
lz4-configure.patchtext/x-diff; charset=us-asciiDownload+59-5
Michael Paquier <michael@paquier.xyz> writes:
On Sat, Feb 12, 2022 at 05:16:03PM -0500, Tom Lane wrote:
I think adding an explicit PGAC_PATH_PROGS would be worth the cycles.
Well, this somebody is I, and the buildfarm did not blow up on any of
that so that looked rather fine.
Eh? copperhead for one is failing for exactly this reason:
Bailout called. Further testing stopped: failed to execute command "lz4 -d -m /home/pgbf/buildroot/HEAD/pgsql.build/src/bin/pg_verifybackup/tmp_check/t_008_untar_primary_data/backup/server-backup/base.tar.lz4": No such file or directory
Adding a few cycles for this check
is fine by me. What do you think of the attached?
Looks OK as far as it goes ... but do we need anything on the
MSVC side?
Also, I can't help wondering why we have both GZIP_PROGRAM and GZIP
variables. I suppose that's a separate issue though.
regards, tom lane
On Sat, Feb 12, 2022 at 11:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Eh? copperhead for one is failing for exactly this reason:
Bailout called. Further testing stopped: failed to execute command "lz4 -d -m /home/pgbf/buildroot/HEAD/pgsql.build/src/bin/pg_verifybackup/tmp_check/t_008_untar_primary_data/backup/server-backup/base.tar.lz4": No such file or directory
Well, that's because I didn't realize that LZ4 might be set to
something that doesn't work at all. So Michael's thing worked, but
because it (in my view) fixed the problem in a more surprising place
than I would have preferred, I made a commit later that turned out to
break the buildfarm. So you can blame either one of us that you like.
Thanks, Michael, for preparing a patch.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Sat, Feb 12, 2022 at 11:12:50PM -0500, Tom Lane wrote:
Michael Paquier <michael@paquier.xyz> writes:
Well, this somebody is I, and the buildfarm did not blow up on any of
that so that looked rather fine.Eh? copperhead for one is failing for exactly this reason:
Bailout called. Further testing stopped: failed to execute command
"lz4 -d -m
/home/pgbf/buildroot/HEAD/pgsql.build/src/bin/pg_verifybackup/tmp_check/t_008_untar_primary_data/backup/server-backup/base.tar.lz4":
No such file or directory
Well, I have put in place a check in the TAP tests, that Robert has
managed to break. It looks like it was wrong of me to assume that
it would be fine as-is. Sorry about that.
Adding a few cycles for this check
is fine by me. What do you think of the attached?Looks OK as far as it goes ... but do we need anything on the
MSVC side?
It is possible to tweak the environment variable so one can unset it.
If we make things consistent with the configure check, we could tweak
the default to not be "lz4" if we don't have --with-lz4 in the MSVC
configuration? I don't recall seeing in the wild an installer for LZ4
where we would have the command but not the libraries, so perhaps
that's not worth the trouble, anyway, and we don't have any code paths
in the tests where we use LZ4 without --with-lz4.
Also, I can't help wondering why we have both GZIP_PROGRAM and GZIP
variables. I suppose that's a separate issue though.
From gzip's man page:
"The obsolescent environment variable GZIP can hold a set of default
options for gzip."
This means that saving the command into a variable with the same name
interacts with the command launch. This was discussed here:
/messages/by-id/nr63G0R6veCrLY30QMwxYA2aCawclWJopZiKEDN8HtKwIoIqZ79fVmSE2GxTyPD1Mk9FzRdfLCrc-BSGO6vTlDT_E2-aqtWeEiNn-qsDDzk=@pm.me
--
Michael
On Sun, Feb 13, 2022 at 11:13:51AM -0500, Robert Haas wrote:
Well, that's because I didn't realize that LZ4 might be set to
something that doesn't work at all. So Michael's thing worked, but
because it (in my view) fixed the problem in a more surprising place
than I would have preferred, I made a commit later that turned out to
break the buildfarm. So you can blame either one of us that you like.Thanks, Michael, for preparing a patch.
Patch that was slightly wrong, as the tests of pg_verifybackup still
failed once I moved away the lz4 command in my environment because LZ4
gets set to an empty value. I have checked this case locally, and
applied the patch to add the ./configure check, so copperhead should
get back to green.
A last thing, that has been mentioned by Andres upthread, is that we
should be able to remove the extra commands run with --version in the
tests of pg_basebackup, as of the attached. I have not done that yet,
as it seems better to wait for copperhead first, and the tests of
pg_basebackup run before pg_verifybackup so I don't want to mask one
error for another in case the buildfarm says something.
--
Michael
Attachments:
basebackup-tap.patchtext/x-diff; charset=us-asciiDownload+3-6
On Mon, Feb 14, 2022 at 10:53:04AM +0900, Michael Paquier wrote:
A last thing, that has been mentioned by Andres upthread, is that we
should be able to remove the extra commands run with --version in the
tests of pg_basebackup, as of the attached. I have not done that yet,
as it seems better to wait for copperhead first, and the tests of
pg_basebackup run before pg_verifybackup so I don't want to mask one
error for another in case the buildfarm says something.
copperhead has reported back a couple of hours ago, and it has
switched back to green. Hence, I have moved on with the remaining
piece and removed all those --version checks from the tests of
pg_basebackup and pg_receivewal.
--
Michael