Re: Add LZ4 compression in pg_dump
On Fri, Aug 05, 2022 at 02:23:45PM +0000, Georgios Kokolatos wrote:
Thank you for your work during commitfest.
The patch is still in development. Given vacation status, expect the next patches to be ready for the November commitfest.
For now it has moved to the September one. Further action will be taken then as needed.
On Sun, Nov 06, 2022 at 02:53:12PM +0000, gkokolatos@pm.me wrote:
On Wed, Nov 2, 2022 at 14:28, Justin Pryzby <pryzby@telsasoft.com> wrote:
Checking if you'll be able to submit new patches soon ?
Thank you for checking up. Expect new versions within this commitfest cycle.
Hi,
I think this patch record should be closed for now. You can re-open the
existing patch record once a patch is ready to be reviewed.
The commitfest is a time for committing/reviewing patches that were
previously submitted, but there's no new patch since July. Making a
patch available for review at the start of the commitfest seems like a
requirement for current patch records (same as for new patch records).
I wrote essentially the same patch as your early patches 2 years ago
(before postgres was ready to consider new compression algorithms), so
I'm happy to review a new patch when it's available, regardless of its
status in the cfapp.
BTW, some of my own review comments from March weren't addressed.
Please check. Also, in February, I asked if you knew how to use
cirrusci to run checks on cirrusci, but the patches still had
compilation errors and warnings on various OS.
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/39/3571
--
Justin
Import Notes
Reply to msg id not found: 165970942594.29780.15623018354659505573.pgcf@coridan.postgresql.org-pU80vtiyXMfSSpn4SWOR97t71MVmDOyHaJtvGsdEfroFsFUDdc-UaqD-8LMPMw2sy-7B2_IVnlrATfVvXwtv8n3kexLePZ7iWgiOnE4jRI@pm.me
On Sun, Nov 20, 2022 at 11:26:11AM -0600, Justin Pryzby wrote:
I think this patch record should be closed for now. You can re-open the
existing patch record once a patch is ready to be reviewed.
Indeed. As of things are, this is just a dead entry in the CF which
would be confusing. I have marked it as RwF.
--
Michael
------- Original Message -------
On Monday, November 21st, 2022 at 12:13 AM, Michael Paquier <michael@paquier.xyz> wrote:
On Sun, Nov 20, 2022 at 11:26:11AM -0600, Justin Pryzby wrote:
I think this patch record should be closed for now. You can re-open the
existing patch record once a patch is ready to be reviewed.Indeed. As of things are, this is just a dead entry in the CF which
would be confusing. I have marked it as RwF.
Thank you for closing it.
For the record I am currently working on it simply unsure if I should submit
WIP patches and add noise to the list or wait until it is in a state that I
feel that the comments have been addressed.
A new version that I feel that is in a decent enough state for review should
be ready within this week. I am happy to drop the patch if you think I should
not work on it though.
Cheers,
//Georgios
Show quoted text
--
Michael
On Tue, Nov 22, 2022 at 10:00:47AM +0000, gkokolatos@pm.me wrote:
A new version that I feel that is in a decent enough state for review should
be ready within this week. I am happy to drop the patch if you think I should
not work on it though.
If you can post a new version of the patch, that's fine, of course.
I'll be happy to look over it more.
--
Michael
On Tue, Nov 22, 2022 at 10:00:47AM +0000, gkokolatos@pm.me wrote:
For the record I am currently working on it simply unsure if I should submit
WIP patches and add noise to the list or wait until it is in a state that I
feel that the comments have been addressed.A new version that I feel that is in a decent enough state for review should
be ready within this week. I am happy to drop the patch if you think I should
not work on it though.
I hope you'll want to continue work on it. The patch record is like a
request for review, so it's closed if there's nothing ready to review.
I think you should re-send patches (and update the CF app) as often as
they're ready for more review. Your 001 commit (which is almost the
same as what I wrote 2 years ago) still needs to account for some review
comments, and the whole patch set ought to pass cirrusci tests. At that
point, you'll be ready for another round of review, even if there's
known TODO/FIXME items in later patches.
BTW I saw that you updated your branch on github. You'll need to make
the corresponding changes to ./meson.build that you made to ./Makefile.
https://wiki.postgresql.org/wiki/Meson_for_patch_authors
https://wiki.postgresql.org/wiki/Meson
--
Justin
------- Original Message -------
On Tuesday, November 22nd, 2022 at 11:49 AM, Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Nov 22, 2022 at 10:00:47AM +0000, gkokolatos@pm.me wrote:
A new version that I feel that is in a decent enough state for review should
be ready within this week. I am happy to drop the patch if you think I should
not work on it though.If you can post a new version of the patch, that's fine, of course.
I'll be happy to look over it more.
Thank you Michael (and Justin). Allow me to present v8.
The focus of this version of this series is 0001 and 0002.
Admittedly 0001 could be presented in a separate thread though given its size and
proximity to the topic, I present it here.
In an earlier review you spotted the similarity between pg_dump's and pg_receivewal's
parsing of compression options. However there exists a substantial difference in the
behaviour of the two programs; one treats the lack of support for the requested
algorithm as a fatal error, whereas the other does not. The existing functions in
common/compression.c do not account for the later. 0002 proposes an implementation
for this. It's usefulness is shown in 0003.
Please consider 0003-0005 as work in progress. They are differences from v7 yet they
may contain unaddressed comments for now.
A welcome feedback would be in splitting and/or reordering of 0003-0005. I think
that they now split in coherent units and are presented in a logical order. Let me
know if you disagree and where should the breakpoints be.
Cheers,
//Georgios
Show quoted text
--
Michael
Attachments:
v8-0003-Prepare-pg_dump-for-additional-compression-method.patchtext/x-patch; name=v8-0003-Prepare-pg_dump-for-additional-compression-method.patchDownload
From 337f19a52f164a22fbf974b8a749f3b895a339b4 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 28 Nov 2022 15:33:49 +0000
Subject: [PATCH v8 3/5] Prepare pg_dump for additional compression methods
This commmit does some of the heavy lifting required for additional compression
methods.
First it is teaching pg_dump.c about the definitions and interfaces found in
common/compression.h. Then it is propagating those throughout the code.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about cfp and is using it through out.
---
doc/src/sgml/ref/pg_dump.sgml | 30 +-
src/bin/pg_dump/compress_io.c | 431 ++++++++++++++++----------
src/bin/pg_dump/compress_io.h | 20 +-
src/bin/pg_dump/pg_backup.h | 7 +-
src/bin/pg_dump/pg_backup_archiver.c | 192 ++++++------
src/bin/pg_dump/pg_backup_archiver.h | 37 +--
src/bin/pg_dump/pg_backup_custom.c | 6 +-
src/bin/pg_dump/pg_backup_directory.c | 13 +-
src/bin/pg_dump/pg_backup_tar.c | 12 +-
src/bin/pg_dump/pg_dump.c | 98 ++++--
src/bin/pg_dump/t/001_basic.pl | 26 +-
src/bin/pg_dump/t/002_pg_dump.pl | 2 +-
12 files changed, 512 insertions(+), 362 deletions(-)
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 8b9d9f4cad..3fb8fdce81 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -644,17 +644,31 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
- <term><option>-Z <replaceable class="parameter">0..9</replaceable></option></term>
- <term><option>--compress=<replaceable class="parameter">0..9</replaceable></option></term>
+ <term><option>-Z <replaceable class="parameter">level</replaceable></option></term>
+ <term><option>-Z <replaceable class="parameter">method</replaceable></option>[:<replaceable>level</replaceable>]</term>
+ <term><option>--compress=<replaceable class="parameter">level</replaceable></option></term>
+ <term><option>--compress=<replaceable class="parameter">method</replaceable></option>[:<replaceable>level</replaceable>]</term>
<listitem>
<para>
- Specify the compression level to use. Zero means no compression.
+ Specify the compression method and/or the compression level to use.
+ The compression method can be set to <literal>gzip</literal> or
+ <literal>none</literal> for no compression. A compression level can
+ be optionally specified, by appending the level number after a colon
+ (<literal>:</literal>). If no level is specified, the default compression
+ level will be used for the specified method. If only a level is
+ specified without mentioning a method, <literal>gzip</literal> compression
+ will be used.
+ </para>
+
+ <para>
For the custom and directory archive formats, this specifies compression of
- individual table-data segments, and the default is to compress
- at a moderate level.
- For plain text output, setting a nonzero compression level causes
- the entire output file to be compressed, as though it had been
- fed through <application>gzip</application>; but the default is not to compress.
+ individual table-data segments, and the default is to compress using
+ <literal>gzip</literal> at a moderate level. For plain text output,
+ setting a nonzero compression level causes the entire output file to be compressed,
+ as though it had been fed through <application>gzip</application>; but the default
+ is not to compress.
+ </para>
+ <para>
The tar archive format currently does not support compression at all.
</para>
</listitem>
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 62f940ff7a..4a8fc1e306 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,10 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
/*----------------------
* Compressor API
*----------------------
@@ -64,7 +68,7 @@
/* typedef appears in compress_io.h */
struct CompressorState
{
- CompressionAlgorithm comprAlg;
+ pg_compress_algorithm compress_algorithm;
WriteFunc writeF;
#ifdef HAVE_LIBZ
@@ -74,9 +78,6 @@ struct CompressorState
#endif
};
-static void ParseCompressionOption(int compression, CompressionAlgorithm *alg,
- int *level);
-
/* Routines that support zlib compressed data I/O */
#ifdef HAVE_LIBZ
static void InitCompressorZlib(CompressorState *cs, int level);
@@ -93,57 +94,30 @@ static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
const char *data, size_t dLen);
-/*
- * Interprets a numeric 'compression' value. The algorithm implied by the
- * value (zlib or none at the moment), is returned in *alg, and the
- * zlib compression level in *level.
- */
-static void
-ParseCompressionOption(int compression, CompressionAlgorithm *alg, int *level)
-{
- if (compression == Z_DEFAULT_COMPRESSION ||
- (compression > 0 && compression <= 9))
- *alg = COMPR_ALG_LIBZ;
- else if (compression == 0)
- *alg = COMPR_ALG_NONE;
- else
- {
- pg_fatal("invalid compression code: %d", compression);
- *alg = COMPR_ALG_NONE; /* keep compiler quiet */
- }
-
- /* The level is just the passed-in value. */
- if (level)
- *level = compression;
-}
-
/* Public interface routines */
/* Allocate a new compressor */
CompressorState *
-AllocateCompressor(int compression, WriteFunc writeF)
+AllocateCompressor(const pg_compress_specification compress_spec,
+ WriteFunc writeF)
{
CompressorState *cs;
- CompressionAlgorithm alg;
- int level;
-
- ParseCompressionOption(compression, &alg, &level);
#ifndef HAVE_LIBZ
- if (alg == COMPR_ALG_LIBZ)
+ if (compress_spec.algorithm == PG_COMPRESSION_GZIP)
pg_fatal("not built with zlib support");
#endif
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
cs->writeF = writeF;
- cs->comprAlg = alg;
+ cs->compress_algorithm = compress_spec.algorithm;
/*
* Perform compression algorithm specific initialization.
*/
#ifdef HAVE_LIBZ
- if (alg == COMPR_ALG_LIBZ)
- InitCompressorZlib(cs, level);
+ if (cs->compress_algorithm == PG_COMPRESSION_GZIP)
+ InitCompressorZlib(cs, compress_spec.level);
#endif
return cs;
@@ -154,21 +128,24 @@ AllocateCompressor(int compression, WriteFunc writeF)
* out with ahwrite().
*/
void
-ReadDataFromArchive(ArchiveHandle *AH, int compression, ReadFunc readF)
+ReadDataFromArchive(ArchiveHandle *AH, pg_compress_specification compress_spec,
+ ReadFunc readF)
{
- CompressionAlgorithm alg;
-
- ParseCompressionOption(compression, &alg, NULL);
-
- if (alg == COMPR_ALG_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (alg == COMPR_ALG_LIBZ)
+ switch (compress_spec.algorithm)
{
+ case PG_COMPRESSION_NONE:
+ ReadDataFromArchiveNone(AH, readF);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
+ ReadDataFromArchiveZlib(AH, readF);
#else
- pg_fatal("not built with zlib support");
+ pg_fatal("not built with zlib support");
#endif
+ break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
}
}
@@ -179,18 +156,21 @@ void
WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen)
{
- switch (cs->comprAlg)
+ switch (cs->compress_algorithm)
{
- case COMPR_ALG_LIBZ:
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
WriteDataToArchiveZlib(AH, cs, data, dLen);
#else
pg_fatal("not built with zlib support");
#endif
break;
- case COMPR_ALG_NONE:
+ case PG_COMPRESSION_NONE:
WriteDataToArchiveNone(AH, cs, data, dLen);
break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
}
}
@@ -200,11 +180,23 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
+ switch (cs->compress_algorithm)
+ {
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (cs->comprAlg == COMPR_ALG_LIBZ)
- EndCompressorZlib(AH, cs);
+ EndCompressorZlib(AH, cs);
+#else
+ pg_fatal("not built with zlib support");
#endif
- free(cs);
+ break;
+ case PG_COMPRESSION_NONE:
+ free(cs);
+ break;
+
+ default:
+ pg_fatal("invalid compression method");
+ break;
+ }
}
/* Private routines, specific to each compression method. */
@@ -418,10 +410,8 @@ WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
*/
struct cfp
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
+ pg_compress_algorithm compress_algorithm;
+ void *fp;
};
#ifdef HAVE_LIBZ
@@ -452,21 +442,25 @@ cfp *
cfopen_read(const char *path, const char *mode)
{
cfp *fp;
+ pg_compress_specification compress_spec = {0};
+ compress_spec.algorithm = PG_COMPRESSION_GZIP;
#ifdef HAVE_LIBZ
if (hasSuffix(path, ".gz"))
- fp = cfopen(path, mode, 1);
+ fp = cfopen(path, mode, compress_spec);
else
#endif
{
- fp = cfopen(path, mode, 0);
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
+ fp = cfopen(path, mode, compress_spec);
#ifdef HAVE_LIBZ
if (fp == NULL)
{
char *fname;
+ compress_spec.algorithm = PG_COMPRESSION_GZIP;
fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, 1);
+ fp = cfopen(fname, mode, compress_spec);
free_keep_errno(fname);
}
#endif
@@ -479,26 +473,27 @@ cfopen_read(const char *path, const char *mode)
* be a filemode as accepted by fopen() and gzopen() that indicates writing
* ("w", "wb", "a", or "ab").
*
- * If 'compression' is non-zero, a gzip compressed stream is opened, and
- * 'compression' indicates the compression level used. The ".gz" suffix
- * is automatically added to 'path' in that case.
+ * If 'compress_spec.algorithm' is GZIP, a gzip compressed stream is opened,
+ * and 'compress_spec.level' used. The ".gz" suffix is automatically added to
+ * 'path' in that case.
*
* On failure, return NULL with an error code in errno.
*/
cfp *
-cfopen_write(const char *path, const char *mode, int compression)
+cfopen_write(const char *path, const char *mode,
+ const pg_compress_specification compress_spec)
{
cfp *fp;
- if (compression == 0)
- fp = cfopen(path, mode, 0);
+ if (compress_spec.algorithm == PG_COMPRESSION_NONE)
+ fp = cfopen(path, mode, compress_spec);
else
{
#ifdef HAVE_LIBZ
char *fname;
fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression);
+ fp = cfopen(fname, mode, compress_spec);
free_keep_errno(fname);
#else
pg_fatal("not built with zlib support");
@@ -509,60 +504,96 @@ cfopen_write(const char *path, const char *mode, int compression)
}
/*
- * Opens file 'path' in 'mode'. If 'compression' is non-zero, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode, int compression)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_algorithm compress_algorithm, int compressionLevel)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compression != 0)
+ fp->compress_algorithm = compress_algorithm;
+
+ switch (compress_algorithm)
{
-#ifdef HAVE_LIBZ
- if (compression != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
+ case PG_COMPRESSION_NONE:
+ if (fd >= 0)
+ fp->fp = fdopen(fd, mode);
+ else
+ fp->fp = fopen(path, mode);
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression);
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
- }
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ if (compressionLevel != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use
+ * it
+ */
+ char mode_compression[32];
+
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, compressionLevel);
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode_compression);
+ else
+ fp->fp = gzopen(path, mode_compression);
+ }
+ else
+ {
+ /* don't specify a level, just use the zlib default */
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode);
+ else
+ fp->fp = gzopen(path, mode);
+ }
- fp->uncompressedfp = NULL;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
#else
- pg_fatal("not built with zlib support");
-#endif
- }
- else
- {
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
+ pg_fatal("not built with zlib support");
#endif
- fp->uncompressedfp = fopen(path, mode);
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
}
return fp;
}
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compress_spec)
+{
+ return cfopen_internal(path, -1, mode,
+ compress_spec.algorithm,
+ compress_spec.level);
+}
+
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compress_spec)
+{
+ return cfopen_internal(NULL, fd, mode,
+ compress_spec.algorithm,
+ compress_spec.level);
+}
int
cfread(void *ptr, int size, cfp *fp)
@@ -572,38 +603,61 @@ cfread(void *ptr, int size, cfp *fp)
if (size == 0)
return 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compress_algorithm)
{
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ case PG_COMPRESSION_NONE:
+ ret = fread(ptr, 1, size, fp->fp);
+ if (ret != size && !feof(fp->fp))
+ READ_ERROR_EXIT(fp->fp);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
- else
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzread(fp->fp, ptr, size);
+ if (ret != size && !gzeof(fp->fp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(fp->fp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+#else
+ pg_fatal("not built with zlib support");
#endif
- {
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+
+ default:
+ pg_fatal("invalid compression method");
+ break;
}
+
return ret;
}
int
cfwrite(const void *ptr, int size, cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compress_algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fwrite(ptr, 1, size, fp->fp);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
+ ret = gzwrite(fp->fp, ptr, size);
+#else
+ pg_fatal("not built with zlib support");
#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
+ break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
+ }
+
+ return ret;
}
int
@@ -611,24 +665,31 @@ cfgetc(cfp *fp)
{
int ret;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compress_algorithm)
{
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
- {
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fgetc(fp->fp);
+ if (ret == EOF)
+ READ_ERROR_EXIT(fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzgetc((gzFile) fp->fp);
+ if (ret == EOF)
+ {
+ if (!gzeof(fp->fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+#else
+ pg_fatal("not built with zlib support");
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
}
return ret;
@@ -637,65 +698,107 @@ cfgetc(cfp *fp)
char *
cfgets(cfp *fp, char *buf, int len)
{
+ char *ret;
+
+ switch (fp->compress_algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fgets(buf, len, fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
+ ret = gzgets(fp->fp, buf, len);
+#else
+ pg_fatal("not built with zlib support");
#endif
- return fgets(buf, len, fp->uncompressedfp);
+ break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
+ }
+
+ return ret;
}
int
cfclose(cfp *fp)
{
- int result;
+ int ret;
if (fp == NULL)
{
errno = EBADF;
return EOF;
}
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+
+ switch (fp->compress_algorithm)
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fclose(fp->fp);
+ fp->fp = NULL;
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzclose(fp->fp);
+ fp->fp = NULL;
+#else
+ pg_fatal("not built with zlib support");
#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
+ break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
}
+
free_keep_errno(fp);
- return result;
+ return ret;
}
int
cfeof(cfp *fp)
{
+ int ret;
+
+ switch (fp->compress_algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = feof(fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
+ ret = gzeof(fp->fp);
+#else
+ pg_fatal("not built with zlib support");
#endif
- return feof(fp->uncompressedfp);
+ break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
+ }
+
+ return ret;
}
const char *
get_cfp_error(cfp *fp)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ if (fp->compress_algorithm == PG_COMPRESSION_GZIP)
{
+#ifdef HAVE_LIBZ
int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ const char *errmsg = gzerror(fp->fp, &errnum);
if (errnum != Z_ERRNO)
return errmsg;
- }
+#else
+ pg_fatal("not built with zlib support");
#endif
+ }
+
return strerror(errno);
}
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index f635787692..d6335fff02 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -21,12 +21,6 @@
#define ZLIB_OUT_SIZE 4096
#define ZLIB_IN_SIZE 4096
-typedef enum
-{
- COMPR_ALG_NONE,
- COMPR_ALG_LIBZ
-} CompressionAlgorithm;
-
/* Prototype for callback function to WriteDataToArchive() */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
@@ -46,8 +40,10 @@ typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
-extern CompressorState *AllocateCompressor(int compression, WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH, int compression,
+extern CompressorState *AllocateCompressor(const pg_compress_specification compress_spec,
+ WriteFunc writeF);
+extern void ReadDataFromArchive(ArchiveHandle *AH,
+ const pg_compress_specification compress_spec,
ReadFunc readF);
extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen);
@@ -56,9 +52,13 @@ extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
typedef struct cfp cfp;
-extern cfp *cfopen(const char *path, const char *mode, int compression);
+extern cfp *cfopen(const char *path, const char *mode,
+ const pg_compress_specification compress_spec);
+extern cfp *cfdopen(int fd, const char *mode,
+ pg_compress_specification compress_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode, int compression);
+extern cfp *cfopen_write(const char *path, const char *mode,
+ const pg_compress_specification compress_spec);
extern int cfread(void *ptr, int size, cfp *fp);
extern int cfwrite(const void *ptr, int size, cfp *fp);
extern int cfgetc(cfp *fp);
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index e8b7898297..61c412c8cb 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -23,6 +23,7 @@
#ifndef PG_BACKUP_H
#define PG_BACKUP_H
+#include "common/compression.h"
#include "fe_utils/simple_list.h"
#include "libpq-fe.h"
@@ -143,7 +144,8 @@ typedef struct _restoreOptions
int noDataForFailedTables;
int exit_on_error;
- int compression;
+ pg_compress_specification compress_spec; /* Specification for
+ * compression */
int suppressDumpWarnings; /* Suppress output of WARNING entries
* to stderr */
bool single_txn;
@@ -303,7 +305,8 @@ extern Archive *OpenArchive(const char *FileSpec, const ArchiveFormat fmt);
/* Create a new archive */
extern Archive *CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compress_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupDumpWorker);
/* The --list option */
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index f39c0fa36f..304cc072ca 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -70,7 +64,8 @@ typedef struct _parallelReadyList
static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compress_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupWorkerPtr);
static void _getObjectDescription(PQExpBuffer buf, const TocEntry *te);
static void _printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData);
@@ -98,9 +93,10 @@ static int _discoverArchiveFormat(ArchiveHandle *AH);
static int RestoringToDB(ArchiveHandle *AH);
static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
-static void SetOutput(ArchiveHandle *AH, const char *filename, int compression);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static void SetOutput(ArchiveHandle *AH, const char *filename,
+ const pg_compress_specification compress_spec);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -239,12 +235,13 @@ setupRestoreWorker(Archive *AHX)
/* Public */
Archive *
CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compress_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupDumpWorker)
{
- ArchiveHandle *AH = _allocAH(FileSpec, fmt, compression, dosync,
- mode, setupDumpWorker);
+ ArchiveHandle *AH = _allocAH(FileSpec, fmt, compress_spec,
+ dosync, mode, setupDumpWorker);
return (Archive *) AH;
}
@@ -254,7 +251,12 @@ CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
Archive *
OpenArchive(const char *FileSpec, const ArchiveFormat fmt)
{
- ArchiveHandle *AH = _allocAH(FileSpec, fmt, 0, true, archModeRead, setupRestoreWorker);
+ ArchiveHandle *AH;
+ pg_compress_specification compress_spec = {0};
+
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH = _allocAH(FileSpec, fmt, compress_spec, true,
+ archModeRead, setupRestoreWorker);
return (Archive *) AH;
}
@@ -269,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -354,8 +353,9 @@ RestoreArchive(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
+ bool supports_compression;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -383,16 +383,23 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression != 0 && AH->PrintTocDataPtr != NULL)
+ supports_compression = true;
+ if (AH->compress_spec.algorithm != PG_COMPRESSION_NONE &&
+ AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+#ifndef HAVE_LIBZ
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+ if (supports_compression == false)
+ pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -459,8 +466,8 @@ RestoreArchive(Archive *AHX)
* Setup the output file if necessary.
*/
sav = SaveOutput(AH);
- if (ropt->filename || ropt->compression)
- SetOutput(AH, ropt->filename, ropt->compression);
+ if (ropt->filename || ropt->compress_spec.algorithm != PG_COMPRESSION_NONE)
+ SetOutput(AH, ropt->filename, ropt->compress_spec);
ahprintf(AH, "--\n-- PostgreSQL database dump\n--\n\n");
@@ -739,7 +746,7 @@ RestoreArchive(Archive *AHX)
*/
AH->stage = STAGE_FINALIZING;
- if (ropt->filename || ropt->compression)
+ if (ropt->filename || ropt->compress_spec.algorithm != PG_COMPRESSION_NONE)
RestoreOutput(AH, sav);
if (ropt->useDB)
@@ -969,6 +976,8 @@ NewRestoreOptions(void)
opts->format = archUnknown;
opts->cparams.promptPassword = TRI_DEFAULT;
opts->dumpSections = DUMP_UNSECTIONED;
+ opts->compress_spec.algorithm = PG_COMPRESSION_NONE;
+ opts->compress_spec.level = INT_MIN;
return opts;
}
@@ -1115,23 +1124,28 @@ PrintTOCSummary(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
TocEntry *te;
+ pg_compress_specification out_compress_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
+ /* TOC is always uncompressed */
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+
sav = SaveOutput(AH);
if (ropt->filename)
- SetOutput(AH, ropt->filename, 0 /* no compression */ );
+ SetOutput(AH, ropt->filename, out_compress_spec);
if (strftime(stamp_str, sizeof(stamp_str), PGDUMP_STRFTIME_FMT,
localtime(&AH->createDate)) == 0)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compress_spec.algorithm));
switch (AH->format)
{
@@ -1485,60 +1499,35 @@ archprintf(Archive *AH, const char *fmt,...)
*******************************/
static void
-SetOutput(ArchiveHandle *AH, const char *filename, int compression)
+SetOutput(ArchiveHandle *AH, const char *filename,
+ const pg_compress_specification compress_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression != 0)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compress_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compress_spec);
if (!AH->OF)
{
@@ -1549,33 +1538,24 @@ SetOutput(ArchiveHandle *AH, const char *filename, int compression)
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1699,22 +1679,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2198,10 +2173,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
static ArchiveHandle *
_allocAH(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compress_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2249,14 +2226,14 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
AH->toc->prev = AH->toc;
AH->mode = mode;
- AH->compression = compression;
+ AH->compress_spec = compress_spec;
AH->dosync = dosync;
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -2264,7 +2241,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
* Force stdin/stdout into binary mode if that is what we are using.
*/
#ifdef WIN32
- if ((fmt != archNull || compression != 0) &&
+ if ((fmt != archNull || compress_spec.algorithm != PG_COMPRESSION_NONE) &&
(AH->fSpec == NULL || strcmp(AH->fSpec, "") == 0))
{
if (mode == archModeWrite)
@@ -3669,7 +3646,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- WriteInt(AH, AH->compression);
+ WriteInt(AH, AH->compress_spec.level);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3740,21 +3717,26 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
+ AH->compress_spec.algorithm = PG_COMPRESSION_NONE;
if (AH->version >= K_VERS_1_2)
{
if (AH->version < K_VERS_1_4)
- AH->compression = AH->ReadBytePtr(AH);
+ AH->compress_spec.level = AH->ReadBytePtr(AH);
else
- AH->compression = ReadInt(AH);
+ AH->compress_spec.level = ReadInt(AH);
+
+ if (AH->compress_spec.level != 0)
+ AH->compress_spec.algorithm = PG_COMPRESSION_GZIP;
}
else
- AH->compression = Z_DEFAULT_COMPRESSION;
+ AH->compress_spec.algorithm = PG_COMPRESSION_GZIP;
#ifndef HAVE_LIBZ
- if (AH->compression != 0)
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
#endif
+
if (AH->version >= K_VERS_1_4)
{
struct tm crtm;
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 42687c4ec8..d2930949ab 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
@@ -331,14 +306,8 @@ struct _archiveHandle
DumpId *tableDataId; /* TABLE DATA ids, indexed by table dumpId */
struct _tocEntry *currToc; /* Used when dumping data */
- int compression; /*---------
- * Compression requested on open().
- * Possible values for compression:
- * -1 Z_DEFAULT_COMPRESSION
- * 0 COMPRESSION_NONE
- * 1-9 levels for gzip compression
- *---------
- */
+ pg_compress_specification compress_spec; /* Requested specification for
+ * compression */
bool dosync; /* data requested to be synced on sight */
ArchiveMode mode; /* File mode - r or w */
void *formatData; /* Header data specific to file format */
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index a0a55a1edd..6a2112c45f 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,7 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compress_spec, _CustomWriteFunc);
}
/*
@@ -377,7 +377,7 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compress_spec, _CustomWriteFunc);
}
/*
@@ -566,7 +566,7 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression, _CustomReadFunc);
+ ReadDataFromArchive(AH, AH->compress_spec, _CustomReadFunc);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 798182b6f7..7d2cddbb2c 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -327,7 +327,8 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
+ ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
+ AH->compress_spec);
if (ctx->dataFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -573,6 +574,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
cfp *tocFH;
+ pg_compress_specification compress_spec = {0};
char fname[MAXPGPATH];
setFilePath(AH, fname, "toc.dat");
@@ -581,7 +583,8 @@ _CloseArchive(ArchiveHandle *AH)
ctx->pstate = ParallelBackupStart(AH);
/* The TOC is always created uncompressed */
- tocFH = cfopen_write(fname, PG_BINARY_W, 0);
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
+ tocFH = cfopen_write(fname, PG_BINARY_W, compress_spec);
if (tocFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -639,12 +642,14 @@ static void
_StartBlobs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ pg_compress_specification compress_spec = {0};
char fname[MAXPGPATH];
setFilePath(AH, fname, "blobs.toc");
/* The blob TOC file is never compressed */
- ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
+ ctx->blobsTocFH = cfopen_write(fname, "ab", compress_spec);
if (ctx->blobsTocFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -662,7 +667,7 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
+ ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compress_spec);
if (ctx->dataFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
diff --git a/src/bin/pg_dump/pg_backup_tar.c b/src/bin/pg_dump/pg_backup_tar.c
index 402b93c610..d773c291c8 100644
--- a/src/bin/pg_dump/pg_backup_tar.c
+++ b/src/bin/pg_dump/pg_backup_tar.c
@@ -35,6 +35,7 @@
#include <unistd.h>
#include "common/file_utils.h"
+#include "compress_io.h"
#include "fe_utils/string_utils.h"
#include "pg_backup_archiver.h"
#include "pg_backup_tar.h"
@@ -194,7 +195,7 @@ InitArchiveFmt_Tar(ArchiveHandle *AH)
* possible since gzdopen uses buffered IO which totally screws file
* positioning.
*/
- if (AH->compression != 0)
+ if (AH->compress_spec.algorithm != PG_COMPRESSION_NONE)
pg_fatal("compression is not supported by tar archive format");
}
else
@@ -328,7 +329,7 @@ tarOpen(ArchiveHandle *AH, const char *filename, char mode)
}
}
- if (AH->compression == 0)
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_NONE)
tm->nFH = ctx->tarFH;
else
pg_fatal("compression is not supported by tar archive format");
@@ -383,7 +384,7 @@ tarOpen(ArchiveHandle *AH, const char *filename, char mode)
umask(old_umask);
- if (AH->compression == 0)
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_NONE)
tm->nFH = tm->tmpFH;
else
pg_fatal("compression is not supported by tar archive format");
@@ -401,7 +402,7 @@ tarOpen(ArchiveHandle *AH, const char *filename, char mode)
static void
tarClose(ArchiveHandle *AH, TAR_MEMBER *th)
{
- if (AH->compression != 0)
+ if (AH->compress_spec.algorithm != PG_COMPRESSION_NONE)
pg_fatal("compression is not supported by tar archive format");
if (th->mode == 'w')
@@ -800,7 +801,6 @@ _CloseArchive(ArchiveHandle *AH)
memcpy(ropt, AH->public.ropt, sizeof(RestoreOptions));
ropt->filename = NULL;
ropt->dropSchema = 1;
- ropt->compression = 0;
ropt->superuser = NULL;
ropt->suppressDumpWarnings = true;
@@ -888,7 +888,7 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
if (oid == 0)
pg_fatal("invalid OID for large object (%u)", oid);
- if (AH->compression != 0)
+ if (AH->compress_spec.algorithm != PG_COMPRESSION_NONE)
pg_fatal("compression is not supported by tar archive format");
sprintf(fname, "blob_%u.dat", oid);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index da427f4d4a..a97a0f3a84 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -54,8 +54,10 @@
#include "catalog/pg_subscription.h"
#include "catalog/pg_trigger_d.h"
#include "catalog/pg_type_d.h"
+#include "common/compression.h"
#include "common/connect.h"
#include "common/relpath.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/option_utils.h"
#include "fe_utils/string_utils.h"
@@ -164,6 +166,8 @@ static void setup_connection(Archive *AH,
const char *dumpencoding, const char *dumpsnapshot,
char *use_role);
static ArchiveFormat parseArchiveFormat(const char *format, ArchiveMode *mode);
+static bool parse_compression(const char *opt,
+ pg_compress_specification *compress_spec);
static void expand_schema_name_patterns(Archive *fout,
SimpleStringList *patterns,
SimpleOidList *oids,
@@ -340,8 +344,9 @@ main(int argc, char **argv)
const char *dumpsnapshot = NULL;
char *use_role = NULL;
int numWorkers = 1;
- int compressLevel = -1;
int plainText = 0;
+ pg_compress_specification compress_spec = {0};
+ bool user_compression_defined = false;
ArchiveFormat archiveFormat = archUnknown;
ArchiveMode archiveMode;
@@ -561,10 +566,10 @@ main(int argc, char **argv)
dopt.aclsSkip = true;
break;
- case 'Z': /* Compression Level */
- if (!option_parse_int(optarg, "-Z/--compress", 0, 9,
- &compressLevel))
+ case 'Z': /* Compression */
+ if (!parse_compression(optarg, &compress_spec))
exit_nicely(1);
+ user_compression_defined = true;
break;
case 0:
@@ -687,23 +692,20 @@ main(int argc, char **argv)
if (archiveFormat == archNull)
plainText = 1;
- /* Custom and directory formats are compressed by default, others not */
- if (compressLevel == -1)
+ /*
+ * Custom and directory formats are compressed by default (zlib), others
+ * not
+ */
+ if (user_compression_defined == false)
{
+ parse_compress_specification(PG_COMPRESSION_NONE, NULL, &compress_spec);
#ifdef HAVE_LIBZ
if (archiveFormat == archCustom || archiveFormat == archDirectory)
- compressLevel = Z_DEFAULT_COMPRESSION;
- else
+ parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
+ &compress_spec);
#endif
- compressLevel = 0;
}
-#ifndef HAVE_LIBZ
- if (compressLevel != 0)
- pg_log_warning("requested compression not available in this installation -- archive will be uncompressed");
- compressLevel = 0;
-#endif
-
/*
* If emitting an archive format, we always want to emit a DATABASE item,
* in case --create is specified at pg_restore time.
@@ -716,8 +718,8 @@ main(int argc, char **argv)
pg_fatal("parallel backup only supported by the directory format");
/* Open the output file */
- fout = CreateArchive(filename, archiveFormat, compressLevel, dosync,
- archiveMode, setupDumpWorker);
+ fout = CreateArchive(filename, archiveFormat, compress_spec,
+ dosync, archiveMode, setupDumpWorker);
/* Make dump options accessible right away */
SetArchiveOptions(fout, &dopt, NULL);
@@ -948,10 +950,7 @@ main(int argc, char **argv)
ropt->sequence_data = dopt.sequence_data;
ropt->binary_upgrade = dopt.binary_upgrade;
- if (compressLevel == -1)
- ropt->compression = 0;
- else
- ropt->compression = compressLevel;
+ ropt->compress_spec = compress_spec;
ropt->suppressDumpWarnings = true; /* We've already shown them */
@@ -998,7 +997,8 @@ help(const char *progname)
printf(_(" -j, --jobs=NUM use this many parallel jobs to dump\n"));
printf(_(" -v, --verbose verbose mode\n"));
printf(_(" -V, --version output version information, then exit\n"));
- printf(_(" -Z, --compress=0-9 compression level for compressed formats\n"));
+ printf(_(" -Z, --compress=METHOD[:LEVEL]\n"
+ " compress as specified\n"));
printf(_(" --lock-wait-timeout=TIMEOUT fail after waiting TIMEOUT for a table lock\n"));
printf(_(" --no-sync do not wait for changes to be written safely to disk\n"));
printf(_(" -?, --help show this help, then exit\n"));
@@ -1258,6 +1258,60 @@ get_synchronized_snapshot(Archive *fout)
return result;
}
+/*
+ * Interprets and validates a compression option using the common compression
+ * parsing functions. If the requested compression is not available then the
+ * archives are uncompressed.
+ */
+static bool
+parse_compression(const char *opt, pg_compress_specification *compress_spec)
+{
+ char *algorithm_str = NULL;
+ char *level_str = NULL;
+ char *validation_error = NULL;
+ bool supports_compression = true;
+
+ compress_spec->algorithm = PG_COMPRESSION_NONE;
+ compress_spec->level = 0;
+
+ parse_compress_user_options(opt, &algorithm_str, &level_str);
+ if (!parse_compress_algorithm(algorithm_str, &(compress_spec->algorithm)))
+ {
+ pg_log_error("invalid compression method: \"%s\" (gzip, none)",
+ algorithm_str);
+ return false;
+ }
+
+ parse_compress_specification(compress_spec->algorithm, level_str,
+ compress_spec);
+ validation_error = validate_compress_specification(compress_spec);
+ if (validation_error)
+ {
+ pg_log_error("invalid compression specification: %s", validation_error);
+ return false;
+ }
+
+ /* Switch off unsupported compressions that made it through parsing */
+ if (test_compress_support(compress_spec))
+ supports_compression = false;
+
+ /* Also switch off unimplemented compressions */
+ if (compress_spec->algorithm != PG_COMPRESSION_NONE &&
+ compress_spec->algorithm != PG_COMPRESSION_GZIP)
+ supports_compression = false;
+
+ if (!supports_compression)
+ {
+ pg_log_warning("requested compression not available in this installation -- archive will be uncompressed");
+ parse_compress_specification(PG_COMPRESSION_NONE, NULL, compress_spec);
+ }
+
+ pg_free(algorithm_str);
+ pg_free(level_str);
+
+ return true;
+}
+
static ArchiveFormat
parseArchiveFormat(const char *format, ArchiveMode *mode)
{
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a583c8a6d2..fffb9e075b 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -121,12 +121,27 @@ command_fails_like(
'pg_restore: cannot specify both --single-transaction and multiple jobs');
command_fails_like(
- [ 'pg_dump', '-Z', '-1' ],
- qr/\Qpg_dump: error: -Z\/--compress must be in range 0..9\E/,
- 'pg_dump: -Z/--compress must be in range');
+ [ 'pg_dump', '--compress', 'garbage' ],
+ qr/\Qpg_dump: error: invalid compression method: "garbage" (gzip, none)\E/,
+ 'pg_dump: invalid --compress');
+
+command_fails_like(
+ [ 'pg_dump', '--compress', 'none:1' ],
+ qr/\Qpg_dump: error: invalid compression specification: compression algorithm "none" does not accept a compression level\E/,
+ 'pg_dump: invalid compression specification: compression algorithm "none" does not accept a compression level');
+
+command_fails_like(
+ [ 'pg_dump', '-Z', 'gzip:nonInt' ],
+ qr/\Qpg_dump: error: invalid compression specification: unrecognized compression option: "nonInt"\E/,
+ 'pg_dump: invalid compression specification: must be an integer');
if (check_pg_config("#define HAVE_LIBZ 1"))
{
+ command_fails_like(
+ [ 'pg_dump', '-Z', '15' ],
+ qr/\Qpg_dump: error: invalid compression specification: compression algorithm "gzip" expects a compression level between 1 and 9 (default at -1)\E/,
+ 'pg_dump: invalid compression specification: must be in range');
+
command_fails_like(
[ 'pg_dump', '--compress', '1', '--format', 'tar' ],
qr/\Qpg_dump: error: compression is not supported by tar archive format\E/,
@@ -134,6 +149,11 @@ if (check_pg_config("#define HAVE_LIBZ 1"))
}
else
{
+ command_fails_like(
+ [ 'pg_dump', '-Z', '-1' ],
+ qr/\Qpg_dump: error: invalid compression specification: compression algorithm "gzip" expects a compression level between 1 and 9 (default at 0)\E/,
+ 'pg_dump: invalid compression specification: must be in range');
+
# --jobs > 1 forces an error with tar format.
command_fails_like(
[ 'pg_dump', '--compress', '1', '--format', 'tar', '-j3' ],
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 8dc1f0eccb..e97d086956 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -87,7 +87,7 @@ my %pgdump_runs = (
compile_option => 'gzip',
dump_cmd => [
'pg_dump', '--jobs=2',
- '--format=directory', '--compress=1',
+ '--format=directory', '--compress=gzip:1',
"--file=$tempdir/compression_gzip_dir", 'postgres',
],
# Give coverage for manually compressed blob.toc files during
--
2.34.1
v8-0001-Export-gzip-program-to-pg_dump-tap-tests.patchtext/x-patch; name=v8-0001-Export-gzip-program-to-pg_dump-tap-tests.patchDownload
From f5125b4d4ca5bfc7ef7f68b7dbe4a1c7777de6eb Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 28 Nov 2022 15:09:22 +0000
Subject: [PATCH v8 1/5] Export gzip program to pg_dump tap tests
---
src/bin/pg_dump/meson.build | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index e66f632b54..6cff2a6c3d 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -88,6 +88,9 @@ tests += {
't/003_pg_dump_with_server.pl',
't/010_dump_connstr.pl',
],
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ },
},
}
--
2.34.1
v8-0002-Make-the-pg_receivewal-compression-parsing-functi.patchtext/x-patch; name=v8-0002-Make-the-pg_receivewal-compression-parsing-functi.patchDownload
From a4ce723e6719ed80c36fd0c4fd85c962b6b25a45 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 28 Nov 2022 15:18:27 +0000
Subject: [PATCH v8 2/5] Make the pg_receivewal compression parsing function
common
Also and relax parsing errors in the helper functions and re-introduce those as
an independed function.
As it is shown in the rest of the patch series, there is a lot of duplication
between pg_dump's parsing of compression options and pg_receivewal's. Now the
core work is done in common. However pg_dump would not error out if the
requested compression algorithm is not supported by the build, whereas other
callers will error out. Also it seems a bit weird for only one of the parsing
functions for compressions to error out on missing support and that one to not
be the one responsible for identifying the compression algorithm.
A new function is added to test the support of the algorithm allowing the user
to tune the behaviour.
---
src/backend/backup/basebackup.c | 1 +
src/bin/pg_basebackup/pg_basebackup.c | 1 +
src/bin/pg_basebackup/pg_receivewal.c | 65 +-------------
src/common/compression.c | 119 ++++++++++++++++++++++----
src/include/common/compression.h | 3 +
5 files changed, 111 insertions(+), 78 deletions(-)
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 74fb529380..70cd720823 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -942,6 +942,7 @@ parse_basebackup_options(List *options, basebackup_options *opt)
parse_compress_specification(opt->compression, compression_detail_str,
&opt->compression_specification);
+ (void) test_compress_support(&opt->compression_specification);
error_detail =
validate_compress_specification(&opt->compression_specification);
if (error_detail != NULL)
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 22836ca01a..cdd32c9763 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -2556,6 +2556,7 @@ main(int argc, char **argv)
compression_algorithm);
parse_compress_specification(alg, compression_detail, &client_compress);
+ (void) test_compress_support(&client_compress);
error_detail = validate_compress_specification(&client_compress);
if (error_detail != NULL)
pg_fatal("invalid compression specification: %s",
diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c
index 63207ca025..d4a3a4213d 100644
--- a/src/bin/pg_basebackup/pg_receivewal.c
+++ b/src/bin/pg_basebackup/pg_receivewal.c
@@ -57,8 +57,6 @@ static XLogRecPtr endpos = InvalidXLogRecPtr;
static void usage(void);
-static void parse_compress_options(char *option, char **algorithm,
- char **detail);
static DIR *get_destination_dir(char *dest_folder);
static void close_destination_dir(DIR *dest_dir, char *dest_folder);
static XLogRecPtr FindStreamingStart(uint32 *tli);
@@ -109,65 +107,6 @@ usage(void)
printf(_("%s home page: <%s>\n"), PACKAGE_NAME, PACKAGE_URL);
}
-/*
- * Basic parsing of a value specified for -Z/--compress
- *
- * The parsing consists of a METHOD:DETAIL string fed later on to a more
- * advanced routine in charge of proper validation checks. This only extracts
- * METHOD and DETAIL. If only an integer is found, the method is implied by
- * the value specified.
- */
-static void
-parse_compress_options(char *option, char **algorithm, char **detail)
-{
- char *sep;
- char *endp;
- long result;
-
- /*
- * Check whether the compression specification consists of a bare integer.
- *
- * For backward-compatibility, assume "none" if the integer found is zero
- * and "gzip" otherwise.
- */
- result = strtol(option, &endp, 10);
- if (*endp == '\0')
- {
- if (result == 0)
- {
- *algorithm = pstrdup("none");
- *detail = NULL;
- }
- else
- {
- *algorithm = pstrdup("gzip");
- *detail = pstrdup(option);
- }
- return;
- }
-
- /*
- * Check whether there is a compression detail following the algorithm
- * name.
- */
- sep = strchr(option, ':');
- if (sep == NULL)
- {
- *algorithm = pstrdup(option);
- *detail = NULL;
- }
- else
- {
- char *alg;
-
- alg = palloc((sep - option) + 1);
- memcpy(alg, option, sep - option);
- alg[sep - option] = '\0';
-
- *algorithm = alg;
- *detail = pstrdup(sep + 1);
- }
-}
/*
* Check if the filename looks like a WAL file, letting caller know if this
@@ -786,8 +725,8 @@ main(int argc, char **argv)
verbose++;
break;
case 'Z':
- parse_compress_options(optarg, &compression_algorithm_str,
- &compression_detail);
+ parse_compress_user_options(optarg, &compression_algorithm_str,
+ &compression_detail);
break;
/* action */
case 1:
diff --git a/src/common/compression.c b/src/common/compression.c
index df5b627834..57c23221a2 100644
--- a/src/common/compression.c
+++ b/src/common/compression.c
@@ -39,6 +39,66 @@
static int expect_integer_value(char *keyword, char *value,
pg_compress_specification *result);
+/*
+ * Basic parsing of a value specified for -Z/--compress
+ *
+ * The parsing consists of a METHOD:DETAIL string fed later on to a more
+ * advanced routine in charge of proper validation checks. This only extracts
+ * METHOD and DETAIL. If only an integer is found, the method is implied by
+ * the value specified.
+ */
+void
+parse_compress_user_options(const char *option, char **algorithm, char **detail)
+{
+ char *sep;
+ char *endp;
+ long result;
+
+ /*
+ * Check whether the compression specification consists of a bare integer.
+ *
+ * For backward-compatibility, assume "none" if the integer found is zero
+ * and "gzip" otherwise.
+ */
+ result = strtol(option, &endp, 10);
+ if (*endp == '\0')
+ {
+ if (result == 0)
+ {
+ *algorithm = pstrdup("none");
+ *detail = NULL;
+ }
+ else
+ {
+ *algorithm = pstrdup("gzip");
+ *detail = pstrdup(option);
+ }
+ return;
+ }
+
+ /*
+ * Check whether there is a compression detail following the algorithm
+ * name.
+ */
+ sep = strchr(option, ':');
+ if (sep == NULL)
+ {
+ *algorithm = pstrdup(option);
+ *detail = NULL;
+ }
+ else
+ {
+ char *alg;
+
+ alg = palloc((sep - option) + 1);
+ memcpy(alg, option, sep - option);
+ alg[sep - option] = '\0';
+
+ *algorithm = alg;
+ *detail = pstrdup(sep + 1);
+ }
+}
+
/*
* Look up a compression algorithm by name. Returns true and sets *algorithm
* if the name is recognized. Otherwise returns false.
@@ -100,6 +160,9 @@ get_compress_algorithm_name(pg_compress_algorithm algorithm)
*
* Use validate_compress_specification() to find out whether a compression
* specification is semantically sensible.
+ *
+ * Does not test whether this build of PostgreSQL supports the requested
+ * compression method.
*/
void
parse_compress_specification(pg_compress_algorithm algorithm, char *specification,
@@ -123,30 +186,16 @@ parse_compress_specification(pg_compress_algorithm algorithm, char *specificatio
result->level = 0;
break;
case PG_COMPRESSION_LZ4:
-#ifdef USE_LZ4
result->level = 0; /* fast compression mode */
-#else
- result->parse_error =
- psprintf(_("this build does not support compression with %s"),
- "LZ4");
-#endif
break;
case PG_COMPRESSION_ZSTD:
#ifdef USE_ZSTD
result->level = ZSTD_CLEVEL_DEFAULT;
-#else
- result->parse_error =
- psprintf(_("this build does not support compression with %s"),
- "ZSTD");
#endif
break;
case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
result->level = Z_DEFAULT_COMPRESSION;
-#else
- result->parse_error =
- psprintf(_("this build does not support compression with %s"),
- "gzip");
#endif
break;
}
@@ -265,7 +314,8 @@ parse_compress_specification(pg_compress_algorithm algorithm, char *specificatio
* and return -1.
*/
static int
-expect_integer_value(char *keyword, char *value, pg_compress_specification *result)
+expect_integer_value(char *keyword, char *value,
+ pg_compress_specification *result)
{
int ivalue;
char *ivalue_endp;
@@ -356,3 +406,42 @@ validate_compress_specification(pg_compress_specification *spec)
return NULL;
}
+
+/*
+ * Returns NULL if the compression algorithm is supported by this build.
+ * Otherwise, returns an error message. In the later case, the error is attached
+ * to pg_compress_specification unless another error preceeds it.
+ */
+char *
+test_compress_support(pg_compress_specification *spec)
+{
+ char *err = NULL;
+ switch (spec->algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifndef HAVE_LIBZ
+ err = psprintf(_("this build does not support compression with %s"),
+ "gzip");
+#endif
+ break;
+ case PG_COMPRESSION_LZ4:
+#ifndef USE_LZ4
+ err = psprintf(_("this build does not support compression with %s"),
+ "LZ4");
+#endif
+ break;
+ case PG_COMPRESSION_ZSTD:
+#ifndef USE_ZSTD
+ err = psprintf(_("this build does not support compression with %s"),
+ "ZSTD");
+#endif
+ break;
+ }
+
+ if (err && !spec->parse_error)
+ spec->parse_error = err;
+
+ return err;
+}
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 5d680058ed..eb30a7b547 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -33,6 +33,8 @@ typedef struct pg_compress_specification
char *parse_error; /* NULL if parsing was OK, else message */
} pg_compress_specification;
+extern void parse_compress_user_options(const char *option, char **algorithm,
+ char **detail);
extern bool parse_compress_algorithm(char *name, pg_compress_algorithm *algorithm);
extern const char *get_compress_algorithm_name(pg_compress_algorithm algorithm);
@@ -40,6 +42,7 @@ extern void parse_compress_specification(pg_compress_algorithm algorithm,
char *specification,
pg_compress_specification *result);
+extern char *test_compress_support(pg_compress_specification *);
extern char *validate_compress_specification(pg_compress_specification *);
#endif
--
2.34.1
v8-0004-Introduce-Compressor-API-in-pg_dump.patchtext/x-patch; name=v8-0004-Introduce-Compressor-API-in-pg_dump.patchDownload
From b1fd402c6108ac88f705c7bebc75333983625221 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 28 Nov 2022 15:35:19 +0000
Subject: [PATCH v8 4/5] Introduce Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for opening, writing, etc. The implementor of a new
compression method is now able to "simply" just add those definitions.
Custom compressed archives need to now store the compression method in their
header. This requires a bump in the version number. The level of compression
is no longer stored in the dump as it is irrelevant.
---
src/bin/pg_dump/Makefile | 1 +
src/bin/pg_dump/compress_gzip.c | 390 ++++++++++++
src/bin/pg_dump/compress_gzip.h | 9 +
src/bin/pg_dump/compress_io.c | 817 ++++++--------------------
src/bin/pg_dump/compress_io.h | 69 ++-
src/bin/pg_dump/meson.build | 1 +
src/bin/pg_dump/pg_backup_archiver.c | 93 +--
src/bin/pg_dump/pg_backup_archiver.h | 4 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 85 +--
10 files changed, 765 insertions(+), 727 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 9dc5a784dd..29eab02d37 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,6 +24,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..bc6d1abc77
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,390 @@
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_gzip.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ int compressionLevel;
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ if (deflateInit(zp, gzipcs->compressionLevel) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs, int compressionLevel)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+ gzipcs->compressionLevel = compressionLevel;
+
+ cs->private = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+typedef struct GzipData
+{
+ gzFile fp;
+ int compressionLevel;
+} GzipData;
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ size_t ret;
+
+ ret = gzread(gd->fp, ptr, size);
+ if (ret != size && !gzeof(gd->fp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gd->fp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+
+ return gzwrite(gd->fp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gd->fp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gd->fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+
+ return gzgets(gd->fp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ int save_errno;
+ int ret;
+
+ CFH->private = NULL;
+
+ ret = gzclose(gd->fp);
+
+ save_errno = errno;
+ free(gd);
+ errno = save_errno;
+
+ return ret;
+}
+
+static int
+Gzip_eof(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+
+ return gzeof(gd->fp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gd->fp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ char mode_compression[32];
+
+ if (gd->compressionLevel != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, gd->compressionLevel);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gd->fp = gzdopen(dup(fd), mode_compression);
+ else
+ gd->fp = gzopen(path, mode_compression);
+
+ if (gd->fp == NULL)
+ return 1;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle * CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressGzip(CompressFileHandle * CFH, int compressionLevel)
+{
+ GzipData *gd;
+
+ CFH->open = Gzip_open;
+ CFH->open_write = Gzip_open_write;
+ CFH->read = Gzip_read;
+ CFH->write = Gzip_write;
+ CFH->gets = Gzip_gets;
+ CFH->getc = Gzip_getc;
+ CFH->close = Gzip_close;
+ CFH->eof = Gzip_eof;
+ CFH->get_error = Gzip_get_error;
+
+ gd = pg_malloc0(sizeof(GzipData));
+ gd->compressionLevel = compressionLevel;
+
+ CFH->private = gd;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs, int compressionLevel)
+{
+ pg_fatal("not built with zlib support");
+}
+
+void
+InitCompressGzip(CompressFileHandle * CFH, int compressionLevel)
+{
+ pg_fatal("not built with zlib support");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..ab0362c1f3
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,9 @@
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs, int compressionLevel);
+extern void InitCompressGzip(CompressFileHandle * CFH, int compressionLevel);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 4a8fc1e306..3065bd76fa 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -51,9 +51,12 @@
*
*-------------------------------------------------------------------------
*/
+#include <sys/stat.h>
+#include <unistd.h>
#include "postgres_fe.h"
#include "compress_io.h"
+#include "compress_gzip.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -65,113 +68,73 @@
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
+/* Private routines that support uncompressed data I/O */
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
{
- pg_compress_algorithm compress_algorithm;
- WriteFunc writeF;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
+ free(buf);
+}
+
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+static void
+InitCompressorNone(CompressorState *cs)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+}
/* Public interface routines */
/* Allocate a new compressor */
CompressorState *
AllocateCompressor(const pg_compress_specification compress_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compress_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("not built with zlib support");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
- cs->compress_algorithm = compress_spec.algorithm;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compress_algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, compress_spec.level);
-#endif
- return cs;
-}
-
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH, pg_compress_specification compress_spec,
- ReadFunc readF)
-{
switch (compress_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- ReadDataFromArchiveNone(AH, readF);
+ InitCompressorNone(cs);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("not built with zlib support");
-#endif
+ InitCompressorGzip(cs, compress_spec.level);
break;
default:
pg_fatal("invalid compression method");
break;
}
-}
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compress_algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- default:
- pg_fatal("invalid compression method");
- break;
- }
+ return cs;
}
/*
@@ -180,243 +143,28 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
- switch (cs->compress_algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- EndCompressorZlib(AH, cs);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- free(cs);
- break;
-
- default:
- pg_fatal("invalid compression method");
- break;
- }
-}
-
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
- */
-
-static void
-InitCompressorZlib(CompressorState *cs, int level)
-{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
-}
-
-static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
-{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
-
- free(cs->zlibOut);
- free(cs->zp);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
-{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
-
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
-
- if (res == Z_STREAM_END)
- break;
- }
-}
-
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
-}
-
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
-{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
-
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
-
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
-
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
-}
-#endif /* HAVE_LIBZ */
-
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
-{
- size_t cnt;
- char *buf;
- size_t buflen;
-
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
-
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
-
- free(buf);
-}
-
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->writeF(AH, data, dLen);
-}
-
-
/*----------------------
* Compressed stream API
*----------------------
*/
-/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
- */
-struct cfp
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- pg_compress_algorithm compress_algorithm;
- void *fp;
-};
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
+}
/* free() without changing errno; useful in several places below */
static void
@@ -429,392 +177,219 @@ free_keep_errno(void *p)
}
/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
+ * Compression None implementation
*/
-cfp *
-cfopen_read(const char *path, const char *mode)
+static size_t
+_read(void *ptr, size_t size, CompressFileHandle * CFH)
{
- cfp *fp;
- pg_compress_specification compress_spec = {0};
+ FILE *fp = (FILE *) CFH->private;
+ size_t ret;
- compress_spec.algorithm = PG_COMPRESSION_GZIP;
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- fp = cfopen(path, mode, compress_spec);
- else
-#endif
- {
- compress_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compress_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
-
- compress_spec.algorithm = PG_COMPRESSION_GZIP;
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compress_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
-}
+ if (size == 0)
+ return 0;
-/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compress_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compress_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compress_spec)
-{
- cfp *fp;
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
- if (compress_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compress_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compress_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("not built with zlib support");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
+ return ret;
}
-/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
- */
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_algorithm compress_algorithm, int compressionLevel)
+static size_t
+_write(const void *ptr, size_t size, CompressFileHandle * CFH)
{
- cfp *fp = pg_malloc(sizeof(cfp));
-
- fp->compress_algorithm = compress_algorithm;
-
- switch (compress_algorithm)
- {
- case PG_COMPRESSION_NONE:
- if (fd >= 0)
- fp->fp = fdopen(fd, mode);
- else
- fp->fp = fopen(path, mode);
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- if (compressionLevel != Z_DEFAULT_COMPRESSION)
- {
- /*
- * user has specified a compression level, so tell zlib to use
- * it
- */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compressionLevel);
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode_compression);
- else
- fp->fp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode);
- else
- fp->fp = gzopen(path, mode);
- }
-
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- default:
- pg_fatal("invalid compression method");
- break;
- }
-
- return fp;
+ return fwrite(ptr, 1, size, (FILE *) CFH->private);
}
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compress_spec)
+static const char *
+_get_error(CompressFileHandle * CFH)
{
- return cfopen_internal(path, -1, mode,
- compress_spec.algorithm,
- compress_spec.level);
+ return strerror(errno);
}
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compress_spec)
+static char *
+_gets(char *ptr, int size, CompressFileHandle * CFH)
{
- return cfopen_internal(NULL, fd, mode,
- compress_spec.algorithm,
- compress_spec.level);
+ return fgets(ptr, size, (FILE *) CFH->private);
}
-int
-cfread(void *ptr, int size, cfp *fp)
+static int
+_getc(CompressFileHandle * CFH)
{
+ FILE *fp = (FILE *) CFH->private;
int ret;
- if (size == 0)
- return 0;
-
- switch (fp->compress_algorithm)
+ ret = fgetc(fp);
+ if (ret == EOF)
{
- case PG_COMPRESSION_NONE:
- ret = fread(ptr, 1, size, fp->fp);
- if (ret != size && !feof(fp->fp))
- READ_ERROR_EXIT(fp->fp);
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzread(fp->fp, ptr, size);
- if (ret != size && !gzeof(fp->fp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->fp, &errnum);
-
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
-
- default:
- pg_fatal("invalid compression method");
- break;
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
}
return ret;
}
-int
-cfwrite(const void *ptr, int size, cfp *fp)
+static int
+_close(CompressFileHandle * CFH)
{
+ FILE *fp = (FILE *) CFH->private;
int ret = 0;
- switch (fp->compress_algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fwrite(ptr, 1, size, fp->fp);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzwrite(fp->fp, ptr, size);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- default:
- pg_fatal("invalid compression method");
- break;
- }
+ CFH->private = NULL;
+
+ if (fp)
+ ret = fclose(fp);
return ret;
}
-int
-cfgetc(cfp *fp)
+static int
+_eof(CompressFileHandle * CFH)
{
- int ret;
+ return feof((FILE *) CFH->private);
+}
- switch (fp->compress_algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgetc(fp->fp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->fp);
+static int
+_open(const char *path, int fd, const char *mode, CompressFileHandle * CFH)
+{
+ Assert(CFH->private == NULL);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgetc((gzFile) fp->fp);
- if (ret == EOF)
- {
- if (!gzeof(fp->fp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- default:
- pg_fatal("invalid compression method");
- break;
- }
+ if (fd >= 0)
+ CFH->private = fdopen(dup(fd), mode);
+ else
+ CFH->private = fopen(path, mode);
- return ret;
+ if (CFH->private == NULL)
+ return 1;
+
+ return 0;
}
-char *
-cfgets(cfp *fp, char *buf, int len)
+static int
+_open_write(const char *path, const char *mode, CompressFileHandle * CFH)
{
- char *ret;
+ Assert(CFH->private == NULL);
- switch (fp->compress_algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgets(buf, len, fp->fp);
+ CFH->private = fopen(path, mode);
+ if (CFH->private == NULL)
+ return 1;
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgets(fp->fp, buf, len);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- default:
- pg_fatal("invalid compression method");
- break;
- }
+ return 0;
+}
- return ret;
+static void
+InitCompressNone(CompressFileHandle * CFH)
+{
+ CFH->open = _open;
+ CFH->open_write = _open_write;
+ CFH->read = _read;
+ CFH->write = _write;
+ CFH->gets = _gets;
+ CFH->getc = _getc;
+ CFH->close = _close;
+ CFH->eof = _eof;
+ CFH->get_error = _get_error;
+
+ CFH->private = NULL;
}
-int
-cfclose(cfp *fp)
+/*
+ * Public interface
+ */
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compress_spec)
{
- int ret;
+ CompressFileHandle *CFH;
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
- }
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
- switch (fp->compress_algorithm)
+ switch (compress_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- ret = fclose(fp->fp);
- fp->fp = NULL;
-
+ InitCompressNone(CFH);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzclose(fp->fp);
- fp->fp = NULL;
-#else
- pg_fatal("not built with zlib support");
-#endif
+ InitCompressGzip(CFH, compress_spec.level);
break;
default:
pg_fatal("invalid compression method");
break;
}
- free_keep_errno(fp);
-
- return ret;
+ return CFH;
}
-int
-cfeof(cfp *fp)
+/*
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
+ *
+ * If the file at 'path' does not exist, we append the ".gz" suffix (if
+ * 'path' doesn't already have it) and try again. So if you pass "foo" as
+ * 'path', this will open either "foo" or "foo.gz", trying in that order.
+ *
+ * On failure, return NULL with an error code in errno.
+ *
+ */
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- int ret;
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compress_spec = {0};
- switch (fp->compress_algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = feof(fp->fp);
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzeof(fp->fp);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- default:
- pg_fatal("invalid compression method");
- break;
- }
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
- return ret;
-}
+ fname = strdup(path);
-const char *
-get_cfp_error(cfp *fp)
-{
- if (fp->compress_algorithm == PG_COMPRESSION_GZIP)
+ if (hasSuffix(fname, ".gz"))
+ compress_spec.algorithm = PG_COMPRESSION_GZIP;
+ else
{
+ bool exists;
+
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not build with compression */
+ if (exists)
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- int errnum;
- const char *errmsg = gzerror(fp->fp, &errnum);
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
- if (errnum != Z_ERRNO)
- return errmsg;
-#else
- pg_fatal("not built with zlib support");
+ if (exists)
+ compress_spec.algorithm = PG_COMPRESSION_GZIP;
+ }
#endif
}
- return strerror(errno);
+ CFH = InitCompressFileHandle(compress_spec);
+ if (CFH->open(fname, -1, mode, CFH))
+ {
+ free_keep_errno(CFH);
+ CFH = NULL;
+ }
+ free_keep_errno(fname);
+
+ return CFH;
}
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
+int
+DestroyCompressFileHandle(CompressFileHandle * CFH)
{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
+ int ret = 0;
- if (filenamelen < suffixlen)
- return 0;
+ if (CFH->private)
+ ret = CFH->close(CFH);
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
-}
+ free_keep_errno(CFH);
-#endif
+ return ret;
+}
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index d6335fff02..a986f5e6ee 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -37,34 +37,61 @@ typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ ReadFunc readF;
+ WriteFunc writeF;
+
+ void *private;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compress_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compress_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ int (*open) (const char *path, int fd, const char *mode,
+ CompressFileHandle * CFH);
+ int (*open_write) (const char *path, const char *mode,
+ CompressFileHandle * cxt);
+ size_t (*read) (void *ptr, size_t size, CompressFileHandle * CFH);
+ size_t (*write) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ char *(*gets) (char *s, int size, CompressFileHandle * CFH);
+ int (*getc) (CompressFileHandle * CFH);
+ int (*eof) (CompressFileHandle * CFH);
+ int (*close) (CompressFileHandle * CFH);
+ const char *(*get_error) (CompressFileHandle * CFH);
+
+ void *private;
+};
+
-typedef struct cfp cfp;
+extern CompressFileHandle * InitCompressFileHandle(const pg_compress_specification compress_spec);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compress_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- pg_compress_specification compress_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compress_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+extern CompressFileHandle * InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int DestroyCompressFileHandle(CompressFileHandle * CFH);
#endif
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 6cff2a6c3d..041e08cd5e 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,5 +1,6 @@
pg_dump_common_sources = files(
'compress_io.c',
+ 'compress_gzip.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 304cc072ca..09e20fb97b 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compress_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle * SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle * savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -355,7 +355,7 @@ RestoreArchive(Archive *AHX)
bool parallel_mode;
bool supports_compression;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1126,7 +1126,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compress_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1502,6 +1502,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compress_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1524,33 +1525,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compress_spec);
- else
- AH->OF = cfopen(filename, mode, compress_spec);
+ CFH = InitCompressFileHandle(compress_spec);
- if (!AH->OF)
+ if (CFH->open(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle * savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1689,7 +1689,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2031,6 +2035,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2061,26 +2077,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2178,6 +2180,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2233,7 +2236,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3646,7 +3652,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- WriteInt(AH, AH->compress_spec.level);
+ AH->WriteBytePtr(AH, AH->compress_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3718,7 +3724,9 @@ ReadHead(ArchiveHandle *AH)
AH->format, fmt);
AH->compress_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compress_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
if (AH->version < K_VERS_1_4)
AH->compress_spec.level = AH->ReadBytePtr(AH);
@@ -3731,11 +3739,20 @@ ReadHead(ArchiveHandle *AH)
else
AH->compress_spec.algorithm = PG_COMPRESSION_GZIP;
+ if (AH->compress_spec.algorithm != PG_COMPRESSION_NONE)
+ {
+ bool unsupported = false;
+
#ifndef HAVE_LIBZ
- if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
+ unsupported = true;
#endif
-
+ if (unsupported)
+ {
+ pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
+ AH->compress_spec.algorithm = PG_COMPRESSION_NONE;
+ }
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index d2930949ab..bb7fad2af1 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,12 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index 6a2112c45f..49ec0e3816 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compress_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compress_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compress_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compress_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compress_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compress_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 7d2cddbb2c..e1ce2f393b 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,9 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
+ CompressFileHandle *dataFH; /* currently open data file */
- cfp *blobsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *blobsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +198,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +218,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +327,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compress_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compress_spec);
+
+ if (ctx->dataFH->open_write(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +346,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
}
@@ -370,7 +371,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +386,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (DestroyCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +435,7 @@ _LoadBlobs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +443,14 @@ _LoadBlobs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->blobsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->blobsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->blobsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the blobs TOC file line-by-line, and process each blob */
- while ((cfgets(ctx->blobsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets(line, MAXPGPATH, CFH)) != NULL)
{
char blobfname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +465,11 @@ _LoadBlobs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreBlob(AH, oid);
}
- if (!cfeof(ctx->blobsTocFH))
+ if (!CFH->eof(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->blobsTocFH) != 0)
+ if (DestroyCompressFileHandle(ctx->blobsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +489,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
return 1;
@@ -512,8 +514,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc(CFH);
}
/*
@@ -524,15 +527,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
}
@@ -545,12 +549,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +578,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compress_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +589,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compress_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compress_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compress_spec);
+ if (tocFH->open_write(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +603,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +654,8 @@ _StartBlobs(ArchiveHandle *AH, TocEntry *te)
/* The blob TOC file is never compressed */
compress_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->blobsTocFH = cfopen_write(fname, "ab", compress_spec);
- if (ctx->blobsTocFH == NULL)
+ ctx->blobsTocFH = InitCompressFileHandle(compress_spec);
+ if (ctx->blobsTocFH->open_write(fname, "ab", ctx->blobsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +672,8 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compress_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compress_spec);
+ if (ctx->dataFH->open_write(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,17 +686,18 @@ static void
_EndBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->blobsTocFH;
char buf[50];
int len;
/* Close the BLOB data file itself */
- if (cfclose(ctx->dataFH) != 0)
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the blob in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->blobsTocFH) != len)
+ if (CFH->write(buf, len, CFH) != len)
pg_fatal("could not write to blobs TOC file");
}
@@ -706,7 +711,7 @@ _EndBlobs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->blobsTocFH) != 0)
+ if (DestroyCompressFileHandle(ctx->blobsTocFH) != 0)
pg_fatal("could not close blobs TOC file: %m");
ctx->blobsTocFH = NULL;
}
--
2.34.1
v8-0005-Add-LZ4-compression-in-pg_-dump-restore.patchtext/x-patch; name=v8-0005-Add-LZ4-compression-in-pg_-dump-restore.patchDownload
From efd5acd0dc4dedab8fe823a9223404a43ba278d0 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 23 Nov 2022 16:59:02 +0000
Subject: [PATCH v8 5/5] Add LZ4 compression in pg_{dump|restore}
Within compress_lz4.{c,h} the streaming API and a file API compression is
implemented.. The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(), it has been
implemented localy.
---
doc/src/sgml/ref/pg_dump.sgml | 23 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 41 +-
src/bin/pg_dump/compress_lz4.c | 593 +++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 9 +
src/bin/pg_dump/meson.build | 4 +-
src/bin/pg_dump/pg_backup_archiver.c | 14 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/001_basic.pl | 2 +-
src/bin/pg_dump/t/002_pg_dump.pl | 69 +++-
10 files changed, 732 insertions(+), 30 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 3fb8fdce81..84d3778c99 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -328,9 +328,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -652,12 +653,12 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression. A compression level can
- be optionally specified, by appending the level number after a colon
- (<literal>:</literal>). If no level is specified, the default compression
- level will be used for the specified method. If only a level is
- specified without mentioning a method, <literal>gzip</literal> compression
- will be used.
+ <literal>lz4</literal> or <literal>none</literal> for no compression. A
+ compression level can be optionally specified, by appending the level
+ number after a colon (<literal>:</literal>). If no level is specified,
+ the default compression level will be used for the specified method. If
+ only a level is specified without mentioning a method,
+ <literal>gzip</literal> compression willbe used.
</para>
<para>
@@ -665,8 +666,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 29eab02d37..28c1fc27cc 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -25,6 +26,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
compress_gzip.o \
+ compress_lz4.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 3065bd76fa..29e2352c31 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -38,13 +38,15 @@
* ----------------------
*
* The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * libz's gzopen() APIs and custom LZ4 calls which provide similar
+ * functionality. It allows you to use the same functions for compressed and
+ * uncompressed streams. cfopen_read() first tries to open the file with given
+ * name, and if it fails, it tries to open the same file with the .gz suffix,
+ * failing that it tries to open the same file with the .lz4 suffix.
+ * cfopen_write() opens a file for writing, an extra argument specifies the
+ * method to use should the file be compressed, and adds the appropriate
+ * suffix, .gz or .lz4, to the filename if so. This allows you to easily handle
+ * both compressed and uncompressed files.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -57,6 +59,7 @@
#include "compress_io.h"
#include "compress_gzip.h"
+#include "compress_lz4.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -129,6 +132,9 @@ AllocateCompressor(const pg_compress_specification compress_spec,
case PG_COMPRESSION_GZIP:
InitCompressorGzip(cs, compress_spec.level);
break;
+ case PG_COMPRESSION_LZ4:
+ InitCompressorLZ4(cs, compress_spec.level);
+ break;
default:
pg_fatal("invalid compression method");
break;
@@ -179,6 +185,7 @@ free_keep_errno(void *p)
/*
* Compression None implementation
*/
+
static size_t
_read(void *ptr, size_t size, CompressFileHandle * CFH)
{
@@ -314,6 +321,9 @@ InitCompressFileHandle(const pg_compress_specification compress_spec)
case PG_COMPRESSION_GZIP:
InitCompressGzip(CFH, compress_spec.level);
break;
+ case PG_COMPRESSION_LZ4:
+ InitCompressLZ4(CFH, compress_spec.level);
+ break;
default:
pg_fatal("invalid compression method");
break;
@@ -326,12 +336,12 @@ InitCompressFileHandle(const pg_compress_specification compress_spec)
* Open a file for reading. 'path' is the file to open, and 'mode' should
* be either "r" or "rb".
*
- * If the file at 'path' does not exist, we append the ".gz" suffix (if
+ * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (i
* 'path' doesn't already have it) and try again. So if you pass "foo" as
- * 'path', this will open either "foo" or "foo.gz", trying in that order.
+ * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
+ * order.
*
* On failure, return NULL with an error code in errno.
- *
*/
CompressFileHandle *
InitDiscoverCompressFileHandle(const char *path, const char *mode)
@@ -367,6 +377,17 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compress_spec.algorithm = PG_COMPRESSION_GZIP;
}
+#endif
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compress_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..6f4680c344
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,593 @@
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private = NULL;
+ }
+}
+
+
+/* Public routines that support LZ4 compressed data I/O */
+void
+InitCompressorLZ4(CompressorState *cs, int compressionLevel)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ /* Will be lazy init'd */
+ cs->private = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the nessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurance of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (!fs->inited && LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verfiy that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (eol_flag == 0) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag == true &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the begining of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t status;
+ int remaining = size;
+
+ if (!fs->inited && LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle * CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = errno ? : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle * CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle * CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+void
+InitCompressLZ4(CompressFileHandle * CFH, int compressionLevel)
+{
+ LZ4File *lz4fp;
+
+ CFH->open = LZ4File_open;
+ CFH->open_write = LZ4File_open_write;
+ CFH->read = LZ4File_read;
+ CFH->write = LZ4File_write;
+ CFH->gets = LZ4File_gets;
+ CFH->getc = LZ4File_getc;
+ CFH->eof = LZ4File_eof;
+ CFH->close = LZ4File_close;
+ CFH->get_error = LZ4File_get_error;
+
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (compressionLevel >= 0)
+ lz4fp->prefs.compressionLevel = compressionLevel;
+
+ CFH->private = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs, int compressionLevel)
+{
+ pg_fatal("not built with LZ4 support");
+}
+
+void
+InitCompressLZ4(CompressFileHandle * CFH, int compressionLevel)
+{
+ pg_fatal("not built with LZ4 support");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..fbec9a508d
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,9 @@
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs, int compressionLevel);
+extern void InitCompressLZ4(CompressFileHandle * CFH, int compressionLevel);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 041e08cd5e..f9065a3e84 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,6 +1,7 @@
pg_dump_common_sources = files(
'compress_io.c',
'compress_gzip.c',
+ 'compress_lz4.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
@@ -15,7 +16,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -91,6 +92,7 @@ tests += {
],
'env': {
'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
},
},
}
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 09e20fb97b..c9b053d572 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -394,6 +394,10 @@ RestoreArchive(Archive *AHX)
#ifndef HAVE_LIBZ
if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
+#endif
+#ifndef USE_LZ4
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
#endif
if (supports_compression == false)
pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
@@ -2073,7 +2077,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2083,6 +2087,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -3746,6 +3754,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
unsupported = true;
+#endif
+#ifndef USE_LZ4
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_LZ4)
+ unsupported = true;
#endif
if (unsupported)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index a97a0f3a84..93c69834f0 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -1277,7 +1277,7 @@ parse_compression(const char *opt, pg_compress_specification *compress_spec)
parse_compress_user_options(opt, &algorithm_str, &level_str);
if (!parse_compress_algorithm(algorithm_str, &(compress_spec->algorithm)))
{
- pg_log_error("invalid compression method: \"%s\" (gzip, none)",
+ pg_log_error("invalid compression method: \"%s\" (gzip, lz4, none)",
algorithm_str);
return false;
}
@@ -1297,7 +1297,8 @@ parse_compression(const char *opt, pg_compress_specification *compress_spec)
/* Also switch off unimplemented compressions */
if (compress_spec->algorithm != PG_COMPRESSION_NONE &&
- compress_spec->algorithm != PG_COMPRESSION_GZIP)
+ compress_spec->algorithm != PG_COMPRESSION_GZIP &&
+ compress_spec->algorithm != PG_COMPRESSION_LZ4)
supports_compression = false;
if (!supports_compression)
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index fffb9e075b..da266c3013 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -122,7 +122,7 @@ command_fails_like(
command_fails_like(
[ 'pg_dump', '--compress', 'garbage' ],
- qr/\Qpg_dump: error: invalid compression method: "garbage" (gzip, none)\E/,
+ qr/\Qpg_dump: error: invalid compression method: "garbage" (gzip, lz4, none)\E/,
'pg_dump: invalid --compress');
command_fails_like(
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index e97d086956..88f0e83b43 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -116,6 +116,67 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=1', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4127,11 +4188,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
--
2.34.1
On Mon, Nov 28, 2022 at 04:32:43PM +0000, gkokolatos@pm.me wrote:
The focus of this version of this series is 0001 and 0002.
Admittedly 0001 could be presented in a separate thread though given its size and
proximity to the topic, I present it here.
I don't mind. This was a hole in meson.build, so nice catch! I have
noticed a second defect with pg_verifybackup for all the commands, and
applied both at the same time.
In an earlier review you spotted the similarity between pg_dump's and pg_receivewal's
parsing of compression options. However there exists a substantial difference in the
behaviour of the two programs; one treats the lack of support for the requested
algorithm as a fatal error, whereas the other does not. The existing functions in
common/compression.c do not account for the later. 0002 proposes an implementation
for this. It's usefulness is shown in 0003.
In what does it matter? The logic in compression.c provides an error
when looking at a spec or validating it, but the caller is free to
consume it as it wants because this is shared between the frontend and
the backend, and that includes consuming it as a warning rather than a
ahrd failure. If we don't want to issue an error and force
non-compression if attempting to use a compression method not
supported in pg_dump, that's fine by me as a historical behavior, but
I don't see why these routines have any need to be split more as
proposed in 0002.
Saying that, I do agree that it would be nice to remove the
duplication between the option parsing of pg_basebackup and
pg_receivewal. Your patch is very close to that, actually, and it
occured to me that if we move the check on "server-" and "client-" in
pg_basebackup to be just before the integer-only check then we can
consolidate the whole thing.
Attached is an alternative that does not sacrifice the pluggability of
the existing routines while allowing 0003~ to still use them (I don't
really want to move around the checks on the supported build options
now in parse_compress_specification(), that was hard enough to settle
on this location). On top of that, pg_basebackup is able to cope with
the case of --compress=0 already, enforcing "none" (BaseBackup could
be simplified a bit more before StartLogStreamer). This refactoring
shaves a little bit of code.
Please consider 0003-0005 as work in progress. They are differences from v7 yet they
may contain unaddressed comments for now.
Okay.
--
Michael
Attachments:
v9-0001-Make-the-pg_receivewal-compression-parsing-functi.patchtext/x-diff; charset=us-asciiDownload
From 6fb2aa609348ad7df6f9c12da60c07aa96243965 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Tue, 29 Nov 2022 15:17:27 +0900
Subject: [PATCH v9] Make the pg_receivewal compression parsing function common
Also and relax parsing errors in the helper functions and re-introduce those as
an independed function.
As it is shown in the rest of the patch series, there is a lot of duplication
between pg_dump's parsing of compression options and pg_receivewal's. Now the
core work is done in common. However pg_dump would not error out if the
requested compression algorithm is not supported by the build, whereas other
callers will error out. Also it seems a bit weird for only one of the parsing
functions for compressions to error out on missing support and that one to not
be the one responsible for identifying the compression algorithm.
A new function is added to test the support of the algorithm allowing the user
to tune the behaviour.
---
src/include/common/compression.h | 2 +
src/common/compression.c | 63 +++++++++++++++++++++++++++
src/bin/pg_basebackup/pg_basebackup.c | 49 ++++-----------------
src/bin/pg_basebackup/pg_receivewal.c | 61 --------------------------
4 files changed, 73 insertions(+), 102 deletions(-)
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 5d680058ed..46855b1a3b 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -33,6 +33,8 @@ typedef struct pg_compress_specification
char *parse_error; /* NULL if parsing was OK, else message */
} pg_compress_specification;
+extern void parse_compress_options(const char *option, char **algorithm,
+ char **detail);
extern bool parse_compress_algorithm(char *name, pg_compress_algorithm *algorithm);
extern const char *get_compress_algorithm_name(pg_compress_algorithm algorithm);
diff --git a/src/common/compression.c b/src/common/compression.c
index df5b627834..5274ba5ba8 100644
--- a/src/common/compression.c
+++ b/src/common/compression.c
@@ -356,3 +356,66 @@ validate_compress_specification(pg_compress_specification *spec)
return NULL;
}
+
+#ifdef FRONTEND
+
+/*
+ * Basic parsing of a value specified through a command-line option, commonly
+ * -Z/--compress.
+ *
+ * The parsing consists of a METHOD:DETAIL string fed later to
+ * parse_compress_specification(). This only extracts METHOD and DETAIL.
+ * If only an integer is found, the method is implied by the value specified.
+ */
+void
+parse_compress_options(const char *option, char **algorithm, char **detail)
+{
+ char *sep;
+ char *endp;
+ long result;
+
+ /*
+ * Check whether the compression specification consists of a bare integer.
+ *
+ * For backward-compatibility, assume "none" if the integer found is zero
+ * and "gzip" otherwise.
+ */
+ result = strtol(option, &endp, 10);
+ if (*endp == '\0')
+ {
+ if (result == 0)
+ {
+ *algorithm = pstrdup("none");
+ *detail = NULL;
+ }
+ else
+ {
+ *algorithm = pstrdup("gzip");
+ *detail = pstrdup(option);
+ }
+ return;
+ }
+
+ /*
+ * Check whether there is a compression detail following the algorithm
+ * name.
+ */
+ sep = strchr(option, ':');
+ if (sep == NULL)
+ {
+ *algorithm = pstrdup(option);
+ *detail = NULL;
+ }
+ else
+ {
+ char *alg;
+
+ alg = palloc((sep - option) + 1);
+ memcpy(alg, option, sep - option);
+ alg[sep - option] = '\0';
+
+ *algorithm = alg;
+ *detail = pstrdup(sep + 1);
+ }
+}
+#endif /* FRONTEND */
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 22836ca01a..4f56c9f464 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -956,27 +956,13 @@ parse_max_rate(char *src)
* at a later stage.
*/
static void
-parse_compress_options(char *option, char **algorithm, char **detail,
- CompressionLocation *locationres)
+backup_parse_compress_options(char *option, char **algorithm, char **detail,
+ CompressionLocation *locationres)
{
- char *sep;
- char *endp;
-
/*
- * Check whether the compression specification consists of a bare integer.
- *
- * If so, for backward compatibility, assume gzip.
+ * Strip off any "client-" or "server-" prefix, calculating the
+ * location.
*/
- (void) strtol(option, &endp, 10);
- if (*endp == '\0')
- {
- *locationres = COMPRESS_LOCATION_UNSPECIFIED;
- *algorithm = pstrdup("gzip");
- *detail = pstrdup(option);
- return;
- }
-
- /* Strip off any "client-" or "server-" prefix. */
if (strncmp(option, "server-", 7) == 0)
{
*locationres = COMPRESS_LOCATION_SERVER;
@@ -990,27 +976,8 @@ parse_compress_options(char *option, char **algorithm, char **detail,
else
*locationres = COMPRESS_LOCATION_UNSPECIFIED;
- /*
- * Check whether there is a compression detail following the algorithm
- * name.
- */
- sep = strchr(option, ':');
- if (sep == NULL)
- {
- *algorithm = pstrdup(option);
- *detail = NULL;
- }
- else
- {
- char *alg;
-
- alg = palloc((sep - option) + 1);
- memcpy(alg, option, sep - option);
- alg[sep - option] = '\0';
-
- *algorithm = alg;
- *detail = pstrdup(sep + 1);
- }
+ /* fallback to the common parsing for the algorithm and detail */
+ parse_compress_options(option, algorithm, detail);
}
/*
@@ -2411,8 +2378,8 @@ main(int argc, char **argv)
compressloc = COMPRESS_LOCATION_UNSPECIFIED;
break;
case 'Z':
- parse_compress_options(optarg, &compression_algorithm,
- &compression_detail, &compressloc);
+ backup_parse_compress_options(optarg, &compression_algorithm,
+ &compression_detail, &compressloc);
break;
case 'c':
if (pg_strcasecmp(optarg, "fast") == 0)
diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c
index 63207ca025..c7a46b8a2a 100644
--- a/src/bin/pg_basebackup/pg_receivewal.c
+++ b/src/bin/pg_basebackup/pg_receivewal.c
@@ -57,8 +57,6 @@ static XLogRecPtr endpos = InvalidXLogRecPtr;
static void usage(void);
-static void parse_compress_options(char *option, char **algorithm,
- char **detail);
static DIR *get_destination_dir(char *dest_folder);
static void close_destination_dir(DIR *dest_dir, char *dest_folder);
static XLogRecPtr FindStreamingStart(uint32 *tli);
@@ -109,65 +107,6 @@ usage(void)
printf(_("%s home page: <%s>\n"), PACKAGE_NAME, PACKAGE_URL);
}
-/*
- * Basic parsing of a value specified for -Z/--compress
- *
- * The parsing consists of a METHOD:DETAIL string fed later on to a more
- * advanced routine in charge of proper validation checks. This only extracts
- * METHOD and DETAIL. If only an integer is found, the method is implied by
- * the value specified.
- */
-static void
-parse_compress_options(char *option, char **algorithm, char **detail)
-{
- char *sep;
- char *endp;
- long result;
-
- /*
- * Check whether the compression specification consists of a bare integer.
- *
- * For backward-compatibility, assume "none" if the integer found is zero
- * and "gzip" otherwise.
- */
- result = strtol(option, &endp, 10);
- if (*endp == '\0')
- {
- if (result == 0)
- {
- *algorithm = pstrdup("none");
- *detail = NULL;
- }
- else
- {
- *algorithm = pstrdup("gzip");
- *detail = pstrdup(option);
- }
- return;
- }
-
- /*
- * Check whether there is a compression detail following the algorithm
- * name.
- */
- sep = strchr(option, ':');
- if (sep == NULL)
- {
- *algorithm = pstrdup(option);
- *detail = NULL;
- }
- else
- {
- char *alg;
-
- alg = palloc((sep - option) + 1);
- memcpy(alg, option, sep - option);
- alg[sep - option] = '\0';
-
- *algorithm = alg;
- *detail = pstrdup(sep + 1);
- }
-}
/*
* Check if the filename looks like a WAL file, letting caller know if this
--
2.38.1
On Tue, Nov 29, 2022 at 03:19:17PM +0900, Michael Paquier wrote:
Attached is an alternative that does not sacrifice the pluggability of
the existing routines while allowing 0003~ to still use them (I don't
really want to move around the checks on the supported build options
now in parse_compress_specification(), that was hard enough to settle
on this location). On top of that, pg_basebackup is able to cope with
the case of --compress=0 already, enforcing "none" (BaseBackup could
be simplified a bit more before StartLogStreamer). This refactoring
shaves a little bit of code.
One thing that I forgot to mention is that this refactoring would
treat things like server-N, client-N as valid grammars (in this case N
takes precedence over an optional detail string), implying that N = 0
is "none" and N > 0 is gzip, so that makes for an extra grammar flavor
without impacting the existing ones. I am not sure that it is worth
documenting, still worth mentioning.
--
Michael
------- Original Message -------
On Tuesday, November 29th, 2022 at 7:19 AM, Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Nov 28, 2022 at 04:32:43PM +0000, gkokolatos@pm.me wrote:
The focus of this version of this series is 0001 and 0002.
Admittedly 0001 could be presented in a separate thread though given its size and
proximity to the topic, I present it here.I don't mind. This was a hole in meson.build, so nice catch! I have
noticed a second defect with pg_verifybackup for all the commands, and
applied both at the same time.
Thank you.
In an earlier review you spotted the similarity between pg_dump's and pg_receivewal's
parsing of compression options. However there exists a substantial difference in the
behaviour of the two programs; one treats the lack of support for the requested
algorithm as a fatal error, whereas the other does not. The existing functions in
common/compression.c do not account for the later. 0002 proposes an implementation
for this. It's usefulness is shown in 0003.In what does it matter? The logic in compression.c provides an error
when looking at a spec or validating it, but the caller is free to
consume it as it wants because this is shared between the frontend and
the backend, and that includes consuming it as a warning rather than a
ahrd failure. If we don't want to issue an error and force
non-compression if attempting to use a compression method not
supported in pg_dump, that's fine by me as a historical behavior, but
I don't see why these routines have any need to be split more as
proposed in 0002.
I understand. The reason for the change in the routines was because it was
impossible to distinguish a genuine parse error from a missing library in
parse_compress_specification(). If the zlib library is missing, then both
'--compress=gzip:garbage' and '--compress=gzip:7' would populate the
parse_error member of the struct and subsequent calls to
validate_compress_specification() would error out, although only one of
the two options is truly an error. Historically the code would fail on
invalid input regardless of whether the library was present or not.
Saying that, I do agree that it would be nice to remove the
duplication between the option parsing of pg_basebackup and
pg_receivewal. Your patch is very close to that, actually, and it
occured to me that if we move the check on "server-" and "client-" in
pg_basebackup to be just before the integer-only check then we can
consolidate the whole thing.
Great. I did notice the possible benefit but chose to not tread too far
off the necessary in my patch.
Attached is an alternative that does not sacrifice the pluggability of
the existing routines while allowing 0003~ to still use them (I don't
really want to move around the checks on the supported build options
now in parse_compress_specification(), that was hard enough to settle
on this location).
Yeah, I thought that it would be a hard sell, hence an "earlier"
version.
The attached version 10, contains verbatim your proposed v9 as 0001.
Then 0002 is switching a bit the parsing order in pg_dump and will not
fail as described above on missing libraries. Now, it will first parse
the algorithm, discard it when unsupported, and only parse the rest of
the option if the algorithm is supported. Granted it is a bit 'uglier'
with the preprocessing blocks, yet it maintains most of the historic
behaviour without altering the common compression interfaces. Now, as
shown in 001_basic.pl, invalid detail will fail only if the algorithm
is supported.
On top of that, pg_basebackup is able to cope with
the case of --compress=0 already, enforcing "none" (BaseBackup could
be simplified a bit more before StartLogStreamer). This refactoring
shaves a little bit of code.Please consider 0003-0005 as work in progress. They are differences from v7 yet they
may contain unaddressed comments for now.Okay.
Thank you. Please advice if is preferable to split 0002 in two parts.
I think not but I will happily do so if you think otherwise.
Cheers,
//Georgios
Show quoted text
--
Michael
Attachments:
v10-0001-Make-the-pg_receivewal-compression-parsing-funct.patchtext/x-patch; name=v10-0001-Make-the-pg_receivewal-compression-parsing-funct.patchDownload
From f9738acf3f6673447e11999ab02de703e2d01951 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Tue, 29 Nov 2022 11:22:38 +0000
Subject: [PATCH v10 1/4] Make the pg_receivewal compression parsing function
common
Also and relax parsing errors in the helper functions and re-introduce those as
an independed function.
As it is shown in the rest of the patch series, there is a lot of duplication
between pg_dump's parsing of compression options and pg_receivewal's. Now the
core work is done in common. However pg_dump would not error out if the
requested compression algorithm is not supported by the build, whereas other
callers will error out. Also it seems a bit weird for only one of the parsing
functions for compressions to error out on missing support and that one to not
be the one responsible for identifying the compression algorithm.
A new function is added to test the support of the algorithm allowing the user
to tune the behaviour.
Authored by Micheal Paquier
---
src/bin/pg_basebackup/pg_basebackup.c | 49 ++++-----------------
src/bin/pg_basebackup/pg_receivewal.c | 61 --------------------------
src/common/compression.c | 63 +++++++++++++++++++++++++++
src/include/common/compression.h | 2 +
4 files changed, 73 insertions(+), 102 deletions(-)
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 22836ca01a..4f56c9f464 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -956,27 +956,13 @@ parse_max_rate(char *src)
* at a later stage.
*/
static void
-parse_compress_options(char *option, char **algorithm, char **detail,
- CompressionLocation *locationres)
+backup_parse_compress_options(char *option, char **algorithm, char **detail,
+ CompressionLocation *locationres)
{
- char *sep;
- char *endp;
-
/*
- * Check whether the compression specification consists of a bare integer.
- *
- * If so, for backward compatibility, assume gzip.
+ * Strip off any "client-" or "server-" prefix, calculating the
+ * location.
*/
- (void) strtol(option, &endp, 10);
- if (*endp == '\0')
- {
- *locationres = COMPRESS_LOCATION_UNSPECIFIED;
- *algorithm = pstrdup("gzip");
- *detail = pstrdup(option);
- return;
- }
-
- /* Strip off any "client-" or "server-" prefix. */
if (strncmp(option, "server-", 7) == 0)
{
*locationres = COMPRESS_LOCATION_SERVER;
@@ -990,27 +976,8 @@ parse_compress_options(char *option, char **algorithm, char **detail,
else
*locationres = COMPRESS_LOCATION_UNSPECIFIED;
- /*
- * Check whether there is a compression detail following the algorithm
- * name.
- */
- sep = strchr(option, ':');
- if (sep == NULL)
- {
- *algorithm = pstrdup(option);
- *detail = NULL;
- }
- else
- {
- char *alg;
-
- alg = palloc((sep - option) + 1);
- memcpy(alg, option, sep - option);
- alg[sep - option] = '\0';
-
- *algorithm = alg;
- *detail = pstrdup(sep + 1);
- }
+ /* fallback to the common parsing for the algorithm and detail */
+ parse_compress_options(option, algorithm, detail);
}
/*
@@ -2411,8 +2378,8 @@ main(int argc, char **argv)
compressloc = COMPRESS_LOCATION_UNSPECIFIED;
break;
case 'Z':
- parse_compress_options(optarg, &compression_algorithm,
- &compression_detail, &compressloc);
+ backup_parse_compress_options(optarg, &compression_algorithm,
+ &compression_detail, &compressloc);
break;
case 'c':
if (pg_strcasecmp(optarg, "fast") == 0)
diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c
index 63207ca025..c7a46b8a2a 100644
--- a/src/bin/pg_basebackup/pg_receivewal.c
+++ b/src/bin/pg_basebackup/pg_receivewal.c
@@ -57,8 +57,6 @@ static XLogRecPtr endpos = InvalidXLogRecPtr;
static void usage(void);
-static void parse_compress_options(char *option, char **algorithm,
- char **detail);
static DIR *get_destination_dir(char *dest_folder);
static void close_destination_dir(DIR *dest_dir, char *dest_folder);
static XLogRecPtr FindStreamingStart(uint32 *tli);
@@ -109,65 +107,6 @@ usage(void)
printf(_("%s home page: <%s>\n"), PACKAGE_NAME, PACKAGE_URL);
}
-/*
- * Basic parsing of a value specified for -Z/--compress
- *
- * The parsing consists of a METHOD:DETAIL string fed later on to a more
- * advanced routine in charge of proper validation checks. This only extracts
- * METHOD and DETAIL. If only an integer is found, the method is implied by
- * the value specified.
- */
-static void
-parse_compress_options(char *option, char **algorithm, char **detail)
-{
- char *sep;
- char *endp;
- long result;
-
- /*
- * Check whether the compression specification consists of a bare integer.
- *
- * For backward-compatibility, assume "none" if the integer found is zero
- * and "gzip" otherwise.
- */
- result = strtol(option, &endp, 10);
- if (*endp == '\0')
- {
- if (result == 0)
- {
- *algorithm = pstrdup("none");
- *detail = NULL;
- }
- else
- {
- *algorithm = pstrdup("gzip");
- *detail = pstrdup(option);
- }
- return;
- }
-
- /*
- * Check whether there is a compression detail following the algorithm
- * name.
- */
- sep = strchr(option, ':');
- if (sep == NULL)
- {
- *algorithm = pstrdup(option);
- *detail = NULL;
- }
- else
- {
- char *alg;
-
- alg = palloc((sep - option) + 1);
- memcpy(alg, option, sep - option);
- alg[sep - option] = '\0';
-
- *algorithm = alg;
- *detail = pstrdup(sep + 1);
- }
-}
/*
* Check if the filename looks like a WAL file, letting caller know if this
diff --git a/src/common/compression.c b/src/common/compression.c
index df5b627834..5274ba5ba8 100644
--- a/src/common/compression.c
+++ b/src/common/compression.c
@@ -356,3 +356,66 @@ validate_compress_specification(pg_compress_specification *spec)
return NULL;
}
+
+#ifdef FRONTEND
+
+/*
+ * Basic parsing of a value specified through a command-line option, commonly
+ * -Z/--compress.
+ *
+ * The parsing consists of a METHOD:DETAIL string fed later to
+ * parse_compress_specification(). This only extracts METHOD and DETAIL.
+ * If only an integer is found, the method is implied by the value specified.
+ */
+void
+parse_compress_options(const char *option, char **algorithm, char **detail)
+{
+ char *sep;
+ char *endp;
+ long result;
+
+ /*
+ * Check whether the compression specification consists of a bare integer.
+ *
+ * For backward-compatibility, assume "none" if the integer found is zero
+ * and "gzip" otherwise.
+ */
+ result = strtol(option, &endp, 10);
+ if (*endp == '\0')
+ {
+ if (result == 0)
+ {
+ *algorithm = pstrdup("none");
+ *detail = NULL;
+ }
+ else
+ {
+ *algorithm = pstrdup("gzip");
+ *detail = pstrdup(option);
+ }
+ return;
+ }
+
+ /*
+ * Check whether there is a compression detail following the algorithm
+ * name.
+ */
+ sep = strchr(option, ':');
+ if (sep == NULL)
+ {
+ *algorithm = pstrdup(option);
+ *detail = NULL;
+ }
+ else
+ {
+ char *alg;
+
+ alg = palloc((sep - option) + 1);
+ memcpy(alg, option, sep - option);
+ alg[sep - option] = '\0';
+
+ *algorithm = alg;
+ *detail = pstrdup(sep + 1);
+ }
+}
+#endif /* FRONTEND */
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 5d680058ed..46855b1a3b 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -33,6 +33,8 @@ typedef struct pg_compress_specification
char *parse_error; /* NULL if parsing was OK, else message */
} pg_compress_specification;
+extern void parse_compress_options(const char *option, char **algorithm,
+ char **detail);
extern bool parse_compress_algorithm(char *name, pg_compress_algorithm *algorithm);
extern const char *get_compress_algorithm_name(pg_compress_algorithm algorithm);
--
2.34.1
v10-0004-Add-LZ4-compression-in-pg_-dump-restore.patchtext/x-patch; name=v10-0004-Add-LZ4-compression-in-pg_-dump-restore.patchDownload
From 8437ce812c45cc8477445e5c178e6c69fd45666d Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Tue, 29 Nov 2022 11:23:00 +0000
Subject: [PATCH v10 4/4] Add LZ4 compression in pg_{dump|restore}
Within compress_lz4.{c,h} the streaming API and a file API compression is
implemented.. The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(), it has been
implemented localy.
---
doc/src/sgml/ref/pg_dump.sgml | 23 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 41 +-
src/bin/pg_dump/compress_lz4.c | 593 +++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 9 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 14 +-
src/bin/pg_dump/pg_dump.c | 12 +-
src/bin/pg_dump/t/001_basic.pl | 2 +-
src/bin/pg_dump/t/002_pg_dump.pl | 69 +++-
10 files changed, 742 insertions(+), 31 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 3fb8fdce81..84d3778c99 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -328,9 +328,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -652,12 +653,12 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression. A compression level can
- be optionally specified, by appending the level number after a colon
- (<literal>:</literal>). If no level is specified, the default compression
- level will be used for the specified method. If only a level is
- specified without mentioning a method, <literal>gzip</literal> compression
- will be used.
+ <literal>lz4</literal> or <literal>none</literal> for no compression. A
+ compression level can be optionally specified, by appending the level
+ number after a colon (<literal>:</literal>). If no level is specified,
+ the default compression level will be used for the specified method. If
+ only a level is specified without mentioning a method,
+ <literal>gzip</literal> compression willbe used.
</para>
<para>
@@ -665,8 +666,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 29eab02d37..28c1fc27cc 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -25,6 +26,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
compress_gzip.o \
+ compress_lz4.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 3065bd76fa..29e2352c31 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -38,13 +38,15 @@
* ----------------------
*
* The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * libz's gzopen() APIs and custom LZ4 calls which provide similar
+ * functionality. It allows you to use the same functions for compressed and
+ * uncompressed streams. cfopen_read() first tries to open the file with given
+ * name, and if it fails, it tries to open the same file with the .gz suffix,
+ * failing that it tries to open the same file with the .lz4 suffix.
+ * cfopen_write() opens a file for writing, an extra argument specifies the
+ * method to use should the file be compressed, and adds the appropriate
+ * suffix, .gz or .lz4, to the filename if so. This allows you to easily handle
+ * both compressed and uncompressed files.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -57,6 +59,7 @@
#include "compress_io.h"
#include "compress_gzip.h"
+#include "compress_lz4.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -129,6 +132,9 @@ AllocateCompressor(const pg_compress_specification compress_spec,
case PG_COMPRESSION_GZIP:
InitCompressorGzip(cs, compress_spec.level);
break;
+ case PG_COMPRESSION_LZ4:
+ InitCompressorLZ4(cs, compress_spec.level);
+ break;
default:
pg_fatal("invalid compression method");
break;
@@ -179,6 +185,7 @@ free_keep_errno(void *p)
/*
* Compression None implementation
*/
+
static size_t
_read(void *ptr, size_t size, CompressFileHandle * CFH)
{
@@ -314,6 +321,9 @@ InitCompressFileHandle(const pg_compress_specification compress_spec)
case PG_COMPRESSION_GZIP:
InitCompressGzip(CFH, compress_spec.level);
break;
+ case PG_COMPRESSION_LZ4:
+ InitCompressLZ4(CFH, compress_spec.level);
+ break;
default:
pg_fatal("invalid compression method");
break;
@@ -326,12 +336,12 @@ InitCompressFileHandle(const pg_compress_specification compress_spec)
* Open a file for reading. 'path' is the file to open, and 'mode' should
* be either "r" or "rb".
*
- * If the file at 'path' does not exist, we append the ".gz" suffix (if
+ * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (i
* 'path' doesn't already have it) and try again. So if you pass "foo" as
- * 'path', this will open either "foo" or "foo.gz", trying in that order.
+ * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
+ * order.
*
* On failure, return NULL with an error code in errno.
- *
*/
CompressFileHandle *
InitDiscoverCompressFileHandle(const char *path, const char *mode)
@@ -367,6 +377,17 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compress_spec.algorithm = PG_COMPRESSION_GZIP;
}
+#endif
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compress_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..6f4680c344
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,593 @@
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private = NULL;
+ }
+}
+
+
+/* Public routines that support LZ4 compressed data I/O */
+void
+InitCompressorLZ4(CompressorState *cs, int compressionLevel)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ /* Will be lazy init'd */
+ cs->private = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the nessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurance of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (!fs->inited && LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verfiy that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (eol_flag == 0) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag == true &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the begining of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t status;
+ int remaining = size;
+
+ if (!fs->inited && LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle * CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = errno ? : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle * CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle * CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+void
+InitCompressLZ4(CompressFileHandle * CFH, int compressionLevel)
+{
+ LZ4File *lz4fp;
+
+ CFH->open = LZ4File_open;
+ CFH->open_write = LZ4File_open_write;
+ CFH->read = LZ4File_read;
+ CFH->write = LZ4File_write;
+ CFH->gets = LZ4File_gets;
+ CFH->getc = LZ4File_getc;
+ CFH->eof = LZ4File_eof;
+ CFH->close = LZ4File_close;
+ CFH->get_error = LZ4File_get_error;
+
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (compressionLevel >= 0)
+ lz4fp->prefs.compressionLevel = compressionLevel;
+
+ CFH->private = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs, int compressionLevel)
+{
+ pg_fatal("not built with LZ4 support");
+}
+
+void
+InitCompressLZ4(CompressFileHandle * CFH, int compressionLevel)
+{
+ pg_fatal("not built with LZ4 support");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..fbec9a508d
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,9 @@
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs, int compressionLevel);
+extern void InitCompressLZ4(CompressFileHandle * CFH, int compressionLevel);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 0c73a4707e..b27e92ffd0 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,6 +1,7 @@
pg_dump_common_sources = files(
'compress_io.c',
'compress_gzip.c',
+ 'compress_lz4.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
@@ -15,7 +16,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -83,7 +84,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 09e20fb97b..c9b053d572 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -394,6 +394,10 @@ RestoreArchive(Archive *AHX)
#ifndef HAVE_LIBZ
if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
+#endif
+#ifndef USE_LZ4
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
#endif
if (supports_compression == false)
pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
@@ -2073,7 +2077,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2083,6 +2087,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -3746,6 +3754,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
unsupported = true;
+#endif
+#ifndef USE_LZ4
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_LZ4)
+ unsupported = true;
#endif
if (unsupported)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 62c09e6ed4..d3feb062cd 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -1274,20 +1274,25 @@ parse_compression(const char *opt, pg_compress_specification *compress_spec)
parse_compress_options(opt, &algorithm_str, &level_str);
if (!parse_compress_algorithm(algorithm_str, &(compress_spec->algorithm)))
{
- pg_log_error("invalid compression method: \"%s\" (gzip, none)",
+ pg_log_error("invalid compression method: \"%s\" (gzip, lz4, none)",
algorithm_str);
return false;
}
/* Switch off unimplemented or unavailable compressions. */
if (compress_spec->algorithm != PG_COMPRESSION_NONE &&
- compress_spec->algorithm != PG_COMPRESSION_GZIP)
+ compress_spec->algorithm != PG_COMPRESSION_GZIP &&
+ compress_spec->algorithm != PG_COMPRESSION_LZ4)
supports_compression = false;
#ifndef HAVE_LIBZ
if (compress_spec->algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
#endif
+#ifndef USE_LZ4
+ if (compress_spec->algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
+#endif
if (!supports_compression)
{
@@ -1310,6 +1315,9 @@ parse_compression(const char *opt, pg_compress_specification *compress_spec)
return false;
}
+ pg_free(algorithm_str);
+ pg_free(level_str);
+
return true;
}
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index f8d0b2fce5..2f7f389681 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -122,7 +122,7 @@ command_fails_like(
command_fails_like(
[ 'pg_dump', '--compress', 'garbage' ],
- qr/\Qpg_dump: error: invalid compression method: "garbage" (gzip, none)\E/,
+ qr/\Qpg_dump: error: invalid compression method: "garbage" (gzip, lz4, none)\E/,
'pg_dump: invalid --compress');
command_fails_like(
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index d604558f03..0bec824836 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -116,6 +116,67 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=1', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4127,11 +4188,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
--
2.34.1
v10-0002-Prepare-pg_dump-for-additional-compression-metho.patchtext/x-patch; name=v10-0002-Prepare-pg_dump-for-additional-compression-metho.patchDownload
From 516ea1bb70a4243965e231092ee460bb4ade311d Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Tue, 29 Nov 2022 11:22:44 +0000
Subject: [PATCH v10 2/4] Prepare pg_dump for additional compression methods
This commmit does some of the heavy lifting required for additional compression
methods.
First it is teaching pg_dump.c about the definitions and interfaces found in
common/compression.h. Then it is propagating those throughout the code.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about cfp and is using it through out.
---
doc/src/sgml/ref/pg_dump.sgml | 30 +-
src/bin/pg_dump/compress_io.c | 431 ++++++++++++++++----------
src/bin/pg_dump/compress_io.h | 20 +-
src/bin/pg_dump/pg_backup.h | 7 +-
src/bin/pg_dump/pg_backup_archiver.c | 192 ++++++------
src/bin/pg_dump/pg_backup_archiver.h | 37 +--
src/bin/pg_dump/pg_backup_custom.c | 6 +-
src/bin/pg_dump/pg_backup_directory.c | 13 +-
src/bin/pg_dump/pg_backup_tar.c | 12 +-
src/bin/pg_dump/pg_dump.c | 99 ++++--
src/bin/pg_dump/t/001_basic.pl | 27 +-
src/bin/pg_dump/t/002_pg_dump.pl | 2 +-
12 files changed, 514 insertions(+), 362 deletions(-)
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 8b9d9f4cad..3fb8fdce81 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -644,17 +644,31 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
- <term><option>-Z <replaceable class="parameter">0..9</replaceable></option></term>
- <term><option>--compress=<replaceable class="parameter">0..9</replaceable></option></term>
+ <term><option>-Z <replaceable class="parameter">level</replaceable></option></term>
+ <term><option>-Z <replaceable class="parameter">method</replaceable></option>[:<replaceable>level</replaceable>]</term>
+ <term><option>--compress=<replaceable class="parameter">level</replaceable></option></term>
+ <term><option>--compress=<replaceable class="parameter">method</replaceable></option>[:<replaceable>level</replaceable>]</term>
<listitem>
<para>
- Specify the compression level to use. Zero means no compression.
+ Specify the compression method and/or the compression level to use.
+ The compression method can be set to <literal>gzip</literal> or
+ <literal>none</literal> for no compression. A compression level can
+ be optionally specified, by appending the level number after a colon
+ (<literal>:</literal>). If no level is specified, the default compression
+ level will be used for the specified method. If only a level is
+ specified without mentioning a method, <literal>gzip</literal> compression
+ will be used.
+ </para>
+
+ <para>
For the custom and directory archive formats, this specifies compression of
- individual table-data segments, and the default is to compress
- at a moderate level.
- For plain text output, setting a nonzero compression level causes
- the entire output file to be compressed, as though it had been
- fed through <application>gzip</application>; but the default is not to compress.
+ individual table-data segments, and the default is to compress using
+ <literal>gzip</literal> at a moderate level. For plain text output,
+ setting a nonzero compression level causes the entire output file to be compressed,
+ as though it had been fed through <application>gzip</application>; but the default
+ is not to compress.
+ </para>
+ <para>
The tar archive format currently does not support compression at all.
</para>
</listitem>
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 62f940ff7a..4a8fc1e306 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,10 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
/*----------------------
* Compressor API
*----------------------
@@ -64,7 +68,7 @@
/* typedef appears in compress_io.h */
struct CompressorState
{
- CompressionAlgorithm comprAlg;
+ pg_compress_algorithm compress_algorithm;
WriteFunc writeF;
#ifdef HAVE_LIBZ
@@ -74,9 +78,6 @@ struct CompressorState
#endif
};
-static void ParseCompressionOption(int compression, CompressionAlgorithm *alg,
- int *level);
-
/* Routines that support zlib compressed data I/O */
#ifdef HAVE_LIBZ
static void InitCompressorZlib(CompressorState *cs, int level);
@@ -93,57 +94,30 @@ static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
const char *data, size_t dLen);
-/*
- * Interprets a numeric 'compression' value. The algorithm implied by the
- * value (zlib or none at the moment), is returned in *alg, and the
- * zlib compression level in *level.
- */
-static void
-ParseCompressionOption(int compression, CompressionAlgorithm *alg, int *level)
-{
- if (compression == Z_DEFAULT_COMPRESSION ||
- (compression > 0 && compression <= 9))
- *alg = COMPR_ALG_LIBZ;
- else if (compression == 0)
- *alg = COMPR_ALG_NONE;
- else
- {
- pg_fatal("invalid compression code: %d", compression);
- *alg = COMPR_ALG_NONE; /* keep compiler quiet */
- }
-
- /* The level is just the passed-in value. */
- if (level)
- *level = compression;
-}
-
/* Public interface routines */
/* Allocate a new compressor */
CompressorState *
-AllocateCompressor(int compression, WriteFunc writeF)
+AllocateCompressor(const pg_compress_specification compress_spec,
+ WriteFunc writeF)
{
CompressorState *cs;
- CompressionAlgorithm alg;
- int level;
-
- ParseCompressionOption(compression, &alg, &level);
#ifndef HAVE_LIBZ
- if (alg == COMPR_ALG_LIBZ)
+ if (compress_spec.algorithm == PG_COMPRESSION_GZIP)
pg_fatal("not built with zlib support");
#endif
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
cs->writeF = writeF;
- cs->comprAlg = alg;
+ cs->compress_algorithm = compress_spec.algorithm;
/*
* Perform compression algorithm specific initialization.
*/
#ifdef HAVE_LIBZ
- if (alg == COMPR_ALG_LIBZ)
- InitCompressorZlib(cs, level);
+ if (cs->compress_algorithm == PG_COMPRESSION_GZIP)
+ InitCompressorZlib(cs, compress_spec.level);
#endif
return cs;
@@ -154,21 +128,24 @@ AllocateCompressor(int compression, WriteFunc writeF)
* out with ahwrite().
*/
void
-ReadDataFromArchive(ArchiveHandle *AH, int compression, ReadFunc readF)
+ReadDataFromArchive(ArchiveHandle *AH, pg_compress_specification compress_spec,
+ ReadFunc readF)
{
- CompressionAlgorithm alg;
-
- ParseCompressionOption(compression, &alg, NULL);
-
- if (alg == COMPR_ALG_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (alg == COMPR_ALG_LIBZ)
+ switch (compress_spec.algorithm)
{
+ case PG_COMPRESSION_NONE:
+ ReadDataFromArchiveNone(AH, readF);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
+ ReadDataFromArchiveZlib(AH, readF);
#else
- pg_fatal("not built with zlib support");
+ pg_fatal("not built with zlib support");
#endif
+ break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
}
}
@@ -179,18 +156,21 @@ void
WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen)
{
- switch (cs->comprAlg)
+ switch (cs->compress_algorithm)
{
- case COMPR_ALG_LIBZ:
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
WriteDataToArchiveZlib(AH, cs, data, dLen);
#else
pg_fatal("not built with zlib support");
#endif
break;
- case COMPR_ALG_NONE:
+ case PG_COMPRESSION_NONE:
WriteDataToArchiveNone(AH, cs, data, dLen);
break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
}
}
@@ -200,11 +180,23 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
+ switch (cs->compress_algorithm)
+ {
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (cs->comprAlg == COMPR_ALG_LIBZ)
- EndCompressorZlib(AH, cs);
+ EndCompressorZlib(AH, cs);
+#else
+ pg_fatal("not built with zlib support");
#endif
- free(cs);
+ break;
+ case PG_COMPRESSION_NONE:
+ free(cs);
+ break;
+
+ default:
+ pg_fatal("invalid compression method");
+ break;
+ }
}
/* Private routines, specific to each compression method. */
@@ -418,10 +410,8 @@ WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
*/
struct cfp
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
+ pg_compress_algorithm compress_algorithm;
+ void *fp;
};
#ifdef HAVE_LIBZ
@@ -452,21 +442,25 @@ cfp *
cfopen_read(const char *path, const char *mode)
{
cfp *fp;
+ pg_compress_specification compress_spec = {0};
+ compress_spec.algorithm = PG_COMPRESSION_GZIP;
#ifdef HAVE_LIBZ
if (hasSuffix(path, ".gz"))
- fp = cfopen(path, mode, 1);
+ fp = cfopen(path, mode, compress_spec);
else
#endif
{
- fp = cfopen(path, mode, 0);
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
+ fp = cfopen(path, mode, compress_spec);
#ifdef HAVE_LIBZ
if (fp == NULL)
{
char *fname;
+ compress_spec.algorithm = PG_COMPRESSION_GZIP;
fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, 1);
+ fp = cfopen(fname, mode, compress_spec);
free_keep_errno(fname);
}
#endif
@@ -479,26 +473,27 @@ cfopen_read(const char *path, const char *mode)
* be a filemode as accepted by fopen() and gzopen() that indicates writing
* ("w", "wb", "a", or "ab").
*
- * If 'compression' is non-zero, a gzip compressed stream is opened, and
- * 'compression' indicates the compression level used. The ".gz" suffix
- * is automatically added to 'path' in that case.
+ * If 'compress_spec.algorithm' is GZIP, a gzip compressed stream is opened,
+ * and 'compress_spec.level' used. The ".gz" suffix is automatically added to
+ * 'path' in that case.
*
* On failure, return NULL with an error code in errno.
*/
cfp *
-cfopen_write(const char *path, const char *mode, int compression)
+cfopen_write(const char *path, const char *mode,
+ const pg_compress_specification compress_spec)
{
cfp *fp;
- if (compression == 0)
- fp = cfopen(path, mode, 0);
+ if (compress_spec.algorithm == PG_COMPRESSION_NONE)
+ fp = cfopen(path, mode, compress_spec);
else
{
#ifdef HAVE_LIBZ
char *fname;
fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression);
+ fp = cfopen(fname, mode, compress_spec);
free_keep_errno(fname);
#else
pg_fatal("not built with zlib support");
@@ -509,60 +504,96 @@ cfopen_write(const char *path, const char *mode, int compression)
}
/*
- * Opens file 'path' in 'mode'. If 'compression' is non-zero, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode, int compression)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_algorithm compress_algorithm, int compressionLevel)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compression != 0)
+ fp->compress_algorithm = compress_algorithm;
+
+ switch (compress_algorithm)
{
-#ifdef HAVE_LIBZ
- if (compression != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
+ case PG_COMPRESSION_NONE:
+ if (fd >= 0)
+ fp->fp = fdopen(fd, mode);
+ else
+ fp->fp = fopen(path, mode);
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression);
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
- }
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ if (compressionLevel != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use
+ * it
+ */
+ char mode_compression[32];
+
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, compressionLevel);
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode_compression);
+ else
+ fp->fp = gzopen(path, mode_compression);
+ }
+ else
+ {
+ /* don't specify a level, just use the zlib default */
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode);
+ else
+ fp->fp = gzopen(path, mode);
+ }
- fp->uncompressedfp = NULL;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
#else
- pg_fatal("not built with zlib support");
-#endif
- }
- else
- {
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
+ pg_fatal("not built with zlib support");
#endif
- fp->uncompressedfp = fopen(path, mode);
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
}
return fp;
}
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compress_spec)
+{
+ return cfopen_internal(path, -1, mode,
+ compress_spec.algorithm,
+ compress_spec.level);
+}
+
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compress_spec)
+{
+ return cfopen_internal(NULL, fd, mode,
+ compress_spec.algorithm,
+ compress_spec.level);
+}
int
cfread(void *ptr, int size, cfp *fp)
@@ -572,38 +603,61 @@ cfread(void *ptr, int size, cfp *fp)
if (size == 0)
return 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compress_algorithm)
{
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ case PG_COMPRESSION_NONE:
+ ret = fread(ptr, 1, size, fp->fp);
+ if (ret != size && !feof(fp->fp))
+ READ_ERROR_EXIT(fp->fp);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
- else
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzread(fp->fp, ptr, size);
+ if (ret != size && !gzeof(fp->fp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(fp->fp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+#else
+ pg_fatal("not built with zlib support");
#endif
- {
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+
+ default:
+ pg_fatal("invalid compression method");
+ break;
}
+
return ret;
}
int
cfwrite(const void *ptr, int size, cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compress_algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fwrite(ptr, 1, size, fp->fp);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
+ ret = gzwrite(fp->fp, ptr, size);
+#else
+ pg_fatal("not built with zlib support");
#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
+ break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
+ }
+
+ return ret;
}
int
@@ -611,24 +665,31 @@ cfgetc(cfp *fp)
{
int ret;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compress_algorithm)
{
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
- {
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fgetc(fp->fp);
+ if (ret == EOF)
+ READ_ERROR_EXIT(fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzgetc((gzFile) fp->fp);
+ if (ret == EOF)
+ {
+ if (!gzeof(fp->fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+#else
+ pg_fatal("not built with zlib support");
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
}
return ret;
@@ -637,65 +698,107 @@ cfgetc(cfp *fp)
char *
cfgets(cfp *fp, char *buf, int len)
{
+ char *ret;
+
+ switch (fp->compress_algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fgets(buf, len, fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
+ ret = gzgets(fp->fp, buf, len);
+#else
+ pg_fatal("not built with zlib support");
#endif
- return fgets(buf, len, fp->uncompressedfp);
+ break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
+ }
+
+ return ret;
}
int
cfclose(cfp *fp)
{
- int result;
+ int ret;
if (fp == NULL)
{
errno = EBADF;
return EOF;
}
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+
+ switch (fp->compress_algorithm)
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fclose(fp->fp);
+ fp->fp = NULL;
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzclose(fp->fp);
+ fp->fp = NULL;
+#else
+ pg_fatal("not built with zlib support");
#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
+ break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
}
+
free_keep_errno(fp);
- return result;
+ return ret;
}
int
cfeof(cfp *fp)
{
+ int ret;
+
+ switch (fp->compress_algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = feof(fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
+ ret = gzeof(fp->fp);
+#else
+ pg_fatal("not built with zlib support");
#endif
- return feof(fp->uncompressedfp);
+ break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
+ }
+
+ return ret;
}
const char *
get_cfp_error(cfp *fp)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ if (fp->compress_algorithm == PG_COMPRESSION_GZIP)
{
+#ifdef HAVE_LIBZ
int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ const char *errmsg = gzerror(fp->fp, &errnum);
if (errnum != Z_ERRNO)
return errmsg;
- }
+#else
+ pg_fatal("not built with zlib support");
#endif
+ }
+
return strerror(errno);
}
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index f635787692..d6335fff02 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -21,12 +21,6 @@
#define ZLIB_OUT_SIZE 4096
#define ZLIB_IN_SIZE 4096
-typedef enum
-{
- COMPR_ALG_NONE,
- COMPR_ALG_LIBZ
-} CompressionAlgorithm;
-
/* Prototype for callback function to WriteDataToArchive() */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
@@ -46,8 +40,10 @@ typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
-extern CompressorState *AllocateCompressor(int compression, WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH, int compression,
+extern CompressorState *AllocateCompressor(const pg_compress_specification compress_spec,
+ WriteFunc writeF);
+extern void ReadDataFromArchive(ArchiveHandle *AH,
+ const pg_compress_specification compress_spec,
ReadFunc readF);
extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen);
@@ -56,9 +52,13 @@ extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
typedef struct cfp cfp;
-extern cfp *cfopen(const char *path, const char *mode, int compression);
+extern cfp *cfopen(const char *path, const char *mode,
+ const pg_compress_specification compress_spec);
+extern cfp *cfdopen(int fd, const char *mode,
+ pg_compress_specification compress_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode, int compression);
+extern cfp *cfopen_write(const char *path, const char *mode,
+ const pg_compress_specification compress_spec);
extern int cfread(void *ptr, int size, cfp *fp);
extern int cfwrite(const void *ptr, int size, cfp *fp);
extern int cfgetc(cfp *fp);
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index e8b7898297..61c412c8cb 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -23,6 +23,7 @@
#ifndef PG_BACKUP_H
#define PG_BACKUP_H
+#include "common/compression.h"
#include "fe_utils/simple_list.h"
#include "libpq-fe.h"
@@ -143,7 +144,8 @@ typedef struct _restoreOptions
int noDataForFailedTables;
int exit_on_error;
- int compression;
+ pg_compress_specification compress_spec; /* Specification for
+ * compression */
int suppressDumpWarnings; /* Suppress output of WARNING entries
* to stderr */
bool single_txn;
@@ -303,7 +305,8 @@ extern Archive *OpenArchive(const char *FileSpec, const ArchiveFormat fmt);
/* Create a new archive */
extern Archive *CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compress_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupDumpWorker);
/* The --list option */
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index f39c0fa36f..304cc072ca 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -70,7 +64,8 @@ typedef struct _parallelReadyList
static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compress_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupWorkerPtr);
static void _getObjectDescription(PQExpBuffer buf, const TocEntry *te);
static void _printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData);
@@ -98,9 +93,10 @@ static int _discoverArchiveFormat(ArchiveHandle *AH);
static int RestoringToDB(ArchiveHandle *AH);
static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
-static void SetOutput(ArchiveHandle *AH, const char *filename, int compression);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static void SetOutput(ArchiveHandle *AH, const char *filename,
+ const pg_compress_specification compress_spec);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -239,12 +235,13 @@ setupRestoreWorker(Archive *AHX)
/* Public */
Archive *
CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compress_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupDumpWorker)
{
- ArchiveHandle *AH = _allocAH(FileSpec, fmt, compression, dosync,
- mode, setupDumpWorker);
+ ArchiveHandle *AH = _allocAH(FileSpec, fmt, compress_spec,
+ dosync, mode, setupDumpWorker);
return (Archive *) AH;
}
@@ -254,7 +251,12 @@ CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
Archive *
OpenArchive(const char *FileSpec, const ArchiveFormat fmt)
{
- ArchiveHandle *AH = _allocAH(FileSpec, fmt, 0, true, archModeRead, setupRestoreWorker);
+ ArchiveHandle *AH;
+ pg_compress_specification compress_spec = {0};
+
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH = _allocAH(FileSpec, fmt, compress_spec, true,
+ archModeRead, setupRestoreWorker);
return (Archive *) AH;
}
@@ -269,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -354,8 +353,9 @@ RestoreArchive(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
+ bool supports_compression;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -383,16 +383,23 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression != 0 && AH->PrintTocDataPtr != NULL)
+ supports_compression = true;
+ if (AH->compress_spec.algorithm != PG_COMPRESSION_NONE &&
+ AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+#ifndef HAVE_LIBZ
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+ if (supports_compression == false)
+ pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -459,8 +466,8 @@ RestoreArchive(Archive *AHX)
* Setup the output file if necessary.
*/
sav = SaveOutput(AH);
- if (ropt->filename || ropt->compression)
- SetOutput(AH, ropt->filename, ropt->compression);
+ if (ropt->filename || ropt->compress_spec.algorithm != PG_COMPRESSION_NONE)
+ SetOutput(AH, ropt->filename, ropt->compress_spec);
ahprintf(AH, "--\n-- PostgreSQL database dump\n--\n\n");
@@ -739,7 +746,7 @@ RestoreArchive(Archive *AHX)
*/
AH->stage = STAGE_FINALIZING;
- if (ropt->filename || ropt->compression)
+ if (ropt->filename || ropt->compress_spec.algorithm != PG_COMPRESSION_NONE)
RestoreOutput(AH, sav);
if (ropt->useDB)
@@ -969,6 +976,8 @@ NewRestoreOptions(void)
opts->format = archUnknown;
opts->cparams.promptPassword = TRI_DEFAULT;
opts->dumpSections = DUMP_UNSECTIONED;
+ opts->compress_spec.algorithm = PG_COMPRESSION_NONE;
+ opts->compress_spec.level = INT_MIN;
return opts;
}
@@ -1115,23 +1124,28 @@ PrintTOCSummary(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
TocEntry *te;
+ pg_compress_specification out_compress_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
+ /* TOC is always uncompressed */
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+
sav = SaveOutput(AH);
if (ropt->filename)
- SetOutput(AH, ropt->filename, 0 /* no compression */ );
+ SetOutput(AH, ropt->filename, out_compress_spec);
if (strftime(stamp_str, sizeof(stamp_str), PGDUMP_STRFTIME_FMT,
localtime(&AH->createDate)) == 0)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compress_spec.algorithm));
switch (AH->format)
{
@@ -1485,60 +1499,35 @@ archprintf(Archive *AH, const char *fmt,...)
*******************************/
static void
-SetOutput(ArchiveHandle *AH, const char *filename, int compression)
+SetOutput(ArchiveHandle *AH, const char *filename,
+ const pg_compress_specification compress_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression != 0)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compress_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compress_spec);
if (!AH->OF)
{
@@ -1549,33 +1538,24 @@ SetOutput(ArchiveHandle *AH, const char *filename, int compression)
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1699,22 +1679,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2198,10 +2173,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
static ArchiveHandle *
_allocAH(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compress_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2249,14 +2226,14 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
AH->toc->prev = AH->toc;
AH->mode = mode;
- AH->compression = compression;
+ AH->compress_spec = compress_spec;
AH->dosync = dosync;
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -2264,7 +2241,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
* Force stdin/stdout into binary mode if that is what we are using.
*/
#ifdef WIN32
- if ((fmt != archNull || compression != 0) &&
+ if ((fmt != archNull || compress_spec.algorithm != PG_COMPRESSION_NONE) &&
(AH->fSpec == NULL || strcmp(AH->fSpec, "") == 0))
{
if (mode == archModeWrite)
@@ -3669,7 +3646,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- WriteInt(AH, AH->compression);
+ WriteInt(AH, AH->compress_spec.level);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3740,21 +3717,26 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
+ AH->compress_spec.algorithm = PG_COMPRESSION_NONE;
if (AH->version >= K_VERS_1_2)
{
if (AH->version < K_VERS_1_4)
- AH->compression = AH->ReadBytePtr(AH);
+ AH->compress_spec.level = AH->ReadBytePtr(AH);
else
- AH->compression = ReadInt(AH);
+ AH->compress_spec.level = ReadInt(AH);
+
+ if (AH->compress_spec.level != 0)
+ AH->compress_spec.algorithm = PG_COMPRESSION_GZIP;
}
else
- AH->compression = Z_DEFAULT_COMPRESSION;
+ AH->compress_spec.algorithm = PG_COMPRESSION_GZIP;
#ifndef HAVE_LIBZ
- if (AH->compression != 0)
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
#endif
+
if (AH->version >= K_VERS_1_4)
{
struct tm crtm;
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 42687c4ec8..d2930949ab 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
@@ -331,14 +306,8 @@ struct _archiveHandle
DumpId *tableDataId; /* TABLE DATA ids, indexed by table dumpId */
struct _tocEntry *currToc; /* Used when dumping data */
- int compression; /*---------
- * Compression requested on open().
- * Possible values for compression:
- * -1 Z_DEFAULT_COMPRESSION
- * 0 COMPRESSION_NONE
- * 1-9 levels for gzip compression
- *---------
- */
+ pg_compress_specification compress_spec; /* Requested specification for
+ * compression */
bool dosync; /* data requested to be synced on sight */
ArchiveMode mode; /* File mode - r or w */
void *formatData; /* Header data specific to file format */
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index a0a55a1edd..6a2112c45f 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,7 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compress_spec, _CustomWriteFunc);
}
/*
@@ -377,7 +377,7 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compress_spec, _CustomWriteFunc);
}
/*
@@ -566,7 +566,7 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression, _CustomReadFunc);
+ ReadDataFromArchive(AH, AH->compress_spec, _CustomReadFunc);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 798182b6f7..7d2cddbb2c 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -327,7 +327,8 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
+ ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
+ AH->compress_spec);
if (ctx->dataFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -573,6 +574,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
cfp *tocFH;
+ pg_compress_specification compress_spec = {0};
char fname[MAXPGPATH];
setFilePath(AH, fname, "toc.dat");
@@ -581,7 +583,8 @@ _CloseArchive(ArchiveHandle *AH)
ctx->pstate = ParallelBackupStart(AH);
/* The TOC is always created uncompressed */
- tocFH = cfopen_write(fname, PG_BINARY_W, 0);
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
+ tocFH = cfopen_write(fname, PG_BINARY_W, compress_spec);
if (tocFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -639,12 +642,14 @@ static void
_StartBlobs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ pg_compress_specification compress_spec = {0};
char fname[MAXPGPATH];
setFilePath(AH, fname, "blobs.toc");
/* The blob TOC file is never compressed */
- ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
+ ctx->blobsTocFH = cfopen_write(fname, "ab", compress_spec);
if (ctx->blobsTocFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -662,7 +667,7 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
+ ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compress_spec);
if (ctx->dataFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
diff --git a/src/bin/pg_dump/pg_backup_tar.c b/src/bin/pg_dump/pg_backup_tar.c
index 402b93c610..d773c291c8 100644
--- a/src/bin/pg_dump/pg_backup_tar.c
+++ b/src/bin/pg_dump/pg_backup_tar.c
@@ -35,6 +35,7 @@
#include <unistd.h>
#include "common/file_utils.h"
+#include "compress_io.h"
#include "fe_utils/string_utils.h"
#include "pg_backup_archiver.h"
#include "pg_backup_tar.h"
@@ -194,7 +195,7 @@ InitArchiveFmt_Tar(ArchiveHandle *AH)
* possible since gzdopen uses buffered IO which totally screws file
* positioning.
*/
- if (AH->compression != 0)
+ if (AH->compress_spec.algorithm != PG_COMPRESSION_NONE)
pg_fatal("compression is not supported by tar archive format");
}
else
@@ -328,7 +329,7 @@ tarOpen(ArchiveHandle *AH, const char *filename, char mode)
}
}
- if (AH->compression == 0)
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_NONE)
tm->nFH = ctx->tarFH;
else
pg_fatal("compression is not supported by tar archive format");
@@ -383,7 +384,7 @@ tarOpen(ArchiveHandle *AH, const char *filename, char mode)
umask(old_umask);
- if (AH->compression == 0)
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_NONE)
tm->nFH = tm->tmpFH;
else
pg_fatal("compression is not supported by tar archive format");
@@ -401,7 +402,7 @@ tarOpen(ArchiveHandle *AH, const char *filename, char mode)
static void
tarClose(ArchiveHandle *AH, TAR_MEMBER *th)
{
- if (AH->compression != 0)
+ if (AH->compress_spec.algorithm != PG_COMPRESSION_NONE)
pg_fatal("compression is not supported by tar archive format");
if (th->mode == 'w')
@@ -800,7 +801,6 @@ _CloseArchive(ArchiveHandle *AH)
memcpy(ropt, AH->public.ropt, sizeof(RestoreOptions));
ropt->filename = NULL;
ropt->dropSchema = 1;
- ropt->compression = 0;
ropt->superuser = NULL;
ropt->suppressDumpWarnings = true;
@@ -888,7 +888,7 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
if (oid == 0)
pg_fatal("invalid OID for large object (%u)", oid);
- if (AH->compression != 0)
+ if (AH->compress_spec.algorithm != PG_COMPRESSION_NONE)
pg_fatal("compression is not supported by tar archive format");
sprintf(fname, "blob_%u.dat", oid);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index da427f4d4a..62c09e6ed4 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -54,8 +54,10 @@
#include "catalog/pg_subscription.h"
#include "catalog/pg_trigger_d.h"
#include "catalog/pg_type_d.h"
+#include "common/compression.h"
#include "common/connect.h"
#include "common/relpath.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/option_utils.h"
#include "fe_utils/string_utils.h"
@@ -164,6 +166,8 @@ static void setup_connection(Archive *AH,
const char *dumpencoding, const char *dumpsnapshot,
char *use_role);
static ArchiveFormat parseArchiveFormat(const char *format, ArchiveMode *mode);
+static bool parse_compression(const char *opt,
+ pg_compress_specification *compress_spec);
static void expand_schema_name_patterns(Archive *fout,
SimpleStringList *patterns,
SimpleOidList *oids,
@@ -340,8 +344,9 @@ main(int argc, char **argv)
const char *dumpsnapshot = NULL;
char *use_role = NULL;
int numWorkers = 1;
- int compressLevel = -1;
int plainText = 0;
+ pg_compress_specification compress_spec = {0};
+ bool user_compression_defined = false;
ArchiveFormat archiveFormat = archUnknown;
ArchiveMode archiveMode;
@@ -561,10 +566,10 @@ main(int argc, char **argv)
dopt.aclsSkip = true;
break;
- case 'Z': /* Compression Level */
- if (!option_parse_int(optarg, "-Z/--compress", 0, 9,
- &compressLevel))
+ case 'Z': /* Compression */
+ if (!parse_compression(optarg, &compress_spec))
exit_nicely(1);
+ user_compression_defined = true;
break;
case 0:
@@ -687,23 +692,20 @@ main(int argc, char **argv)
if (archiveFormat == archNull)
plainText = 1;
- /* Custom and directory formats are compressed by default, others not */
- if (compressLevel == -1)
+ /*
+ * Custom and directory formats are compressed by default (zlib), others
+ * not
+ */
+ if (user_compression_defined == false)
{
+ parse_compress_specification(PG_COMPRESSION_NONE, NULL, &compress_spec);
#ifdef HAVE_LIBZ
if (archiveFormat == archCustom || archiveFormat == archDirectory)
- compressLevel = Z_DEFAULT_COMPRESSION;
- else
+ parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
+ &compress_spec);
#endif
- compressLevel = 0;
}
-#ifndef HAVE_LIBZ
- if (compressLevel != 0)
- pg_log_warning("requested compression not available in this installation -- archive will be uncompressed");
- compressLevel = 0;
-#endif
-
/*
* If emitting an archive format, we always want to emit a DATABASE item,
* in case --create is specified at pg_restore time.
@@ -716,8 +718,8 @@ main(int argc, char **argv)
pg_fatal("parallel backup only supported by the directory format");
/* Open the output file */
- fout = CreateArchive(filename, archiveFormat, compressLevel, dosync,
- archiveMode, setupDumpWorker);
+ fout = CreateArchive(filename, archiveFormat, compress_spec,
+ dosync, archiveMode, setupDumpWorker);
/* Make dump options accessible right away */
SetArchiveOptions(fout, &dopt, NULL);
@@ -948,10 +950,7 @@ main(int argc, char **argv)
ropt->sequence_data = dopt.sequence_data;
ropt->binary_upgrade = dopt.binary_upgrade;
- if (compressLevel == -1)
- ropt->compression = 0;
- else
- ropt->compression = compressLevel;
+ ropt->compress_spec = compress_spec;
ropt->suppressDumpWarnings = true; /* We've already shown them */
@@ -998,7 +997,8 @@ help(const char *progname)
printf(_(" -j, --jobs=NUM use this many parallel jobs to dump\n"));
printf(_(" -v, --verbose verbose mode\n"));
printf(_(" -V, --version output version information, then exit\n"));
- printf(_(" -Z, --compress=0-9 compression level for compressed formats\n"));
+ printf(_(" -Z, --compress=METHOD[:LEVEL]\n"
+ " compress as specified\n"));
printf(_(" --lock-wait-timeout=TIMEOUT fail after waiting TIMEOUT for a table lock\n"));
printf(_(" --no-sync do not wait for changes to be written safely to disk\n"));
printf(_(" -?, --help show this help, then exit\n"));
@@ -1258,6 +1258,61 @@ get_synchronized_snapshot(Archive *fout)
return result;
}
+/*
+ * Interprets and validates a compression option using the common compression
+ * parsing functions. If the requested compression is not available then the
+ * archives are uncompressed.
+ */
+static bool
+parse_compression(const char *opt, pg_compress_specification *compress_spec)
+{
+ char *algorithm_str = NULL;
+ char *level_str = NULL;
+ char *validation_error = NULL;
+ bool supports_compression = true;
+
+ parse_compress_options(opt, &algorithm_str, &level_str);
+ if (!parse_compress_algorithm(algorithm_str, &(compress_spec->algorithm)))
+ {
+ pg_log_error("invalid compression method: \"%s\" (gzip, none)",
+ algorithm_str);
+ return false;
+ }
+
+ /* Switch off unimplemented or unavailable compressions. */
+ if (compress_spec->algorithm != PG_COMPRESSION_NONE &&
+ compress_spec->algorithm != PG_COMPRESSION_GZIP)
+ supports_compression = false;
+
+#ifndef HAVE_LIBZ
+ if (compress_spec->algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+
+ if (!supports_compression)
+ {
+ pg_log_warning("requested compression not available in this installation -- archive will be uncompressed");
+ parse_compress_specification(PG_COMPRESSION_NONE, NULL, compress_spec);
+
+ pg_free(algorithm_str);
+ pg_free(level_str);
+
+ return true;
+ }
+
+ /* Parse and validate the rest of the options */
+ parse_compress_specification(compress_spec->algorithm, level_str,
+ compress_spec);
+ validation_error = validate_compress_specification(compress_spec);
+ if (validation_error)
+ {
+ pg_log_error("invalid compression specification: %s", validation_error);
+ return false;
+ }
+
+ return true;
+}
+
static ArchiveFormat
parseArchiveFormat(const char *format, ArchiveMode *mode)
{
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a583c8a6d2..f8d0b2fce5 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -121,16 +121,32 @@ command_fails_like(
'pg_restore: cannot specify both --single-transaction and multiple jobs');
command_fails_like(
- [ 'pg_dump', '-Z', '-1' ],
- qr/\Qpg_dump: error: -Z\/--compress must be in range 0..9\E/,
- 'pg_dump: -Z/--compress must be in range');
+ [ 'pg_dump', '--compress', 'garbage' ],
+ qr/\Qpg_dump: error: invalid compression method: "garbage" (gzip, none)\E/,
+ 'pg_dump: invalid --compress');
+
+command_fails_like(
+ [ 'pg_dump', '--compress', 'none:1' ],
+ qr/\Qpg_dump: error: invalid compression specification: compression algorithm "none" does not accept a compression level\E/,
+ 'pg_dump: invalid compression specification: compression algorithm "none" does not accept a compression level');
+
if (check_pg_config("#define HAVE_LIBZ 1"))
{
+ command_fails_like(
+ [ 'pg_dump', '-Z', '15' ],
+ qr/\Qpg_dump: error: invalid compression specification: compression algorithm "gzip" expects a compression level between 1 and 9 (default at -1)\E/,
+ 'pg_dump: invalid compression specification: must be in range');
+
command_fails_like(
[ 'pg_dump', '--compress', '1', '--format', 'tar' ],
qr/\Qpg_dump: error: compression is not supported by tar archive format\E/,
'pg_dump: compression is not supported by tar archive format');
+
+ command_fails_like(
+ [ 'pg_dump', '-Z', 'gzip:nonInt' ],
+ qr/\Qpg_dump: error: invalid compression specification: unrecognized compression option: "nonInt"\E/,
+ 'pg_dump: invalid compression specification: must be an integer');
}
else
{
@@ -139,6 +155,11 @@ else
[ 'pg_dump', '--compress', '1', '--format', 'tar', '-j3' ],
qr/\Qpg_dump: warning: requested compression not available in this installation -- archive will be uncompressed\E/,
'pg_dump: warning: compression not available in this installation');
+
+ command_fails_like(
+ [ 'pg_dump', '-Z', 'gzip:nonInt', '--format', 'tar', '-j2' ],
+ qr/\Qpg_dump: warning: requested compression not available in this installation -- archive will be uncompressed\E/,
+ 'pg_dump: invalid compression specification: must be an integer');
}
command_fails_like(
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index fe53ed0f89..d604558f03 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -87,7 +87,7 @@ my %pgdump_runs = (
compile_option => 'gzip',
dump_cmd => [
'pg_dump', '--jobs=2',
- '--format=directory', '--compress=1',
+ '--format=directory', '--compress=gzip:1',
"--file=$tempdir/compression_gzip_dir", 'postgres',
],
# Give coverage for manually compressed blob.toc files during
--
2.34.1
v10-0003-Introduce-Compressor-API-in-pg_dump.patchtext/x-patch; name=v10-0003-Introduce-Compressor-API-in-pg_dump.patchDownload
From 8285412481539bb6649616b24df5191a9423f3c1 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Tue, 29 Nov 2022 11:22:48 +0000
Subject: [PATCH v10 3/4] Introduce Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for opening, writing, etc. The implementor of a new
compression method is now able to "simply" just add those definitions.
Custom compressed archives need to now store the compression method in their
header. This requires a bump in the version number. The level of compression
is no longer stored in the dump as it is irrelevant.
---
src/bin/pg_dump/Makefile | 1 +
src/bin/pg_dump/compress_gzip.c | 390 ++++++++++++
src/bin/pg_dump/compress_gzip.h | 9 +
src/bin/pg_dump/compress_io.c | 817 ++++++--------------------
src/bin/pg_dump/compress_io.h | 69 ++-
src/bin/pg_dump/meson.build | 1 +
src/bin/pg_dump/pg_backup_archiver.c | 93 +--
src/bin/pg_dump/pg_backup_archiver.h | 4 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 85 +--
10 files changed, 765 insertions(+), 727 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 9dc5a784dd..29eab02d37 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,6 +24,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..bc6d1abc77
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,390 @@
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_gzip.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ int compressionLevel;
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ if (deflateInit(zp, gzipcs->compressionLevel) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs, int compressionLevel)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+ gzipcs->compressionLevel = compressionLevel;
+
+ cs->private = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+typedef struct GzipData
+{
+ gzFile fp;
+ int compressionLevel;
+} GzipData;
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ size_t ret;
+
+ ret = gzread(gd->fp, ptr, size);
+ if (ret != size && !gzeof(gd->fp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gd->fp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+
+ return gzwrite(gd->fp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gd->fp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gd->fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+
+ return gzgets(gd->fp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ int save_errno;
+ int ret;
+
+ CFH->private = NULL;
+
+ ret = gzclose(gd->fp);
+
+ save_errno = errno;
+ free(gd);
+ errno = save_errno;
+
+ return ret;
+}
+
+static int
+Gzip_eof(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+
+ return gzeof(gd->fp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gd->fp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ char mode_compression[32];
+
+ if (gd->compressionLevel != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, gd->compressionLevel);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gd->fp = gzdopen(dup(fd), mode_compression);
+ else
+ gd->fp = gzopen(path, mode_compression);
+
+ if (gd->fp == NULL)
+ return 1;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle * CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressGzip(CompressFileHandle * CFH, int compressionLevel)
+{
+ GzipData *gd;
+
+ CFH->open = Gzip_open;
+ CFH->open_write = Gzip_open_write;
+ CFH->read = Gzip_read;
+ CFH->write = Gzip_write;
+ CFH->gets = Gzip_gets;
+ CFH->getc = Gzip_getc;
+ CFH->close = Gzip_close;
+ CFH->eof = Gzip_eof;
+ CFH->get_error = Gzip_get_error;
+
+ gd = pg_malloc0(sizeof(GzipData));
+ gd->compressionLevel = compressionLevel;
+
+ CFH->private = gd;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs, int compressionLevel)
+{
+ pg_fatal("not built with zlib support");
+}
+
+void
+InitCompressGzip(CompressFileHandle * CFH, int compressionLevel)
+{
+ pg_fatal("not built with zlib support");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..ab0362c1f3
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,9 @@
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs, int compressionLevel);
+extern void InitCompressGzip(CompressFileHandle * CFH, int compressionLevel);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 4a8fc1e306..3065bd76fa 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -51,9 +51,12 @@
*
*-------------------------------------------------------------------------
*/
+#include <sys/stat.h>
+#include <unistd.h>
#include "postgres_fe.h"
#include "compress_io.h"
+#include "compress_gzip.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -65,113 +68,73 @@
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
+/* Private routines that support uncompressed data I/O */
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
{
- pg_compress_algorithm compress_algorithm;
- WriteFunc writeF;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
+ free(buf);
+}
+
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+static void
+InitCompressorNone(CompressorState *cs)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+}
/* Public interface routines */
/* Allocate a new compressor */
CompressorState *
AllocateCompressor(const pg_compress_specification compress_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compress_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("not built with zlib support");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
- cs->compress_algorithm = compress_spec.algorithm;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compress_algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, compress_spec.level);
-#endif
- return cs;
-}
-
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH, pg_compress_specification compress_spec,
- ReadFunc readF)
-{
switch (compress_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- ReadDataFromArchiveNone(AH, readF);
+ InitCompressorNone(cs);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("not built with zlib support");
-#endif
+ InitCompressorGzip(cs, compress_spec.level);
break;
default:
pg_fatal("invalid compression method");
break;
}
-}
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compress_algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- default:
- pg_fatal("invalid compression method");
- break;
- }
+ return cs;
}
/*
@@ -180,243 +143,28 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
- switch (cs->compress_algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- EndCompressorZlib(AH, cs);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- free(cs);
- break;
-
- default:
- pg_fatal("invalid compression method");
- break;
- }
-}
-
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
- */
-
-static void
-InitCompressorZlib(CompressorState *cs, int level)
-{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
-}
-
-static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
-{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
-
- free(cs->zlibOut);
- free(cs->zp);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
-{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
-
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
-
- if (res == Z_STREAM_END)
- break;
- }
-}
-
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
-}
-
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
-{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
-
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
-
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
-
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
-}
-#endif /* HAVE_LIBZ */
-
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
-{
- size_t cnt;
- char *buf;
- size_t buflen;
-
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
-
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
-
- free(buf);
-}
-
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->writeF(AH, data, dLen);
-}
-
-
/*----------------------
* Compressed stream API
*----------------------
*/
-/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
- */
-struct cfp
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- pg_compress_algorithm compress_algorithm;
- void *fp;
-};
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
+}
/* free() without changing errno; useful in several places below */
static void
@@ -429,392 +177,219 @@ free_keep_errno(void *p)
}
/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
+ * Compression None implementation
*/
-cfp *
-cfopen_read(const char *path, const char *mode)
+static size_t
+_read(void *ptr, size_t size, CompressFileHandle * CFH)
{
- cfp *fp;
- pg_compress_specification compress_spec = {0};
+ FILE *fp = (FILE *) CFH->private;
+ size_t ret;
- compress_spec.algorithm = PG_COMPRESSION_GZIP;
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- fp = cfopen(path, mode, compress_spec);
- else
-#endif
- {
- compress_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compress_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
-
- compress_spec.algorithm = PG_COMPRESSION_GZIP;
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compress_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
-}
+ if (size == 0)
+ return 0;
-/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compress_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compress_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compress_spec)
-{
- cfp *fp;
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
- if (compress_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compress_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compress_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("not built with zlib support");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
+ return ret;
}
-/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
- */
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_algorithm compress_algorithm, int compressionLevel)
+static size_t
+_write(const void *ptr, size_t size, CompressFileHandle * CFH)
{
- cfp *fp = pg_malloc(sizeof(cfp));
-
- fp->compress_algorithm = compress_algorithm;
-
- switch (compress_algorithm)
- {
- case PG_COMPRESSION_NONE:
- if (fd >= 0)
- fp->fp = fdopen(fd, mode);
- else
- fp->fp = fopen(path, mode);
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- if (compressionLevel != Z_DEFAULT_COMPRESSION)
- {
- /*
- * user has specified a compression level, so tell zlib to use
- * it
- */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compressionLevel);
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode_compression);
- else
- fp->fp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode);
- else
- fp->fp = gzopen(path, mode);
- }
-
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- default:
- pg_fatal("invalid compression method");
- break;
- }
-
- return fp;
+ return fwrite(ptr, 1, size, (FILE *) CFH->private);
}
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compress_spec)
+static const char *
+_get_error(CompressFileHandle * CFH)
{
- return cfopen_internal(path, -1, mode,
- compress_spec.algorithm,
- compress_spec.level);
+ return strerror(errno);
}
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compress_spec)
+static char *
+_gets(char *ptr, int size, CompressFileHandle * CFH)
{
- return cfopen_internal(NULL, fd, mode,
- compress_spec.algorithm,
- compress_spec.level);
+ return fgets(ptr, size, (FILE *) CFH->private);
}
-int
-cfread(void *ptr, int size, cfp *fp)
+static int
+_getc(CompressFileHandle * CFH)
{
+ FILE *fp = (FILE *) CFH->private;
int ret;
- if (size == 0)
- return 0;
-
- switch (fp->compress_algorithm)
+ ret = fgetc(fp);
+ if (ret == EOF)
{
- case PG_COMPRESSION_NONE:
- ret = fread(ptr, 1, size, fp->fp);
- if (ret != size && !feof(fp->fp))
- READ_ERROR_EXIT(fp->fp);
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzread(fp->fp, ptr, size);
- if (ret != size && !gzeof(fp->fp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->fp, &errnum);
-
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
-
- default:
- pg_fatal("invalid compression method");
- break;
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
}
return ret;
}
-int
-cfwrite(const void *ptr, int size, cfp *fp)
+static int
+_close(CompressFileHandle * CFH)
{
+ FILE *fp = (FILE *) CFH->private;
int ret = 0;
- switch (fp->compress_algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fwrite(ptr, 1, size, fp->fp);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzwrite(fp->fp, ptr, size);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- default:
- pg_fatal("invalid compression method");
- break;
- }
+ CFH->private = NULL;
+
+ if (fp)
+ ret = fclose(fp);
return ret;
}
-int
-cfgetc(cfp *fp)
+static int
+_eof(CompressFileHandle * CFH)
{
- int ret;
+ return feof((FILE *) CFH->private);
+}
- switch (fp->compress_algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgetc(fp->fp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->fp);
+static int
+_open(const char *path, int fd, const char *mode, CompressFileHandle * CFH)
+{
+ Assert(CFH->private == NULL);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgetc((gzFile) fp->fp);
- if (ret == EOF)
- {
- if (!gzeof(fp->fp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- default:
- pg_fatal("invalid compression method");
- break;
- }
+ if (fd >= 0)
+ CFH->private = fdopen(dup(fd), mode);
+ else
+ CFH->private = fopen(path, mode);
- return ret;
+ if (CFH->private == NULL)
+ return 1;
+
+ return 0;
}
-char *
-cfgets(cfp *fp, char *buf, int len)
+static int
+_open_write(const char *path, const char *mode, CompressFileHandle * CFH)
{
- char *ret;
+ Assert(CFH->private == NULL);
- switch (fp->compress_algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgets(buf, len, fp->fp);
+ CFH->private = fopen(path, mode);
+ if (CFH->private == NULL)
+ return 1;
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgets(fp->fp, buf, len);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- default:
- pg_fatal("invalid compression method");
- break;
- }
+ return 0;
+}
- return ret;
+static void
+InitCompressNone(CompressFileHandle * CFH)
+{
+ CFH->open = _open;
+ CFH->open_write = _open_write;
+ CFH->read = _read;
+ CFH->write = _write;
+ CFH->gets = _gets;
+ CFH->getc = _getc;
+ CFH->close = _close;
+ CFH->eof = _eof;
+ CFH->get_error = _get_error;
+
+ CFH->private = NULL;
}
-int
-cfclose(cfp *fp)
+/*
+ * Public interface
+ */
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compress_spec)
{
- int ret;
+ CompressFileHandle *CFH;
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
- }
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
- switch (fp->compress_algorithm)
+ switch (compress_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- ret = fclose(fp->fp);
- fp->fp = NULL;
-
+ InitCompressNone(CFH);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzclose(fp->fp);
- fp->fp = NULL;
-#else
- pg_fatal("not built with zlib support");
-#endif
+ InitCompressGzip(CFH, compress_spec.level);
break;
default:
pg_fatal("invalid compression method");
break;
}
- free_keep_errno(fp);
-
- return ret;
+ return CFH;
}
-int
-cfeof(cfp *fp)
+/*
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
+ *
+ * If the file at 'path' does not exist, we append the ".gz" suffix (if
+ * 'path' doesn't already have it) and try again. So if you pass "foo" as
+ * 'path', this will open either "foo" or "foo.gz", trying in that order.
+ *
+ * On failure, return NULL with an error code in errno.
+ *
+ */
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- int ret;
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compress_spec = {0};
- switch (fp->compress_algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = feof(fp->fp);
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzeof(fp->fp);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- default:
- pg_fatal("invalid compression method");
- break;
- }
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
- return ret;
-}
+ fname = strdup(path);
-const char *
-get_cfp_error(cfp *fp)
-{
- if (fp->compress_algorithm == PG_COMPRESSION_GZIP)
+ if (hasSuffix(fname, ".gz"))
+ compress_spec.algorithm = PG_COMPRESSION_GZIP;
+ else
{
+ bool exists;
+
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not build with compression */
+ if (exists)
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- int errnum;
- const char *errmsg = gzerror(fp->fp, &errnum);
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
- if (errnum != Z_ERRNO)
- return errmsg;
-#else
- pg_fatal("not built with zlib support");
+ if (exists)
+ compress_spec.algorithm = PG_COMPRESSION_GZIP;
+ }
#endif
}
- return strerror(errno);
+ CFH = InitCompressFileHandle(compress_spec);
+ if (CFH->open(fname, -1, mode, CFH))
+ {
+ free_keep_errno(CFH);
+ CFH = NULL;
+ }
+ free_keep_errno(fname);
+
+ return CFH;
}
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
+int
+DestroyCompressFileHandle(CompressFileHandle * CFH)
{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
+ int ret = 0;
- if (filenamelen < suffixlen)
- return 0;
+ if (CFH->private)
+ ret = CFH->close(CFH);
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
-}
+ free_keep_errno(CFH);
-#endif
+ return ret;
+}
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index d6335fff02..a986f5e6ee 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -37,34 +37,61 @@ typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ ReadFunc readF;
+ WriteFunc writeF;
+
+ void *private;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compress_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compress_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ int (*open) (const char *path, int fd, const char *mode,
+ CompressFileHandle * CFH);
+ int (*open_write) (const char *path, const char *mode,
+ CompressFileHandle * cxt);
+ size_t (*read) (void *ptr, size_t size, CompressFileHandle * CFH);
+ size_t (*write) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ char *(*gets) (char *s, int size, CompressFileHandle * CFH);
+ int (*getc) (CompressFileHandle * CFH);
+ int (*eof) (CompressFileHandle * CFH);
+ int (*close) (CompressFileHandle * CFH);
+ const char *(*get_error) (CompressFileHandle * CFH);
+
+ void *private;
+};
+
-typedef struct cfp cfp;
+extern CompressFileHandle * InitCompressFileHandle(const pg_compress_specification compress_spec);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compress_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- pg_compress_specification compress_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compress_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+extern CompressFileHandle * InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int DestroyCompressFileHandle(CompressFileHandle * CFH);
#endif
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index d96e566846..0c73a4707e 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,5 +1,6 @@
pg_dump_common_sources = files(
'compress_io.c',
+ 'compress_gzip.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 304cc072ca..09e20fb97b 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compress_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle * SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle * savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -355,7 +355,7 @@ RestoreArchive(Archive *AHX)
bool parallel_mode;
bool supports_compression;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1126,7 +1126,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compress_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1502,6 +1502,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compress_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1524,33 +1525,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compress_spec);
- else
- AH->OF = cfopen(filename, mode, compress_spec);
+ CFH = InitCompressFileHandle(compress_spec);
- if (!AH->OF)
+ if (CFH->open(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle * savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1689,7 +1689,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2031,6 +2035,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2061,26 +2077,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2178,6 +2180,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2233,7 +2236,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3646,7 +3652,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- WriteInt(AH, AH->compress_spec.level);
+ AH->WriteBytePtr(AH, AH->compress_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3718,7 +3724,9 @@ ReadHead(ArchiveHandle *AH)
AH->format, fmt);
AH->compress_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compress_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
if (AH->version < K_VERS_1_4)
AH->compress_spec.level = AH->ReadBytePtr(AH);
@@ -3731,11 +3739,20 @@ ReadHead(ArchiveHandle *AH)
else
AH->compress_spec.algorithm = PG_COMPRESSION_GZIP;
+ if (AH->compress_spec.algorithm != PG_COMPRESSION_NONE)
+ {
+ bool unsupported = false;
+
#ifndef HAVE_LIBZ
- if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
+ unsupported = true;
#endif
-
+ if (unsupported)
+ {
+ pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
+ AH->compress_spec.algorithm = PG_COMPRESSION_NONE;
+ }
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index d2930949ab..bb7fad2af1 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,12 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index 6a2112c45f..49ec0e3816 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compress_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compress_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compress_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compress_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compress_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compress_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 7d2cddbb2c..e1ce2f393b 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,9 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
+ CompressFileHandle *dataFH; /* currently open data file */
- cfp *blobsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *blobsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +198,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +218,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +327,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compress_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compress_spec);
+
+ if (ctx->dataFH->open_write(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +346,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
}
@@ -370,7 +371,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +386,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (DestroyCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +435,7 @@ _LoadBlobs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +443,14 @@ _LoadBlobs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->blobsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->blobsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->blobsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the blobs TOC file line-by-line, and process each blob */
- while ((cfgets(ctx->blobsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets(line, MAXPGPATH, CFH)) != NULL)
{
char blobfname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +465,11 @@ _LoadBlobs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreBlob(AH, oid);
}
- if (!cfeof(ctx->blobsTocFH))
+ if (!CFH->eof(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->blobsTocFH) != 0)
+ if (DestroyCompressFileHandle(ctx->blobsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +489,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
return 1;
@@ -512,8 +514,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc(CFH);
}
/*
@@ -524,15 +527,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
}
@@ -545,12 +549,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +578,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compress_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +589,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compress_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compress_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compress_spec);
+ if (tocFH->open_write(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +603,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +654,8 @@ _StartBlobs(ArchiveHandle *AH, TocEntry *te)
/* The blob TOC file is never compressed */
compress_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->blobsTocFH = cfopen_write(fname, "ab", compress_spec);
- if (ctx->blobsTocFH == NULL)
+ ctx->blobsTocFH = InitCompressFileHandle(compress_spec);
+ if (ctx->blobsTocFH->open_write(fname, "ab", ctx->blobsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +672,8 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compress_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compress_spec);
+ if (ctx->dataFH->open_write(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,17 +686,18 @@ static void
_EndBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->blobsTocFH;
char buf[50];
int len;
/* Close the BLOB data file itself */
- if (cfclose(ctx->dataFH) != 0)
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the blob in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->blobsTocFH) != len)
+ if (CFH->write(buf, len, CFH) != len)
pg_fatal("could not write to blobs TOC file");
}
@@ -706,7 +711,7 @@ _EndBlobs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->blobsTocFH) != 0)
+ if (DestroyCompressFileHandle(ctx->blobsTocFH) != 0)
pg_fatal("could not close blobs TOC file: %m");
ctx->blobsTocFH = NULL;
}
--
2.34.1
On Tue, Nov 29, 2022 at 12:10:46PM +0000, gkokolatos@pm.me wrote:
Thank you. Please advice if is preferable to split 0002 in two parts.
I think not but I will happily do so if you think otherwise.
This one makes me curious. What kind of split are you talking about?
If it makes the code review and the git history cleaner and easier, I
am usually a lot in favor of such incremental changes. As far as I
can see, there is the switch from the compression integer to
compression specification as one thing. The second thing is the
refactoring of cfclose() and these routines, paving the way for 0003.
Hmm, it may be cleaner to move the switch to the compression spec in
one patch, and move the logic around cfclose() to its own, paving the
way to 0003.
By the way, I think that this 0002 should drop all the default clauses
in the switches for the compression method so as we'd catch any
missing code paths with compiler warnings if a new compression method
is added in the future.
Anyway, I have applied 0001, adding you as a primary author because
you did most of it with only tweaks from me for pg_basebackup. The
docs of pg_basebackup have been amended to mention the slight change
in grammar, affecting the case where we do not have a detail string.
--
Michael
------- Original Message -------
On Wednesday, November 30th, 2022 at 1:50 AM, Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Nov 29, 2022 at 12:10:46PM +0000, gkokolatos@pm.me wrote:
Thank you. Please advice if is preferable to split 0002 in two parts.
I think not but I will happily do so if you think otherwise.This one makes me curious. What kind of split are you talking about?
If it makes the code review and the git history cleaner and easier, I
am usually a lot in favor of such incremental changes. As far as I
can see, there is the switch from the compression integer to
compression specification as one thing. The second thing is the
refactoring of cfclose() and these routines, paving the way for 0003.
Hmm, it may be cleaner to move the switch to the compression spec in
one patch, and move the logic around cfclose() to its own, paving the
way to 0003.
Fair enough. The atteched v11 does that. 0001 introduces compression
specification and is using it throughout. 0002 paves the way to the
new interface by homogenizing the use of cfp. 0003 introduces the new
API and stores the compression algorithm in the custom format header
instead of the compression level integer. Finally 0004 adds support for
LZ4.
Besides the version bump in 0003 which can possibly be split out and
as an independent and earlier step, I think that the patchset consists
of coherent units.
By the way, I think that this 0002 should drop all the default clauses
in the switches for the compression method so as we'd catch any
missing code paths with compiler warnings if a new compression method
is added in the future.
Sure.
Anyway, I have applied 0001, adding you as a primary author because
you did most of it with only tweaks from me for pg_basebackup. The
docs of pg_basebackup have been amended to mention the slight change
in grammar, affecting the case where we do not have a detail string.
Very kind of you, thank you.
Cheers,
//Georgios
Show quoted text
--
Michael
Attachments:
v11-0002-Prepare-pg_dump-internals-for-additional-compres.patchtext/x-patch; name=v11-0002-Prepare-pg_dump-internals-for-additional-compres.patchDownload
From 18c8ba5e115c8f3386ac5421d1cc595ef8ed2a62 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 30 Nov 2022 14:07:39 +0000
Subject: [PATCH v11 2/4] Prepare pg_dump internals for additional compression
methods.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it and is using it throughout.
---
src/bin/pg_dump/compress_io.c | 363 ++++++++++++++++++---------
src/bin/pg_dump/pg_backup_archiver.c | 130 ++++------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-
3 files changed, 297 insertions(+), 223 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 2c9d730fce..83b478bc63 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,10 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
/*----------------------
* Compressor API
*----------------------
@@ -127,15 +131,23 @@ void
ReadDataFromArchive(ArchiveHandle *AH, pg_compress_specification compress_spec,
ReadFunc readF)
{
- if (compress_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compress_spec.algorithm == PG_COMPRESSION_GZIP)
+ switch (compress_spec.algorithm)
{
+ case PG_COMPRESSION_NONE:
+ ReadDataFromArchiveNone(AH, readF);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
+ ReadDataFromArchiveZlib(AH, readF);
#else
- pg_fatal("not built with zlib support");
+ pg_fatal("not built with zlib support");
#endif
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
}
@@ -172,11 +184,24 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
+ switch (cs->compress_spec.algorithm)
+ {
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (cs->compress_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
+ EndCompressorZlib(AH, cs);
+#else
+ pg_fatal("not built with zlib support");
#endif
- free(cs);
+ break;
+ case PG_COMPRESSION_NONE:
+ free(cs);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
+ }
}
/* Private routines, specific to each compression method. */
@@ -390,10 +415,8 @@ WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
*/
struct cfp
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
+ pg_compress_specification compress_spec;
+ void *fp;
};
#ifdef HAVE_LIBZ
@@ -486,127 +509,195 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If 'compression' is non-zero, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compress_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_specification compress_spec)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compress_spec.algorithm == PG_COMPRESSION_GZIP)
+ fp->compress_spec = compress_spec;
+
+ switch (compress_spec.algorithm)
{
-#ifdef HAVE_LIBZ
- if (compress_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
+ case PG_COMPRESSION_NONE:
+ if (fd >= 0)
+ fp->fp = fdopen(fd, mode);
+ else
+ fp->fp = fopen(path, mode);
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compress_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
- }
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ if (compress_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use
+ * it
+ */
+ char mode_compression[32];
+
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, compress_spec.level);
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode_compression);
+ else
+ fp->fp = gzopen(path, mode_compression);
+ }
+ else
+ {
+ /* don't specify a level, just use the zlib default */
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode);
+ else
+ fp->fp = gzopen(path, mode);
+ }
- fp->uncompressedfp = NULL;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
#else
- pg_fatal("not built with zlib support");
-#endif
- }
- else
- {
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
+ pg_fatal("not built with zlib support");
#endif
- fp->uncompressedfp = fopen(path, mode);
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
return fp;
}
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compress_spec)
+{
+ return cfopen_internal(path, -1, mode, compress_spec);
+}
+
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compress_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compress_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
{
- int ret;
+ int ret = 0;
if (size == 0)
return 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compress_spec.algorithm)
{
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ case PG_COMPRESSION_NONE:
+ ret = fread(ptr, 1, size, fp->fp);
+ if (ret != size && !feof(fp->fp))
+ READ_ERROR_EXIT(fp->fp);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
- else
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzread(fp->fp, ptr, size);
+ if (ret != size && !gzeof(fp->fp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(fp->fp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+#else
+ pg_fatal("not built with zlib support");
#endif
- {
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
+
return ret;
}
int
cfwrite(const void *ptr, int size, cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compress_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fwrite(ptr, 1, size, fp->fp);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
+ ret = gzwrite(fp->fp, ptr, size);
+#else
+ pg_fatal("not built with zlib support");
#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
+ }
+
+ return ret;
}
int
cfgetc(cfp *fp)
{
- int ret;
+ int ret = 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compress_spec.algorithm)
{
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
- {
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fgetc(fp->fp);
+ if (ret == EOF)
+ READ_ERROR_EXIT(fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzgetc((gzFile) fp->fp);
+ if (ret == EOF)
+ {
+ if (!gzeof(fp->fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+#else
+ pg_fatal("not built with zlib support");
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
return ret;
@@ -615,65 +706,113 @@ cfgetc(cfp *fp)
char *
cfgets(cfp *fp, char *buf, int len)
{
+ char *ret = NULL;
+
+ switch (fp->compress_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fgets(buf, len, fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
+ ret = gzgets(fp->fp, buf, len);
+#else
+ pg_fatal("not built with zlib support");
#endif
- return fgets(buf, len, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
+ }
+
+ return ret;
}
int
cfclose(cfp *fp)
{
- int result;
+ int ret = 0;
if (fp == NULL)
{
errno = EBADF;
return EOF;
}
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+
+ switch (fp->compress_spec.algorithm)
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fclose(fp->fp);
+ fp->fp = NULL;
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzclose(fp->fp);
+ fp->fp = NULL;
+#else
+ pg_fatal("not built with zlib support");
#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
+
free_keep_errno(fp);
- return result;
+ return ret;
}
int
cfeof(cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compress_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = feof(fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
+ ret = gzeof(fp->fp);
+#else
+ pg_fatal("not built with zlib support");
#endif
- return feof(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
+ }
+
+ return ret;
}
const char *
get_cfp_error(cfp *fp)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ if (fp->compress_spec.algorithm == PG_COMPRESSION_GZIP)
{
+#ifdef HAVE_LIBZ
int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ const char *errmsg = gzerror(fp->fp, &errnum);
if (errnum != Z_ERRNO)
return errmsg;
- }
+#else
+ pg_fatal("not built with zlib support");
#endif
+ }
+
return strerror(errno);
}
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 79347d387b..92a160b67e 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compress_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -362,8 +353,9 @@ RestoreArchive(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
+ bool supports_compression;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +383,24 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP &&
+ supports_compression = true;
+ if (AH->compress_spec.algorithm != PG_COMPRESSION_NONE &&
+ AH->compress_spec.algorithm == PG_COMPRESSION_GZIP &&
AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+#ifndef HAVE_LIBZ
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+ if (supports_compression == false)
+ pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -979,7 +978,7 @@ NewRestoreOptions(void)
opts->cparams.promptPassword = TRI_DEFAULT;
opts->dumpSections = DUMP_UNSECTIONED;
opts->compress_spec.algorithm = PG_COMPRESSION_NONE;
- opts->compress_spec.level = 0;
+ opts->compress_spec.level = INT_MIN;
return opts;
}
@@ -1128,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compress_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1504,58 +1503,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compress_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compress_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compress_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compress_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compress_spec);
if (!AH->OF)
{
@@ -1566,33 +1539,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1716,22 +1680,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2220,6 +2179,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2273,8 +2233,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index d58b96b2dc..d2930949ab 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.34.1
v11-0003-Introduce-Compressor-API-in-pg_dump.patchtext/x-patch; name=v11-0003-Introduce-Compressor-API-in-pg_dump.patchDownload
From 0c8e498070fa30f3f05657cd769df3053a39e1d8 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Tue, 29 Nov 2022 11:22:48 +0000
Subject: [PATCH v11 3/4] Introduce Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for opening, writing, etc. The implementor of a new
compression method is now able to "simply" just add those definitions.
Custom compressed archives now need to store the compression algorithm in their
header. This requires a bump in the version number. The level of compression
is no longer stored in the dump as it is irrelevant.
---
src/bin/pg_dump/Makefile | 1 +
src/bin/pg_dump/compress_gzip.c | 390 ++++++++++++
src/bin/pg_dump/compress_gzip.h | 9 +
src/bin/pg_dump/compress_io.c | 829 ++++++--------------------
src/bin/pg_dump/compress_io.h | 69 ++-
src/bin/pg_dump/meson.build | 1 +
src/bin/pg_dump/pg_backup_archiver.c | 93 +--
src/bin/pg_dump/pg_backup_archiver.h | 4 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 85 +--
10 files changed, 766 insertions(+), 738 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 9dc5a784dd..29eab02d37 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,6 +24,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..bc6d1abc77
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,390 @@
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_gzip.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ int compressionLevel;
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ if (deflateInit(zp, gzipcs->compressionLevel) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs, int compressionLevel)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+ gzipcs->compressionLevel = compressionLevel;
+
+ cs->private = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+typedef struct GzipData
+{
+ gzFile fp;
+ int compressionLevel;
+} GzipData;
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ size_t ret;
+
+ ret = gzread(gd->fp, ptr, size);
+ if (ret != size && !gzeof(gd->fp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gd->fp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+
+ return gzwrite(gd->fp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gd->fp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gd->fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+
+ return gzgets(gd->fp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ int save_errno;
+ int ret;
+
+ CFH->private = NULL;
+
+ ret = gzclose(gd->fp);
+
+ save_errno = errno;
+ free(gd);
+ errno = save_errno;
+
+ return ret;
+}
+
+static int
+Gzip_eof(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+
+ return gzeof(gd->fp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gd->fp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ char mode_compression[32];
+
+ if (gd->compressionLevel != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, gd->compressionLevel);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gd->fp = gzdopen(dup(fd), mode_compression);
+ else
+ gd->fp = gzopen(path, mode_compression);
+
+ if (gd->fp == NULL)
+ return 1;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle * CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressGzip(CompressFileHandle * CFH, int compressionLevel)
+{
+ GzipData *gd;
+
+ CFH->open = Gzip_open;
+ CFH->open_write = Gzip_open_write;
+ CFH->read = Gzip_read;
+ CFH->write = Gzip_write;
+ CFH->gets = Gzip_gets;
+ CFH->getc = Gzip_getc;
+ CFH->close = Gzip_close;
+ CFH->eof = Gzip_eof;
+ CFH->get_error = Gzip_get_error;
+
+ gd = pg_malloc0(sizeof(GzipData));
+ gd->compressionLevel = compressionLevel;
+
+ CFH->private = gd;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs, int compressionLevel)
+{
+ pg_fatal("not built with zlib support");
+}
+
+void
+InitCompressGzip(CompressFileHandle * CFH, int compressionLevel)
+{
+ pg_fatal("not built with zlib support");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..ab0362c1f3
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,9 @@
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs, int compressionLevel);
+extern void InitCompressGzip(CompressFileHandle * CFH, int compressionLevel);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 83b478bc63..56a07e309b 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -51,9 +51,12 @@
*
*-------------------------------------------------------------------------
*/
+#include <sys/stat.h>
+#include <unistd.h>
#include "postgres_fe.h"
#include "compress_io.h"
+#include "compress_gzip.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -65,83 +68,66 @@
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
+/* Private routines that support uncompressed data I/O */
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
{
- pg_compress_specification compress_spec;
- WriteFunc writeF;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
+
+ free(buf);
+}
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+static void
+InitCompressorNone(CompressorState *cs)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+}
/* Public interface routines */
/* Allocate a new compressor */
CompressorState *
AllocateCompressor(const pg_compress_specification compress_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compress_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("not built with zlib support");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
- cs->compress_spec = compress_spec;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compress_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compress_spec.level);
-#endif
- return cs;
-}
-
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH, pg_compress_specification compress_spec,
- ReadFunc readF)
-{
switch (compress_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- ReadDataFromArchiveNone(AH, readF);
+ InitCompressorNone(cs);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("not built with zlib support");
-#endif
+ InitCompressorGzip(cs, compress_spec.level);
break;
case PG_COMPRESSION_LZ4:
/* fallthrough */
@@ -149,33 +135,8 @@ ReadDataFromArchive(ArchiveHandle *AH, pg_compress_specification compress_spec,
pg_fatal("invalid compression method");
break;
}
-}
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compress_spec.algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ return cs;
}
/*
@@ -184,244 +145,28 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
- switch (cs->compress_spec.algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- EndCompressorZlib(AH, cs);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- free(cs);
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
-}
-
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
- */
-
-static void
-InitCompressorZlib(CompressorState *cs, int level)
-{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
+ cs->end(AH, cs);
+ pg_free(cs);
}
-static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
-{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
-
- free(cs->zlibOut);
- free(cs->zp);
-}
-
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
-{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
-
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
-
- if (res == Z_STREAM_END)
- break;
- }
-}
-
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
-}
-
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
-{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
-
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
-
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
-
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
-}
-#endif /* HAVE_LIBZ */
-
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
-{
- size_t cnt;
- char *buf;
- size_t buflen;
-
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
-
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
-
- free(buf);
-}
-
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->writeF(AH, data, dLen);
-}
-
-
/*----------------------
* Compressed stream API
*----------------------
*/
-/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
- */
-struct cfp
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- pg_compress_specification compress_spec;
- void *fp;
-};
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
+}
/* free() without changing errno; useful in several places below */
static void
@@ -434,328 +179,142 @@ free_keep_errno(void *p)
}
/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
+ * Compression None implementation
*/
-cfp *
-cfopen_read(const char *path, const char *mode)
+static size_t
+_read(void *ptr, size_t size, CompressFileHandle * CFH)
{
- cfp *fp;
- pg_compress_specification compress_spec = {0};
+ FILE *fp = (FILE *) CFH->private;
+ size_t ret;
- compress_spec.algorithm = PG_COMPRESSION_GZIP;
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- fp = cfopen(path, mode, compress_spec);
- else
-#endif
- {
- compress_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compress_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
+ if (size == 0)
+ return 0;
- compress_spec.algorithm = PG_COMPRESSION_GZIP;
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compress_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
-}
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
-/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compress_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compress_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compress_spec)
-{
- cfp *fp;
-
- if (compress_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compress_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compress_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("not built with zlib support");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
+ return ret;
}
-/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
- */
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compress_spec)
+static size_t
+_write(const void *ptr, size_t size, CompressFileHandle * CFH)
{
- cfp *fp = pg_malloc(sizeof(cfp));
-
- fp->compress_spec = compress_spec;
-
- switch (compress_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- if (fd >= 0)
- fp->fp = fdopen(fd, mode);
- else
- fp->fp = fopen(path, mode);
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- if (compress_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /*
- * user has specified a compression level, so tell zlib to use
- * it
- */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compress_spec.level);
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode_compression);
- else
- fp->fp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode);
- else
- fp->fp = gzopen(path, mode);
- }
-
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
-
- return fp;
+ return fwrite(ptr, 1, size, (FILE *) CFH->private);
}
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compress_spec)
+static const char *
+_get_error(CompressFileHandle * CFH)
{
- return cfopen_internal(path, -1, mode, compress_spec);
+ return strerror(errno);
}
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compress_spec)
+static char *
+_gets(char *ptr, int size, CompressFileHandle * CFH)
{
- return cfopen_internal(NULL, fd, mode, compress_spec);
+ return fgets(ptr, size, (FILE *) CFH->private);
}
-int
-cfread(void *ptr, int size, cfp *fp)
+static int
+_getc(CompressFileHandle * CFH)
{
- int ret = 0;
-
- if (size == 0)
- return 0;
+ FILE *fp = (FILE *) CFH->private;
+ int ret;
- switch (fp->compress_spec.algorithm)
+ ret = fgetc(fp);
+ if (ret == EOF)
{
- case PG_COMPRESSION_NONE:
- ret = fread(ptr, 1, size, fp->fp);
- if (ret != size && !feof(fp->fp))
- READ_ERROR_EXIT(fp->fp);
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzread(fp->fp, ptr, size);
- if (ret != size && !gzeof(fp->fp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->fp, &errnum);
-
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
}
return ret;
}
-int
-cfwrite(const void *ptr, int size, cfp *fp)
+static int
+_close(CompressFileHandle * CFH)
{
+ FILE *fp = (FILE *) CFH->private;
int ret = 0;
- switch (fp->compress_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fwrite(ptr, 1, size, fp->fp);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzwrite(fp->fp, ptr, size);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ CFH->private = NULL;
+
+ if (fp)
+ ret = fclose(fp);
return ret;
}
-int
-cfgetc(cfp *fp)
+static int
+_eof(CompressFileHandle * CFH)
{
- int ret = 0;
+ return feof((FILE *) CFH->private);
+}
- switch (fp->compress_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgetc(fp->fp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->fp);
+static int
+_open(const char *path, int fd, const char *mode, CompressFileHandle * CFH)
+{
+ Assert(CFH->private == NULL);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgetc((gzFile) fp->fp);
- if (ret == EOF)
- {
- if (!gzeof(fp->fp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ if (fd >= 0)
+ CFH->private = fdopen(dup(fd), mode);
+ else
+ CFH->private = fopen(path, mode);
- return ret;
+ if (CFH->private == NULL)
+ return 1;
+
+ return 0;
}
-char *
-cfgets(cfp *fp, char *buf, int len)
+static int
+_open_write(const char *path, const char *mode, CompressFileHandle * CFH)
{
- char *ret = NULL;
+ Assert(CFH->private == NULL);
- switch (fp->compress_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgets(buf, len, fp->fp);
+ CFH->private = fopen(path, mode);
+ if (CFH->private == NULL)
+ return 1;
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgets(fp->fp, buf, len);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ return 0;
+}
- return ret;
+static void
+InitCompressNone(CompressFileHandle * CFH)
+{
+ CFH->open = _open;
+ CFH->open_write = _open_write;
+ CFH->read = _read;
+ CFH->write = _write;
+ CFH->gets = _gets;
+ CFH->getc = _getc;
+ CFH->close = _close;
+ CFH->eof = _eof;
+ CFH->get_error = _get_error;
+
+ CFH->private = NULL;
}
-int
-cfclose(cfp *fp)
+/*
+ * Public interface
+ */
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compress_spec)
{
- int ret = 0;
+ CompressFileHandle *CFH;
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
- }
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
- switch (fp->compress_spec.algorithm)
+ switch (compress_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- ret = fclose(fp->fp);
- fp->fp = NULL;
-
+ InitCompressNone(CFH);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzclose(fp->fp);
- fp->fp = NULL;
-#else
- pg_fatal("not built with zlib support");
-#endif
+ InitCompressGzip(CFH, compress_spec.level);
break;
case PG_COMPRESSION_LZ4:
/* fallthrough */
@@ -764,71 +323,77 @@ cfclose(cfp *fp)
break;
}
- free_keep_errno(fp);
-
- return ret;
+ return CFH;
}
-int
-cfeof(cfp *fp)
+/*
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
+ *
+ * If the file at 'path' does not exist, we append the ".gz" suffix (if
+ * 'path' doesn't already have it) and try again. So if you pass "foo" as
+ * 'path', this will open either "foo" or "foo.gz", trying in that order.
+ *
+ * On failure, return NULL with an error code in errno.
+ *
+ */
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- int ret = 0;
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compress_spec = {0};
- switch (fp->compress_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = feof(fp->fp);
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzeof(fp->fp);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
- return ret;
-}
+ fname = strdup(path);
-const char *
-get_cfp_error(cfp *fp)
-{
- if (fp->compress_spec.algorithm == PG_COMPRESSION_GZIP)
+ if (hasSuffix(fname, ".gz"))
+ compress_spec.algorithm = PG_COMPRESSION_GZIP;
+ else
{
+ bool exists;
+
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not build with compression */
+ if (exists)
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- int errnum;
- const char *errmsg = gzerror(fp->fp, &errnum);
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
- if (errnum != Z_ERRNO)
- return errmsg;
-#else
- pg_fatal("not built with zlib support");
+ if (exists)
+ compress_spec.algorithm = PG_COMPRESSION_GZIP;
+ }
#endif
}
- return strerror(errno);
+ CFH = InitCompressFileHandle(compress_spec);
+ if (CFH->open(fname, -1, mode, CFH))
+ {
+ free_keep_errno(CFH);
+ CFH = NULL;
+ }
+ free_keep_errno(fname);
+
+ return CFH;
}
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
+int
+DestroyCompressFileHandle(CompressFileHandle * CFH)
{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
+ int ret = 0;
- if (filenamelen < suffixlen)
- return 0;
+ if (CFH->private)
+ ret = CFH->close(CFH);
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
-}
+ free_keep_errno(CFH);
-#endif
+ return ret;
+}
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index d6335fff02..a986f5e6ee 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -37,34 +37,61 @@ typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ ReadFunc readF;
+ WriteFunc writeF;
+
+ void *private;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compress_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compress_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ int (*open) (const char *path, int fd, const char *mode,
+ CompressFileHandle * CFH);
+ int (*open_write) (const char *path, const char *mode,
+ CompressFileHandle * cxt);
+ size_t (*read) (void *ptr, size_t size, CompressFileHandle * CFH);
+ size_t (*write) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ char *(*gets) (char *s, int size, CompressFileHandle * CFH);
+ int (*getc) (CompressFileHandle * CFH);
+ int (*eof) (CompressFileHandle * CFH);
+ int (*close) (CompressFileHandle * CFH);
+ const char *(*get_error) (CompressFileHandle * CFH);
+
+ void *private;
+};
+
-typedef struct cfp cfp;
+extern CompressFileHandle * InitCompressFileHandle(const pg_compress_specification compress_spec);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compress_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- pg_compress_specification compress_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compress_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+extern CompressFileHandle * InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int DestroyCompressFileHandle(CompressFileHandle * CFH);
#endif
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index d96e566846..0c73a4707e 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,5 +1,6 @@
pg_dump_common_sources = files(
'compress_io.c',
+ 'compress_gzip.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 92a160b67e..248646143d 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compress_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle * SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle * savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -355,7 +355,7 @@ RestoreArchive(Archive *AHX)
bool parallel_mode;
bool supports_compression;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1127,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compress_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1503,6 +1503,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compress_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1525,33 +1526,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compress_spec);
- else
- AH->OF = cfopen(filename, mode, compress_spec);
+ CFH = InitCompressFileHandle(compress_spec);
- if (!AH->OF)
+ if (CFH->open(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle * savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1690,7 +1690,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2032,6 +2036,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2062,26 +2078,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2179,6 +2181,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2234,7 +2237,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3647,7 +3653,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- WriteInt(AH, AH->compress_spec.level);
+ AH->WriteBytePtr(AH, AH->compress_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3719,7 +3725,9 @@ ReadHead(ArchiveHandle *AH)
AH->format, fmt);
AH->compress_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compress_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
if (AH->version < K_VERS_1_4)
AH->compress_spec.level = AH->ReadBytePtr(AH);
@@ -3732,11 +3740,20 @@ ReadHead(ArchiveHandle *AH)
else
AH->compress_spec.algorithm = PG_COMPRESSION_GZIP;
+ if (AH->compress_spec.algorithm != PG_COMPRESSION_NONE)
+ {
+ bool unsupported = false;
+
#ifndef HAVE_LIBZ
- if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
+ unsupported = true;
#endif
-
+ if (unsupported)
+ {
+ pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
+ AH->compress_spec.algorithm = PG_COMPRESSION_NONE;
+ }
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index d2930949ab..bb7fad2af1 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,12 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index 6a2112c45f..49ec0e3816 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compress_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compress_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compress_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compress_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compress_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compress_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 7d2cddbb2c..e1ce2f393b 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,9 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
+ CompressFileHandle *dataFH; /* currently open data file */
- cfp *blobsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *blobsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +198,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +218,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +327,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compress_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compress_spec);
+
+ if (ctx->dataFH->open_write(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +346,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
}
@@ -370,7 +371,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +386,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (DestroyCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +435,7 @@ _LoadBlobs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +443,14 @@ _LoadBlobs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->blobsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->blobsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->blobsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the blobs TOC file line-by-line, and process each blob */
- while ((cfgets(ctx->blobsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets(line, MAXPGPATH, CFH)) != NULL)
{
char blobfname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +465,11 @@ _LoadBlobs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreBlob(AH, oid);
}
- if (!cfeof(ctx->blobsTocFH))
+ if (!CFH->eof(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->blobsTocFH) != 0)
+ if (DestroyCompressFileHandle(ctx->blobsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +489,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
return 1;
@@ -512,8 +514,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc(CFH);
}
/*
@@ -524,15 +527,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
}
@@ -545,12 +549,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +578,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compress_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +589,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compress_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compress_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compress_spec);
+ if (tocFH->open_write(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +603,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +654,8 @@ _StartBlobs(ArchiveHandle *AH, TocEntry *te)
/* The blob TOC file is never compressed */
compress_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->blobsTocFH = cfopen_write(fname, "ab", compress_spec);
- if (ctx->blobsTocFH == NULL)
+ ctx->blobsTocFH = InitCompressFileHandle(compress_spec);
+ if (ctx->blobsTocFH->open_write(fname, "ab", ctx->blobsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +672,8 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compress_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compress_spec);
+ if (ctx->dataFH->open_write(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,17 +686,18 @@ static void
_EndBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->blobsTocFH;
char buf[50];
int len;
/* Close the BLOB data file itself */
- if (cfclose(ctx->dataFH) != 0)
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the blob in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->blobsTocFH) != len)
+ if (CFH->write(buf, len, CFH) != len)
pg_fatal("could not write to blobs TOC file");
}
@@ -706,7 +711,7 @@ _EndBlobs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->blobsTocFH) != 0)
+ if (DestroyCompressFileHandle(ctx->blobsTocFH) != 0)
pg_fatal("could not close blobs TOC file: %m");
ctx->blobsTocFH = NULL;
}
--
2.34.1
v11-0004-Add-LZ4-compression-in-pg_-dump-restore.patchtext/x-patch; name=v11-0004-Add-LZ4-compression-in-pg_-dump-restore.patchDownload
From ca4de793a6ae3ac0bd34e68809a53b8d2b1abb4f Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Tue, 29 Nov 2022 11:23:00 +0000
Subject: [PATCH v11 4/4] Add LZ4 compression in pg_{dump|restore}
Within compress_lz4.{c,h} the streaming API and a file API compression is
implemented.. The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(), it has been
implemented localy.
---
doc/src/sgml/ref/pg_dump.sgml | 23 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 43 +-
src/bin/pg_dump/compress_lz4.c | 601 +++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 9 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 14 +-
src/bin/pg_dump/pg_dump.c | 12 +-
src/bin/pg_dump/t/001_basic.pl | 2 +-
src/bin/pg_dump/t/002_pg_dump.pl | 69 ++-
10 files changed, 749 insertions(+), 34 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 3fb8fdce81..84d3778c99 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -328,9 +328,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -652,12 +653,12 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression. A compression level can
- be optionally specified, by appending the level number after a colon
- (<literal>:</literal>). If no level is specified, the default compression
- level will be used for the specified method. If only a level is
- specified without mentioning a method, <literal>gzip</literal> compression
- will be used.
+ <literal>lz4</literal> or <literal>none</literal> for no compression. A
+ compression level can be optionally specified, by appending the level
+ number after a colon (<literal>:</literal>). If no level is specified,
+ the default compression level will be used for the specified method. If
+ only a level is specified without mentioning a method,
+ <literal>gzip</literal> compression willbe used.
</para>
<para>
@@ -665,8 +666,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 29eab02d37..28c1fc27cc 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -25,6 +26,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
compress_gzip.o \
+ compress_lz4.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 56a07e309b..731c0913df 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -38,13 +38,15 @@
* ----------------------
*
* The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * libz's gzopen() APIs and custom LZ4 calls which provide similar
+ * functionality. It allows you to use the same functions for compressed and
+ * uncompressed streams. cfopen_read() first tries to open the file with given
+ * name, and if it fails, it tries to open the same file with the .gz suffix,
+ * failing that it tries to open the same file with the .lz4 suffix.
+ * cfopen_write() opens a file for writing, an extra argument specifies the
+ * method to use should the file be compressed, and adds the appropriate
+ * suffix, .gz or .lz4, to the filename if so. This allows you to easily handle
+ * both compressed and uncompressed files.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -57,6 +59,7 @@
#include "compress_io.h"
#include "compress_gzip.h"
+#include "compress_lz4.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -130,8 +133,9 @@ AllocateCompressor(const pg_compress_specification compress_spec,
InitCompressorGzip(cs, compress_spec.level);
break;
case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
+ InitCompressorLZ4(cs, compress_spec.level);
+ break;
+ default:
pg_fatal("invalid compression method");
break;
}
@@ -181,6 +185,7 @@ free_keep_errno(void *p)
/*
* Compression None implementation
*/
+
static size_t
_read(void *ptr, size_t size, CompressFileHandle * CFH)
{
@@ -317,7 +322,8 @@ InitCompressFileHandle(const pg_compress_specification compress_spec)
InitCompressGzip(CFH, compress_spec.level);
break;
case PG_COMPRESSION_LZ4:
- /* fallthrough */
+ InitCompressLZ4(CFH, compress_spec.level);
+ break;
case PG_COMPRESSION_ZSTD:
pg_fatal("invalid compression method");
break;
@@ -330,12 +336,12 @@ InitCompressFileHandle(const pg_compress_specification compress_spec)
* Open a file for reading. 'path' is the file to open, and 'mode' should
* be either "r" or "rb".
*
- * If the file at 'path' does not exist, we append the ".gz" suffix (if
+ * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (i
* 'path' doesn't already have it) and try again. So if you pass "foo" as
- * 'path', this will open either "foo" or "foo.gz", trying in that order.
+ * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
+ * order.
*
* On failure, return NULL with an error code in errno.
- *
*/
CompressFileHandle *
InitDiscoverCompressFileHandle(const char *path, const char *mode)
@@ -371,6 +377,17 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compress_spec.algorithm = PG_COMPRESSION_GZIP;
}
+#endif
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compress_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..8f93f05e87
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,601 @@
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private = NULL;
+ }
+}
+
+
+/* Public routines that support LZ4 compressed data I/O */
+void
+InitCompressorLZ4(CompressorState *cs, int compressionLevel)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ /* Will be lazy init'd */
+ cs->private = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the nessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurance of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (!fs->inited && LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verfiy that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (eol_flag == 0) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag == true &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the begining of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t status;
+ int remaining = size;
+
+ if (!fs->inited && LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle * CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = errno ? : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle * CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle * CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+void
+InitCompressLZ4(CompressFileHandle * CFH, int compressionLevel)
+{
+ LZ4File *lz4fp;
+
+ CFH->open = LZ4File_open;
+ CFH->open_write = LZ4File_open_write;
+ CFH->read = LZ4File_read;
+ CFH->write = LZ4File_write;
+ CFH->gets = LZ4File_gets;
+ CFH->getc = LZ4File_getc;
+ CFH->eof = LZ4File_eof;
+ CFH->close = LZ4File_close;
+ CFH->get_error = LZ4File_get_error;
+
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (compressionLevel >= 0)
+ lz4fp->prefs.compressionLevel = compressionLevel;
+
+ CFH->private = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs, int compressionLevel)
+{
+ pg_fatal("not built with LZ4 support");
+}
+
+void
+InitCompressLZ4(CompressFileHandle * CFH, int compressionLevel)
+{
+ pg_fatal("not built with LZ4 support");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..fbec9a508d
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,9 @@
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs, int compressionLevel);
+extern void InitCompressLZ4(CompressFileHandle * CFH, int compressionLevel);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 0c73a4707e..b27e92ffd0 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,6 +1,7 @@
pg_dump_common_sources = files(
'compress_io.c',
'compress_gzip.c',
+ 'compress_lz4.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
@@ -15,7 +16,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -83,7 +84,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 248646143d..51e51748f1 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -395,6 +395,10 @@ RestoreArchive(Archive *AHX)
#ifndef HAVE_LIBZ
if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
+#endif
+#ifndef USE_LZ4
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
#endif
if (supports_compression == false)
pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
@@ -2074,7 +2078,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2084,6 +2088,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -3747,6 +3755,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
unsupported = true;
+#endif
+#ifndef USE_LZ4
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_LZ4)
+ unsupported = true;
#endif
if (unsupported)
{
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 4830983c86..78016ba1d2 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -1272,20 +1272,25 @@ parse_compression(const char *opt, pg_compress_specification *compress_spec)
parse_compress_options(opt, &algorithm_str, &level_str);
if (!parse_compress_algorithm(algorithm_str, &(compress_spec->algorithm)))
{
- pg_log_error("invalid compression method: \"%s\" (gzip, none)",
+ pg_log_error("invalid compression method: \"%s\" (gzip, lz4, none)",
algorithm_str);
return false;
}
/* Switch off unimplemented or unavailable compressions. */
if (compress_spec->algorithm != PG_COMPRESSION_NONE &&
- compress_spec->algorithm != PG_COMPRESSION_GZIP)
+ compress_spec->algorithm != PG_COMPRESSION_GZIP &&
+ compress_spec->algorithm != PG_COMPRESSION_LZ4)
supports_compression = false;
#ifndef HAVE_LIBZ
if (compress_spec->algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
#endif
+#ifndef USE_LZ4
+ if (compress_spec->algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
+#endif
if (!supports_compression)
{
@@ -1308,6 +1313,9 @@ parse_compression(const char *opt, pg_compress_specification *compress_spec)
return false;
}
+ pg_free(algorithm_str);
+ pg_free(level_str);
+
return true;
}
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index f8d0b2fce5..2f7f389681 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -122,7 +122,7 @@ command_fails_like(
command_fails_like(
[ 'pg_dump', '--compress', 'garbage' ],
- qr/\Qpg_dump: error: invalid compression method: "garbage" (gzip, none)\E/,
+ qr/\Qpg_dump: error: invalid compression method: "garbage" (gzip, lz4, none)\E/,
'pg_dump: invalid --compress');
command_fails_like(
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index d604558f03..0bec824836 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -116,6 +116,67 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=1', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4127,11 +4188,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
--
2.34.1
v11-0001-Teach-pg_dump-about-compress_spec-and-use-it-thr.patchtext/x-patch; name=v11-0001-Teach-pg_dump-about-compress_spec-and-use-it-thr.patchDownload
From 1669812c3177a0c54e36d892676a45a51232096a Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 30 Nov 2022 12:58:03 +0000
Subject: [PATCH v11 1/4] Teach pg_dump about compress_spec and use it
throughout.
Align pg_dump with the rest of the binaries which use common compression. It is
teaching pg_dump.c about the common compression definitions and interfaces. Then
it propagates those throughout the code.
---
doc/src/sgml/ref/pg_dump.sgml | 30 ++++++--
src/bin/pg_dump/compress_io.c | 102 ++++++++++----------------
src/bin/pg_dump/compress_io.h | 20 ++---
src/bin/pg_dump/pg_backup.h | 7 +-
src/bin/pg_dump/pg_backup_archiver.c | 71 ++++++++++++------
src/bin/pg_dump/pg_backup_archiver.h | 10 +--
src/bin/pg_dump/pg_backup_custom.c | 6 +-
src/bin/pg_dump/pg_backup_directory.c | 13 +++-
src/bin/pg_dump/pg_backup_tar.c | 11 ++-
src/bin/pg_dump/pg_dump.c | 97 ++++++++++++++++++------
src/bin/pg_dump/t/001_basic.pl | 27 ++++++-
src/bin/pg_dump/t/002_pg_dump.pl | 2 +-
12 files changed, 243 insertions(+), 153 deletions(-)
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 8b9d9f4cad..3fb8fdce81 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -644,17 +644,31 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
- <term><option>-Z <replaceable class="parameter">0..9</replaceable></option></term>
- <term><option>--compress=<replaceable class="parameter">0..9</replaceable></option></term>
+ <term><option>-Z <replaceable class="parameter">level</replaceable></option></term>
+ <term><option>-Z <replaceable class="parameter">method</replaceable></option>[:<replaceable>level</replaceable>]</term>
+ <term><option>--compress=<replaceable class="parameter">level</replaceable></option></term>
+ <term><option>--compress=<replaceable class="parameter">method</replaceable></option>[:<replaceable>level</replaceable>]</term>
<listitem>
<para>
- Specify the compression level to use. Zero means no compression.
+ Specify the compression method and/or the compression level to use.
+ The compression method can be set to <literal>gzip</literal> or
+ <literal>none</literal> for no compression. A compression level can
+ be optionally specified, by appending the level number after a colon
+ (<literal>:</literal>). If no level is specified, the default compression
+ level will be used for the specified method. If only a level is
+ specified without mentioning a method, <literal>gzip</literal> compression
+ will be used.
+ </para>
+
+ <para>
For the custom and directory archive formats, this specifies compression of
- individual table-data segments, and the default is to compress
- at a moderate level.
- For plain text output, setting a nonzero compression level causes
- the entire output file to be compressed, as though it had been
- fed through <application>gzip</application>; but the default is not to compress.
+ individual table-data segments, and the default is to compress using
+ <literal>gzip</literal> at a moderate level. For plain text output,
+ setting a nonzero compression level causes the entire output file to be compressed,
+ as though it had been fed through <application>gzip</application>; but the default
+ is not to compress.
+ </para>
+ <para>
The tar archive format currently does not support compression at all.
</para>
</listitem>
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 62f940ff7a..2c9d730fce 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -64,7 +64,7 @@
/* typedef appears in compress_io.h */
struct CompressorState
{
- CompressionAlgorithm comprAlg;
+ pg_compress_specification compress_spec;
WriteFunc writeF;
#ifdef HAVE_LIBZ
@@ -74,9 +74,6 @@ struct CompressorState
#endif
};
-static void ParseCompressionOption(int compression, CompressionAlgorithm *alg,
- int *level);
-
/* Routines that support zlib compressed data I/O */
#ifdef HAVE_LIBZ
static void InitCompressorZlib(CompressorState *cs, int level);
@@ -93,57 +90,30 @@ static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
const char *data, size_t dLen);
-/*
- * Interprets a numeric 'compression' value. The algorithm implied by the
- * value (zlib or none at the moment), is returned in *alg, and the
- * zlib compression level in *level.
- */
-static void
-ParseCompressionOption(int compression, CompressionAlgorithm *alg, int *level)
-{
- if (compression == Z_DEFAULT_COMPRESSION ||
- (compression > 0 && compression <= 9))
- *alg = COMPR_ALG_LIBZ;
- else if (compression == 0)
- *alg = COMPR_ALG_NONE;
- else
- {
- pg_fatal("invalid compression code: %d", compression);
- *alg = COMPR_ALG_NONE; /* keep compiler quiet */
- }
-
- /* The level is just the passed-in value. */
- if (level)
- *level = compression;
-}
-
/* Public interface routines */
/* Allocate a new compressor */
CompressorState *
-AllocateCompressor(int compression, WriteFunc writeF)
+AllocateCompressor(const pg_compress_specification compress_spec,
+ WriteFunc writeF)
{
CompressorState *cs;
- CompressionAlgorithm alg;
- int level;
-
- ParseCompressionOption(compression, &alg, &level);
#ifndef HAVE_LIBZ
- if (alg == COMPR_ALG_LIBZ)
+ if (compress_spec.algorithm == PG_COMPRESSION_GZIP)
pg_fatal("not built with zlib support");
#endif
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
cs->writeF = writeF;
- cs->comprAlg = alg;
+ cs->compress_spec = compress_spec;
/*
* Perform compression algorithm specific initialization.
*/
#ifdef HAVE_LIBZ
- if (alg == COMPR_ALG_LIBZ)
- InitCompressorZlib(cs, level);
+ if (cs->compress_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressorZlib(cs, cs->compress_spec.level);
#endif
return cs;
@@ -154,15 +124,12 @@ AllocateCompressor(int compression, WriteFunc writeF)
* out with ahwrite().
*/
void
-ReadDataFromArchive(ArchiveHandle *AH, int compression, ReadFunc readF)
+ReadDataFromArchive(ArchiveHandle *AH, pg_compress_specification compress_spec,
+ ReadFunc readF)
{
- CompressionAlgorithm alg;
-
- ParseCompressionOption(compression, &alg, NULL);
-
- if (alg == COMPR_ALG_NONE)
+ if (compress_spec.algorithm == PG_COMPRESSION_NONE)
ReadDataFromArchiveNone(AH, readF);
- if (alg == COMPR_ALG_LIBZ)
+ if (compress_spec.algorithm == PG_COMPRESSION_GZIP)
{
#ifdef HAVE_LIBZ
ReadDataFromArchiveZlib(AH, readF);
@@ -179,18 +146,23 @@ void
WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen)
{
- switch (cs->comprAlg)
+ switch (cs->compress_spec.algorithm)
{
- case COMPR_ALG_LIBZ:
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
WriteDataToArchiveZlib(AH, cs, data, dLen);
#else
pg_fatal("not built with zlib support");
#endif
break;
- case COMPR_ALG_NONE:
+ case PG_COMPRESSION_NONE:
WriteDataToArchiveNone(AH, cs, data, dLen);
break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
}
@@ -201,7 +173,7 @@ void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
#ifdef HAVE_LIBZ
- if (cs->comprAlg == COMPR_ALG_LIBZ)
+ if (cs->compress_spec.algorithm == PG_COMPRESSION_GZIP)
EndCompressorZlib(AH, cs);
#endif
free(cs);
@@ -452,21 +424,25 @@ cfp *
cfopen_read(const char *path, const char *mode)
{
cfp *fp;
+ pg_compress_specification compress_spec = {0};
+ compress_spec.algorithm = PG_COMPRESSION_GZIP;
#ifdef HAVE_LIBZ
if (hasSuffix(path, ".gz"))
- fp = cfopen(path, mode, 1);
+ fp = cfopen(path, mode, compress_spec);
else
#endif
{
- fp = cfopen(path, mode, 0);
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
+ fp = cfopen(path, mode, compress_spec);
#ifdef HAVE_LIBZ
if (fp == NULL)
{
char *fname;
+ compress_spec.algorithm = PG_COMPRESSION_GZIP;
fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, 1);
+ fp = cfopen(fname, mode, compress_spec);
free_keep_errno(fname);
}
#endif
@@ -479,26 +455,27 @@ cfopen_read(const char *path, const char *mode)
* be a filemode as accepted by fopen() and gzopen() that indicates writing
* ("w", "wb", "a", or "ab").
*
- * If 'compression' is non-zero, a gzip compressed stream is opened, and
- * 'compression' indicates the compression level used. The ".gz" suffix
- * is automatically added to 'path' in that case.
+ * If 'compress_spec.algorithm' is GZIP, a gzip compressed stream is opened,
+ * and 'compress_spec.level' used. The ".gz" suffix is automatically added to
+ * 'path' in that case.
*
* On failure, return NULL with an error code in errno.
*/
cfp *
-cfopen_write(const char *path, const char *mode, int compression)
+cfopen_write(const char *path, const char *mode,
+ const pg_compress_specification compress_spec)
{
cfp *fp;
- if (compression == 0)
- fp = cfopen(path, mode, 0);
+ if (compress_spec.algorithm == PG_COMPRESSION_NONE)
+ fp = cfopen(path, mode, compress_spec);
else
{
#ifdef HAVE_LIBZ
char *fname;
fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression);
+ fp = cfopen(fname, mode, compress_spec);
free_keep_errno(fname);
#else
pg_fatal("not built with zlib support");
@@ -515,20 +492,21 @@ cfopen_write(const char *path, const char *mode, int compression)
* On failure, return NULL with an error code in errno.
*/
cfp *
-cfopen(const char *path, const char *mode, int compression)
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compress_spec)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compression != 0)
+ if (compress_spec.algorithm == PG_COMPRESSION_GZIP)
{
#ifdef HAVE_LIBZ
- if (compression != Z_DEFAULT_COMPRESSION)
+ if (compress_spec.level != Z_DEFAULT_COMPRESSION)
{
/* user has specified a compression level, so tell zlib to use it */
char mode_compression[32];
snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression);
+ mode, compress_spec.level);
fp->compressedfp = gzopen(path, mode_compression);
}
else
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index f635787692..d6335fff02 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -21,12 +21,6 @@
#define ZLIB_OUT_SIZE 4096
#define ZLIB_IN_SIZE 4096
-typedef enum
-{
- COMPR_ALG_NONE,
- COMPR_ALG_LIBZ
-} CompressionAlgorithm;
-
/* Prototype for callback function to WriteDataToArchive() */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
@@ -46,8 +40,10 @@ typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
-extern CompressorState *AllocateCompressor(int compression, WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH, int compression,
+extern CompressorState *AllocateCompressor(const pg_compress_specification compress_spec,
+ WriteFunc writeF);
+extern void ReadDataFromArchive(ArchiveHandle *AH,
+ const pg_compress_specification compress_spec,
ReadFunc readF);
extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen);
@@ -56,9 +52,13 @@ extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
typedef struct cfp cfp;
-extern cfp *cfopen(const char *path, const char *mode, int compression);
+extern cfp *cfopen(const char *path, const char *mode,
+ const pg_compress_specification compress_spec);
+extern cfp *cfdopen(int fd, const char *mode,
+ pg_compress_specification compress_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode, int compression);
+extern cfp *cfopen_write(const char *path, const char *mode,
+ const pg_compress_specification compress_spec);
extern int cfread(void *ptr, int size, cfp *fp);
extern int cfwrite(const void *ptr, int size, cfp *fp);
extern int cfgetc(cfp *fp);
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index e8b7898297..61c412c8cb 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -23,6 +23,7 @@
#ifndef PG_BACKUP_H
#define PG_BACKUP_H
+#include "common/compression.h"
#include "fe_utils/simple_list.h"
#include "libpq-fe.h"
@@ -143,7 +144,8 @@ typedef struct _restoreOptions
int noDataForFailedTables;
int exit_on_error;
- int compression;
+ pg_compress_specification compress_spec; /* Specification for
+ * compression */
int suppressDumpWarnings; /* Suppress output of WARNING entries
* to stderr */
bool single_txn;
@@ -303,7 +305,8 @@ extern Archive *OpenArchive(const char *FileSpec, const ArchiveFormat fmt);
/* Create a new archive */
extern Archive *CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compress_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupDumpWorker);
/* The --list option */
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index f39c0fa36f..79347d387b 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -70,7 +70,8 @@ typedef struct _parallelReadyList
static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compress_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupWorkerPtr);
static void _getObjectDescription(PQExpBuffer buf, const TocEntry *te);
static void _printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData);
@@ -98,7 +99,8 @@ static int _discoverArchiveFormat(ArchiveHandle *AH);
static int RestoringToDB(ArchiveHandle *AH);
static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
-static void SetOutput(ArchiveHandle *AH, const char *filename, int compression);
+static void SetOutput(ArchiveHandle *AH, const char *filename,
+ const pg_compress_specification compress_spec);
static OutputContext SaveOutput(ArchiveHandle *AH);
static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
@@ -239,12 +241,13 @@ setupRestoreWorker(Archive *AHX)
/* Public */
Archive *
CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compress_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupDumpWorker)
{
- ArchiveHandle *AH = _allocAH(FileSpec, fmt, compression, dosync,
- mode, setupDumpWorker);
+ ArchiveHandle *AH = _allocAH(FileSpec, fmt, compress_spec,
+ dosync, mode, setupDumpWorker);
return (Archive *) AH;
}
@@ -254,7 +257,12 @@ CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
Archive *
OpenArchive(const char *FileSpec, const ArchiveFormat fmt)
{
- ArchiveHandle *AH = _allocAH(FileSpec, fmt, 0, true, archModeRead, setupRestoreWorker);
+ ArchiveHandle *AH;
+ pg_compress_specification compress_spec = {0};
+
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH = _allocAH(FileSpec, fmt, compress_spec, true,
+ archModeRead, setupRestoreWorker);
return (Archive *) AH;
}
@@ -384,7 +392,8 @@ RestoreArchive(Archive *AHX)
* Make sure we won't need (de)compression we haven't got
*/
#ifndef HAVE_LIBZ
- if (AH->compression != 0 && AH->PrintTocDataPtr != NULL)
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP &&
+ AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
@@ -459,8 +468,8 @@ RestoreArchive(Archive *AHX)
* Setup the output file if necessary.
*/
sav = SaveOutput(AH);
- if (ropt->filename || ropt->compression)
- SetOutput(AH, ropt->filename, ropt->compression);
+ if (ropt->filename || ropt->compress_spec.algorithm != PG_COMPRESSION_NONE)
+ SetOutput(AH, ropt->filename, ropt->compress_spec);
ahprintf(AH, "--\n-- PostgreSQL database dump\n--\n\n");
@@ -739,7 +748,7 @@ RestoreArchive(Archive *AHX)
*/
AH->stage = STAGE_FINALIZING;
- if (ropt->filename || ropt->compression)
+ if (ropt->filename || ropt->compress_spec.algorithm != PG_COMPRESSION_NONE)
RestoreOutput(AH, sav);
if (ropt->useDB)
@@ -969,6 +978,8 @@ NewRestoreOptions(void)
opts->format = archUnknown;
opts->cparams.promptPassword = TRI_DEFAULT;
opts->dumpSections = DUMP_UNSECTIONED;
+ opts->compress_spec.algorithm = PG_COMPRESSION_NONE;
+ opts->compress_spec.level = 0;
return opts;
}
@@ -1115,23 +1126,28 @@ PrintTOCSummary(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
TocEntry *te;
+ pg_compress_specification out_compress_spec = {0};
teSection curSection;
OutputContext sav;
const char *fmtName;
char stamp_str[64];
+ /* TOC is always uncompressed */
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+
sav = SaveOutput(AH);
if (ropt->filename)
- SetOutput(AH, ropt->filename, 0 /* no compression */ );
+ SetOutput(AH, ropt->filename, out_compress_spec);
if (strftime(stamp_str, sizeof(stamp_str), PGDUMP_STRFTIME_FMT,
localtime(&AH->createDate)) == 0)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compress_spec.algorithm));
switch (AH->format)
{
@@ -1485,7 +1501,8 @@ archprintf(Archive *AH, const char *fmt,...)
*******************************/
static void
-SetOutput(ArchiveHandle *AH, const char *filename, int compression)
+SetOutput(ArchiveHandle *AH, const char *filename,
+ const pg_compress_specification compress_spec)
{
int fn;
@@ -1508,12 +1525,12 @@ SetOutput(ArchiveHandle *AH, const char *filename, int compression)
/* If compression explicitly requested, use gzopen */
#ifdef HAVE_LIBZ
- if (compression != 0)
+ if (compress_spec.algorithm == PG_COMPRESSION_GZIP)
{
char fmode[14];
/* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression);
+ sprintf(fmode, "wb%d", compress_spec.level);
if (fn >= 0)
AH->OF = gzdopen(dup(fn), fmode);
else
@@ -2198,7 +2215,8 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
static ArchiveHandle *
_allocAH(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compress_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
@@ -2249,7 +2267,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
AH->toc->prev = AH->toc;
AH->mode = mode;
- AH->compression = compression;
+ AH->compress_spec = compress_spec;
AH->dosync = dosync;
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
@@ -2264,7 +2282,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
* Force stdin/stdout into binary mode if that is what we are using.
*/
#ifdef WIN32
- if ((fmt != archNull || compression != 0) &&
+ if ((fmt != archNull || compress_spec.algorithm != PG_COMPRESSION_NONE) &&
(AH->fSpec == NULL || strcmp(AH->fSpec, "") == 0))
{
if (mode == archModeWrite)
@@ -3669,7 +3687,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- WriteInt(AH, AH->compression);
+ WriteInt(AH, AH->compress_spec.level);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3740,21 +3758,26 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
+ AH->compress_spec.algorithm = PG_COMPRESSION_NONE;
if (AH->version >= K_VERS_1_2)
{
if (AH->version < K_VERS_1_4)
- AH->compression = AH->ReadBytePtr(AH);
+ AH->compress_spec.level = AH->ReadBytePtr(AH);
else
- AH->compression = ReadInt(AH);
+ AH->compress_spec.level = ReadInt(AH);
+
+ if (AH->compress_spec.level != 0)
+ AH->compress_spec.algorithm = PG_COMPRESSION_GZIP;
}
else
- AH->compression = Z_DEFAULT_COMPRESSION;
+ AH->compress_spec.algorithm = PG_COMPRESSION_GZIP;
#ifndef HAVE_LIBZ
- if (AH->compression != 0)
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_GZIP)
pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
#endif
+
if (AH->version >= K_VERS_1_4)
{
struct tm crtm;
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 42687c4ec8..d58b96b2dc 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -331,14 +331,8 @@ struct _archiveHandle
DumpId *tableDataId; /* TABLE DATA ids, indexed by table dumpId */
struct _tocEntry *currToc; /* Used when dumping data */
- int compression; /*---------
- * Compression requested on open().
- * Possible values for compression:
- * -1 Z_DEFAULT_COMPRESSION
- * 0 COMPRESSION_NONE
- * 1-9 levels for gzip compression
- *---------
- */
+ pg_compress_specification compress_spec; /* Requested specification for
+ * compression */
bool dosync; /* data requested to be synced on sight */
ArchiveMode mode; /* File mode - r or w */
void *formatData; /* Header data specific to file format */
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index a0a55a1edd..6a2112c45f 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,7 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compress_spec, _CustomWriteFunc);
}
/*
@@ -377,7 +377,7 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compress_spec, _CustomWriteFunc);
}
/*
@@ -566,7 +566,7 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression, _CustomReadFunc);
+ ReadDataFromArchive(AH, AH->compress_spec, _CustomReadFunc);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 798182b6f7..7d2cddbb2c 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -327,7 +327,8 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
+ ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
+ AH->compress_spec);
if (ctx->dataFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -573,6 +574,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
cfp *tocFH;
+ pg_compress_specification compress_spec = {0};
char fname[MAXPGPATH];
setFilePath(AH, fname, "toc.dat");
@@ -581,7 +583,8 @@ _CloseArchive(ArchiveHandle *AH)
ctx->pstate = ParallelBackupStart(AH);
/* The TOC is always created uncompressed */
- tocFH = cfopen_write(fname, PG_BINARY_W, 0);
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
+ tocFH = cfopen_write(fname, PG_BINARY_W, compress_spec);
if (tocFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -639,12 +642,14 @@ static void
_StartBlobs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ pg_compress_specification compress_spec = {0};
char fname[MAXPGPATH];
setFilePath(AH, fname, "blobs.toc");
/* The blob TOC file is never compressed */
- ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
+ compress_spec.algorithm = PG_COMPRESSION_NONE;
+ ctx->blobsTocFH = cfopen_write(fname, "ab", compress_spec);
if (ctx->blobsTocFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -662,7 +667,7 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
+ ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compress_spec);
if (ctx->dataFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
diff --git a/src/bin/pg_dump/pg_backup_tar.c b/src/bin/pg_dump/pg_backup_tar.c
index 402b93c610..87bfed76fd 100644
--- a/src/bin/pg_dump/pg_backup_tar.c
+++ b/src/bin/pg_dump/pg_backup_tar.c
@@ -194,7 +194,7 @@ InitArchiveFmt_Tar(ArchiveHandle *AH)
* possible since gzdopen uses buffered IO which totally screws file
* positioning.
*/
- if (AH->compression != 0)
+ if (AH->compress_spec.algorithm != PG_COMPRESSION_NONE)
pg_fatal("compression is not supported by tar archive format");
}
else
@@ -328,7 +328,7 @@ tarOpen(ArchiveHandle *AH, const char *filename, char mode)
}
}
- if (AH->compression == 0)
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_NONE)
tm->nFH = ctx->tarFH;
else
pg_fatal("compression is not supported by tar archive format");
@@ -383,7 +383,7 @@ tarOpen(ArchiveHandle *AH, const char *filename, char mode)
umask(old_umask);
- if (AH->compression == 0)
+ if (AH->compress_spec.algorithm == PG_COMPRESSION_NONE)
tm->nFH = tm->tmpFH;
else
pg_fatal("compression is not supported by tar archive format");
@@ -401,7 +401,7 @@ tarOpen(ArchiveHandle *AH, const char *filename, char mode)
static void
tarClose(ArchiveHandle *AH, TAR_MEMBER *th)
{
- if (AH->compression != 0)
+ if (AH->compress_spec.algorithm != PG_COMPRESSION_NONE)
pg_fatal("compression is not supported by tar archive format");
if (th->mode == 'w')
@@ -800,7 +800,6 @@ _CloseArchive(ArchiveHandle *AH)
memcpy(ropt, AH->public.ropt, sizeof(RestoreOptions));
ropt->filename = NULL;
ropt->dropSchema = 1;
- ropt->compression = 0;
ropt->superuser = NULL;
ropt->suppressDumpWarnings = true;
@@ -888,7 +887,7 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
if (oid == 0)
pg_fatal("invalid OID for large object (%u)", oid);
- if (AH->compression != 0)
+ if (AH->compress_spec.algorithm != PG_COMPRESSION_NONE)
pg_fatal("compression is not supported by tar archive format");
sprintf(fname, "blob_%u.dat", oid);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index da427f4d4a..4830983c86 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -164,6 +164,8 @@ static void setup_connection(Archive *AH,
const char *dumpencoding, const char *dumpsnapshot,
char *use_role);
static ArchiveFormat parseArchiveFormat(const char *format, ArchiveMode *mode);
+static bool parse_compression(const char *opt,
+ pg_compress_specification *compress_spec);
static void expand_schema_name_patterns(Archive *fout,
SimpleStringList *patterns,
SimpleOidList *oids,
@@ -340,8 +342,9 @@ main(int argc, char **argv)
const char *dumpsnapshot = NULL;
char *use_role = NULL;
int numWorkers = 1;
- int compressLevel = -1;
int plainText = 0;
+ pg_compress_specification compress_spec = {0};
+ bool user_compression_defined = false;
ArchiveFormat archiveFormat = archUnknown;
ArchiveMode archiveMode;
@@ -561,10 +564,10 @@ main(int argc, char **argv)
dopt.aclsSkip = true;
break;
- case 'Z': /* Compression Level */
- if (!option_parse_int(optarg, "-Z/--compress", 0, 9,
- &compressLevel))
+ case 'Z': /* Compression */
+ if (!parse_compression(optarg, &compress_spec))
exit_nicely(1);
+ user_compression_defined = true;
break;
case 0:
@@ -687,23 +690,20 @@ main(int argc, char **argv)
if (archiveFormat == archNull)
plainText = 1;
- /* Custom and directory formats are compressed by default, others not */
- if (compressLevel == -1)
+ /*
+ * Custom and directory formats are compressed by default (zlib), others
+ * not
+ */
+ if (user_compression_defined == false)
{
+ parse_compress_specification(PG_COMPRESSION_NONE, NULL, &compress_spec);
#ifdef HAVE_LIBZ
if (archiveFormat == archCustom || archiveFormat == archDirectory)
- compressLevel = Z_DEFAULT_COMPRESSION;
- else
+ parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
+ &compress_spec);
#endif
- compressLevel = 0;
}
-#ifndef HAVE_LIBZ
- if (compressLevel != 0)
- pg_log_warning("requested compression not available in this installation -- archive will be uncompressed");
- compressLevel = 0;
-#endif
-
/*
* If emitting an archive format, we always want to emit a DATABASE item,
* in case --create is specified at pg_restore time.
@@ -716,8 +716,8 @@ main(int argc, char **argv)
pg_fatal("parallel backup only supported by the directory format");
/* Open the output file */
- fout = CreateArchive(filename, archiveFormat, compressLevel, dosync,
- archiveMode, setupDumpWorker);
+ fout = CreateArchive(filename, archiveFormat, compress_spec,
+ dosync, archiveMode, setupDumpWorker);
/* Make dump options accessible right away */
SetArchiveOptions(fout, &dopt, NULL);
@@ -948,10 +948,7 @@ main(int argc, char **argv)
ropt->sequence_data = dopt.sequence_data;
ropt->binary_upgrade = dopt.binary_upgrade;
- if (compressLevel == -1)
- ropt->compression = 0;
- else
- ropt->compression = compressLevel;
+ ropt->compress_spec = compress_spec;
ropt->suppressDumpWarnings = true; /* We've already shown them */
@@ -998,7 +995,8 @@ help(const char *progname)
printf(_(" -j, --jobs=NUM use this many parallel jobs to dump\n"));
printf(_(" -v, --verbose verbose mode\n"));
printf(_(" -V, --version output version information, then exit\n"));
- printf(_(" -Z, --compress=0-9 compression level for compressed formats\n"));
+ printf(_(" -Z, --compress=METHOD[:LEVEL]\n"
+ " compress as specified\n"));
printf(_(" --lock-wait-timeout=TIMEOUT fail after waiting TIMEOUT for a table lock\n"));
printf(_(" --no-sync do not wait for changes to be written safely to disk\n"));
printf(_(" -?, --help show this help, then exit\n"));
@@ -1258,6 +1256,61 @@ get_synchronized_snapshot(Archive *fout)
return result;
}
+/*
+ * Interprets and validates a compression option using the common compression
+ * parsing functions. If the requested compression is not available then the
+ * archives are uncompressed.
+ */
+static bool
+parse_compression(const char *opt, pg_compress_specification *compress_spec)
+{
+ char *algorithm_str = NULL;
+ char *level_str = NULL;
+ char *validation_error = NULL;
+ bool supports_compression = true;
+
+ parse_compress_options(opt, &algorithm_str, &level_str);
+ if (!parse_compress_algorithm(algorithm_str, &(compress_spec->algorithm)))
+ {
+ pg_log_error("invalid compression method: \"%s\" (gzip, none)",
+ algorithm_str);
+ return false;
+ }
+
+ /* Switch off unimplemented or unavailable compressions. */
+ if (compress_spec->algorithm != PG_COMPRESSION_NONE &&
+ compress_spec->algorithm != PG_COMPRESSION_GZIP)
+ supports_compression = false;
+
+#ifndef HAVE_LIBZ
+ if (compress_spec->algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+
+ if (!supports_compression)
+ {
+ pg_log_warning("requested compression not available in this installation -- archive will be uncompressed");
+ parse_compress_specification(PG_COMPRESSION_NONE, NULL, compress_spec);
+
+ pg_free(algorithm_str);
+ pg_free(level_str);
+
+ return true;
+ }
+
+ /* Parse and validate the rest of the options */
+ parse_compress_specification(compress_spec->algorithm, level_str,
+ compress_spec);
+ validation_error = validate_compress_specification(compress_spec);
+ if (validation_error)
+ {
+ pg_log_error("invalid compression specification: %s", validation_error);
+ return false;
+ }
+
+ return true;
+}
+
static ArchiveFormat
parseArchiveFormat(const char *format, ArchiveMode *mode)
{
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a583c8a6d2..f8d0b2fce5 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -121,16 +121,32 @@ command_fails_like(
'pg_restore: cannot specify both --single-transaction and multiple jobs');
command_fails_like(
- [ 'pg_dump', '-Z', '-1' ],
- qr/\Qpg_dump: error: -Z\/--compress must be in range 0..9\E/,
- 'pg_dump: -Z/--compress must be in range');
+ [ 'pg_dump', '--compress', 'garbage' ],
+ qr/\Qpg_dump: error: invalid compression method: "garbage" (gzip, none)\E/,
+ 'pg_dump: invalid --compress');
+
+command_fails_like(
+ [ 'pg_dump', '--compress', 'none:1' ],
+ qr/\Qpg_dump: error: invalid compression specification: compression algorithm "none" does not accept a compression level\E/,
+ 'pg_dump: invalid compression specification: compression algorithm "none" does not accept a compression level');
+
if (check_pg_config("#define HAVE_LIBZ 1"))
{
+ command_fails_like(
+ [ 'pg_dump', '-Z', '15' ],
+ qr/\Qpg_dump: error: invalid compression specification: compression algorithm "gzip" expects a compression level between 1 and 9 (default at -1)\E/,
+ 'pg_dump: invalid compression specification: must be in range');
+
command_fails_like(
[ 'pg_dump', '--compress', '1', '--format', 'tar' ],
qr/\Qpg_dump: error: compression is not supported by tar archive format\E/,
'pg_dump: compression is not supported by tar archive format');
+
+ command_fails_like(
+ [ 'pg_dump', '-Z', 'gzip:nonInt' ],
+ qr/\Qpg_dump: error: invalid compression specification: unrecognized compression option: "nonInt"\E/,
+ 'pg_dump: invalid compression specification: must be an integer');
}
else
{
@@ -139,6 +155,11 @@ else
[ 'pg_dump', '--compress', '1', '--format', 'tar', '-j3' ],
qr/\Qpg_dump: warning: requested compression not available in this installation -- archive will be uncompressed\E/,
'pg_dump: warning: compression not available in this installation');
+
+ command_fails_like(
+ [ 'pg_dump', '-Z', 'gzip:nonInt', '--format', 'tar', '-j2' ],
+ qr/\Qpg_dump: warning: requested compression not available in this installation -- archive will be uncompressed\E/,
+ 'pg_dump: invalid compression specification: must be an integer');
}
command_fails_like(
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index fe53ed0f89..d604558f03 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -87,7 +87,7 @@ my %pgdump_runs = (
compile_option => 'gzip',
dump_cmd => [
'pg_dump', '--jobs=2',
- '--format=directory', '--compress=1',
+ '--format=directory', '--compress=gzip:1',
"--file=$tempdir/compression_gzip_dir", 'postgres',
],
# Give coverage for manually compressed blob.toc files during
--
2.34.1
On Wed, Nov 30, 2022 at 05:11:44PM +0000, gkokolatos@pm.me wrote:
Fair enough. The atteched v11 does that. 0001 introduces compression
specification and is using it throughout. 0002 paves the way to the
new interface by homogenizing the use of cfp. 0003 introduces the new
API and stores the compression algorithm in the custom format header
instead of the compression level integer. Finally 0004 adds support for
LZ4.
I have been looking at 0001, and.. Hmm. I am really wondering
whether it would not be better to just nuke this warning into orbit.
This stuff enforces non-compression even if -Z has been used to a
non-default value. This has been moved to its current location by
cae2bb1 as of this thread:
/messages/by-id/20160526.185551.242041780.horiguchi.kyotaro@lab.ntt.co.jp
However, this is only active if -Z is used when not building with
zlib. At the end, it comes down to whether we want to prioritize the
portability of pg_dump commands specifying a -Z/--compress across
environments knowing that these may or may not be built with zlib,
vs the amount of simplification/uniformity we would get across the
binaries in the tree once we switch everything to use the compression
specifications. Now that pg_basebackup and pg_receivewal are managed
by compression specifications, and that we'd want more compression
options for pg_dump, I would tend to do the latter and from now on
complain if attempting to do a pg_dump -Z under --without-zlib with a
compression level > 0. zlib is also widely available, and we don't
document the fact that non-compression is enforced in this case,
either. (Two TAP tests with the custom format had to be tweaked.)
As per the patch, it is true that we do not need to bump the format of
the dump archives, as we can still store only the compression level
and guess the method from it. I have added some notes about that in
ReadHead and WriteHead to not forget.
Most of the changes are really-straight forward, and it has resisted
my tests, so I think that this is in a rather-commitable shape as-is.
--
Michael
Attachments:
v12-0001-Teach-pg_dump-about-compress_spec-and-use-it-thr.patchtext/x-diff; charset=us-asciiDownload
From a4fa522d0259e8969cde32798a917321cced0415 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Thu, 1 Dec 2022 11:03:41 +0900
Subject: [PATCH v12] Teach pg_dump about compress_spec and use it throughout.
Align pg_dump with the rest of the binaries which use common compression. It is
teaching pg_dump.c about the common compression definitions and interfaces. Then
it propagates those throughout the code.
---
src/bin/pg_dump/compress_io.c | 107 ++++++++------------
src/bin/pg_dump/compress_io.h | 20 ++--
src/bin/pg_dump/pg_backup.h | 7 +-
src/bin/pg_dump/pg_backup_archiver.c | 76 +++++++++-----
src/bin/pg_dump/pg_backup_archiver.h | 10 +-
src/bin/pg_dump/pg_backup_custom.c | 6 +-
src/bin/pg_dump/pg_backup_directory.c | 13 ++-
src/bin/pg_dump/pg_backup_tar.c | 11 +-
src/bin/pg_dump/pg_dump.c | 67 +++++++-----
src/bin/pg_dump/t/001_basic.pl | 34 +++++--
src/bin/pg_dump/t/002_pg_dump.pl | 3 +-
src/test/modules/test_pg_dump/t/001_base.pl | 16 +++
doc/src/sgml/ref/pg_dump.sgml | 34 +++++--
src/tools/pgindent/typedefs.list | 1 -
14 files changed, 242 insertions(+), 163 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 62f940ff7a..8f0d6d6210 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -64,7 +64,7 @@
/* typedef appears in compress_io.h */
struct CompressorState
{
- CompressionAlgorithm comprAlg;
+ pg_compress_specification compression_spec;
WriteFunc writeF;
#ifdef HAVE_LIBZ
@@ -74,9 +74,6 @@ struct CompressorState
#endif
};
-static void ParseCompressionOption(int compression, CompressionAlgorithm *alg,
- int *level);
-
/* Routines that support zlib compressed data I/O */
#ifdef HAVE_LIBZ
static void InitCompressorZlib(CompressorState *cs, int level);
@@ -93,57 +90,30 @@ static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
const char *data, size_t dLen);
-/*
- * Interprets a numeric 'compression' value. The algorithm implied by the
- * value (zlib or none at the moment), is returned in *alg, and the
- * zlib compression level in *level.
- */
-static void
-ParseCompressionOption(int compression, CompressionAlgorithm *alg, int *level)
-{
- if (compression == Z_DEFAULT_COMPRESSION ||
- (compression > 0 && compression <= 9))
- *alg = COMPR_ALG_LIBZ;
- else if (compression == 0)
- *alg = COMPR_ALG_NONE;
- else
- {
- pg_fatal("invalid compression code: %d", compression);
- *alg = COMPR_ALG_NONE; /* keep compiler quiet */
- }
-
- /* The level is just the passed-in value. */
- if (level)
- *level = compression;
-}
-
/* Public interface routines */
/* Allocate a new compressor */
CompressorState *
-AllocateCompressor(int compression, WriteFunc writeF)
+AllocateCompressor(const pg_compress_specification compression_spec,
+ WriteFunc writeF)
{
CompressorState *cs;
- CompressionAlgorithm alg;
- int level;
-
- ParseCompressionOption(compression, &alg, &level);
#ifndef HAVE_LIBZ
- if (alg == COMPR_ALG_LIBZ)
+ if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
pg_fatal("not built with zlib support");
#endif
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
cs->writeF = writeF;
- cs->comprAlg = alg;
+ cs->compression_spec = compression_spec;
/*
* Perform compression algorithm specific initialization.
*/
#ifdef HAVE_LIBZ
- if (alg == COMPR_ALG_LIBZ)
- InitCompressorZlib(cs, level);
+ if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressorZlib(cs, cs->compression_spec.level);
#endif
return cs;
@@ -154,15 +124,12 @@ AllocateCompressor(int compression, WriteFunc writeF)
* out with ahwrite().
*/
void
-ReadDataFromArchive(ArchiveHandle *AH, int compression, ReadFunc readF)
+ReadDataFromArchive(ArchiveHandle *AH, pg_compress_specification compression_spec,
+ ReadFunc readF)
{
- CompressionAlgorithm alg;
-
- ParseCompressionOption(compression, &alg, NULL);
-
- if (alg == COMPR_ALG_NONE)
+ if (compression_spec.algorithm == PG_COMPRESSION_NONE)
ReadDataFromArchiveNone(AH, readF);
- if (alg == COMPR_ALG_LIBZ)
+ if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
#ifdef HAVE_LIBZ
ReadDataFromArchiveZlib(AH, readF);
@@ -179,18 +146,23 @@ void
WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen)
{
- switch (cs->comprAlg)
+ switch (cs->compression_spec.algorithm)
{
- case COMPR_ALG_LIBZ:
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
WriteDataToArchiveZlib(AH, cs, data, dLen);
#else
pg_fatal("not built with zlib support");
#endif
break;
- case COMPR_ALG_NONE:
+ case PG_COMPRESSION_NONE:
WriteDataToArchiveNone(AH, cs, data, dLen);
break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
}
@@ -201,7 +173,7 @@ void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
#ifdef HAVE_LIBZ
- if (cs->comprAlg == COMPR_ALG_LIBZ)
+ if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
EndCompressorZlib(AH, cs);
#endif
free(cs);
@@ -453,20 +425,27 @@ cfopen_read(const char *path, const char *mode)
{
cfp *fp;
+ pg_compress_specification compression_spec = {0};
+
#ifdef HAVE_LIBZ
if (hasSuffix(path, ".gz"))
- fp = cfopen(path, mode, 1);
+ {
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ fp = cfopen(path, mode, compression_spec);
+ }
else
#endif
{
- fp = cfopen(path, mode, 0);
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
+ fp = cfopen(path, mode, compression_spec);
#ifdef HAVE_LIBZ
if (fp == NULL)
{
char *fname;
fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, 1);
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ fp = cfopen(fname, mode, compression_spec);
free_keep_errno(fname);
}
#endif
@@ -479,26 +458,27 @@ cfopen_read(const char *path, const char *mode)
* be a filemode as accepted by fopen() and gzopen() that indicates writing
* ("w", "wb", "a", or "ab").
*
- * If 'compression' is non-zero, a gzip compressed stream is opened, and
- * 'compression' indicates the compression level used. The ".gz" suffix
- * is automatically added to 'path' in that case.
+ * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
+ * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
+ * 'path' in that case.
*
* On failure, return NULL with an error code in errno.
*/
cfp *
-cfopen_write(const char *path, const char *mode, int compression)
+cfopen_write(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
{
cfp *fp;
- if (compression == 0)
- fp = cfopen(path, mode, 0);
+ if (compression_spec.algorithm == PG_COMPRESSION_NONE)
+ fp = cfopen(path, mode, compression_spec);
else
{
#ifdef HAVE_LIBZ
char *fname;
fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression);
+ fp = cfopen(fname, mode, compression_spec);
free_keep_errno(fname);
#else
pg_fatal("not built with zlib support");
@@ -509,26 +489,27 @@ cfopen_write(const char *path, const char *mode, int compression)
}
/*
- * Opens file 'path' in 'mode'. If 'compression' is non-zero, the file
+ * Opens file 'path' in 'mode'. If compression is GZIP, the file
* is opened with libz gzopen(), otherwise with plain fopen().
*
* On failure, return NULL with an error code in errno.
*/
cfp *
-cfopen(const char *path, const char *mode, int compression)
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compression != 0)
+ if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
#ifdef HAVE_LIBZ
- if (compression != Z_DEFAULT_COMPRESSION)
+ if (compression_spec.level != Z_DEFAULT_COMPRESSION)
{
/* user has specified a compression level, so tell zlib to use it */
char mode_compression[32];
snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression);
+ mode, compression_spec.level);
fp->compressedfp = gzopen(path, mode_compression);
}
else
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index f635787692..6fad6c2cd5 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -21,12 +21,6 @@
#define ZLIB_OUT_SIZE 4096
#define ZLIB_IN_SIZE 4096
-typedef enum
-{
- COMPR_ALG_NONE,
- COMPR_ALG_LIBZ
-} CompressionAlgorithm;
-
/* Prototype for callback function to WriteDataToArchive() */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
@@ -46,8 +40,10 @@ typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
-extern CompressorState *AllocateCompressor(int compression, WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH, int compression,
+extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ WriteFunc writeF);
+extern void ReadDataFromArchive(ArchiveHandle *AH,
+ const pg_compress_specification compression_spec,
ReadFunc readF);
extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen);
@@ -56,9 +52,13 @@ extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
typedef struct cfp cfp;
-extern cfp *cfopen(const char *path, const char *mode, int compression);
+extern cfp *cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec);
+extern cfp *cfdopen(int fd, const char *mode,
+ pg_compress_specification compression_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode, int compression);
+extern cfp *cfopen_write(const char *path, const char *mode,
+ const pg_compress_specification compression_spec);
extern int cfread(void *ptr, int size, cfp *fp);
extern int cfwrite(const void *ptr, int size, cfp *fp);
extern int cfgetc(cfp *fp);
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index e8b7898297..bc6b6594af 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -23,6 +23,7 @@
#ifndef PG_BACKUP_H
#define PG_BACKUP_H
+#include "common/compression.h"
#include "fe_utils/simple_list.h"
#include "libpq-fe.h"
@@ -143,7 +144,8 @@ typedef struct _restoreOptions
int noDataForFailedTables;
int exit_on_error;
- int compression;
+ pg_compress_specification compression_spec; /* Specification for
+ * compression */
int suppressDumpWarnings; /* Suppress output of WARNING entries
* to stderr */
bool single_txn;
@@ -303,7 +305,8 @@ extern Archive *OpenArchive(const char *FileSpec, const ArchiveFormat fmt);
/* Create a new archive */
extern Archive *CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compression_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupDumpWorker);
/* The --list option */
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index f39c0fa36f..22238539fd 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -70,7 +70,8 @@ typedef struct _parallelReadyList
static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compression_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupWorkerPtr);
static void _getObjectDescription(PQExpBuffer buf, const TocEntry *te);
static void _printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData);
@@ -98,7 +99,8 @@ static int _discoverArchiveFormat(ArchiveHandle *AH);
static int RestoringToDB(ArchiveHandle *AH);
static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
-static void SetOutput(ArchiveHandle *AH, const char *filename, int compression);
+static void SetOutput(ArchiveHandle *AH, const char *filename,
+ const pg_compress_specification compression_spec);
static OutputContext SaveOutput(ArchiveHandle *AH);
static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
@@ -239,12 +241,13 @@ setupRestoreWorker(Archive *AHX)
/* Public */
Archive *
CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compression_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupDumpWorker)
{
- ArchiveHandle *AH = _allocAH(FileSpec, fmt, compression, dosync,
- mode, setupDumpWorker);
+ ArchiveHandle *AH = _allocAH(FileSpec, fmt, compression_spec,
+ dosync, mode, setupDumpWorker);
return (Archive *) AH;
}
@@ -254,7 +257,12 @@ CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
Archive *
OpenArchive(const char *FileSpec, const ArchiveFormat fmt)
{
- ArchiveHandle *AH = _allocAH(FileSpec, fmt, 0, true, archModeRead, setupRestoreWorker);
+ ArchiveHandle *AH;
+ pg_compress_specification compression_spec = {0};
+
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
+ AH = _allocAH(FileSpec, fmt, compression_spec, true,
+ archModeRead, setupRestoreWorker);
return (Archive *) AH;
}
@@ -384,7 +392,8 @@ RestoreArchive(Archive *AHX)
* Make sure we won't need (de)compression we haven't got
*/
#ifndef HAVE_LIBZ
- if (AH->compression != 0 && AH->PrintTocDataPtr != NULL)
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
+ AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
@@ -459,8 +468,8 @@ RestoreArchive(Archive *AHX)
* Setup the output file if necessary.
*/
sav = SaveOutput(AH);
- if (ropt->filename || ropt->compression)
- SetOutput(AH, ropt->filename, ropt->compression);
+ if (ropt->filename || ropt->compression_spec.algorithm != PG_COMPRESSION_NONE)
+ SetOutput(AH, ropt->filename, ropt->compression_spec);
ahprintf(AH, "--\n-- PostgreSQL database dump\n--\n\n");
@@ -739,7 +748,7 @@ RestoreArchive(Archive *AHX)
*/
AH->stage = STAGE_FINALIZING;
- if (ropt->filename || ropt->compression)
+ if (ropt->filename || ropt->compression_spec.algorithm != PG_COMPRESSION_NONE)
RestoreOutput(AH, sav);
if (ropt->useDB)
@@ -969,6 +978,8 @@ NewRestoreOptions(void)
opts->format = archUnknown;
opts->cparams.promptPassword = TRI_DEFAULT;
opts->dumpSections = DUMP_UNSECTIONED;
+ opts->compression_spec.algorithm = PG_COMPRESSION_NONE;
+ opts->compression_spec.level = 0;
return opts;
}
@@ -1115,23 +1126,28 @@ PrintTOCSummary(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
TocEntry *te;
+ pg_compress_specification out_compression_spec = {0};
teSection curSection;
OutputContext sav;
const char *fmtName;
char stamp_str[64];
+ /* TOC is always uncompressed */
+ out_compression_spec.algorithm = PG_COMPRESSION_NONE;
+
sav = SaveOutput(AH);
if (ropt->filename)
- SetOutput(AH, ropt->filename, 0 /* no compression */ );
+ SetOutput(AH, ropt->filename, out_compression_spec);
if (strftime(stamp_str, sizeof(stamp_str), PGDUMP_STRFTIME_FMT,
localtime(&AH->createDate)) == 0)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1485,7 +1501,8 @@ archprintf(Archive *AH, const char *fmt,...)
*******************************/
static void
-SetOutput(ArchiveHandle *AH, const char *filename, int compression)
+SetOutput(ArchiveHandle *AH, const char *filename,
+ const pg_compress_specification compression_spec)
{
int fn;
@@ -1508,12 +1525,12 @@ SetOutput(ArchiveHandle *AH, const char *filename, int compression)
/* If compression explicitly requested, use gzopen */
#ifdef HAVE_LIBZ
- if (compression != 0)
+ if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
char fmode[14];
/* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression);
+ sprintf(fmode, "wb%d", compression_spec.level);
if (fn >= 0)
AH->OF = gzdopen(dup(fn), fmode);
else
@@ -2198,7 +2215,8 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
static ArchiveHandle *
_allocAH(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compression_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
@@ -2249,7 +2267,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
AH->toc->prev = AH->toc;
AH->mode = mode;
- AH->compression = compression;
+ AH->compression_spec = compression_spec;
AH->dosync = dosync;
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
@@ -2264,7 +2282,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
* Force stdin/stdout into binary mode if that is what we are using.
*/
#ifdef WIN32
- if ((fmt != archNull || compression != 0) &&
+ if ((fmt != archNull || compression_spec.algorithm != PG_COMPRESSION_NONE) &&
(AH->fSpec == NULL || strcmp(AH->fSpec, "") == 0))
{
if (mode == archModeWrite)
@@ -3669,7 +3687,12 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- WriteInt(AH, AH->compression);
+ /*
+ * For now the compression type is implied by the level. This will need
+ * to change once support for more compression algorithms is added,
+ * requiring a format bump.
+ */
+ WriteInt(AH, AH->compression_spec.level);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3740,18 +3763,23 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
+ /* Guess the compression method based on the level */
+ AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
if (AH->version >= K_VERS_1_2)
{
if (AH->version < K_VERS_1_4)
- AH->compression = AH->ReadBytePtr(AH);
+ AH->compression_spec.level = AH->ReadBytePtr(AH);
else
- AH->compression = ReadInt(AH);
+ AH->compression_spec.level = ReadInt(AH);
+
+ if (AH->compression_spec.level != 0)
+ AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
else
- AH->compression = Z_DEFAULT_COMPRESSION;
+ AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
#ifndef HAVE_LIBZ
- if (AH->compression != 0)
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
#endif
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 42687c4ec8..a9560c6045 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -331,14 +331,8 @@ struct _archiveHandle
DumpId *tableDataId; /* TABLE DATA ids, indexed by table dumpId */
struct _tocEntry *currToc; /* Used when dumping data */
- int compression; /*---------
- * Compression requested on open().
- * Possible values for compression:
- * -1 Z_DEFAULT_COMPRESSION
- * 0 COMPRESSION_NONE
- * 1-9 levels for gzip compression
- *---------
- */
+ pg_compress_specification compression_spec; /* Requested specification for
+ * compression */
bool dosync; /* data requested to be synced on sight */
ArchiveMode mode; /* File mode - r or w */
void *formatData; /* Header data specific to file format */
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index a0a55a1edd..f413d01fcb 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,7 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
}
/*
@@ -377,7 +377,7 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
}
/*
@@ -566,7 +566,7 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression, _CustomReadFunc);
+ ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 798182b6f7..53ef8db728 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -327,7 +327,8 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
+ ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
+ AH->compression_spec);
if (ctx->dataFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -573,6 +574,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
cfp *tocFH;
+ pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
setFilePath(AH, fname, "toc.dat");
@@ -581,7 +583,8 @@ _CloseArchive(ArchiveHandle *AH)
ctx->pstate = ParallelBackupStart(AH);
/* The TOC is always created uncompressed */
- tocFH = cfopen_write(fname, PG_BINARY_W, 0);
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
+ tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
if (tocFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -639,12 +642,14 @@ static void
_StartBlobs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
setFilePath(AH, fname, "blobs.toc");
/* The blob TOC file is never compressed */
- ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
+ ctx->blobsTocFH = cfopen_write(fname, "ab", compression_spec);
if (ctx->blobsTocFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -662,7 +667,7 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
+ ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
if (ctx->dataFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
diff --git a/src/bin/pg_dump/pg_backup_tar.c b/src/bin/pg_dump/pg_backup_tar.c
index 402b93c610..99f3f5bcae 100644
--- a/src/bin/pg_dump/pg_backup_tar.c
+++ b/src/bin/pg_dump/pg_backup_tar.c
@@ -194,7 +194,7 @@ InitArchiveFmt_Tar(ArchiveHandle *AH)
* possible since gzdopen uses buffered IO which totally screws file
* positioning.
*/
- if (AH->compression != 0)
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
pg_fatal("compression is not supported by tar archive format");
}
else
@@ -328,7 +328,7 @@ tarOpen(ArchiveHandle *AH, const char *filename, char mode)
}
}
- if (AH->compression == 0)
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_NONE)
tm->nFH = ctx->tarFH;
else
pg_fatal("compression is not supported by tar archive format");
@@ -383,7 +383,7 @@ tarOpen(ArchiveHandle *AH, const char *filename, char mode)
umask(old_umask);
- if (AH->compression == 0)
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_NONE)
tm->nFH = tm->tmpFH;
else
pg_fatal("compression is not supported by tar archive format");
@@ -401,7 +401,7 @@ tarOpen(ArchiveHandle *AH, const char *filename, char mode)
static void
tarClose(ArchiveHandle *AH, TAR_MEMBER *th)
{
- if (AH->compression != 0)
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
pg_fatal("compression is not supported by tar archive format");
if (th->mode == 'w')
@@ -800,7 +800,6 @@ _CloseArchive(ArchiveHandle *AH)
memcpy(ropt, AH->public.ropt, sizeof(RestoreOptions));
ropt->filename = NULL;
ropt->dropSchema = 1;
- ropt->compression = 0;
ropt->superuser = NULL;
ropt->suppressDumpWarnings = true;
@@ -888,7 +887,7 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
if (oid == 0)
pg_fatal("invalid OID for large object (%u)", oid);
- if (AH->compression != 0)
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
pg_fatal("compression is not supported by tar archive format");
sprintf(fname, "blob_%u.dat", oid);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index da427f4d4a..510555ef21 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -105,6 +105,8 @@ static Oid g_last_builtin_oid; /* value of the last builtin oid */
/* The specified names/patterns should to match at least one entity */
static int strict_names = 0;
+static pg_compress_algorithm compression_algorithm = PG_COMPRESSION_NONE;
+
/*
* Object inclusion/exclusion lists
*
@@ -340,10 +342,13 @@ main(int argc, char **argv)
const char *dumpsnapshot = NULL;
char *use_role = NULL;
int numWorkers = 1;
- int compressLevel = -1;
int plainText = 0;
ArchiveFormat archiveFormat = archUnknown;
ArchiveMode archiveMode;
+ pg_compress_specification compression_spec = {0};
+ char *compression_detail = NULL;
+ char *compression_algorithm_str = "none";
+ char *error_detail = NULL;
static DumpOptions dopt;
@@ -561,10 +566,9 @@ main(int argc, char **argv)
dopt.aclsSkip = true;
break;
- case 'Z': /* Compression Level */
- if (!option_parse_int(optarg, "-Z/--compress", 0, 9,
- &compressLevel))
- exit_nicely(1);
+ case 'Z': /* Compression */
+ parse_compress_options(optarg, &compression_algorithm_str,
+ &compression_detail);
break;
case 0:
@@ -687,22 +691,33 @@ main(int argc, char **argv)
if (archiveFormat == archNull)
plainText = 1;
- /* Custom and directory formats are compressed by default, others not */
- if (compressLevel == -1)
- {
-#ifdef HAVE_LIBZ
- if (archiveFormat == archCustom || archiveFormat == archDirectory)
- compressLevel = Z_DEFAULT_COMPRESSION;
- else
-#endif
- compressLevel = 0;
- }
+ /*
+ * Compression options
+ */
+ if (!parse_compress_algorithm(compression_algorithm_str,
+ &compression_algorithm))
+ pg_fatal("unrecognized compression algorithm: \"%s\"",
+ compression_algorithm_str);
-#ifndef HAVE_LIBZ
- if (compressLevel != 0)
- pg_log_warning("requested compression not available in this installation -- archive will be uncompressed");
- compressLevel = 0;
-#endif
+ parse_compress_specification(compression_algorithm, compression_detail,
+ &compression_spec);
+ error_detail = validate_compress_specification(&compression_spec);
+ if (error_detail != NULL)
+ pg_fatal("invalid compression specification: %s",
+ error_detail);
+
+ switch (compression_algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ case PG_COMPRESSION_GZIP:
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ }
/*
* If emitting an archive format, we always want to emit a DATABASE item,
@@ -716,8 +731,8 @@ main(int argc, char **argv)
pg_fatal("parallel backup only supported by the directory format");
/* Open the output file */
- fout = CreateArchive(filename, archiveFormat, compressLevel, dosync,
- archiveMode, setupDumpWorker);
+ fout = CreateArchive(filename, archiveFormat, compression_spec,
+ dosync, archiveMode, setupDumpWorker);
/* Make dump options accessible right away */
SetArchiveOptions(fout, &dopt, NULL);
@@ -948,10 +963,7 @@ main(int argc, char **argv)
ropt->sequence_data = dopt.sequence_data;
ropt->binary_upgrade = dopt.binary_upgrade;
- if (compressLevel == -1)
- ropt->compression = 0;
- else
- ropt->compression = compressLevel;
+ ropt->compression_spec = compression_spec;
ropt->suppressDumpWarnings = true; /* We've already shown them */
@@ -998,7 +1010,8 @@ help(const char *progname)
printf(_(" -j, --jobs=NUM use this many parallel jobs to dump\n"));
printf(_(" -v, --verbose verbose mode\n"));
printf(_(" -V, --version output version information, then exit\n"));
- printf(_(" -Z, --compress=0-9 compression level for compressed formats\n"));
+ printf(_(" -Z, --compress=METHOD[:LEVEL]\n"
+ " compress as specified\n"));
printf(_(" --lock-wait-timeout=TIMEOUT fail after waiting TIMEOUT for a table lock\n"));
printf(_(" --no-sync do not wait for changes to be written safely to disk\n"));
printf(_(" -?, --help show this help, then exit\n"));
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a583c8a6d2..c8bc02126d 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -121,24 +121,46 @@ command_fails_like(
'pg_restore: cannot specify both --single-transaction and multiple jobs');
command_fails_like(
- [ 'pg_dump', '-Z', '-1' ],
- qr/\Qpg_dump: error: -Z\/--compress must be in range 0..9\E/,
- 'pg_dump: -Z/--compress must be in range');
+ [ 'pg_dump', '--compress', 'garbage' ],
+ qr/\Qpg_dump: error: unrecognized compression algorithm/,
+ 'pg_dump: invalid --compress');
+
+command_fails_like(
+ [ 'pg_dump', '--compress', 'none:1' ],
+ qr/\Qpg_dump: error: invalid compression specification: compression algorithm "none" does not accept a compression level\E/,
+ 'pg_dump: invalid compression specification: compression algorithm "none" does not accept a compression level'
+);
+
if (check_pg_config("#define HAVE_LIBZ 1"))
{
+ command_fails_like(
+ [ 'pg_dump', '-Z', '15' ],
+ qr/\Qpg_dump: error: invalid compression specification: compression algorithm "gzip" expects a compression level between 1 and 9 (default at -1)\E/,
+ 'pg_dump: invalid compression specification: must be in range');
+
command_fails_like(
[ 'pg_dump', '--compress', '1', '--format', 'tar' ],
qr/\Qpg_dump: error: compression is not supported by tar archive format\E/,
'pg_dump: compression is not supported by tar archive format');
+
+ command_fails_like(
+ [ 'pg_dump', '-Z', 'gzip:nonInt' ],
+ qr/\Qpg_dump: error: invalid compression specification: unrecognized compression option: "nonInt"\E/,
+ 'pg_dump: invalid compression specification: must be an integer');
}
else
{
# --jobs > 1 forces an error with tar format.
command_fails_like(
- [ 'pg_dump', '--compress', '1', '--format', 'tar', '-j3' ],
- qr/\Qpg_dump: warning: requested compression not available in this installation -- archive will be uncompressed\E/,
- 'pg_dump: warning: compression not available in this installation');
+ [ 'pg_dump', '--format', 'tar', '-j3' ],
+ qr/\Qpg_dump: error: parallel backup only supported by the directory format\E/,
+ 'pg_dump: warning: parallel backup not supported by tar format');
+
+ command_fails_like(
+ [ 'pg_dump', '-Z', 'gzip:nonInt', '--format', 'tar', '-j2' ],
+ qr/\Qpg_dump: error: invalid compression specification: unrecognized compression option\E/,
+ 'pg_dump: invalid compression specification: must be an integer');
}
command_fails_like(
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index fe53ed0f89..709db0986d 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -87,7 +87,7 @@ my %pgdump_runs = (
compile_option => 'gzip',
dump_cmd => [
'pg_dump', '--jobs=2',
- '--format=directory', '--compress=1',
+ '--format=directory', '--compress=gzip:1',
"--file=$tempdir/compression_gzip_dir", 'postgres',
],
# Give coverage for manually compressed blob.toc files during
@@ -200,6 +200,7 @@ my %pgdump_runs = (
# Do not use --no-sync to give test coverage for data sync.
defaults_custom_format => {
test_key => 'defaults',
+ compile_option => 'gzip',
dump_cmd => [
'pg_dump', '-Fc', '-Z6',
"--file=$tempdir/defaults_custom_format.dump", 'postgres',
diff --git a/src/test/modules/test_pg_dump/t/001_base.pl b/src/test/modules/test_pg_dump/t/001_base.pl
index f5da6bf46d..19577ce0ea 100644
--- a/src/test/modules/test_pg_dump/t/001_base.pl
+++ b/src/test/modules/test_pg_dump/t/001_base.pl
@@ -20,6 +20,10 @@ my $tempdir = PostgreSQL::Test::Utils::tempdir;
# to define how each test should (or shouldn't) treat a result
# from a given run.
#
+# compile_option indicates if the commands run depend on a compilation
+# option, if any. This can be used to control if tests should be
+# skipped when a build dependency is not satisfied.
+#
# test_key indicates that a given run should simply use the same
# set of like/unlike tests as another run, and which run that is.
#
@@ -90,6 +94,7 @@ my %pgdump_runs = (
},
defaults_custom_format => {
test_key => 'defaults',
+ compile_option => 'gzip',
dump_cmd => [
'pg_dump', '--no-sync', '-Fc', '-Z6',
"--file=$tempdir/defaults_custom_format.dump", 'postgres',
@@ -749,6 +754,8 @@ $node->start;
my $port = $node->port;
+my $supports_gzip = check_pg_config("#define HAVE_LIBZ 1");
+
#########################################
# Set up schemas, tables, etc, to be dumped.
@@ -792,6 +799,15 @@ foreach my $run (sort keys %pgdump_runs)
my $test_key = $run;
+ # Skip command-level tests for gzip if there is no support for it.
+ if ( defined($pgdump_runs{$run}->{compile_option})
+ && $pgdump_runs{$run}->{compile_option} eq 'gzip'
+ && !$supports_gzip)
+ {
+ note "$run: skipped due to no gzip support";
+ next;
+ }
+
$node->command_ok(\@{ $pgdump_runs{$run}->{dump_cmd} },
"$run: pg_dump runs");
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 8b9d9f4cad..2a015908f0 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -644,17 +644,35 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
- <term><option>-Z <replaceable class="parameter">0..9</replaceable></option></term>
- <term><option>--compress=<replaceable class="parameter">0..9</replaceable></option></term>
+ <term><option>-Z <replaceable class="parameter">level</replaceable></option></term>
+ <term><option>-Z <replaceable class="parameter">method</replaceable></option>[:<replaceable>level</replaceable>]</term>
+ <term><option>--compress=<replaceable class="parameter">level</replaceable></option></term>
+ <term><option>--compress=<replaceable class="parameter">method</replaceable></option>[:<replaceable>level</replaceable>]</term>
<listitem>
<para>
- Specify the compression level to use. Zero means no compression.
+ Specify the compression method and/or the compression level to use.
+ The compression method can be set to <literal>gzip</literal> or
+ <literal>none</literal> for no compression. A compression level can
+ be optionally specified, by appending the level number after a colon
+ (<literal>:</literal>).
+ </para>
+ <para>
+ If no compression level is specified, the default compression
+ level will be used. If only a level is specified without mentioning
+ an algorithm, <literal>gzip</literal> compression will be used if
+ the level is greater than <literal>0</literal>, and no compression
+ will be used if the level is <literal>0</literal>.
+ </para>
+
+ <para>
For the custom and directory archive formats, this specifies compression of
- individual table-data segments, and the default is to compress
- at a moderate level.
- For plain text output, setting a nonzero compression level causes
- the entire output file to be compressed, as though it had been
- fed through <application>gzip</application>; but the default is not to compress.
+ individual table-data segments, and the default is to compress using
+ <literal>gzip</literal> at a moderate level. For plain text output,
+ setting a nonzero compression level causes the entire output file to be compressed,
+ as though it had been fed through <application>gzip</application>; but the default
+ is not to compress.
+ </para>
+ <para>
The tar archive format currently does not support compression at all.
</para>
</listitem>
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 2f5802195d..58daeca831 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -428,7 +428,6 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
-CompressionAlgorithm
CompressionLocation
CompressorState
ComputeXidHorizonsResult
--
2.38.1
------- Original Message -------
On Thursday, December 1st, 2022 at 3:05 AM, Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Nov 30, 2022 at 05:11:44PM +0000, gkokolatos@pm.me wrote:
Fair enough. The atteched v11 does that. 0001 introduces compression
specification and is using it throughout. 0002 paves the way to the
new interface by homogenizing the use of cfp. 0003 introduces the new
API and stores the compression algorithm in the custom format header
instead of the compression level integer. Finally 0004 adds support for
LZ4.I have been looking at 0001, and.. Hmm. I am really wondering
whether it would not be better to just nuke this warning into orbit.
This stuff enforces non-compression even if -Z has been used to a
non-default value. This has been moved to its current location by
cae2bb1 as of this thread:
/messages/by-id/20160526.185551.242041780.horiguchi.kyotaro@lab.ntt.co.jpHowever, this is only active if -Z is used when not building with
zlib. At the end, it comes down to whether we want to prioritize the
portability of pg_dump commands specifying a -Z/--compress across
environments knowing that these may or may not be built with zlib,
vs the amount of simplification/uniformity we would get across the
binaries in the tree once we switch everything to use the compression
specifications. Now that pg_basebackup and pg_receivewal are managed
by compression specifications, and that we'd want more compression
options for pg_dump, I would tend to do the latter and from now on
complain if attempting to do a pg_dump -Z under --without-zlib with a
compression level > 0. zlib is also widely available, and we don't
document the fact that non-compression is enforced in this case,
either. (Two TAP tests with the custom format had to be tweaked.)
Fair enough. Thank you for looking. However I have a small comment
on your new patch.
- /* Custom and directory formats are compressed by default, others not */
- if (compressLevel == -1)
- {
-#ifdef HAVE_LIBZ
- if (archiveFormat == archCustom || archiveFormat == archDirectory)
- compressLevel = Z_DEFAULT_COMPRESSION;
- else
-#endif
- compressLevel = 0;
- }
Nuking the warning from orbit and changing the behaviour around disabling
the requested compression when the libraries are not present, should not
mean that we need to change the behaviour of default values for different
formats. Please find v13 attached which reinstates it.
Which in itself it got me looking and wondering why the tests succeeded.
The only existing test covering that path is `defaults_dir_format` in
`002_pg_dump.pl`. However as the test is currently written it does not
check whether the output was compressed. The restore command would succeed
in either case. A simple `gzip -t -r` against the directory will not
suffice to test it, because there exist files which are never compressed
in this format (.toc). A little bit more involved test case would need
to be written, yet before I embark to this journey, I would like to know
if you would agree to reinstate the defaults for those formats.
As per the patch, it is true that we do not need to bump the format of
the dump archives, as we can still store only the compression level
and guess the method from it. I have added some notes about that in
ReadHead and WriteHead to not forget.
Agreed. A minor suggestion if you may.
#ifndef HAVE_LIBZ
- if (AH->compression != 0)
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
#endif
It would seem a more consistent to error out in this case. We do error
in all other cases where the compression is not available.
Most of the changes are really-straight forward, and it has resisted
my tests, so I think that this is in a rather-commitable shape as-is.
Thank you.
Cheers,
//Georgios
Show quoted text
--
Michael
Attachments:
v13-0001-Teach-pg_dump-about-compress_spec-and-use-it-thr.patchtext/x-patch; name=v13-0001-Teach-pg_dump-about-compress_spec-and-use-it-thr.patchDownload
From 16e10b38cc8eb6eb5b1ffc15365d7e6ce23eef0a Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 1 Dec 2022 08:58:51 +0000
Subject: [PATCH v13] Teach pg_dump about compress_spec and use it throughout.
Align pg_dump with the rest of the binaries which use common compression. It is
teaching pg_dump.c about the common compression definitions and interfaces. Then
it propagates those throughout the code.
---
doc/src/sgml/ref/pg_dump.sgml | 34 +++++--
src/bin/pg_dump/compress_io.c | 107 ++++++++------------
src/bin/pg_dump/compress_io.h | 20 ++--
src/bin/pg_dump/pg_backup.h | 7 +-
src/bin/pg_dump/pg_backup_archiver.c | 78 +++++++++-----
src/bin/pg_dump/pg_backup_archiver.h | 10 +-
src/bin/pg_dump/pg_backup_custom.c | 6 +-
src/bin/pg_dump/pg_backup_directory.c | 13 ++-
src/bin/pg_dump/pg_backup_tar.c | 11 +-
src/bin/pg_dump/pg_dump.c | 79 ++++++++++-----
src/bin/pg_dump/t/001_basic.pl | 34 +++++--
src/bin/pg_dump/t/002_pg_dump.pl | 3 +-
src/test/modules/test_pg_dump/t/001_base.pl | 16 +++
src/tools/pgindent/typedefs.list | 1 -
14 files changed, 258 insertions(+), 161 deletions(-)
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 8b9d9f4cad..2a015908f0 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -644,17 +644,35 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
- <term><option>-Z <replaceable class="parameter">0..9</replaceable></option></term>
- <term><option>--compress=<replaceable class="parameter">0..9</replaceable></option></term>
+ <term><option>-Z <replaceable class="parameter">level</replaceable></option></term>
+ <term><option>-Z <replaceable class="parameter">method</replaceable></option>[:<replaceable>level</replaceable>]</term>
+ <term><option>--compress=<replaceable class="parameter">level</replaceable></option></term>
+ <term><option>--compress=<replaceable class="parameter">method</replaceable></option>[:<replaceable>level</replaceable>]</term>
<listitem>
<para>
- Specify the compression level to use. Zero means no compression.
+ Specify the compression method and/or the compression level to use.
+ The compression method can be set to <literal>gzip</literal> or
+ <literal>none</literal> for no compression. A compression level can
+ be optionally specified, by appending the level number after a colon
+ (<literal>:</literal>).
+ </para>
+ <para>
+ If no compression level is specified, the default compression
+ level will be used. If only a level is specified without mentioning
+ an algorithm, <literal>gzip</literal> compression will be used if
+ the level is greater than <literal>0</literal>, and no compression
+ will be used if the level is <literal>0</literal>.
+ </para>
+
+ <para>
For the custom and directory archive formats, this specifies compression of
- individual table-data segments, and the default is to compress
- at a moderate level.
- For plain text output, setting a nonzero compression level causes
- the entire output file to be compressed, as though it had been
- fed through <application>gzip</application>; but the default is not to compress.
+ individual table-data segments, and the default is to compress using
+ <literal>gzip</literal> at a moderate level. For plain text output,
+ setting a nonzero compression level causes the entire output file to be compressed,
+ as though it had been fed through <application>gzip</application>; but the default
+ is not to compress.
+ </para>
+ <para>
The tar archive format currently does not support compression at all.
</para>
</listitem>
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 62f940ff7a..8f0d6d6210 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -64,7 +64,7 @@
/* typedef appears in compress_io.h */
struct CompressorState
{
- CompressionAlgorithm comprAlg;
+ pg_compress_specification compression_spec;
WriteFunc writeF;
#ifdef HAVE_LIBZ
@@ -74,9 +74,6 @@ struct CompressorState
#endif
};
-static void ParseCompressionOption(int compression, CompressionAlgorithm *alg,
- int *level);
-
/* Routines that support zlib compressed data I/O */
#ifdef HAVE_LIBZ
static void InitCompressorZlib(CompressorState *cs, int level);
@@ -93,57 +90,30 @@ static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
const char *data, size_t dLen);
-/*
- * Interprets a numeric 'compression' value. The algorithm implied by the
- * value (zlib or none at the moment), is returned in *alg, and the
- * zlib compression level in *level.
- */
-static void
-ParseCompressionOption(int compression, CompressionAlgorithm *alg, int *level)
-{
- if (compression == Z_DEFAULT_COMPRESSION ||
- (compression > 0 && compression <= 9))
- *alg = COMPR_ALG_LIBZ;
- else if (compression == 0)
- *alg = COMPR_ALG_NONE;
- else
- {
- pg_fatal("invalid compression code: %d", compression);
- *alg = COMPR_ALG_NONE; /* keep compiler quiet */
- }
-
- /* The level is just the passed-in value. */
- if (level)
- *level = compression;
-}
-
/* Public interface routines */
/* Allocate a new compressor */
CompressorState *
-AllocateCompressor(int compression, WriteFunc writeF)
+AllocateCompressor(const pg_compress_specification compression_spec,
+ WriteFunc writeF)
{
CompressorState *cs;
- CompressionAlgorithm alg;
- int level;
-
- ParseCompressionOption(compression, &alg, &level);
#ifndef HAVE_LIBZ
- if (alg == COMPR_ALG_LIBZ)
+ if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
pg_fatal("not built with zlib support");
#endif
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
cs->writeF = writeF;
- cs->comprAlg = alg;
+ cs->compression_spec = compression_spec;
/*
* Perform compression algorithm specific initialization.
*/
#ifdef HAVE_LIBZ
- if (alg == COMPR_ALG_LIBZ)
- InitCompressorZlib(cs, level);
+ if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressorZlib(cs, cs->compression_spec.level);
#endif
return cs;
@@ -154,15 +124,12 @@ AllocateCompressor(int compression, WriteFunc writeF)
* out with ahwrite().
*/
void
-ReadDataFromArchive(ArchiveHandle *AH, int compression, ReadFunc readF)
+ReadDataFromArchive(ArchiveHandle *AH, pg_compress_specification compression_spec,
+ ReadFunc readF)
{
- CompressionAlgorithm alg;
-
- ParseCompressionOption(compression, &alg, NULL);
-
- if (alg == COMPR_ALG_NONE)
+ if (compression_spec.algorithm == PG_COMPRESSION_NONE)
ReadDataFromArchiveNone(AH, readF);
- if (alg == COMPR_ALG_LIBZ)
+ if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
#ifdef HAVE_LIBZ
ReadDataFromArchiveZlib(AH, readF);
@@ -179,18 +146,23 @@ void
WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen)
{
- switch (cs->comprAlg)
+ switch (cs->compression_spec.algorithm)
{
- case COMPR_ALG_LIBZ:
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
WriteDataToArchiveZlib(AH, cs, data, dLen);
#else
pg_fatal("not built with zlib support");
#endif
break;
- case COMPR_ALG_NONE:
+ case PG_COMPRESSION_NONE:
WriteDataToArchiveNone(AH, cs, data, dLen);
break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
}
@@ -201,7 +173,7 @@ void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
#ifdef HAVE_LIBZ
- if (cs->comprAlg == COMPR_ALG_LIBZ)
+ if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
EndCompressorZlib(AH, cs);
#endif
free(cs);
@@ -453,20 +425,27 @@ cfopen_read(const char *path, const char *mode)
{
cfp *fp;
+ pg_compress_specification compression_spec = {0};
+
#ifdef HAVE_LIBZ
if (hasSuffix(path, ".gz"))
- fp = cfopen(path, mode, 1);
+ {
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ fp = cfopen(path, mode, compression_spec);
+ }
else
#endif
{
- fp = cfopen(path, mode, 0);
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
+ fp = cfopen(path, mode, compression_spec);
#ifdef HAVE_LIBZ
if (fp == NULL)
{
char *fname;
fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, 1);
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ fp = cfopen(fname, mode, compression_spec);
free_keep_errno(fname);
}
#endif
@@ -479,26 +458,27 @@ cfopen_read(const char *path, const char *mode)
* be a filemode as accepted by fopen() and gzopen() that indicates writing
* ("w", "wb", "a", or "ab").
*
- * If 'compression' is non-zero, a gzip compressed stream is opened, and
- * 'compression' indicates the compression level used. The ".gz" suffix
- * is automatically added to 'path' in that case.
+ * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
+ * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
+ * 'path' in that case.
*
* On failure, return NULL with an error code in errno.
*/
cfp *
-cfopen_write(const char *path, const char *mode, int compression)
+cfopen_write(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
{
cfp *fp;
- if (compression == 0)
- fp = cfopen(path, mode, 0);
+ if (compression_spec.algorithm == PG_COMPRESSION_NONE)
+ fp = cfopen(path, mode, compression_spec);
else
{
#ifdef HAVE_LIBZ
char *fname;
fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression);
+ fp = cfopen(fname, mode, compression_spec);
free_keep_errno(fname);
#else
pg_fatal("not built with zlib support");
@@ -509,26 +489,27 @@ cfopen_write(const char *path, const char *mode, int compression)
}
/*
- * Opens file 'path' in 'mode'. If 'compression' is non-zero, the file
+ * Opens file 'path' in 'mode'. If compression is GZIP, the file
* is opened with libz gzopen(), otherwise with plain fopen().
*
* On failure, return NULL with an error code in errno.
*/
cfp *
-cfopen(const char *path, const char *mode, int compression)
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compression != 0)
+ if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
#ifdef HAVE_LIBZ
- if (compression != Z_DEFAULT_COMPRESSION)
+ if (compression_spec.level != Z_DEFAULT_COMPRESSION)
{
/* user has specified a compression level, so tell zlib to use it */
char mode_compression[32];
snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression);
+ mode, compression_spec.level);
fp->compressedfp = gzopen(path, mode_compression);
}
else
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index f635787692..6fad6c2cd5 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -21,12 +21,6 @@
#define ZLIB_OUT_SIZE 4096
#define ZLIB_IN_SIZE 4096
-typedef enum
-{
- COMPR_ALG_NONE,
- COMPR_ALG_LIBZ
-} CompressionAlgorithm;
-
/* Prototype for callback function to WriteDataToArchive() */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
@@ -46,8 +40,10 @@ typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
-extern CompressorState *AllocateCompressor(int compression, WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH, int compression,
+extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ WriteFunc writeF);
+extern void ReadDataFromArchive(ArchiveHandle *AH,
+ const pg_compress_specification compression_spec,
ReadFunc readF);
extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen);
@@ -56,9 +52,13 @@ extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
typedef struct cfp cfp;
-extern cfp *cfopen(const char *path, const char *mode, int compression);
+extern cfp *cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec);
+extern cfp *cfdopen(int fd, const char *mode,
+ pg_compress_specification compression_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode, int compression);
+extern cfp *cfopen_write(const char *path, const char *mode,
+ const pg_compress_specification compression_spec);
extern int cfread(void *ptr, int size, cfp *fp);
extern int cfwrite(const void *ptr, int size, cfp *fp);
extern int cfgetc(cfp *fp);
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index e8b7898297..bc6b6594af 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -23,6 +23,7 @@
#ifndef PG_BACKUP_H
#define PG_BACKUP_H
+#include "common/compression.h"
#include "fe_utils/simple_list.h"
#include "libpq-fe.h"
@@ -143,7 +144,8 @@ typedef struct _restoreOptions
int noDataForFailedTables;
int exit_on_error;
- int compression;
+ pg_compress_specification compression_spec; /* Specification for
+ * compression */
int suppressDumpWarnings; /* Suppress output of WARNING entries
* to stderr */
bool single_txn;
@@ -303,7 +305,8 @@ extern Archive *OpenArchive(const char *FileSpec, const ArchiveFormat fmt);
/* Create a new archive */
extern Archive *CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compression_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupDumpWorker);
/* The --list option */
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index f39c0fa36f..ac647662a3 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -70,7 +70,8 @@ typedef struct _parallelReadyList
static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compression_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupWorkerPtr);
static void _getObjectDescription(PQExpBuffer buf, const TocEntry *te);
static void _printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData);
@@ -98,7 +99,8 @@ static int _discoverArchiveFormat(ArchiveHandle *AH);
static int RestoringToDB(ArchiveHandle *AH);
static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
-static void SetOutput(ArchiveHandle *AH, const char *filename, int compression);
+static void SetOutput(ArchiveHandle *AH, const char *filename,
+ const pg_compress_specification compression_spec);
static OutputContext SaveOutput(ArchiveHandle *AH);
static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
@@ -239,12 +241,13 @@ setupRestoreWorker(Archive *AHX)
/* Public */
Archive *
CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compression_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupDumpWorker)
{
- ArchiveHandle *AH = _allocAH(FileSpec, fmt, compression, dosync,
- mode, setupDumpWorker);
+ ArchiveHandle *AH = _allocAH(FileSpec, fmt, compression_spec,
+ dosync, mode, setupDumpWorker);
return (Archive *) AH;
}
@@ -254,7 +257,12 @@ CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
Archive *
OpenArchive(const char *FileSpec, const ArchiveFormat fmt)
{
- ArchiveHandle *AH = _allocAH(FileSpec, fmt, 0, true, archModeRead, setupRestoreWorker);
+ ArchiveHandle *AH;
+ pg_compress_specification compression_spec = {0};
+
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
+ AH = _allocAH(FileSpec, fmt, compression_spec, true,
+ archModeRead, setupRestoreWorker);
return (Archive *) AH;
}
@@ -384,7 +392,8 @@ RestoreArchive(Archive *AHX)
* Make sure we won't need (de)compression we haven't got
*/
#ifndef HAVE_LIBZ
- if (AH->compression != 0 && AH->PrintTocDataPtr != NULL)
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
+ AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
@@ -459,8 +468,8 @@ RestoreArchive(Archive *AHX)
* Setup the output file if necessary.
*/
sav = SaveOutput(AH);
- if (ropt->filename || ropt->compression)
- SetOutput(AH, ropt->filename, ropt->compression);
+ if (ropt->filename || ropt->compression_spec.algorithm != PG_COMPRESSION_NONE)
+ SetOutput(AH, ropt->filename, ropt->compression_spec);
ahprintf(AH, "--\n-- PostgreSQL database dump\n--\n\n");
@@ -739,7 +748,7 @@ RestoreArchive(Archive *AHX)
*/
AH->stage = STAGE_FINALIZING;
- if (ropt->filename || ropt->compression)
+ if (ropt->filename || ropt->compression_spec.algorithm != PG_COMPRESSION_NONE)
RestoreOutput(AH, sav);
if (ropt->useDB)
@@ -969,6 +978,8 @@ NewRestoreOptions(void)
opts->format = archUnknown;
opts->cparams.promptPassword = TRI_DEFAULT;
opts->dumpSections = DUMP_UNSECTIONED;
+ opts->compression_spec.algorithm = PG_COMPRESSION_NONE;
+ opts->compression_spec.level = 0;
return opts;
}
@@ -1115,23 +1126,28 @@ PrintTOCSummary(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
TocEntry *te;
+ pg_compress_specification out_compression_spec = {0};
teSection curSection;
OutputContext sav;
const char *fmtName;
char stamp_str[64];
+ /* TOC is always uncompressed */
+ out_compression_spec.algorithm = PG_COMPRESSION_NONE;
+
sav = SaveOutput(AH);
if (ropt->filename)
- SetOutput(AH, ropt->filename, 0 /* no compression */ );
+ SetOutput(AH, ropt->filename, out_compression_spec);
if (strftime(stamp_str, sizeof(stamp_str), PGDUMP_STRFTIME_FMT,
localtime(&AH->createDate)) == 0)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1485,7 +1501,8 @@ archprintf(Archive *AH, const char *fmt,...)
*******************************/
static void
-SetOutput(ArchiveHandle *AH, const char *filename, int compression)
+SetOutput(ArchiveHandle *AH, const char *filename,
+ const pg_compress_specification compression_spec)
{
int fn;
@@ -1508,12 +1525,12 @@ SetOutput(ArchiveHandle *AH, const char *filename, int compression)
/* If compression explicitly requested, use gzopen */
#ifdef HAVE_LIBZ
- if (compression != 0)
+ if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
char fmode[14];
/* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression);
+ sprintf(fmode, "wb%d", compression_spec.level);
if (fn >= 0)
AH->OF = gzdopen(dup(fn), fmode);
else
@@ -2198,7 +2215,8 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
static ArchiveHandle *
_allocAH(const char *FileSpec, const ArchiveFormat fmt,
- const int compression, bool dosync, ArchiveMode mode,
+ const pg_compress_specification compression_spec,
+ bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
@@ -2249,7 +2267,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
AH->toc->prev = AH->toc;
AH->mode = mode;
- AH->compression = compression;
+ AH->compression_spec = compression_spec;
AH->dosync = dosync;
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
@@ -2264,7 +2282,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
* Force stdin/stdout into binary mode if that is what we are using.
*/
#ifdef WIN32
- if ((fmt != archNull || compression != 0) &&
+ if ((fmt != archNull || compression_spec.algorithm != PG_COMPRESSION_NONE) &&
(AH->fSpec == NULL || strcmp(AH->fSpec, "") == 0))
{
if (mode == archModeWrite)
@@ -3669,7 +3687,12 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- WriteInt(AH, AH->compression);
+ /*
+ * For now the compression type is implied by the level. This will need
+ * to change once support for more compression algorithms is added,
+ * requiring a format bump.
+ */
+ WriteInt(AH, AH->compression_spec.level);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3740,19 +3763,24 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
+ /* Guess the compression method based on the level */
+ AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
if (AH->version >= K_VERS_1_2)
{
if (AH->version < K_VERS_1_4)
- AH->compression = AH->ReadBytePtr(AH);
+ AH->compression_spec.level = AH->ReadBytePtr(AH);
else
- AH->compression = ReadInt(AH);
+ AH->compression_spec.level = ReadInt(AH);
+
+ if (AH->compression_spec.level != 0)
+ AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
else
- AH->compression = Z_DEFAULT_COMPRESSION;
+ AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
#ifndef HAVE_LIBZ
- if (AH->compression != 0)
- pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ pg_fatal("archive is compressed, but this installation does not support compression");
#endif
if (AH->version >= K_VERS_1_4)
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 42687c4ec8..a9560c6045 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -331,14 +331,8 @@ struct _archiveHandle
DumpId *tableDataId; /* TABLE DATA ids, indexed by table dumpId */
struct _tocEntry *currToc; /* Used when dumping data */
- int compression; /*---------
- * Compression requested on open().
- * Possible values for compression:
- * -1 Z_DEFAULT_COMPRESSION
- * 0 COMPRESSION_NONE
- * 1-9 levels for gzip compression
- *---------
- */
+ pg_compress_specification compression_spec; /* Requested specification for
+ * compression */
bool dosync; /* data requested to be synced on sight */
ArchiveMode mode; /* File mode - r or w */
void *formatData; /* Header data specific to file format */
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index a0a55a1edd..f413d01fcb 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,7 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
}
/*
@@ -377,7 +377,7 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
}
/*
@@ -566,7 +566,7 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression, _CustomReadFunc);
+ ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 798182b6f7..53ef8db728 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -327,7 +327,8 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
+ ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
+ AH->compression_spec);
if (ctx->dataFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -573,6 +574,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
cfp *tocFH;
+ pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
setFilePath(AH, fname, "toc.dat");
@@ -581,7 +583,8 @@ _CloseArchive(ArchiveHandle *AH)
ctx->pstate = ParallelBackupStart(AH);
/* The TOC is always created uncompressed */
- tocFH = cfopen_write(fname, PG_BINARY_W, 0);
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
+ tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
if (tocFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -639,12 +642,14 @@ static void
_StartBlobs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
setFilePath(AH, fname, "blobs.toc");
/* The blob TOC file is never compressed */
- ctx->blobsTocFH = cfopen_write(fname, "ab", 0);
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
+ ctx->blobsTocFH = cfopen_write(fname, "ab", compression_spec);
if (ctx->blobsTocFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -662,7 +667,7 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression);
+ ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
if (ctx->dataFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
diff --git a/src/bin/pg_dump/pg_backup_tar.c b/src/bin/pg_dump/pg_backup_tar.c
index 402b93c610..99f3f5bcae 100644
--- a/src/bin/pg_dump/pg_backup_tar.c
+++ b/src/bin/pg_dump/pg_backup_tar.c
@@ -194,7 +194,7 @@ InitArchiveFmt_Tar(ArchiveHandle *AH)
* possible since gzdopen uses buffered IO which totally screws file
* positioning.
*/
- if (AH->compression != 0)
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
pg_fatal("compression is not supported by tar archive format");
}
else
@@ -328,7 +328,7 @@ tarOpen(ArchiveHandle *AH, const char *filename, char mode)
}
}
- if (AH->compression == 0)
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_NONE)
tm->nFH = ctx->tarFH;
else
pg_fatal("compression is not supported by tar archive format");
@@ -383,7 +383,7 @@ tarOpen(ArchiveHandle *AH, const char *filename, char mode)
umask(old_umask);
- if (AH->compression == 0)
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_NONE)
tm->nFH = tm->tmpFH;
else
pg_fatal("compression is not supported by tar archive format");
@@ -401,7 +401,7 @@ tarOpen(ArchiveHandle *AH, const char *filename, char mode)
static void
tarClose(ArchiveHandle *AH, TAR_MEMBER *th)
{
- if (AH->compression != 0)
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
pg_fatal("compression is not supported by tar archive format");
if (th->mode == 'w')
@@ -800,7 +800,6 @@ _CloseArchive(ArchiveHandle *AH)
memcpy(ropt, AH->public.ropt, sizeof(RestoreOptions));
ropt->filename = NULL;
ropt->dropSchema = 1;
- ropt->compression = 0;
ropt->superuser = NULL;
ropt->suppressDumpWarnings = true;
@@ -888,7 +887,7 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
if (oid == 0)
pg_fatal("invalid OID for large object (%u)", oid);
- if (AH->compression != 0)
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
pg_fatal("compression is not supported by tar archive format");
sprintf(fname, "blob_%u.dat", oid);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index da427f4d4a..f31ed21afa 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -105,6 +105,8 @@ static Oid g_last_builtin_oid; /* value of the last builtin oid */
/* The specified names/patterns should to match at least one entity */
static int strict_names = 0;
+static pg_compress_algorithm compression_algorithm = PG_COMPRESSION_NONE;
+
/*
* Object inclusion/exclusion lists
*
@@ -340,10 +342,14 @@ main(int argc, char **argv)
const char *dumpsnapshot = NULL;
char *use_role = NULL;
int numWorkers = 1;
- int compressLevel = -1;
int plainText = 0;
ArchiveFormat archiveFormat = archUnknown;
ArchiveMode archiveMode;
+ pg_compress_specification compression_spec = {0};
+ char *compression_detail = NULL;
+ char *compression_algorithm_str = "none";
+ char *error_detail = NULL;
+ bool user_compression_defined = false;
static DumpOptions dopt;
@@ -561,10 +567,10 @@ main(int argc, char **argv)
dopt.aclsSkip = true;
break;
- case 'Z': /* Compression Level */
- if (!option_parse_int(optarg, "-Z/--compress", 0, 9,
- &compressLevel))
- exit_nicely(1);
+ case 'Z': /* Compression */
+ parse_compress_options(optarg, &compression_algorithm_str,
+ &compression_detail);
+ user_compression_defined = true;
break;
case 0:
@@ -687,23 +693,50 @@ main(int argc, char **argv)
if (archiveFormat == archNull)
plainText = 1;
- /* Custom and directory formats are compressed by default, others not */
- if (compressLevel == -1)
+ /*
+ * Compression options
+ */
+ if (!parse_compress_algorithm(compression_algorithm_str,
+ &compression_algorithm))
+ pg_fatal("unrecognized compression algorithm: \"%s\"",
+ compression_algorithm_str);
+
+ parse_compress_specification(compression_algorithm, compression_detail,
+ &compression_spec);
+ error_detail = validate_compress_specification(&compression_spec);
+ if (error_detail != NULL)
+ pg_fatal("invalid compression specification: %s",
+ error_detail);
+
+ switch (compression_algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ /* fallthrough */
+ case PG_COMPRESSION_GZIP:
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ }
+
+ /*
+ * Custom and directory formats are compressed by default (zlib), others
+ * not
+ */
+ if ((archiveFormat == archCustom || archiveFormat == archDirectory) &&
+ !user_compression_defined)
{
#ifdef HAVE_LIBZ
- if (archiveFormat == archCustom || archiveFormat == archDirectory)
- compressLevel = Z_DEFAULT_COMPRESSION;
- else
+ parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
+ &compression_spec);
+#else
+ /* no op */
#endif
- compressLevel = 0;
}
-#ifndef HAVE_LIBZ
- if (compressLevel != 0)
- pg_log_warning("requested compression not available in this installation -- archive will be uncompressed");
- compressLevel = 0;
-#endif
-
/*
* If emitting an archive format, we always want to emit a DATABASE item,
* in case --create is specified at pg_restore time.
@@ -716,8 +749,8 @@ main(int argc, char **argv)
pg_fatal("parallel backup only supported by the directory format");
/* Open the output file */
- fout = CreateArchive(filename, archiveFormat, compressLevel, dosync,
- archiveMode, setupDumpWorker);
+ fout = CreateArchive(filename, archiveFormat, compression_spec,
+ dosync, archiveMode, setupDumpWorker);
/* Make dump options accessible right away */
SetArchiveOptions(fout, &dopt, NULL);
@@ -948,10 +981,7 @@ main(int argc, char **argv)
ropt->sequence_data = dopt.sequence_data;
ropt->binary_upgrade = dopt.binary_upgrade;
- if (compressLevel == -1)
- ropt->compression = 0;
- else
- ropt->compression = compressLevel;
+ ropt->compression_spec = compression_spec;
ropt->suppressDumpWarnings = true; /* We've already shown them */
@@ -998,7 +1028,8 @@ help(const char *progname)
printf(_(" -j, --jobs=NUM use this many parallel jobs to dump\n"));
printf(_(" -v, --verbose verbose mode\n"));
printf(_(" -V, --version output version information, then exit\n"));
- printf(_(" -Z, --compress=0-9 compression level for compressed formats\n"));
+ printf(_(" -Z, --compress=METHOD[:LEVEL]\n"
+ " compress as specified\n"));
printf(_(" --lock-wait-timeout=TIMEOUT fail after waiting TIMEOUT for a table lock\n"));
printf(_(" --no-sync do not wait for changes to be written safely to disk\n"));
printf(_(" -?, --help show this help, then exit\n"));
diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl
index a583c8a6d2..c8bc02126d 100644
--- a/src/bin/pg_dump/t/001_basic.pl
+++ b/src/bin/pg_dump/t/001_basic.pl
@@ -121,24 +121,46 @@ command_fails_like(
'pg_restore: cannot specify both --single-transaction and multiple jobs');
command_fails_like(
- [ 'pg_dump', '-Z', '-1' ],
- qr/\Qpg_dump: error: -Z\/--compress must be in range 0..9\E/,
- 'pg_dump: -Z/--compress must be in range');
+ [ 'pg_dump', '--compress', 'garbage' ],
+ qr/\Qpg_dump: error: unrecognized compression algorithm/,
+ 'pg_dump: invalid --compress');
+
+command_fails_like(
+ [ 'pg_dump', '--compress', 'none:1' ],
+ qr/\Qpg_dump: error: invalid compression specification: compression algorithm "none" does not accept a compression level\E/,
+ 'pg_dump: invalid compression specification: compression algorithm "none" does not accept a compression level'
+);
+
if (check_pg_config("#define HAVE_LIBZ 1"))
{
+ command_fails_like(
+ [ 'pg_dump', '-Z', '15' ],
+ qr/\Qpg_dump: error: invalid compression specification: compression algorithm "gzip" expects a compression level between 1 and 9 (default at -1)\E/,
+ 'pg_dump: invalid compression specification: must be in range');
+
command_fails_like(
[ 'pg_dump', '--compress', '1', '--format', 'tar' ],
qr/\Qpg_dump: error: compression is not supported by tar archive format\E/,
'pg_dump: compression is not supported by tar archive format');
+
+ command_fails_like(
+ [ 'pg_dump', '-Z', 'gzip:nonInt' ],
+ qr/\Qpg_dump: error: invalid compression specification: unrecognized compression option: "nonInt"\E/,
+ 'pg_dump: invalid compression specification: must be an integer');
}
else
{
# --jobs > 1 forces an error with tar format.
command_fails_like(
- [ 'pg_dump', '--compress', '1', '--format', 'tar', '-j3' ],
- qr/\Qpg_dump: warning: requested compression not available in this installation -- archive will be uncompressed\E/,
- 'pg_dump: warning: compression not available in this installation');
+ [ 'pg_dump', '--format', 'tar', '-j3' ],
+ qr/\Qpg_dump: error: parallel backup only supported by the directory format\E/,
+ 'pg_dump: warning: parallel backup not supported by tar format');
+
+ command_fails_like(
+ [ 'pg_dump', '-Z', 'gzip:nonInt', '--format', 'tar', '-j2' ],
+ qr/\Qpg_dump: error: invalid compression specification: unrecognized compression option\E/,
+ 'pg_dump: invalid compression specification: must be an integer');
}
command_fails_like(
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index fe53ed0f89..709db0986d 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -87,7 +87,7 @@ my %pgdump_runs = (
compile_option => 'gzip',
dump_cmd => [
'pg_dump', '--jobs=2',
- '--format=directory', '--compress=1',
+ '--format=directory', '--compress=gzip:1',
"--file=$tempdir/compression_gzip_dir", 'postgres',
],
# Give coverage for manually compressed blob.toc files during
@@ -200,6 +200,7 @@ my %pgdump_runs = (
# Do not use --no-sync to give test coverage for data sync.
defaults_custom_format => {
test_key => 'defaults',
+ compile_option => 'gzip',
dump_cmd => [
'pg_dump', '-Fc', '-Z6',
"--file=$tempdir/defaults_custom_format.dump", 'postgres',
diff --git a/src/test/modules/test_pg_dump/t/001_base.pl b/src/test/modules/test_pg_dump/t/001_base.pl
index f5da6bf46d..19577ce0ea 100644
--- a/src/test/modules/test_pg_dump/t/001_base.pl
+++ b/src/test/modules/test_pg_dump/t/001_base.pl
@@ -20,6 +20,10 @@ my $tempdir = PostgreSQL::Test::Utils::tempdir;
# to define how each test should (or shouldn't) treat a result
# from a given run.
#
+# compile_option indicates if the commands run depend on a compilation
+# option, if any. This can be used to control if tests should be
+# skipped when a build dependency is not satisfied.
+#
# test_key indicates that a given run should simply use the same
# set of like/unlike tests as another run, and which run that is.
#
@@ -90,6 +94,7 @@ my %pgdump_runs = (
},
defaults_custom_format => {
test_key => 'defaults',
+ compile_option => 'gzip',
dump_cmd => [
'pg_dump', '--no-sync', '-Fc', '-Z6',
"--file=$tempdir/defaults_custom_format.dump", 'postgres',
@@ -749,6 +754,8 @@ $node->start;
my $port = $node->port;
+my $supports_gzip = check_pg_config("#define HAVE_LIBZ 1");
+
#########################################
# Set up schemas, tables, etc, to be dumped.
@@ -792,6 +799,15 @@ foreach my $run (sort keys %pgdump_runs)
my $test_key = $run;
+ # Skip command-level tests for gzip if there is no support for it.
+ if ( defined($pgdump_runs{$run}->{compile_option})
+ && $pgdump_runs{$run}->{compile_option} eq 'gzip'
+ && !$supports_gzip)
+ {
+ note "$run: skipped due to no gzip support";
+ next;
+ }
+
$node->command_ok(\@{ $pgdump_runs{$run}->{dump_cmd} },
"$run: pg_dump runs");
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 2f5802195d..58daeca831 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -428,7 +428,6 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
-CompressionAlgorithm
CompressionLocation
CompressorState
ComputeXidHorizonsResult
--
2.34.1
On Thu, Dec 01, 2022 at 02:58:35PM +0000, gkokolatos@pm.me wrote:
Nuking the warning from orbit and changing the behaviour around disabling
the requested compression when the libraries are not present, should not
mean that we need to change the behaviour of default values for different
formats. Please find v13 attached which reinstates it.
Gah, thanks! And this default behavior is documented as dependent on
the compilation as well.
Which in itself it got me looking and wondering why the tests succeeded.
The only existing test covering that path is `defaults_dir_format` in
`002_pg_dump.pl`. However as the test is currently written it does not
check whether the output was compressed. The restore command would succeed
in either case. A simple `gzip -t -r` against the directory will not
suffice to test it, because there exist files which are never compressed
in this format (.toc). A little bit more involved test case would need
to be written, yet before I embark to this journey, I would like to know
if you would agree to reinstate the defaults for those formats.
On top of my mind, I briefly recall that -r is not that portable. And
the toc format makes the files generated non-deterministic as these
use OIDs..
[.. thinks ..]
We are going to need a new thing here, as compress_cmd cannot be
directly used. What if we used only an array of glob()-able elements?
Let's say "expected_contents" that could include a "dir_path/*.gz"
conditional on $supports_gzip? glob() can only be calculated when the
test is run as the file names cannot be known beforehand :/
As per the patch, it is true that we do not need to bump the format of
the dump archives, as we can still store only the compression level
and guess the method from it. I have added some notes about that in
ReadHead and WriteHead to not forget.Agreed. A minor suggestion if you may.
#ifndef HAVE_LIBZ - if (AH->compression != 0) + if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP) pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available"); #endifIt would seem a more consistent to error out in this case. We do error
in all other cases where the compression is not available.
Makes sense.
I have gone through the patch again, and applied it. Thanks!
--
Michael
------- Original Message -------
On Friday, December 2nd, 2022 at 2:56 AM, Michael Paquier <michael@paquier.xyz> wrote:
On top of my mind, I briefly recall that -r is not that portable. And
the toc format makes the files generated non-deterministic as these
use OIDs..[.. thinks ..]
We are going to need a new thing here, as compress_cmd cannot be
directly used. What if we used only an array of glob()-able elements?
Let's say "expected_contents" that could include a "dir_path/*.gz"
conditional on $supports_gzip? glob() can only be calculated when the
test is run as the file names cannot be known beforehand :/
You are very correct. However one can glob after the fact. Please find
0001 of the attached v14 which attempts to implement it.
I have gone through the patch again, and applied it. Thanks!
Thank you. Please find the rest of of the patchset series rebased on top
of it. I dare to say that 0002 is in a state worth of your consideration.
Cheers,
//Georgios
Show quoted text
--
Michael
Attachments:
v14-0001-Provide-coverage-for-pg_dump-default-compression.patchtext/x-patch; name=v14-0001-Provide-coverage-for-pg_dump-default-compression.patchDownload
From fdab9843ba84d64e96e461c9f8e78a932cc366e1 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 2 Dec 2022 16:02:51 +0000
Subject: [PATCH v14 1/4] Provide coverage for pg_dump default compression for
dir format
The restore program will succeed regardless of whether the dumped output was
compressed or not. This commit implements a portable way to check the contents
of the directory via perl's build in filename expansion.
---
src/bin/pg_dump/t/002_pg_dump.pl | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 709db0986d..03b5375e70 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -215,6 +215,7 @@ my %pgdump_runs = (
# Do not use --no-sync to give test coverage for data sync.
defaults_dir_format => {
test_key => 'defaults',
+ compile_option => 'gzip',
dump_cmd => [
'pg_dump', '-Fd',
"--file=$tempdir/defaults_dir_format", 'postgres',
@@ -224,6 +225,7 @@ my %pgdump_runs = (
"--file=$tempdir/defaults_dir_format.sql",
"$tempdir/defaults_dir_format",
],
+ glob_pattern => "$tempdir/defaults_dir_format/*.dat.gz"
},
# Do not use --no-sync to give test coverage for data sync.
@@ -4153,6 +4155,13 @@ foreach my $run (sort keys %pgdump_runs)
command_ok(\@full_compress_cmd, "$run: compression commands");
}
+ if ($pgdump_runs{$run}->{glob_pattern})
+ {
+ my $glob_pattern = $pgdump_runs{$run}->{glob_pattern};
+ my @glob_output = glob($glob_pattern);
+ is(scalar(@glob_output) > 0, 1, "glob pattern matched")
+ }
+
if ($pgdump_runs{$run}->{restore_cmd})
{
$node->command_ok(\@{ $pgdump_runs{$run}->{restore_cmd} },
--
2.34.1
v14-0002-Prepare-pg_dump-internals-for-additional-compres.patchtext/x-patch; name=v14-0002-Prepare-pg_dump-internals-for-additional-compres.patchDownload
From b87e02c847c84fdd8823b188fd5cf1b28e5b9099 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 2 Dec 2022 16:02:58 +0000
Subject: [PATCH v14 2/4] Prepare pg_dump internals for additional compression
methods.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it and is using it throughout.
---
src/bin/pg_dump/compress_io.c | 363 ++++++++++++++++++---------
src/bin/pg_dump/pg_backup_archiver.c | 128 ++++------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-
3 files changed, 296 insertions(+), 222 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 8f0d6d6210..e453443b6a 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,10 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
/*----------------------
* Compressor API
*----------------------
@@ -127,15 +131,23 @@ void
ReadDataFromArchive(ArchiveHandle *AH, pg_compress_specification compression_spec,
ReadFunc readF)
{
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ switch (compression_spec.algorithm)
{
+ case PG_COMPRESSION_NONE:
+ ReadDataFromArchiveNone(AH, readF);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
+ ReadDataFromArchiveZlib(AH, readF);
#else
- pg_fatal("not built with zlib support");
+ pg_fatal("not built with zlib support");
#endif
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
}
@@ -172,11 +184,24 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
+ switch (cs->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
+ EndCompressorZlib(AH, cs);
+#else
+ pg_fatal("not built with zlib support");
#endif
- free(cs);
+ break;
+ case PG_COMPRESSION_NONE:
+ free(cs);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
+ }
}
/* Private routines, specific to each compression method. */
@@ -390,10 +415,8 @@ WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
*/
struct cfp
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
+ pg_compress_specification compression_spec;
+ void *fp;
};
#ifdef HAVE_LIBZ
@@ -489,127 +512,195 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_specification compression_spec)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ fp->compression_spec = compression_spec;
+
+ switch (compression_spec.algorithm)
{
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
+ case PG_COMPRESSION_NONE:
+ if (fd >= 0)
+ fp->fp = fdopen(fd, mode);
+ else
+ fp->fp = fopen(path, mode);
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
- }
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ if (compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use
+ * it
+ */
+ char mode_compression[32];
+
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, compression_spec.level);
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode_compression);
+ else
+ fp->fp = gzopen(path, mode_compression);
+ }
+ else
+ {
+ /* don't specify a level, just use the zlib default */
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode);
+ else
+ fp->fp = gzopen(path, mode);
+ }
- fp->uncompressedfp = NULL;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
#else
- pg_fatal("not built with zlib support");
-#endif
- }
- else
- {
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
+ pg_fatal("not built with zlib support");
#endif
- fp->uncompressedfp = fopen(path, mode);
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
return fp;
}
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
{
- int ret;
+ int ret = 0;
if (size == 0)
return 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ case PG_COMPRESSION_NONE:
+ ret = fread(ptr, 1, size, fp->fp);
+ if (ret != size && !feof(fp->fp))
+ READ_ERROR_EXIT(fp->fp);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
- else
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzread(fp->fp, ptr, size);
+ if (ret != size && !gzeof(fp->fp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(fp->fp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+#else
+ pg_fatal("not built with zlib support");
#endif
- {
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
+
return ret;
}
int
cfwrite(const void *ptr, int size, cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fwrite(ptr, 1, size, fp->fp);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
+ ret = gzwrite(fp->fp, ptr, size);
+#else
+ pg_fatal("not built with zlib support");
#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
+ }
+
+ return ret;
}
int
cfgetc(cfp *fp)
{
- int ret;
+ int ret = 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
- {
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fgetc(fp->fp);
+ if (ret == EOF)
+ READ_ERROR_EXIT(fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzgetc((gzFile) fp->fp);
+ if (ret == EOF)
+ {
+ if (!gzeof(fp->fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+#else
+ pg_fatal("not built with zlib support");
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
return ret;
@@ -618,65 +709,113 @@ cfgetc(cfp *fp)
char *
cfgets(cfp *fp, char *buf, int len)
{
+ char *ret = NULL;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fgets(buf, len, fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
+ ret = gzgets(fp->fp, buf, len);
+#else
+ pg_fatal("not built with zlib support");
#endif
- return fgets(buf, len, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
+ }
+
+ return ret;
}
int
cfclose(cfp *fp)
{
- int result;
+ int ret = 0;
if (fp == NULL)
{
errno = EBADF;
return EOF;
}
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+
+ switch (fp->compression_spec.algorithm)
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fclose(fp->fp);
+ fp->fp = NULL;
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzclose(fp->fp);
+ fp->fp = NULL;
+#else
+ pg_fatal("not built with zlib support");
#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
+
free_keep_errno(fp);
- return result;
+ return ret;
}
int
cfeof(cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = feof(fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
+ ret = gzeof(fp->fp);
+#else
+ pg_fatal("not built with zlib support");
#endif
- return feof(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
+ }
+
+ return ret;
}
const char *
get_cfp_error(cfp *fp)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
+#ifdef HAVE_LIBZ
int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ const char *errmsg = gzerror(fp->fp, &errnum);
if (errnum != Z_ERRNO)
return errmsg;
- }
+#else
+ pg_fatal("not built with zlib support");
#endif
+ }
+
return strerror(errno);
}
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 0081873a72..e1652ad013 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -362,8 +353,9 @@ RestoreArchive(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
+ bool supports_compression;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +383,24 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
+ supports_compression = true;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE &&
+ AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+#ifndef HAVE_LIBZ
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+ if (supports_compression == false)
+ pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1128,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1503,58 +1502,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1565,33 +1538,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1715,22 +1679,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2219,6 +2178,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2272,8 +2232,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index a9560c6045..ad65693242 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.34.1
v14-0003-Introduce-Compressor-API-in-pg_dump.patchtext/x-patch; name=v14-0003-Introduce-Compressor-API-in-pg_dump.patchDownload
From a8f4dde046c1c955c8c386e554869bb667ddd600 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 2 Dec 2022 16:03:05 +0000
Subject: [PATCH v14 3/4] Introduce Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for opening, writing, etc. The implementor of a new
compression method is now able to "simply" just add those definitions.
Custom compressed archives now need to store the compression algorithm in their
header. This requires a bump in the version number. The level of compression
is no longer stored in the dump as it is irrelevant.
---
src/bin/pg_dump/Makefile | 1 +
src/bin/pg_dump/compress_gzip.c | 390 ++++++++++++
src/bin/pg_dump/compress_gzip.h | 9 +
src/bin/pg_dump/compress_io.c | 866 +++++++-------------------
src/bin/pg_dump/compress_io.h | 68 +-
src/bin/pg_dump/meson.build | 1 +
src/bin/pg_dump/pg_backup_archiver.c | 102 +--
src/bin/pg_dump/pg_backup_archiver.h | 4 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 85 +--
10 files changed, 783 insertions(+), 766 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 9dc5a784dd..29eab02d37 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,6 +24,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..bc6d1abc77
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,390 @@
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_gzip.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ int compressionLevel;
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ if (deflateInit(zp, gzipcs->compressionLevel) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs, int compressionLevel)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+ gzipcs->compressionLevel = compressionLevel;
+
+ cs->private = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+typedef struct GzipData
+{
+ gzFile fp;
+ int compressionLevel;
+} GzipData;
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ size_t ret;
+
+ ret = gzread(gd->fp, ptr, size);
+ if (ret != size && !gzeof(gd->fp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gd->fp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+
+ return gzwrite(gd->fp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gd->fp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gd->fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+
+ return gzgets(gd->fp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ int save_errno;
+ int ret;
+
+ CFH->private = NULL;
+
+ ret = gzclose(gd->fp);
+
+ save_errno = errno;
+ free(gd);
+ errno = save_errno;
+
+ return ret;
+}
+
+static int
+Gzip_eof(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+
+ return gzeof(gd->fp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gd->fp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ char mode_compression[32];
+
+ if (gd->compressionLevel != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, gd->compressionLevel);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gd->fp = gzdopen(dup(fd), mode_compression);
+ else
+ gd->fp = gzopen(path, mode_compression);
+
+ if (gd->fp == NULL)
+ return 1;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle * CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressGzip(CompressFileHandle * CFH, int compressionLevel)
+{
+ GzipData *gd;
+
+ CFH->open = Gzip_open;
+ CFH->open_write = Gzip_open_write;
+ CFH->read = Gzip_read;
+ CFH->write = Gzip_write;
+ CFH->gets = Gzip_gets;
+ CFH->getc = Gzip_getc;
+ CFH->close = Gzip_close;
+ CFH->eof = Gzip_eof;
+ CFH->get_error = Gzip_get_error;
+
+ gd = pg_malloc0(sizeof(GzipData));
+ gd->compressionLevel = compressionLevel;
+
+ CFH->private = gd;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs, int compressionLevel)
+{
+ pg_fatal("not built with zlib support");
+}
+
+void
+InitCompressGzip(CompressFileHandle * CFH, int compressionLevel)
+{
+ pg_fatal("not built with zlib support");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..ab0362c1f3
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,9 @@
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs, int compressionLevel);
+extern void InitCompressGzip(CompressFileHandle * CFH, int compressionLevel);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index e453443b6a..96132116f9 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -51,9 +51,12 @@
*
*-------------------------------------------------------------------------
*/
+#include <sys/stat.h>
+#include <unistd.h>
#include "postgres_fe.h"
#include "compress_io.h"
+#include "compress_gzip.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -65,700 +68,253 @@
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
+/* Private routines that support uncompressed data I/O */
+ static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
-
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
-
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
-/* Public interface routines */
-
-/* Allocate a new compressor */
-CompressorState *
-AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
-{
- CompressorState *cs;
-
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("not built with zlib support");
-#endif
-
- cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
- cs->writeF = writeF;
- cs->compression_spec = compression_spec;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
-}
-
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH, pg_compress_specification compression_spec,
- ReadFunc readF)
-{
- switch (compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ReadDataFromArchiveNone(AH, readF);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
-}
-
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
-}
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
-/*
- * Terminate compression library context and flush its buffers.
- */
-void
-EndCompressor(ArchiveHandle *AH, CompressorState *cs)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- EndCompressorZlib(AH, cs);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- free(cs);
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ free(buf);
}
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
- */
-
-static void
-InitCompressorZlib(CompressorState *cs, int level)
+ static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
+ cs->writeF(AH, data, dLen);
}
-static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
+ static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
-
- free(cs->zlibOut);
- free(cs->zp);
+ /* no op */
}
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
+ static void
+InitCompressorNone(CompressorState *cs)
{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
-
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
-
- if (res == Z_STREAM_END)
- break;
- }
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
}
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
-}
+/* Public interface routines */
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
+/* Allocate a new compressor */
+ CompressorState *
+AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF, WriteFunc writeF)
{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
+ CompressorState *cs;
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
+ cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
+ cs->writeF = writeF;
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
+ switch (compression_spec.algorithm)
{
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
-
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ case PG_COMPRESSION_NONE:
+ InitCompressorNone(cs);
+ break;
+ case PG_COMPRESSION_GZIP:
+ InitCompressorGzip(cs, compression_spec.level);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
- }
-
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
-
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
- free(buf);
- free(out);
- free(zp);
+ return cs;
}
-#endif /* HAVE_LIBZ */
-
/*
- * Functions for uncompressed output.
+ * Terminate compression library context and flush its buffers.
*/
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
-{
- size_t cnt;
- char *buf;
- size_t buflen;
-
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
-
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
-
- free(buf);
-}
-
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+ void
+EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
- cs->writeF(AH, data, dLen);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-
/*----------------------
* Compressed stream API
*----------------------
*/
-/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
- */
-struct cfp
+ static int
+hasSuffix(const char *filename, const char *suffix)
{
- pg_compress_specification compression_spec;
- void *fp;
-};
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
+}
/* free() without changing errno; useful in several places below */
-static void
+ static void
free_keep_errno(void *p)
{
- int save_errno = errno;
+ int save_errno = errno;
- free(p);
- errno = save_errno;
+ free(p);
+ errno = save_errno;
}
/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
+ * Compression None implementation
*/
-cfp *
-cfopen_read(const char *path, const char *mode)
+ static size_t
+_read(void *ptr, size_t size, CompressFileHandle * CFH)
{
- cfp *fp;
+ FILE *fp = (FILE *) CFH->private;
+ size_t ret;
- pg_compress_specification compression_spec = {0};
+ if (size == 0)
+ return 0;
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
+ return ret;
}
-/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+ static size_t
+_write(const void *ptr, size_t size, CompressFileHandle * CFH)
{
- cfp *fp;
-
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("not built with zlib support");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
+ return fwrite(ptr, 1, size, (FILE *) CFH->private);
}
-/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
- */
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compression_spec)
+ static const char *
+_get_error(CompressFileHandle * CFH)
{
- cfp *fp = pg_malloc(sizeof(cfp));
-
- fp->compression_spec = compression_spec;
-
- switch (compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- if (fd >= 0)
- fp->fp = fdopen(fd, mode);
- else
- fp->fp = fopen(path, mode);
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /*
- * user has specified a compression level, so tell zlib to use
- * it
- */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode_compression);
- else
- fp->fp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode);
- else
- fp->fp = gzopen(path, mode);
- }
-
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
-
- return fp;
+ return strerror(errno);
}
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+ static char *
+_gets(char *ptr, int size, CompressFileHandle * CFH)
{
- return cfopen_internal(path, -1, mode, compression_spec);
+ return fgets(ptr, size, (FILE *) CFH->private);
}
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec)
+ static int
+_getc(CompressFileHandle * CFH)
{
- return cfopen_internal(NULL, fd, mode, compression_spec);
+ FILE *fp = (FILE *) CFH->private;
+ int ret;
+
+ ret = fgetc(fp);
+ if (ret == EOF)
+ {
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
}
-int
-cfread(void *ptr, int size, cfp *fp)
+ static int
+_close(CompressFileHandle * CFH)
{
- int ret = 0;
-
- if (size == 0)
- return 0;
+ FILE *fp = (FILE *) CFH->private;
+ int ret = 0;
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fread(ptr, 1, size, fp->fp);
- if (ret != size && !feof(fp->fp))
- READ_ERROR_EXIT(fp->fp);
+ CFH->private = NULL;
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzread(fp->fp, ptr, size);
- if (ret != size && !gzeof(fp->fp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->fp, &errnum);
+ if (fp)
+ ret = fclose(fp);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
-
- return ret;
+ return ret;
}
-int
-cfwrite(const void *ptr, int size, cfp *fp)
+ static int
+_eof(CompressFileHandle * CFH)
{
- int ret = 0;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fwrite(ptr, 1, size, fp->fp);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzwrite(fp->fp, ptr, size);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
-
- return ret;
+ return feof((FILE *) CFH->private);
}
-int
-cfgetc(cfp *fp)
+ static int
+_open(const char *path, int fd, const char *mode, CompressFileHandle * CFH)
{
- int ret = 0;
+ Assert(CFH->private == NULL);
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgetc(fp->fp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->fp);
+ if (fd >= 0)
+ CFH->private = fdopen(dup(fd), mode);
+ else
+ CFH->private = fopen(path, mode);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgetc((gzFile) fp->fp);
- if (ret == EOF)
- {
- if (!gzeof(fp->fp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ if (CFH->private == NULL)
+ return 1;
- return ret;
+ return 0;
}
-char *
-cfgets(cfp *fp, char *buf, int len)
+ static int
+_open_write(const char *path, const char *mode, CompressFileHandle * CFH)
{
- char *ret = NULL;
+ Assert(CFH->private == NULL);
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgets(buf, len, fp->fp);
+ CFH->private = fopen(path, mode);
+ if (CFH->private == NULL)
+ return 1;
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgets(fp->fp, buf, len);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ return 0;
+}
- return ret;
+ static void
+InitCompressNone(CompressFileHandle * CFH)
+{
+ CFH->open = _open;
+ CFH->open_write = _open_write;
+ CFH->read = _read;
+ CFH->write = _write;
+ CFH->gets = _gets;
+ CFH->getc = _getc;
+ CFH->close = _close;
+ CFH->eof = _eof;
+ CFH->get_error = _get_error;
+
+ CFH->private = NULL;
}
-int
-cfclose(cfp *fp)
+/*
+ * Public interface
+ */
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compression_spec)
{
- int ret = 0;
+ CompressFileHandle *CFH;
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
- }
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
- switch (fp->compression_spec.algorithm)
+ switch (compression_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- ret = fclose(fp->fp);
- fp->fp = NULL;
-
+ InitCompressNone(CFH);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzclose(fp->fp);
- fp->fp = NULL;
-#else
- pg_fatal("not built with zlib support");
-#endif
+ InitCompressGzip(CFH, compression_spec.level);
break;
case PG_COMPRESSION_LZ4:
/* fallthrough */
@@ -767,71 +323,77 @@ cfclose(cfp *fp)
break;
}
- free_keep_errno(fp);
-
- return ret;
+ return CFH;
}
-int
-cfeof(cfp *fp)
+/*
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
+ *
+ * If the file at 'path' does not exist, we append the ".gz" suffix (if
+ * 'path' doesn't already have it) and try again. So if you pass "foo" as
+ * 'path', this will open either "foo" or "foo.gz", trying in that order.
+ *
+ * On failure, return NULL with an error code in errno.
+ *
+ */
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- int ret = 0;
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = feof(fp->fp);
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzeof(fp->fp);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
- return ret;
-}
+ fname = strdup(path);
-const char *
-get_cfp_error(cfp *fp)
-{
- if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ else
{
+ bool exists;
+
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not build with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- int errnum;
- const char *errmsg = gzerror(fp->fp, &errnum);
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
- if (errnum != Z_ERRNO)
- return errmsg;
-#else
- pg_fatal("not built with zlib support");
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ }
#endif
}
- return strerror(errno);
+ CFH = InitCompressFileHandle(compression_spec);
+ if (CFH->open(fname, -1, mode, CFH))
+ {
+ free_keep_errno(CFH);
+ CFH = NULL;
+ }
+ free_keep_errno(fname);
+
+ return CFH;
}
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
+int
+DestroyCompressFileHandle(CompressFileHandle * CFH)
{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
+ int ret = 0;
- if (filenamelen < suffixlen)
- return 0;
+ if (CFH->private)
+ ret = CFH->close(CFH);
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
-}
+ free_keep_errno(CFH);
-#endif
+ return ret;
+}
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 6fad6c2cd5..1118b7a638 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -37,34 +37,60 @@ typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ ReadFunc readF;
+ WriteFunc writeF;
+
+ void *private;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ int (*open) (const char *path, int fd, const char *mode,
+ CompressFileHandle * CFH);
+ int (*open_write) (const char *path, const char *mode,
+ CompressFileHandle * cxt);
+ size_t (*read) (void *ptr, size_t size, CompressFileHandle * CFH);
+ size_t (*write) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ char *(*gets) (char *s, int size, CompressFileHandle * CFH);
+ int (*getc) (CompressFileHandle * CFH);
+ int (*eof) (CompressFileHandle * CFH);
+ int (*close) (CompressFileHandle * CFH);
+ const char *(*get_error) (CompressFileHandle * CFH);
+
+ void *private;
+};
-typedef struct cfp cfp;
+extern CompressFileHandle * InitCompressFileHandle(const pg_compress_specification compress_spec);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- pg_compress_specification compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+extern CompressFileHandle * InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int DestroyCompressFileHandle(CompressFileHandle * CFH);
#endif
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index d96e566846..0c73a4707e 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,5 +1,6 @@
pg_dump_common_sources = files(
'compress_io.c',
+ 'compress_gzip.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index e1652ad013..989f276301 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle * SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle * savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -355,7 +355,7 @@ RestoreArchive(Archive *AHX)
bool parallel_mode;
bool supports_compression;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1127,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1143,9 +1143,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1502,6 +1503,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1524,33 +1526,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle * savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1689,7 +1690,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2031,6 +2036,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2061,26 +2078,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2178,6 +2181,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2233,7 +2237,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3646,12 +3653,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3722,10 +3724,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
@@ -3737,10 +3740,17 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
+ {
+ bool unsupported = false;
+
#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ unsupported = true;
#endif
+ if (unsupported)
+ pg_fatal("archive is compressed, but this installation does not support compression");
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index ad65693242..17591af7bc 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,12 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index f413d01fcb..3b461b048a 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 53ef8db728..b03127e720 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,9 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
+ CompressFileHandle *dataFH; /* currently open data file */
- cfp *blobsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *blobsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +198,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +218,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +327,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+
+ if (ctx->dataFH->open_write(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +346,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
}
@@ -370,7 +371,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +386,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (DestroyCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +435,7 @@ _LoadBlobs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +443,14 @@ _LoadBlobs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->blobsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->blobsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->blobsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the blobs TOC file line-by-line, and process each blob */
- while ((cfgets(ctx->blobsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets(line, MAXPGPATH, CFH)) != NULL)
{
char blobfname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +465,11 @@ _LoadBlobs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreBlob(AH, oid);
}
- if (!cfeof(ctx->blobsTocFH))
+ if (!CFH->eof(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->blobsTocFH) != 0)
+ if (DestroyCompressFileHandle(ctx->blobsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +489,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
return 1;
@@ -512,8 +514,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc(CFH);
}
/*
@@ -524,15 +527,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
}
@@ -545,12 +549,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +578,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +589,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compression_spec);
+ if (tocFH->open_write(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +603,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +654,8 @@ _StartBlobs(ArchiveHandle *AH, TocEntry *te)
/* The blob TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->blobsTocFH = cfopen_write(fname, "ab", compression_spec);
- if (ctx->blobsTocFH == NULL)
+ ctx->blobsTocFH = InitCompressFileHandle(compression_spec);
+ if (ctx->blobsTocFH->open_write(fname, "ab", ctx->blobsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +672,8 @@ _StartBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+ if (ctx->dataFH->open_write(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,17 +686,18 @@ static void
_EndBlob(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->blobsTocFH;
char buf[50];
int len;
/* Close the BLOB data file itself */
- if (cfclose(ctx->dataFH) != 0)
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the blob in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->blobsTocFH) != len)
+ if (CFH->write(buf, len, CFH) != len)
pg_fatal("could not write to blobs TOC file");
}
@@ -706,7 +711,7 @@ _EndBlobs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->blobsTocFH) != 0)
+ if (DestroyCompressFileHandle(ctx->blobsTocFH) != 0)
pg_fatal("could not close blobs TOC file: %m");
ctx->blobsTocFH = NULL;
}
--
2.34.1
v14-0004-Add-LZ4-compression-in-pg_-dump-restore.patchtext/x-patch; name=v14-0004-Add-LZ4-compression-in-pg_-dump-restore.patchDownload
From 0d16e54023485dcc36ee1252132c444fb4c6aa7f Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 2 Dec 2022 16:03:12 +0000
Subject: [PATCH v14 4/4] Add LZ4 compression in pg_{dump|restore}
Within compress_lz4.{c,h} the streaming API and a file API compression is
implemented.. The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(), it has been
implemented localy.
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 79 ++--
src/bin/pg_dump/compress_lz4.c | 601 +++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 9 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 14 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 69 ++-
9 files changed, 753 insertions(+), 47 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 363d1327e2..ebf3e968d0 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -328,9 +328,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -652,7 +653,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -673,8 +674,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 29eab02d37..28c1fc27cc 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -25,6 +26,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
compress_gzip.o \
+ compress_lz4.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 96132116f9..2bf77c693d 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -38,13 +38,15 @@
* ----------------------
*
* The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * libz's gzopen() APIs and custom LZ4 calls which provide similar
+ * functionality. It allows you to use the same functions for compressed and
+ * uncompressed streams. cfopen_read() first tries to open the file with given
+ * name, and if it fails, it tries to open the same file with the .gz suffix,
+ * failing that it tries to open the same file with the .lz4 suffix.
+ * cfopen_write() opens a file for writing, an extra argument specifies the
+ * method to use should the file be compressed, and adds the appropriate
+ * suffix, .gz or .lz4, to the filename if so. This allows you to easily handle
+ * both compressed and uncompressed files.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -57,6 +59,7 @@
#include "compress_io.h"
#include "compress_gzip.h"
+#include "compress_lz4.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -115,28 +118,29 @@ InitCompressorNone(CompressorState *cs)
AllocateCompressor(const pg_compress_specification compression_spec,
ReadFunc readF, WriteFunc writeF)
{
- CompressorState *cs;
+ CompressorState *cs;
- cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
- cs->readF = readF;
- cs->writeF = writeF;
+ cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
+ cs->writeF = writeF;
- switch (compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- InitCompressorNone(cs);
- break;
- case PG_COMPRESSION_GZIP:
- InitCompressorGzip(cs, compression_spec.level);
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ switch (compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ InitCompressorNone(cs);
+ break;
+ case PG_COMPRESSION_GZIP:
+ InitCompressorGzip(cs, compression_spec.level);
+ break;
+ case PG_COMPRESSION_LZ4:
+ InitCompressorLZ4(cs, compression_spec.level);
+ break;
+ default:
+ pg_fatal("invalid compression method");
+ break;
+ }
- return cs;
+ return cs;
}
/*
@@ -181,7 +185,8 @@ free_keep_errno(void *p)
/*
* Compression None implementation
*/
- static size_t
+
+static size_t
_read(void *ptr, size_t size, CompressFileHandle * CFH)
{
FILE *fp = (FILE *) CFH->private;
@@ -317,7 +322,8 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
InitCompressGzip(CFH, compression_spec.level);
break;
case PG_COMPRESSION_LZ4:
- /* fallthrough */
+ InitCompressLZ4(CFH, compression_spec.level);
+ break;
case PG_COMPRESSION_ZSTD:
pg_fatal("invalid compression method");
break;
@@ -330,12 +336,12 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
* Open a file for reading. 'path' is the file to open, and 'mode' should
* be either "r" or "rb".
*
- * If the file at 'path' does not exist, we append the ".gz" suffix (if
+ * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (i
* 'path' doesn't already have it) and try again. So if you pass "foo" as
- * 'path', this will open either "foo" or "foo.gz", trying in that order.
+ * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
+ * order.
*
* On failure, return NULL with an error code in errno.
- *
*/
CompressFileHandle *
InitDiscoverCompressFileHandle(const char *path, const char *mode)
@@ -371,6 +377,17 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
+#endif
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..8f93f05e87
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,601 @@
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private = NULL;
+ }
+}
+
+
+/* Public routines that support LZ4 compressed data I/O */
+void
+InitCompressorLZ4(CompressorState *cs, int compressionLevel)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ /* Will be lazy init'd */
+ cs->private = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the nessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurance of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (!fs->inited && LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verfiy that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (eol_flag == 0) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag == true &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the begining of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t status;
+ int remaining = size;
+
+ if (!fs->inited && LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle * CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = errno ? : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle * CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle * CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+void
+InitCompressLZ4(CompressFileHandle * CFH, int compressionLevel)
+{
+ LZ4File *lz4fp;
+
+ CFH->open = LZ4File_open;
+ CFH->open_write = LZ4File_open_write;
+ CFH->read = LZ4File_read;
+ CFH->write = LZ4File_write;
+ CFH->gets = LZ4File_gets;
+ CFH->getc = LZ4File_getc;
+ CFH->eof = LZ4File_eof;
+ CFH->close = LZ4File_close;
+ CFH->get_error = LZ4File_get_error;
+
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (compressionLevel >= 0)
+ lz4fp->prefs.compressionLevel = compressionLevel;
+
+ CFH->private = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs, int compressionLevel)
+{
+ pg_fatal("not built with LZ4 support");
+}
+
+void
+InitCompressLZ4(CompressFileHandle * CFH, int compressionLevel)
+{
+ pg_fatal("not built with LZ4 support");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..fbec9a508d
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,9 @@
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs, int compressionLevel);
+extern void InitCompressLZ4(CompressFileHandle * CFH, int compressionLevel);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 0c73a4707e..b27e92ffd0 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,6 +1,7 @@
pg_dump_common_sources = files(
'compress_io.c',
'compress_gzip.c',
+ 'compress_lz4.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
@@ -15,7 +16,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -83,7 +84,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 989f276301..6cf745bcc1 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -395,6 +395,10 @@ RestoreArchive(Archive *AHX)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
#endif
if (supports_compression == false)
pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
@@ -2074,7 +2078,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2084,6 +2088,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -3747,6 +3755,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
unsupported = true;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ unsupported = true;
#endif
if (unsupported)
pg_fatal("archive is compressed, but this installation does not support compression");
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 44e8cd4704..8cb20c1bc0 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -713,13 +713,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 03b5375e70..aedb6c994b 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -116,6 +116,67 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=1', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4130,11 +4191,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
--
2.34.1
On Fri, Dec 02, 2022 at 04:15:10PM +0000, gkokolatos@pm.me wrote:
You are very correct. However one can glob after the fact. Please find
0001 of the attached v14 which attempts to implement it.
+ if ($pgdump_runs{$run}->{glob_pattern})
+ {
+ my $glob_pattern = $pgdump_runs{$run}->{glob_pattern};
+ my @glob_output = glob($glob_pattern);
+ is(scalar(@glob_output) > 0, 1, "glob pattern matched")
+ }
While this is correct in checking that the contents are compressed
under --with-zlib, this also removes the coverage where we make sure
that this command is able to complete under --without-zlib without
compressing any of the table data files. Hence my point from
upthread: this test had better not use compile_option, but change
glob_pattern depending on if the build uses zlib or not.
In order to check this behavior with defaults_custom_format, perhaps
we could just remove the -Z6 from it or add an extra command for its
default behavior?
--
Michael
On Sat, Dec 03, 2022 at 11:45:30AM +0900, Michael Paquier wrote:
While this is correct in checking that the contents are compressed
under --with-zlib, this also removes the coverage where we make sure
that this command is able to complete under --without-zlib without
compressing any of the table data files. Hence my point from
upthread: this test had better not use compile_option, but change
glob_pattern depending on if the build uses zlib or not.
In short, I mean something like the attached. I have named the flag
content_patterns, and switched it to an array so as we can check that
toc.dat is always uncompression and that the other data files are
always uncompressed.
In order to check this behavior with defaults_custom_format, perhaps
we could just remove the -Z6 from it or add an extra command for its
default behavior?
This is slightly more complicated as there is just one file generated
for the compression and non-compression cases, so I have let that as
it is now.
--
Michael
Attachments:
v15-0001-Provide-coverage-for-pg_dump-default-compression.patchtext/x-diff; charset=us-asciiDownload
From 5c583358caed5598fec9abea6750ff7fbd98d269 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Mon, 5 Dec 2022 16:04:57 +0900
Subject: [PATCH v15] Provide coverage for pg_dump default compression for dir
format
The restore program will succeed regardless of whether the dumped output was
compressed or not. This commit implements a portable way to check the contents
of the directory via perl's build in filename expansion.
---
src/bin/pg_dump/t/002_pg_dump.pl | 29 +++++++++++++++++++++++++----
1 file changed, 25 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 709db0986d..9796d2667f 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -36,6 +36,9 @@ my $tempdir = PostgreSQL::Test::Utils::tempdir;
# to test pg_restore's ability to parse manually compressed files
# that otherwise pg_dump does not compress on its own (e.g. *.toc).
#
+# content_patterns is an optional array consisting of strings compilable
+# with glob() to check the files generated after a dump.
+#
# restore_cmd is the pg_restore command to run, if any. Note
# that this should generally be used when the pg_dump goes to
# a non-text file and that the restore can then be used to
@@ -46,6 +49,10 @@ my $tempdir = PostgreSQL::Test::Utils::tempdir;
# database and then pg_dump *that* database (or something along
# those lines) to validate that part of the process.
+my $supports_icu = ($ENV{with_icu} eq 'yes');
+my $supports_lz4 = check_pg_config("#define USE_LZ4 1");
+my $supports_gzip = check_pg_config("#define HAVE_LIBZ 1");
+
my %pgdump_runs = (
binary_upgrade => {
dump_cmd => [
@@ -213,6 +220,9 @@ my %pgdump_runs = (
},
# Do not use --no-sync to give test coverage for data sync.
+ # By default, the directory format compresses its contents
+ # when the code is compiled with gzip support, and lets things
+ # uncompressed when not compiled with it.
defaults_dir_format => {
test_key => 'defaults',
dump_cmd => [
@@ -224,6 +234,11 @@ my %pgdump_runs = (
"--file=$tempdir/defaults_dir_format.sql",
"$tempdir/defaults_dir_format",
],
+ content_patterns => ["$tempdir/defaults_dir_format/toc.dat",
+ $supports_gzip ?
+ "$tempdir/defaults_dir_format/*.dat.gz" :
+ "$tempdir/defaults_dir_format/*.dat",
+ ],
},
# Do not use --no-sync to give test coverage for data sync.
@@ -3920,10 +3935,6 @@ if ($collation_check_stderr !~ /ERROR: /)
$collation_support = 1;
}
-my $supports_icu = ($ENV{with_icu} eq 'yes');
-my $supports_lz4 = check_pg_config("#define USE_LZ4 1");
-my $supports_gzip = check_pg_config("#define HAVE_LIBZ 1");
-
# ICU doesn't work with some encodings
my $encoding = $node->safe_psql('postgres', 'show server_encoding');
$supports_icu = 0 if $encoding eq 'SQL_ASCII';
@@ -4153,6 +4164,16 @@ foreach my $run (sort keys %pgdump_runs)
command_ok(\@full_compress_cmd, "$run: compression commands");
}
+ if ($pgdump_runs{$run}->{content_patterns})
+ {
+ my $content_patterns = $pgdump_runs{$run}->{content_patterns};
+ foreach my $content_pattern (@{$content_patterns})
+ {
+ my @glob_output = glob($content_pattern);
+ is(scalar(@glob_output) > 0, 1, "$run: content check for $content_pattern");
+ }
+ }
+
if ($pgdump_runs{$run}->{restore_cmd})
{
$node->command_ok(\@{ $pgdump_runs{$run}->{restore_cmd} },
--
2.38.1
------- Original Message -------
On Monday, December 5th, 2022 at 8:05 AM, Michael Paquier <michael@paquier.xyz> wrote:
On Sat, Dec 03, 2022 at 11:45:30AM +0900, Michael Paquier wrote:
While this is correct in checking that the contents are compressed
under --with-zlib, this also removes the coverage where we make sure
that this command is able to complete under --without-zlib without
compressing any of the table data files. Hence my point from
upthread: this test had better not use compile_option, but change
glob_pattern depending on if the build uses zlib or not.In short, I mean something like the attached. I have named the flag
content_patterns, and switched it to an array so as we can check that
toc.dat is always uncompression and that the other data files are
always uncompressed.
I see. This approach is much better than my proposal, thanks. If you
allow me, I find 'content_patterns' to be slightly ambiguous. While is
true that it refers to the contents of a directory, it is not the
contents of the dump that it is examining. I took the liberty of proposing
an alternative name in the attached v16.
I also took the liberty of applying the test pattern when it the dump
is explicitly compressed.
In order to check this behavior with defaults_custom_format, perhaps
we could just remove the -Z6 from it or add an extra command for its
default behavior?This is slightly more complicated as there is just one file generated
for the compression and non-compression cases, so I have let that as
it is now.
I was thinking a bit more about this. I think that we can use the list
TOC option of pg_restore. This option will first print out the header
info which contains the compression. Perl utils already support to
parse the generated output of a command. Please find an attempt to do
so in the attached. The benefits of having some testing for this case
become a bit more obvious in 0004 of the patchset, when lz4 is
introduced.
Cheers,
//Georgios
Show quoted text
--
Michael
Attachments:
v16-0001-Provide-coverage-for-pg_dump-default-compression.patchtext/x-patch; name=v16-0001-Provide-coverage-for-pg_dump-default-compression.patchDownload
From 75619245b02c7cd659d826d6ea8b3964445155a5 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 5 Dec 2022 12:14:22 +0000
Subject: [PATCH v16 1/4] Provide coverage for pg_dump default compression for
dir and custom format
The restore program will succeed regardless of whether the dumped output was
compressed or not. This commit implements a portable way to check the contents
of the directory via perl's build in filename expansion. It also implements a
way to check the custom's format data segmenets commpression by examining the
header information during the TOC summary output.
---
src/bin/pg_dump/t/002_pg_dump.pl | 65 +++++++++++++++++++++++++++++---
1 file changed, 59 insertions(+), 6 deletions(-)
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index c2da1df39d..248540db8c 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -36,6 +36,9 @@ my $tempdir = PostgreSQL::Test::Utils::tempdir;
# to test pg_restore's ability to parse manually compressed files
# that otherwise pg_dump does not compress on its own (e.g. *.toc).
#
+# glob_patterns is an optional array consisting of strings compilable
+# with glob() to check the files generated after a dump.
+#
# restore_cmd is the pg_restore command to run, if any. Note
# that this should generally be used when the pg_dump goes to
# a non-text file and that the restore can then be used to
@@ -46,6 +49,10 @@ my $tempdir = PostgreSQL::Test::Utils::tempdir;
# database and then pg_dump *that* database (or something along
# those lines) to validate that part of the process.
+my $supports_icu = ($ENV{with_icu} eq 'yes');
+my $supports_lz4 = check_pg_config("#define USE_LZ4 1");
+my $supports_gzip = check_pg_config("#define HAVE_LIBZ 1");
+
my %pgdump_runs = (
binary_upgrade => {
dump_cmd => [
@@ -79,6 +86,14 @@ my %pgdump_runs = (
"--file=$tempdir/compression_gzip_custom.sql",
"$tempdir/compression_gzip_custom.dump",
],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_gzip_custom.dump",
+ ],
+ expected => qr/Compression: 1/,
+ name => 'data content is gzip compressed'
+ },
},
# Do not use --no-sync to give test coverage for data sync.
@@ -96,6 +111,11 @@ my %pgdump_runs = (
program => $ENV{'GZIP_PROGRAM'},
args => [ '-f', "$tempdir/compression_gzip_dir/blobs.toc", ],
},
+ # Verify that only data files where compressed
+ glob_patterns => [
+ "$tempdir/compression_gzip_dir/toc.dat",
+ "$tempdir/compression_gzip_dir/*.dat.gz",
+ ],
restore_cmd => [
'pg_restore', '--jobs=2',
"--file=$tempdir/compression_gzip_dir.sql",
@@ -200,9 +220,8 @@ my %pgdump_runs = (
# Do not use --no-sync to give test coverage for data sync.
defaults_custom_format => {
test_key => 'defaults',
- compile_option => 'gzip',
dump_cmd => [
- 'pg_dump', '-Fc', '-Z6',
+ 'pg_dump', '-Fc',
"--file=$tempdir/defaults_custom_format.dump", 'postgres',
],
restore_cmd => [
@@ -210,9 +229,22 @@ my %pgdump_runs = (
"--file=$tempdir/defaults_custom_format.sql",
"$tempdir/defaults_custom_format.dump",
],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/defaults_custom_format.dump",
+ ],
+ expected => $supports_gzip ?
+ qr/Compression: -1/ :
+ qr/Compression: 0/,
+ name => 'data content is gzip compressed by default if available'
+ },
},
# Do not use --no-sync to give test coverage for data sync.
+ # By default, the directory format compresses its data files
+ # when the code is compiled with gzip support, and lets them
+ # uncompressed when not compiled with it.
defaults_dir_format => {
test_key => 'defaults',
dump_cmd => [
@@ -224,6 +256,13 @@ my %pgdump_runs = (
"--file=$tempdir/defaults_dir_format.sql",
"$tempdir/defaults_dir_format",
],
+ glob_patterns => [
+ "$tempdir/defaults_dir_format/toc.dat",
+ "$tempdir/defaults_dir_format/blobs.toc",
+ $supports_gzip ?
+ "$tempdir/defaults_dir_format/*.dat.gz" :
+ "$tempdir/defaults_dir_format/*.dat",
+ ],
},
# Do not use --no-sync to give test coverage for data sync.
@@ -3920,10 +3959,6 @@ if ($collation_check_stderr !~ /ERROR: /)
$collation_support = 1;
}
-my $supports_icu = ($ENV{with_icu} eq 'yes');
-my $supports_lz4 = check_pg_config("#define USE_LZ4 1");
-my $supports_gzip = check_pg_config("#define HAVE_LIBZ 1");
-
# ICU doesn't work with some encodings
my $encoding = $node->safe_psql('postgres', 'show server_encoding');
$supports_icu = 0 if $encoding eq 'SQL_ASCII';
@@ -4153,6 +4188,24 @@ foreach my $run (sort keys %pgdump_runs)
command_ok(\@full_compress_cmd, "$run: compression commands");
}
+ if ($pgdump_runs{$run}->{glob_patterns})
+ {
+ my $glob_patterns = $pgdump_runs{$run}->{glob_patterns};
+ foreach my $glob_pattern (@{$glob_patterns})
+ {
+ my @glob_output = glob($glob_pattern);
+ is(scalar(@glob_output) > 0, 1, "$run: glob check for $glob_pattern");
+ }
+ }
+
+ if ($pgdump_runs{$run}->{command_like})
+ {
+ my $cmd_like = $pgdump_runs{$run}->{command_like};
+ $node->command_like(\@{ $cmd_like->{command} },
+ $cmd_like->{expected},
+ $cmd_like->{name})
+ }
+
if ($pgdump_runs{$run}->{restore_cmd})
{
$node->command_ok(\@{ $pgdump_runs{$run}->{restore_cmd} },
--
2.34.1
v16-0002-Prepare-pg_dump-internals-for-additional-compres.patchtext/x-patch; name=v16-0002-Prepare-pg_dump-internals-for-additional-compres.patchDownload
From c220ebbc01bf3694dbfb6dff94e3d2bff9b1bbc1 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 2 Dec 2022 16:02:58 +0000
Subject: [PATCH v16 2/4] Prepare pg_dump internals for additional compression
methods.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it and is using it throughout.
---
src/bin/pg_dump/compress_io.c | 363 ++++++++++++++++++---------
src/bin/pg_dump/pg_backup_archiver.c | 128 ++++------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-
3 files changed, 296 insertions(+), 222 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index a7df600cc0..cb59300cb5 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,10 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
/*----------------------
* Compressor API
*----------------------
@@ -128,15 +132,23 @@ ReadDataFromArchive(ArchiveHandle *AH,
const pg_compress_specification compression_spec,
ReadFunc readF)
{
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ switch (compression_spec.algorithm)
{
+ case PG_COMPRESSION_NONE:
+ ReadDataFromArchiveNone(AH, readF);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
+ ReadDataFromArchiveZlib(AH, readF);
#else
- pg_fatal("not built with zlib support");
+ pg_fatal("not built with zlib support");
#endif
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
}
@@ -173,11 +185,24 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
+ switch (cs->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
+ EndCompressorZlib(AH, cs);
+#else
+ pg_fatal("not built with zlib support");
#endif
- free(cs);
+ break;
+ case PG_COMPRESSION_NONE:
+ free(cs);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
+ }
}
/* Private routines, specific to each compression method. */
@@ -391,10 +416,8 @@ WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
*/
struct cfp
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
+ pg_compress_specification compression_spec;
+ void *fp;
};
#ifdef HAVE_LIBZ
@@ -490,127 +513,195 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_specification compression_spec)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ fp->compression_spec = compression_spec;
+
+ switch (compression_spec.algorithm)
{
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
+ case PG_COMPRESSION_NONE:
+ if (fd >= 0)
+ fp->fp = fdopen(fd, mode);
+ else
+ fp->fp = fopen(path, mode);
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
- }
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ if (compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use
+ * it
+ */
+ char mode_compression[32];
+
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, compression_spec.level);
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode_compression);
+ else
+ fp->fp = gzopen(path, mode_compression);
+ }
+ else
+ {
+ /* don't specify a level, just use the zlib default */
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode);
+ else
+ fp->fp = gzopen(path, mode);
+ }
- fp->uncompressedfp = NULL;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
#else
- pg_fatal("not built with zlib support");
-#endif
- }
- else
- {
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
+ pg_fatal("not built with zlib support");
#endif
- fp->uncompressedfp = fopen(path, mode);
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
return fp;
}
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
{
- int ret;
+ int ret = 0;
if (size == 0)
return 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ case PG_COMPRESSION_NONE:
+ ret = fread(ptr, 1, size, fp->fp);
+ if (ret != size && !feof(fp->fp))
+ READ_ERROR_EXIT(fp->fp);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
- else
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzread(fp->fp, ptr, size);
+ if (ret != size && !gzeof(fp->fp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(fp->fp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+#else
+ pg_fatal("not built with zlib support");
#endif
- {
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
+
return ret;
}
int
cfwrite(const void *ptr, int size, cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fwrite(ptr, 1, size, fp->fp);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
+ ret = gzwrite(fp->fp, ptr, size);
+#else
+ pg_fatal("not built with zlib support");
#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
+ }
+
+ return ret;
}
int
cfgetc(cfp *fp)
{
- int ret;
+ int ret = 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
- {
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fgetc(fp->fp);
+ if (ret == EOF)
+ READ_ERROR_EXIT(fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzgetc((gzFile) fp->fp);
+ if (ret == EOF)
+ {
+ if (!gzeof(fp->fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+#else
+ pg_fatal("not built with zlib support");
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
return ret;
@@ -619,65 +710,113 @@ cfgetc(cfp *fp)
char *
cfgets(cfp *fp, char *buf, int len)
{
+ char *ret = NULL;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fgets(buf, len, fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
+ ret = gzgets(fp->fp, buf, len);
+#else
+ pg_fatal("not built with zlib support");
#endif
- return fgets(buf, len, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
+ }
+
+ return ret;
}
int
cfclose(cfp *fp)
{
- int result;
+ int ret = 0;
if (fp == NULL)
{
errno = EBADF;
return EOF;
}
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+
+ switch (fp->compression_spec.algorithm)
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fclose(fp->fp);
+ fp->fp = NULL;
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzclose(fp->fp);
+ fp->fp = NULL;
+#else
+ pg_fatal("not built with zlib support");
#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
}
+
free_keep_errno(fp);
- return result;
+ return ret;
}
int
cfeof(cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = feof(fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
+ ret = gzeof(fp->fp);
+#else
+ pg_fatal("not built with zlib support");
#endif
- return feof(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ /* fallthrough */
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("invalid compression method");
+ break;
+ }
+
+ return ret;
}
const char *
get_cfp_error(cfp *fp)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
+#ifdef HAVE_LIBZ
int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ const char *errmsg = gzerror(fp->fp, &errnum);
if (errnum != Z_ERRNO)
return errmsg;
- }
+#else
+ pg_fatal("not built with zlib support");
#endif
+ }
+
return strerror(errno);
}
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 7f7a0f1ce7..fb94317ad9 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -362,8 +353,9 @@ RestoreArchive(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
+ bool supports_compression;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +383,24 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
+ supports_compression = true;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE &&
+ AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+#ifndef HAVE_LIBZ
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+ if (supports_compression == false)
+ pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1128,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1503,58 +1502,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1565,33 +1538,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1715,22 +1679,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2219,6 +2178,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2272,8 +2232,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b..4725e49747 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.34.1
v16-0003-Introduce-Compressor-API-in-pg_dump.patchtext/x-patch; name=v16-0003-Introduce-Compressor-API-in-pg_dump.patchDownload
From 4e5590699a417a080b784429f26056274bf1e142 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 2 Dec 2022 16:03:05 +0000
Subject: [PATCH v16 3/4] Introduce Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for opening, writing, etc. The implementor of a new
compression method is now able to "simply" just add those definitions.
Custom compressed archives now need to store the compression algorithm in their
header. This requires a bump in the version number. The level of compression
is no longer stored in the dump as it is irrelevant.
---
src/bin/pg_dump/Makefile | 1 +
src/bin/pg_dump/compress_gzip.c | 390 ++++++++++++
src/bin/pg_dump/compress_gzip.h | 9 +
src/bin/pg_dump/compress_io.c | 839 ++++++--------------------
src/bin/pg_dump/compress_io.h | 68 ++-
src/bin/pg_dump/meson.build | 1 +
src/bin/pg_dump/pg_backup_archiver.c | 102 ++--
src/bin/pg_dump/pg_backup_archiver.h | 4 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 94 +--
src/bin/pg_dump/t/002_pg_dump.pl | 6 +-
11 files changed, 777 insertions(+), 760 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 9dc5a784dd..29eab02d37 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,6 +24,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..bc6d1abc77
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,390 @@
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_gzip.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ int compressionLevel;
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ if (deflateInit(zp, gzipcs->compressionLevel) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs, int compressionLevel)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+ gzipcs->compressionLevel = compressionLevel;
+
+ cs->private = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+typedef struct GzipData
+{
+ gzFile fp;
+ int compressionLevel;
+} GzipData;
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ size_t ret;
+
+ ret = gzread(gd->fp, ptr, size);
+ if (ret != size && !gzeof(gd->fp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gd->fp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+
+ return gzwrite(gd->fp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gd->fp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gd->fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+
+ return gzgets(gd->fp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ int save_errno;
+ int ret;
+
+ CFH->private = NULL;
+
+ ret = gzclose(gd->fp);
+
+ save_errno = errno;
+ free(gd);
+ errno = save_errno;
+
+ return ret;
+}
+
+static int
+Gzip_eof(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+
+ return gzeof(gd->fp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gd->fp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle * CFH)
+{
+ GzipData *gd = (GzipData *) CFH->private;
+ char mode_compression[32];
+
+ if (gd->compressionLevel != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, gd->compressionLevel);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gd->fp = gzdopen(dup(fd), mode_compression);
+ else
+ gd->fp = gzopen(path, mode_compression);
+
+ if (gd->fp == NULL)
+ return 1;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle * CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressGzip(CompressFileHandle * CFH, int compressionLevel)
+{
+ GzipData *gd;
+
+ CFH->open = Gzip_open;
+ CFH->open_write = Gzip_open_write;
+ CFH->read = Gzip_read;
+ CFH->write = Gzip_write;
+ CFH->gets = Gzip_gets;
+ CFH->getc = Gzip_getc;
+ CFH->close = Gzip_close;
+ CFH->eof = Gzip_eof;
+ CFH->get_error = Gzip_get_error;
+
+ gd = pg_malloc0(sizeof(GzipData));
+ gd->compressionLevel = compressionLevel;
+
+ CFH->private = gd;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs, int compressionLevel)
+{
+ pg_fatal("not built with zlib support");
+}
+
+void
+InitCompressGzip(CompressFileHandle * CFH, int compressionLevel)
+{
+ pg_fatal("not built with zlib support");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..ab0362c1f3
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,9 @@
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs, int compressionLevel);
+extern void InitCompressGzip(CompressFileHandle * CFH, int compressionLevel);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index cb59300cb5..e7a0e57d8b 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -51,9 +51,12 @@
*
*-------------------------------------------------------------------------
*/
+#include <sys/stat.h>
+#include <unistd.h>
#include "postgres_fe.h"
#include "compress_io.h"
+#include "compress_gzip.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -65,84 +68,67 @@
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
+/* Private routines that support uncompressed data I/O */
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
+ free(buf);
+}
+
+
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+static void
+InitCompressorNone(CompressorState *cs)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+}
/* Public interface routines */
/* Allocate a new compressor */
CompressorState *
AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("not built with zlib support");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
- cs->compression_spec = compression_spec;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
- return cs;
-}
-
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF)
-{
switch (compression_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- ReadDataFromArchiveNone(AH, readF);
+ InitCompressorNone(cs);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("not built with zlib support");
-#endif
+ InitCompressorGzip(cs, compression_spec.level);
break;
case PG_COMPRESSION_LZ4:
/* fallthrough */
@@ -150,33 +136,8 @@ ReadDataFromArchive(ArchiveHandle *AH,
pg_fatal("invalid compression method");
break;
}
-}
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ return cs;
}
/*
@@ -185,545 +146,177 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- EndCompressorZlib(AH, cs);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- free(cs);
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ cs->end(AH, cs);
+ pg_free(cs);
}
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
+/*----------------------
+ * Compressed stream API
+ *----------------------
*/
-static void
-InitCompressorZlib(CompressorState *cs, int level)
-{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
-}
-
-static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
-{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
-
- free(cs->zlibOut);
- free(cs->zp);
-}
-
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
+ if (filenamelen < suffixlen)
+ return 0;
- if (res == Z_STREAM_END)
- break;
- }
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
}
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+/* free() without changing errno; useful in several places below */
+ static void
+free_keep_errno(void *p)
{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
-}
+ int save_errno = errno;
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
-{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
-
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
-
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
-
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
+ free(p);
+ errno = save_errno;
}
-#endif /* HAVE_LIBZ */
-
/*
- * Functions for uncompressed output.
+ * Compression None implementation
*/
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
+ static size_t
+_read(void *ptr, size_t size, CompressFileHandle * CFH)
{
- size_t cnt;
- char *buf;
- size_t buflen;
+ FILE *fp = (FILE *) CFH->private;
+ size_t ret;
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
+ if (size == 0)
+ return 0;
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
- free(buf);
+ return ret;
}
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+static size_t
+_write(const void *ptr, size_t size, CompressFileHandle * CFH)
{
- cs->writeF(AH, data, dLen);
+ return fwrite(ptr, 1, size, (FILE *) CFH->private);
}
-
-/*----------------------
- * Compressed stream API
- *----------------------
- */
-
-/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
- */
-struct cfp
-{
- pg_compress_specification compression_spec;
- void *fp;
-};
-
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
-
-/* free() without changing errno; useful in several places below */
-static void
-free_keep_errno(void *p)
+static const char *
+_get_error(CompressFileHandle * CFH)
{
- int save_errno = errno;
-
- free(p);
- errno = save_errno;
+ return strerror(errno);
}
-/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_read(const char *path, const char *mode)
+static char *
+_gets(char *ptr, int size, CompressFileHandle * CFH)
{
- cfp *fp;
-
- pg_compress_specification compression_spec = {0};
-
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
-
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
+ return fgets(ptr, size, (FILE *) CFH->private);
}
-/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static int
+_getc(CompressFileHandle * CFH)
{
- cfp *fp;
+ FILE *fp = (FILE *) CFH->private;
+ int ret;
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
+ ret = fgetc(fp);
+ if (ret == EOF)
{
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("not built with zlib support");
- fp = NULL; /* keep compiler quiet */
-#endif
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
}
- return fp;
+
+ return ret;
}
-/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
- */
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compression_spec)
+static int
+_close(CompressFileHandle * CFH)
{
- cfp *fp = pg_malloc(sizeof(cfp));
+ FILE *fp = (FILE *) CFH->private;
+ int ret = 0;
- fp->compression_spec = compression_spec;
+ CFH->private = NULL;
- switch (compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- if (fd >= 0)
- fp->fp = fdopen(fd, mode);
- else
- fp->fp = fopen(path, mode);
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ if (fp)
+ ret = fclose(fp);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /*
- * user has specified a compression level, so tell zlib to use
- * it
- */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode_compression);
- else
- fp->fp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode);
- else
- fp->fp = gzopen(path, mode);
- }
-
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
-
- return fp;
+ return ret;
}
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- return cfopen_internal(path, -1, mode, compression_spec);
-}
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec)
+static int
+_eof(CompressFileHandle * CFH)
{
- return cfopen_internal(NULL, fd, mode, compression_spec);
+ return feof((FILE *) CFH->private);
}
-int
-cfread(void *ptr, int size, cfp *fp)
+static int
+_open(const char *path, int fd, const char *mode, CompressFileHandle * CFH)
{
- int ret = 0;
+ Assert(CFH->private == NULL);
- if (size == 0)
- return 0;
+ if (fd >= 0)
+ CFH->private = fdopen(dup(fd), mode);
+ else
+ CFH->private = fopen(path, mode);
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fread(ptr, 1, size, fp->fp);
- if (ret != size && !feof(fp->fp))
- READ_ERROR_EXIT(fp->fp);
+ if (CFH->private == NULL)
+ return 1;
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzread(fp->fp, ptr, size);
- if (ret != size && !gzeof(fp->fp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->fp, &errnum);
-
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
-
- return ret;
+ return 0;
}
-int
-cfwrite(const void *ptr, int size, cfp *fp)
+static int
+_open_write(const char *path, const char *mode, CompressFileHandle * CFH)
{
- int ret = 0;
+ Assert(CFH->private == NULL);
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fwrite(ptr, 1, size, fp->fp);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzwrite(fp->fp, ptr, size);
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ CFH->private = fopen(path, mode);
+ if (CFH->private == NULL)
+ return 1;
- return ret;
+ return 0;
}
-int
-cfgetc(cfp *fp)
+static void
+InitCompressNone(CompressFileHandle * CFH)
{
- int ret = 0;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgetc(fp->fp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->fp);
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgetc((gzFile) fp->fp);
- if (ret == EOF)
- {
- if (!gzeof(fp->fp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
-
- return ret;
+ CFH->open = _open;
+ CFH->open_write = _open_write;
+ CFH->read = _read;
+ CFH->write = _write;
+ CFH->gets = _gets;
+ CFH->getc = _getc;
+ CFH->close = _close;
+ CFH->eof = _eof;
+ CFH->get_error = _get_error;
+
+ CFH->private = NULL;
}
-char *
-cfgets(cfp *fp, char *buf, int len)
+/*
+ * Public interface
+ */
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compression_spec)
{
- char *ret = NULL;
+ CompressFileHandle *CFH;
- switch (fp->compression_spec.algorithm)
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
+
+ switch (compression_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- ret = fgets(buf, len, fp->fp);
-
+ InitCompressNone(CFH);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgets(fp->fp, buf, len);
-#else
- pg_fatal("not built with zlib support");
-#endif
+ InitCompressGzip(CFH, compression_spec.level);
break;
case PG_COMPRESSION_LZ4:
/* fallthrough */
@@ -732,107 +325,77 @@ cfgets(cfp *fp, char *buf, int len)
break;
}
- return ret;
+ return CFH;
}
-int
-cfclose(cfp *fp)
+/*
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
+ *
+ * If the file at 'path' does not exist, we append the ".gz" suffix (if
+ * 'path' doesn't already have it) and try again. So if you pass "foo" as
+ * 'path', this will open either "foo" or "foo.gz", trying in that order.
+ *
+ * On failure, return NULL with an error code in errno.
+ *
+ */
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- int ret = 0;
-
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
- }
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fclose(fp->fp);
- fp->fp = NULL;
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzclose(fp->fp);
- fp->fp = NULL;
-#else
- pg_fatal("not built with zlib support");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
- free_keep_errno(fp);
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
- return ret;
-}
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
-int
-cfeof(cfp *fp)
-{
- int ret = 0;
+ fname = strdup(path);
- switch (fp->compression_spec.algorithm)
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ else
{
- case PG_COMPRESSION_NONE:
- ret = feof(fp->fp);
+ bool exists;
- break;
- case PG_COMPRESSION_GZIP:
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not build with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- ret = gzeof(fp->fp);
-#else
- pg_fatal("not built with zlib support");
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ }
#endif
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
}
- return ret;
-}
-
-const char *
-get_cfp_error(cfp *fp)
-{
- if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ CFH = InitCompressFileHandle(compression_spec);
+ if (CFH->open(fname, -1, mode, CFH))
{
-#ifdef HAVE_LIBZ
- int errnum;
- const char *errmsg = gzerror(fp->fp, &errnum);
-
- if (errnum != Z_ERRNO)
- return errmsg;
-#else
- pg_fatal("not built with zlib support");
-#endif
+ free_keep_errno(CFH);
+ CFH = NULL;
}
+ free_keep_errno(fname);
- return strerror(errno);
+ return CFH;
}
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
+int
+DestroyCompressFileHandle(CompressFileHandle * CFH)
{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
+ int ret = 0;
- if (filenamelen < suffixlen)
- return 0;
+ if (CFH->private)
+ ret = CFH->close(CFH);
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
-}
+ free_keep_errno(CFH);
-#endif
+ return ret;
+}
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 6fad6c2cd5..1118b7a638 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -37,34 +37,60 @@ typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ ReadFunc readF;
+ WriteFunc writeF;
+
+ void *private;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ int (*open) (const char *path, int fd, const char *mode,
+ CompressFileHandle * CFH);
+ int (*open_write) (const char *path, const char *mode,
+ CompressFileHandle * cxt);
+ size_t (*read) (void *ptr, size_t size, CompressFileHandle * CFH);
+ size_t (*write) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ char *(*gets) (char *s, int size, CompressFileHandle * CFH);
+ int (*getc) (CompressFileHandle * CFH);
+ int (*eof) (CompressFileHandle * CFH);
+ int (*close) (CompressFileHandle * CFH);
+ const char *(*get_error) (CompressFileHandle * CFH);
+
+ void *private;
+};
-typedef struct cfp cfp;
+extern CompressFileHandle * InitCompressFileHandle(const pg_compress_specification compress_spec);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- pg_compress_specification compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+extern CompressFileHandle * InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int DestroyCompressFileHandle(CompressFileHandle * CFH);
#endif
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index d96e566846..0c73a4707e 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,5 +1,6 @@
pg_dump_common_sources = files(
'compress_io.c',
+ 'compress_gzip.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index fb94317ad9..dbd698027c 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle * SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle * savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -355,7 +355,7 @@ RestoreArchive(Archive *AHX)
bool parallel_mode;
bool supports_compression;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1127,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1143,9 +1143,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1502,6 +1503,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1524,33 +1526,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle * savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1689,7 +1690,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2031,6 +2036,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2061,26 +2078,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2178,6 +2181,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2233,7 +2237,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3646,12 +3653,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3722,10 +3724,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
@@ -3737,10 +3740,17 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
+ {
+ bool unsupported = false;
+
#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ unsupported = true;
#endif
+ if (unsupported)
+ pg_fatal("archive is compressed, but this installation does not support compression");
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 4725e49747..9e97e871f0 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,12 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index d1e54644a9..512ab043af 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index f6aee775eb..4182718b0a 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+
+ if (ctx->dataFH->open_write(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (DestroyCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (DestroyCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compression_spec);
+ if (tocFH->open_write(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
+ if (ctx->LOsTocFH->open_write(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+ if (ctx->dataFH->open_write(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (DestroyCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 248540db8c..e8246a3d4c 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -91,7 +91,7 @@ my %pgdump_runs = (
'pg_restore',
'-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip compressed'
},
},
@@ -235,8 +235,8 @@ my %pgdump_runs = (
'-l', "$tempdir/defaults_custom_format.dump",
],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip compressed by default if available'
},
},
--
2.34.1
v16-0004-Add-LZ4-compression-in-pg_-dump-restore.patchtext/x-patch; name=v16-0004-Add-LZ4-compression-in-pg_-dump-restore.patchDownload
From 53904d54638ccc3dfbbcc844d97b8f2900c083a8 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 2 Dec 2022 16:03:12 +0000
Subject: [PATCH v16 4/4] Add LZ4 compression in pg_{dump|restore}
Within compress_lz4.{c,h} the streaming API and a file API compression is
implemented.. The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(), it has been
implemented localy.
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 43 +-
src/bin/pg_dump/compress_lz4.c | 601 +++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 9 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 14 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
9 files changed, 748 insertions(+), 29 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e1..49d218905f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 29eab02d37..28c1fc27cc 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -25,6 +26,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
compress_gzip.o \
+ compress_lz4.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index e7a0e57d8b..a1058ff2fe 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -38,13 +38,15 @@
* ----------------------
*
* The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * libz's gzopen() APIs and custom LZ4 calls which provide similar
+ * functionality. It allows you to use the same functions for compressed and
+ * uncompressed streams. cfopen_read() first tries to open the file with given
+ * name, and if it fails, it tries to open the same file with the .gz suffix,
+ * failing that it tries to open the same file with the .lz4 suffix.
+ * cfopen_write() opens a file for writing, an extra argument specifies the
+ * method to use should the file be compressed, and adds the appropriate
+ * suffix, .gz or .lz4, to the filename if so. This allows you to easily handle
+ * both compressed and uncompressed files.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -57,6 +59,7 @@
#include "compress_io.h"
#include "compress_gzip.h"
+#include "compress_lz4.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -131,7 +134,8 @@ AllocateCompressor(const pg_compress_specification compression_spec,
InitCompressorGzip(cs, compression_spec.level);
break;
case PG_COMPRESSION_LZ4:
- /* fallthrough */
+ InitCompressorLZ4(cs, compression_spec.level);
+ break;
case PG_COMPRESSION_ZSTD:
pg_fatal("invalid compression method");
break;
@@ -182,7 +186,8 @@ free_keep_errno(void *p)
/*
* Compression None implementation
*/
- static size_t
+
+static size_t
_read(void *ptr, size_t size, CompressFileHandle * CFH)
{
FILE *fp = (FILE *) CFH->private;
@@ -319,7 +324,8 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
InitCompressGzip(CFH, compression_spec.level);
break;
case PG_COMPRESSION_LZ4:
- /* fallthrough */
+ InitCompressLZ4(CFH, compression_spec.level);
+ break;
case PG_COMPRESSION_ZSTD:
pg_fatal("invalid compression method");
break;
@@ -332,12 +338,12 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
* Open a file for reading. 'path' is the file to open, and 'mode' should
* be either "r" or "rb".
*
- * If the file at 'path' does not exist, we append the ".gz" suffix (if
+ * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (i
* 'path' doesn't already have it) and try again. So if you pass "foo" as
- * 'path', this will open either "foo" or "foo.gz", trying in that order.
+ * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
+ * order.
*
* On failure, return NULL with an error code in errno.
- *
*/
CompressFileHandle *
InitDiscoverCompressFileHandle(const char *path, const char *mode)
@@ -373,6 +379,17 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
+#endif
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..8f93f05e87
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,601 @@
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private = NULL;
+ }
+}
+
+
+/* Public routines that support LZ4 compressed data I/O */
+void
+InitCompressorLZ4(CompressorState *cs, int compressionLevel)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ /* Will be lazy init'd */
+ cs->private = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the nessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurance of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (!fs->inited && LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verfiy that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (eol_flag == 0) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag == true &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the begining of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t status;
+ int remaining = size;
+
+ if (!fs->inited && LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle * CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle * CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = errno ? : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle * CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle * CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+void
+InitCompressLZ4(CompressFileHandle * CFH, int compressionLevel)
+{
+ LZ4File *lz4fp;
+
+ CFH->open = LZ4File_open;
+ CFH->open_write = LZ4File_open_write;
+ CFH->read = LZ4File_read;
+ CFH->write = LZ4File_write;
+ CFH->gets = LZ4File_gets;
+ CFH->getc = LZ4File_getc;
+ CFH->eof = LZ4File_eof;
+ CFH->close = LZ4File_close;
+ CFH->get_error = LZ4File_get_error;
+
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (compressionLevel >= 0)
+ lz4fp->prefs.compressionLevel = compressionLevel;
+
+ CFH->private = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs, int compressionLevel)
+{
+ pg_fatal("not built with LZ4 support");
+}
+
+void
+InitCompressLZ4(CompressFileHandle * CFH, int compressionLevel)
+{
+ pg_fatal("not built with LZ4 support");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..fbec9a508d
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,9 @@
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs, int compressionLevel);
+extern void InitCompressLZ4(CompressFileHandle * CFH, int compressionLevel);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 0c73a4707e..b27e92ffd0 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,6 +1,7 @@
pg_dump_common_sources = files(
'compress_io.c',
'compress_gzip.c',
+ 'compress_lz4.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
@@ -15,7 +16,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -83,7 +84,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index dbd698027c..9bf64b7fa9 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -395,6 +395,10 @@ RestoreArchive(Archive *AHX)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
#endif
if (supports_compression == false)
pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
@@ -2074,7 +2078,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2084,6 +2088,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -3747,6 +3755,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
unsupported = true;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ unsupported = true;
#endif
if (unsupported)
pg_fatal("archive is compressed, but this installation does not support compression");
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index ad6693c358..40f4949a9f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index e8246a3d4c..20f8a8c539 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -136,6 +136,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files where compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4163,11 +4237,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
--
2.34.1
On Mon, Dec 05, 2022 at 12:48:28PM +0000, gkokolatos@pm.me wrote:
I also took the liberty of applying the test pattern when it the dump
is explicitly compressed.
Sticking with glob_patterns is fine by me.
I was thinking a bit more about this. I think that we can use the list
TOC option of pg_restore. This option will first print out the header
info which contains the compression. Perl utils already support to
parse the generated output of a command. Please find an attempt to do
so in the attached. The benefits of having some testing for this case
become a bit more obvious in 0004 of the patchset, when lz4 is
introduced.
This is where the fun is. What you are doing here is more complete,
and we would make sure that the custom and data directory would always
see their contents compressed by default. And it would have caught
the bug you mentioned upthread for the custom format.
I have kept things as you proposed at the end, added a few comments,
documented the new command_like and an extra command_like for
defaults_dir_format. Glad to see this addressed, thanks!
--
Michael
------- Original Message -------
On Tuesday, December 6th, 2022 at 1:22 AM, Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Dec 05, 2022 at 12:48:28PM +0000, gkokolatos@pm.me wrote:
This is where the fun is. What you are doing here is more complete,
and we would make sure that the custom and data directory would always
see their contents compressed by default. And it would have caught
the bug you mentioned upthread for the custom format.
Thank you very much Michael.
I have kept things as you proposed at the end, added a few comments,
documented the new command_like and an extra command_like for
defaults_dir_format. Glad to see this addressed, thanks!
Please find attached v17, which builds on top of what is already
committed. I dare to think 0001 as ready to be reviewed. 0002 is
also complete albeit with some documentation gaps.
Cheers,
//Georgios
Show quoted text
--
Michael
Attachments:
v17-0001-Prepare-pg_dump-internals-for-additional-compres.patchtext/x-patch; name=v17-0001-Prepare-pg_dump-internals-for-additional-compres.patchDownload
From 859a9ca65ddafaf98854b81639d310c11c5cae9c Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Tue, 6 Dec 2022 15:42:03 +0000
Subject: [PATCH v17 1/3] Prepare pg_dump internals for additional compression
methods.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it and is using it throughout.
---
src/bin/pg_dump/compress_io.c | 396 +++++++++++++++++++--------
src/bin/pg_dump/pg_backup_archiver.c | 128 +++------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-
3 files changed, 323 insertions(+), 228 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index a7df600cc0..b893aca1ed 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,10 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
/*----------------------
* Compressor API
*----------------------
@@ -101,7 +105,7 @@ AllocateCompressor(const pg_compress_specification compression_spec,
#ifndef HAVE_LIBZ
if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("not built with zlib support");
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
@@ -128,15 +132,25 @@ ReadDataFromArchive(ArchiveHandle *AH,
const pg_compress_specification compression_spec,
ReadFunc readF)
{
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ switch (compression_spec.algorithm)
{
+ case PG_COMPRESSION_NONE:
+ ReadDataFromArchiveNone(AH, readF);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
+ ReadDataFromArchiveZlib(AH, readF);
#else
- pg_fatal("not built with zlib support");
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
}
@@ -149,20 +163,22 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
{
switch (cs->compression_spec.algorithm)
{
+ case PG_COMPRESSION_NONE:
+ WriteDataToArchiveNone(AH, cs, data, dLen);
+ break;
case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
WriteDataToArchiveZlib(AH, cs, data, dLen);
#else
- pg_fatal("not built with zlib support");
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
case PG_COMPRESSION_LZ4:
- /* fallthrough */
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
}
}
@@ -173,10 +189,26 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
+ switch (cs->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
+ EndCompressorZlib(AH, cs);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
free(cs);
}
@@ -391,10 +423,8 @@ WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
*/
struct cfp
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
+ pg_compress_specification compression_spec;
+ void *fp;
};
#ifdef HAVE_LIBZ
@@ -490,127 +520,203 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_specification compression_spec)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ fp->compression_spec = compression_spec;
+
+ switch (compression_spec.algorithm)
{
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
+ case PG_COMPRESSION_NONE:
+ if (fd >= 0)
+ fp->fp = fdopen(fd, mode);
+ else
+ fp->fp = fopen(path, mode);
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
- }
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ if (compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use
+ * it
+ */
+ char mode_compression[32];
+
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, compression_spec.level);
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode_compression);
+ else
+ fp->fp = gzopen(path, mode_compression);
+ }
+ else
+ {
+ /* don't specify a level, just use the zlib default */
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode);
+ else
+ fp->fp = gzopen(path, mode);
+ }
- fp->uncompressedfp = NULL;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
#else
- pg_fatal("not built with zlib support");
-#endif
- }
- else
- {
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- fp->uncompressedfp = fopen(path, mode);
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
return fp;
}
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
{
- int ret;
+ int ret = 0;
if (size == 0)
return 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ case PG_COMPRESSION_NONE:
+ ret = fread(ptr, 1, size, fp->fp);
+ if (ret != size && !feof(fp->fp))
+ READ_ERROR_EXIT(fp->fp);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
- else
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzread(fp->fp, ptr, size);
+ if (ret != size && !gzeof(fp->fp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(fp->fp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
+
return ret;
}
int
cfwrite(const void *ptr, int size, cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fwrite(ptr, 1, size, fp->fp);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
+ ret = gzwrite(fp->fp, ptr, size);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
int
cfgetc(cfp *fp)
{
- int ret;
+ int ret = 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
- {
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fgetc(fp->fp);
+ if (ret == EOF)
+ READ_ERROR_EXIT(fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzgetc((gzFile) fp->fp);
+ if (ret == EOF)
+ {
+ if (!gzeof(fp->fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
return ret;
@@ -619,65 +725,119 @@ cfgetc(cfp *fp)
char *
cfgets(cfp *fp, char *buf, int len)
{
+ char *ret = NULL;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fgets(buf, len, fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
+ ret = gzgets(fp->fp, buf, len);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return fgets(buf, len, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
int
cfclose(cfp *fp)
{
- int result;
+ int ret = 0;
if (fp == NULL)
{
errno = EBADF;
return EOF;
}
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+
+ switch (fp->compression_spec.algorithm)
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fclose(fp->fp);
+ fp->fp = NULL;
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzclose(fp->fp);
+ fp->fp = NULL;
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
+
free_keep_errno(fp);
- return result;
+ return ret;
}
int
cfeof(cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = feof(fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
+ ret = gzeof(fp->fp);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return feof(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
const char *
get_cfp_error(cfp *fp)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
+#ifdef HAVE_LIBZ
int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ const char *errmsg = gzerror(fp->fp, &errnum);
if (errnum != Z_ERRNO)
return errmsg;
- }
+#else
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
+ }
+
return strerror(errno);
}
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 7f7a0f1ce7..fb94317ad9 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -362,8 +353,9 @@ RestoreArchive(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
+ bool supports_compression;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +383,24 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
+ supports_compression = true;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE &&
+ AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+#ifndef HAVE_LIBZ
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+ if (supports_compression == false)
+ pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1128,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1503,58 +1502,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1565,33 +1538,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1715,22 +1679,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2219,6 +2178,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2272,8 +2232,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b..4725e49747 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.34.1
v17-0002-Introduce-Compressor-API-in-pg_dump.patchtext/x-patch; name=v17-0002-Introduce-Compressor-API-in-pg_dump.patchDownload
From f7915e5dcbb44f323f13d33213723490174a3c1a Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Tue, 6 Dec 2022 15:42:11 +0000
Subject: [PATCH v17 2/3] Introduce Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for opening, writing, etc. The implementor of a new
compression method is now able to "simply" just add those definitions.
Custom compressed archives now need to store the compression algorithm in their
header. This requires a bump in the version number. The level of compression
is no longer stored in the dump as it is irrelevant.
---
src/bin/pg_dump/Makefile | 1 +
src/bin/pg_dump/compress_gzip.c | 394 +++++++++++
src/bin/pg_dump/compress_gzip.h | 22 +
src/bin/pg_dump/compress_io.c | 897 +++++++-------------------
src/bin/pg_dump/compress_io.h | 71 +-
src/bin/pg_dump/meson.build | 1 +
src/bin/pg_dump/pg_backup_archiver.c | 102 +--
src/bin/pg_dump/pg_backup_archiver.h | 5 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 94 +--
src/bin/pg_dump/t/002_pg_dump.pl | 10 +-
src/tools/pgindent/typedefs.list | 2 +
12 files changed, 823 insertions(+), 799 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 9dc5a784dd..29eab02d37 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,6 +24,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..6a89deaa4b
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,394 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.c
+ * Routines for archivers to write an uncompressed or compressed data
+ * stream.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_gzip.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
+ cs->private = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private;
+ size_t ret;
+
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private;
+
+ return gzwrite(gzfp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private;
+
+ return gzgets(gzfp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private;
+ int save_errno;
+ int ret;
+
+ CFH->private = NULL;
+
+ ret = gzclose(gzfp);
+
+ save_errno = errno;
+ errno = save_errno;
+
+ return ret;
+}
+
+static int
+Gzip_eof(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private;
+
+ return gzeof(gzfp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ gzFile gzfp;
+ char mode_compression[32];
+
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
+ else
+ gzfp = gzopen(path, mode_compression);
+
+ if (gzfp == NULL)
+ return 1;
+
+ CFH->private = gzfp;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ CFH->open = Gzip_open;
+ CFH->open_write = Gzip_open_write;
+ CFH->read = Gzip_read;
+ CFH->write = Gzip_write;
+ CFH->gets = Gzip_gets;
+ CFH->getc = Gzip_getc;
+ CFH->close = Gzip_close;
+ CFH->eof = Gzip_eof;
+ CFH->get_error = Gzip_get_error;
+
+ CFH->compression_spec = compression_spec;
+
+ CFH->private = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..6dfd0eb04d
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.h
+ * Interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index b893aca1ed..c33736dd49 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -9,30 +9,30 @@
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
+ * underlying stream. The second API is a wrapper around fopen and
* friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
+ * the possible compression. The second API is aimed for the resulting
+ * files can be easily manipulated with an external compression utility
+ * program.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
+ * AllocateCompressor, writeData, and EndCompressor. First you call
+ * AllocateCompressor, then write all the data by calling writeData as many
+ * times as needed, and finally EndCompressor. writeData will call the
+ * WriteFunc that was provided to AllocateCompressor for each chunk of
+ * compressed data.
*
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
- *
- * The interface is the same for compressed and uncompressed streams.
+ * The interface for reading an archive consists of the same three functions:
+ * AllocateCompressor, readData, and EndCompressor. First you call
+ * AllocateCompressor, then read all the data by calling readData to read the
+ * whole compressed stream which repeatedly calls the given ReadFunc. ReadFunc
+ * returns the compressed data chunk at a time, and readData decompresses it
+ * and passes the decompressed data to ahwrite(), until ReadFunc returns 0 to
+ * signal EOF. The interface is the same for compressed and uncompressed
+ * streams.
*
* Compressed stream API
* ----------------------
@@ -51,9 +51,12 @@
*
*-------------------------------------------------------------------------
*/
+#include <sys/stat.h>
+#include <unistd.h>
#include "postgres_fe.h"
#include "compress_io.h"
+#include "compress_gzip.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -65,85 +68,70 @@
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
+/* Private routines that support uncompressed data I/O */
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
+
+ free(buf);
+}
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+static void
+InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+
+ cs->compression_spec = compression_spec;
+}
/* Public interface routines */
/* Allocate a new compressor */
CompressorState *
AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
- cs->compression_spec = compression_spec;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
-}
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF)
-{
switch (compression_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- ReadDataFromArchiveNone(AH, readF);
+ InitCompressorNone(cs, compression_spec);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
+ InitCompressorGzip(cs, compression_spec);
break;
case PG_COMPRESSION_LZ4:
pg_fatal("compression with %s is not yet supported", "LZ4");
@@ -152,35 +140,8 @@ ReadDataFromArchive(ArchiveHandle *AH,
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
}
-}
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
+ return cs;
}
/*
@@ -189,247 +150,28 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- EndCompressorZlib(AH, cs);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- free(cs);
-}
-
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
- */
-
-static void
-InitCompressorZlib(CompressorState *cs, int level)
-{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
-}
-
-static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
-{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
-
- free(cs->zlibOut);
- free(cs->zp);
-}
-
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
-{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
-
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
-
- if (res == Z_STREAM_END)
- break;
- }
-}
-
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
-}
-
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
-{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
-
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
-
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
-
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
-}
-#endif /* HAVE_LIBZ */
-
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
-{
- size_t cnt;
- char *buf;
- size_t buflen;
-
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
-
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
-
- free(buf);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->writeF(AH, data, dLen);
-}
-
-
/*----------------------
* Compressed stream API
*----------------------
*/
-/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
- */
-struct cfp
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- pg_compress_specification compression_spec;
- void *fp;
-};
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
+}
/* free() without changing errno; useful in several places below */
static void
@@ -442,418 +184,223 @@ free_keep_errno(void *p)
}
/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
+ * Compression None implementation
*/
-cfp *
-cfopen_read(const char *path, const char *mode)
+static size_t
+_read(void *ptr, size_t size, CompressFileHandle *CFH)
{
- cfp *fp;
-
- pg_compress_specification compression_spec = {0};
-
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
+ FILE *fp = (FILE *) CFH->private;
+ size_t ret;
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
-}
+ if (size == 0)
+ return 0;
-/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- cfp *fp;
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("not built with zlib support");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
+ return ret;
}
-/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
- */
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compression_spec)
+static size_t
+_write(const void *ptr, size_t size, CompressFileHandle *CFH)
{
- cfp *fp = pg_malloc(sizeof(cfp));
-
- fp->compression_spec = compression_spec;
-
- switch (compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- if (fd >= 0)
- fp->fp = fdopen(fd, mode);
- else
- fp->fp = fopen(path, mode);
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /*
- * user has specified a compression level, so tell zlib to use
- * it
- */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode_compression);
- else
- fp->fp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode);
- else
- fp->fp = gzopen(path, mode);
- }
-
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return fp;
+ return fwrite(ptr, 1, size, (FILE *) CFH->private);
}
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static const char *
+_get_error(CompressFileHandle *CFH)
{
- return cfopen_internal(path, -1, mode, compression_spec);
+ return strerror(errno);
}
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec)
+static char *
+_gets(char *ptr, int size, CompressFileHandle *CFH)
{
- return cfopen_internal(NULL, fd, mode, compression_spec);
+ return fgets(ptr, size, (FILE *) CFH->private);
}
-int
-cfread(void *ptr, int size, cfp *fp)
+static int
+_getc(CompressFileHandle *CFH)
{
- int ret = 0;
-
- if (size == 0)
- return 0;
+ FILE *fp = (FILE *) CFH->private;
+ int ret;
- switch (fp->compression_spec.algorithm)
+ ret = fgetc(fp);
+ if (ret == EOF)
{
- case PG_COMPRESSION_NONE:
- ret = fread(ptr, 1, size, fp->fp);
- if (ret != size && !feof(fp->fp))
- READ_ERROR_EXIT(fp->fp);
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzread(fp->fp, ptr, size);
- if (ret != size && !gzeof(fp->fp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->fp, &errnum);
-
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
}
return ret;
}
-int
-cfwrite(const void *ptr, int size, cfp *fp)
+static int
+_close(CompressFileHandle *CFH)
{
+ FILE *fp = (FILE *) CFH->private;
int ret = 0;
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fwrite(ptr, 1, size, fp->fp);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzwrite(fp->fp, ptr, size);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
+ CFH->private = NULL;
+
+ if (fp)
+ ret = fclose(fp);
return ret;
}
-int
-cfgetc(cfp *fp)
+
+static int
+_eof(CompressFileHandle *CFH)
{
- int ret = 0;
+ return feof((FILE *) CFH->private);
+}
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgetc(fp->fp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->fp);
+static int
+_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private == NULL);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgetc((gzFile) fp->fp);
- if (ret == EOF)
- {
- if (!gzeof(fp->fp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
+ if (fd >= 0)
+ CFH->private = fdopen(dup(fd), mode);
+ else
+ CFH->private = fopen(path, mode);
- return ret;
+ if (CFH->private == NULL)
+ return 1;
+
+ return 0;
}
-char *
-cfgets(cfp *fp, char *buf, int len)
+static int
+_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
{
- char *ret = NULL;
+ Assert(CFH->private == NULL);
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgets(buf, len, fp->fp);
+ CFH->private = fopen(path, mode);
+ if (CFH->private == NULL)
+ return 1;
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgets(fp->fp, buf, len);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
+ return 0;
+}
- return ret;
+static void
+InitCompressNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ CFH->open = _open;
+ CFH->open_write = _open_write;
+ CFH->read = _read;
+ CFH->write = _write;
+ CFH->gets = _gets;
+ CFH->getc = _getc;
+ CFH->close = _close;
+ CFH->eof = _eof;
+ CFH->get_error = _get_error;
+
+ CFH->private = NULL;
}
-int
-cfclose(cfp *fp)
+/*
+ * Public interface
+ */
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compression_spec)
{
- int ret = 0;
+ CompressFileHandle *CFH;
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
- }
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
- switch (fp->compression_spec.algorithm)
+ switch (compression_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- ret = fclose(fp->fp);
- fp->fp = NULL;
-
+ InitCompressNone(CFH, compression_spec);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzclose(fp->fp);
- fp->fp = NULL;
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
+ InitCompressGzip(CFH, compression_spec);
break;
case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
+ /* fallthrough */
case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
+ pg_fatal("invalid compression method");
break;
}
- free_keep_errno(fp);
-
- return ret;
+ return CFH;
}
-int
-cfeof(cfp *fp)
+/*
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
+ *
+ * If the file at 'path' does not exist, we append the ".gz" suffix (if
+ * 'path' doesn't already have it) and try again. So if you pass "foo" as
+ * 'path', this will open either "foo" or "foo.gz", trying in that order.
+ *
+ * On failure, return NULL with an error code in errno.
+ *
+ */
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- int ret = 0;
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = feof(fp->fp);
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzeof(fp->fp);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
- return ret;
-}
+ fname = strdup(path);
-const char *
-get_cfp_error(cfp *fp)
-{
- if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ else
{
+ bool exists;
+
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not build with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- int errnum;
- const char *errmsg = gzerror(fp->fp, &errnum);
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
- if (errnum != Z_ERRNO)
- return errmsg;
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ }
#endif
}
- return strerror(errno);
+ CFH = InitCompressFileHandle(compression_spec);
+ if (CFH->open(fname, -1, mode, CFH))
+ {
+ free_keep_errno(CFH);
+ CFH = NULL;
+ }
+ free_keep_errno(fname);
+
+ return CFH;
}
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
+int
+DestroyCompressFileHandle(CompressFileHandle *CFH)
{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
+ int ret = 0;
- if (filenamelen < suffixlen)
- return 0;
+ if (CFH->private)
+ ret = CFH->close(CFH);
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
-}
+ free_keep_errno(CFH);
-#endif
+ return ret;
+}
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 6fad6c2cd5..3053dc43dd 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -37,34 +37,63 @@ typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ ReadFunc readF;
+ WriteFunc writeF;
+
+ pg_compress_specification compression_spec;
+ void *private;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ int (*open) (const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH);
+ int (*open_write) (const char *path, const char *mode,
+ CompressFileHandle *cxt);
+ size_t (*read) (void *ptr, size_t size, CompressFileHandle *CFH);
+ size_t (*write) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ char *(*gets) (char *s, int size, CompressFileHandle *CFH);
+ int (*getc) (CompressFileHandle *CFH);
+ int (*eof) (CompressFileHandle *CFH);
+ int (*close) (CompressFileHandle *CFH);
+ const char *(*get_error) (CompressFileHandle *CFH);
+
+ pg_compress_specification compression_spec;
+ void *private;
+};
-typedef struct cfp cfp;
+extern CompressFileHandle *InitCompressFileHandle(
+ const pg_compress_specification compression_spec);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- pg_compress_specification compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int DestroyCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index d96e566846..0c73a4707e 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,5 +1,6 @@
pg_dump_common_sources = files(
'compress_io.c',
+ 'compress_gzip.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index fb94317ad9..1a3b309484 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -355,7 +355,7 @@ RestoreArchive(Archive *AHX)
bool parallel_mode;
bool supports_compression;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1127,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1143,9 +1143,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1502,6 +1503,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1524,33 +1526,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1689,7 +1690,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2031,6 +2036,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2061,26 +2078,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2178,6 +2181,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2233,7 +2237,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3646,12 +3653,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3722,10 +3724,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
@@ -3737,10 +3740,17 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
+ {
+ bool unsupported = false;
+
#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ unsupported = true;
#endif
+ if (unsupported)
+ pg_fatal("archive is compressed, but this installation does not support compression");
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 4725e49747..18b38c17ab 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,13 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add
+ * compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index d1e54644a9..512ab043af 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index f6aee775eb..4182718b0a 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+
+ if (ctx->dataFH->open_write(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (DestroyCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (DestroyCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compression_spec);
+ if (tocFH->open_write(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
+ if (ctx->LOsTocFH->open_write(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+ if (ctx->dataFH->open_write(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (DestroyCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 6656222363..22a7c5c37c 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -94,7 +94,7 @@ my %pgdump_runs = (
command => [
'pg_restore', '-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip-compressed'
},
},
@@ -239,8 +239,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default if available',
},
},
@@ -264,8 +264,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
glob_patterns => [
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 58daeca831..29266b5de9 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -428,6 +428,7 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
+CompressFileHandle
CompressionLocation
CompressorState
ComputeXidHorizonsResult
@@ -1033,6 +1034,7 @@ GucStack
GucStackState
GucStringAssignHook
GucStringCheckHook
+GzipCompressorState
HANDLE
HASHACTION
HASHBUCKET
--
2.34.1
v17-0003-Add-LZ4-compression-in-pg_-dump-restore.patchtext/x-patch; name=v17-0003-Add-LZ4-compression-in-pg_-dump-restore.patchDownload
From f20ba2ccb17ab5f8e75f9b73df03e3186f4d6024 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Tue, 6 Dec 2022 15:42:31 +0000
Subject: [PATCH v17 3/3] Add LZ4 compression in pg_{dump|restore}
Within compress_lz4.{c,h} the streaming API and a file API compression is
implemented.. The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(), it has been
implemented localy.
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 39 +-
src/bin/pg_dump/compress_lz4.c | 618 +++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 22 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 14 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
src/tools/pgindent/typedefs.list | 1 +
10 files changed, 776 insertions(+), 28 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e1..49d218905f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 29eab02d37..28c1fc27cc 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -25,6 +26,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
compress_gzip.o \
+ compress_lz4.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index c33736dd49..3c2a297350 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -38,13 +38,15 @@
* ----------------------
*
* The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * libz's gzopen() APIs and custom LZ4 calls which provide similar
+ * functionality. It allows you to use the same functions for compressed and
+ * uncompressed streams. cfopen_read() first tries to open the file with given
+ * name, and if it fails, it tries to open the same file with the .gz suffix,
+ * failing that it tries to open the same file with the .lz4 suffix.
+ * cfopen_write() opens a file for writing, an extra argument specifies the
+ * method to use should the file be compressed, and adds the appropriate
+ * suffix, .gz or .lz4, to the filename if so. This allows you to easily handle
+ * both compressed and uncompressed files.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -57,6 +59,7 @@
#include "compress_io.h"
#include "compress_gzip.h"
+#include "compress_lz4.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -134,7 +137,7 @@ AllocateCompressor(const pg_compress_specification compression_spec,
InitCompressorGzip(cs, compression_spec);
break;
case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
+ InitCompressorLZ4(cs, compression_spec);
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
@@ -324,7 +327,8 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
InitCompressGzip(CFH, compression_spec);
break;
case PG_COMPRESSION_LZ4:
- /* fallthrough */
+ InitCompressLZ4(CFH, compression_spec);
+ break;
case PG_COMPRESSION_ZSTD:
pg_fatal("invalid compression method");
break;
@@ -337,12 +341,12 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
* Open a file for reading. 'path' is the file to open, and 'mode' should
* be either "r" or "rb".
*
- * If the file at 'path' does not exist, we append the ".gz" suffix (if
+ * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (i
* 'path' doesn't already have it) and try again. So if you pass "foo" as
- * 'path', this will open either "foo" or "foo.gz", trying in that order.
+ * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
+ * order.
*
* On failure, return NULL with an error code in errno.
- *
*/
CompressFileHandle *
InitDiscoverCompressFileHandle(const char *path, const char *mode)
@@ -378,6 +382,17 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
+#endif
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..25bc9108dd
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,618 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.c
+ * Routines for archivers to write an uncompressed or compressed data
+ * stream.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private = NULL;
+ }
+}
+
+
+/* Public routines that support LZ4 compressed data I/O */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ cs->compression_spec = compression_spec;
+
+ /* Will be lazy init'd */
+ cs->private = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the nessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurance of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (!fs->inited && LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verfiy that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (eol_flag == 0) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag == true &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the begining of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t status;
+ int remaining = size;
+
+ if (!fs->inited && LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = errno ? : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ LZ4File *lz4fp;
+
+ CFH->open = LZ4File_open;
+ CFH->open_write = LZ4File_open_write;
+ CFH->read = LZ4File_read;
+ CFH->write = LZ4File_write;
+ CFH->gets = LZ4File_gets;
+ CFH->getc = LZ4File_getc;
+ CFH->eof = LZ4File_eof;
+ CFH->close = LZ4File_close;
+ CFH->get_error = LZ4File_get_error;
+
+ CFH->compression_spec = compression_spec;
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (CFH->compression_spec.level >= 0)
+ lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
+
+ CFH->private = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..74595db1b9
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.h
+ * Interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 0c73a4707e..b27e92ffd0 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,6 +1,7 @@
pg_dump_common_sources = files(
'compress_io.c',
'compress_gzip.c',
+ 'compress_lz4.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
@@ -15,7 +16,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -83,7 +84,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 1a3b309484..c0f031b052 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -395,6 +395,10 @@ RestoreArchive(Archive *AHX)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
#endif
if (supports_compression == false)
pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
@@ -2074,7 +2078,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2084,6 +2088,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -3747,6 +3755,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
unsupported = true;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ unsupported = true;
#endif
if (unsupported)
pg_fatal("archive is compressed, but this installation does not support compression");
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index ad6693c358..40f4949a9f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 22a7c5c37c..d1a9e1d45b 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -139,6 +139,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files where compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4175,11 +4249,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 29266b5de9..0e144bceaf 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1380,6 +1380,7 @@ LWLock
LWLockHandle
LWLockMode
LWLockPadded
+LZ4CompressorState
LZ4F_compressionContext_t
LZ4F_decompressOptions_t
LZ4F_decompressionContext_t
--
2.34.1
001: still refers to "gzip", which is correct for -Fp and -Fd but not
for -Fc, for which it's more correct to say "zlib". That affects the
name of the function, structures, comments, etc. I'm not sure if it's
an issue to re-use the basebackup compression routines here. Maybe we
should accept "-Zn" for zlib output (-Fc), but reject "gzip:9", which
I'm sure some will find confusing, as it does not output. Maybe 001
should be split into a patch to re-use the existing "cfp" interface
(which is a clear win), and 002 to re-use the basebackup interfaces for
user input and constants, etc.
001 still doesn't compile on freebsd, and 002 doesn't compile on
windows. Have you checked test results from cirrusci on your private
github account ?
002 says:
+ save_errno = errno;
+ errno = save_errno;
I suppose that's intended to wrap the preceding library call.
002 breaks "pg_dump -Fc -Z2" because (I think) AllocateCompressor()
doesn't store the passed-in compression_spec.
003 still uses <lz4.h> and not "lz4.h".
Earlier this year I also suggested to include an 999 patch to change to
use LZ4 as the default compression, to exercise the new code under CI.
I suggest to re-open the cf patch entry after that passes tests on all
platforms and when it's ready for more review.
BTW, some of these review comments are the same as what I sent earlier
this year.
/messages/by-id/20220326162156.GI28503@telsasoft.com
/messages/by-id/20220705151328.GQ13040@telsasoft.com
--
Justin
On Sat, Dec 17, 2022 at 05:26:15PM -0600, Justin Pryzby wrote:
001: still refers to "gzip", which is correct for -Fp and -Fd but not
for -Fc, for which it's more correct to say "zlib".
Or should we begin by changing all these existing "not built with zlib
support" error strings to the more generic "this build does not
support compression with %s" to reduce the number of messages to
translate? That would bring consistency with the other tools dealing
with compression.
That affects the
name of the function, structures, comments, etc. I'm not sure if it's
an issue to re-use the basebackup compression routines here. Maybe we
should accept "-Zn" for zlib output (-Fc), but reject "gzip:9", which
I'm sure some will find confusing, as it does not output. Maybe 001
should be split into a patch to re-use the existing "cfp" interface
(which is a clear win), and 002 to re-use the basebackup interfaces for
user input and constants, etc.001 still doesn't compile on freebsd, and 002 doesn't compile on
windows. Have you checked test results from cirrusci on your private
github account ?
FYI, I have re-added an entry to the CF app to get some automated
coverage:
https://commitfest.postgresql.org/41/3571/
On MinGW, a complain about the open() callback, which I guess ought to
be avoided with a rename:
[00:16:37.254] compress_gzip.c:356:38: error: macro "open" passed 4 arguments, but takes just 3
[00:16:37.254] 356 | ret = CFH->open(fname, -1, mode, CFH);
[00:16:37.254] | ^
[00:16:37.254] In file included from ../../../src/include/c.h:1309,
[00:16:37.254] from ../../../src/include/postgres_fe.h:25,
[00:16:37.254] from compress_gzip.c:15:
On MSVC, some declaration conflicts, for a similar issue:
[00:12:31.966] ../src/bin/pg_dump/compress_io.c(193): error C2371: '_read': redefinition; different basic types
[00:12:31.966] C:\Program Files (x86)\Windows Kits\10\include\10.0.20348.0\ucrt\corecrt_io.h(252): note: see declaration of '_read'
[00:12:31.966] ../src/bin/pg_dump/compress_io.c(210): error C2371: '_write': redefinition; different basic types
[00:12:31.966] C:\Program Files (x86)\Windows Kits\10\include\10.0.20348.0\ucrt\corecrt_io.h(294): note: see declaration of '_write'
002 breaks "pg_dump -Fc -Z2" because (I think) AllocateCompressor()
doesn't store the passed-in compression_spec.
Hmm. This looks like a gap in the existing tests that we'd better fix
first. This CI is green on Linux.
003 still uses <lz4.h> and not "lz4.h".
This should be <lz4.h>, not "lz4.h".
--
Michael
------- Original Message -------
On Monday, December 19th, 2022 at 5:06 AM, Michael Paquier <michael@paquier.xyz> wrote:
On Sat, Dec 17, 2022 at 05:26:15PM -0600, Justin Pryzby wrote:
Thank you for the comments, please find v18 attached.
001: still refers to "gzip", which is correct for -Fp and -Fd but not
for -Fc, for which it's more correct to say "zlib".Or should we begin by changing all these existing "not built with zlib
support" error strings to the more generic "this build does not
support compression with %s" to reduce the number of messages to
translate? That would bring consistency with the other tools dealing
with compression.
This has been the approach from 0002 on-wards. In the attached it is also
applied on the remaining location in 0001.
That affects the
name of the function, structures, comments, etc. I'm not sure if it's
an issue to re-use the basebackup compression routines here. Maybe we
should accept "-Zn" for zlib output (-Fc), but reject "gzip:9", which
I'm sure some will find confusing, as it does not output. Maybe 001
should be split into a patch to re-use the existing "cfp" interface
(which is a clear win), and 002 to re-use the basebackup interfaces for
user input and constants, etc.001 still doesn't compile on freebsd, and 002 doesn't compile on
windows. Have you checked test results from cirrusci on your private
github account ?
There are still known gaps in 0002 and 0003, for example documentation,
and I have not been focusing too much on those. You are right, it is helpful
and kind to try to reduce the noise. The attached should have hopefully
tackled the ci errors.
FYI, I have re-added an entry to the CF app to get some automated
coverage:
https://commitfest.postgresql.org/41/3571/
Much obliged. Should I change the state to "ready for review" when post a
new version or should I leave that to the senior personnel?
On MinGW, a complain about the open() callback, which I guess ought to
be avoided with a rename:
[00:16:37.254] compress_gzip.c:356:38: error: macro "open" passed 4 arguments, but takes just 3
[00:16:37.254] 356 | ret = CFH->open(fname, -1, mode, CFH);[00:16:37.254] | ^
[00:16:37.254] In file included from ../../../src/include/c.h:1309,
[00:16:37.254] from ../../../src/include/postgres_fe.h:25,
[00:16:37.254] from compress_gzip.c:15:On MSVC, some declaration conflicts, for a similar issue:
[00:12:31.966] ../src/bin/pg_dump/compress_io.c(193): error C2371: '_read': redefinition; different basic types
[00:12:31.966] C:\Program Files (x86)\Windows Kits\10\include\10.0.20348.0\ucrt\corecrt_io.h(252): note: see declaration of '_read'
[00:12:31.966] ../src/bin/pg_dump/compress_io.c(210): error C2371: '_write': redefinition; different basic types
[00:12:31.966] C:\Program Files (x86)\Windows Kits\10\include\10.0.20348.0\ucrt\corecrt_io.h(294): note: see declaration of '_write'
A rename was enough.
002 breaks "pg_dump -Fc -Z2" because (I think) AllocateCompressor()
doesn't store the passed-in compression_spec.
I am afraid I have not been able to reproduce this error. I tried both
debian and freebsd after I addressed the compilation warnings. Which
error did you get? Is it still present in the attached?
Hmm. This looks like a gap in the existing tests that we'd better fix
first. This CI is green on Linux.
As the code stands, the compression level is not stored in the custom
format's header as it is no longer relevant information. We can decide
to make it relevant for the tests only on the expense of increasing
dump size by four bytes. In either case this is not applicable in
current head and can wait for 0002's turn.
Cheers,
//Georgios
Show quoted text
003 still uses <lz4.h> and not "lz4.h".
This should be <lz4.h>, not "lz4.h".
--
Michael
Attachments:
v18-0001-Prepare-pg_dump-internals-for-additional-compres.patchtext/x-patch; name=v18-0001-Prepare-pg_dump-internals-for-additional-compres.patchDownload
From 3f319d62fb98f1bb6c60eef48b8fd04a1adca289 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 19 Dec 2022 15:02:17 +0000
Subject: [PATCH v18 1/3] Prepare pg_dump internals for additional compression
methods.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it and is using it throughout.
---
src/bin/pg_dump/compress_io.c | 398 +++++++++++++++++++--------
src/bin/pg_dump/pg_backup_archiver.c | 128 +++------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-
3 files changed, 324 insertions(+), 229 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index a7df600cc0..bbac154669 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,10 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
/*----------------------
* Compressor API
*----------------------
@@ -101,7 +105,7 @@ AllocateCompressor(const pg_compress_specification compression_spec,
#ifndef HAVE_LIBZ
if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("not built with zlib support");
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
@@ -128,15 +132,25 @@ ReadDataFromArchive(ArchiveHandle *AH,
const pg_compress_specification compression_spec,
ReadFunc readF)
{
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ switch (compression_spec.algorithm)
{
+ case PG_COMPRESSION_NONE:
+ ReadDataFromArchiveNone(AH, readF);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
+ ReadDataFromArchiveZlib(AH, readF);
#else
- pg_fatal("not built with zlib support");
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
}
@@ -149,20 +163,22 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
{
switch (cs->compression_spec.algorithm)
{
+ case PG_COMPRESSION_NONE:
+ WriteDataToArchiveNone(AH, cs, data, dLen);
+ break;
case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
WriteDataToArchiveZlib(AH, cs, data, dLen);
#else
- pg_fatal("not built with zlib support");
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
case PG_COMPRESSION_LZ4:
- /* fallthrough */
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
}
}
@@ -173,10 +189,26 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
+ switch (cs->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
+ EndCompressorZlib(AH, cs);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
free(cs);
}
@@ -391,10 +423,8 @@ WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
*/
struct cfp
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
+ pg_compress_specification compression_spec;
+ void *fp;
};
#ifdef HAVE_LIBZ
@@ -482,7 +512,7 @@ cfopen_write(const char *path, const char *mode,
fp = cfopen(fname, mode, compression_spec);
free_keep_errno(fname);
#else
- pg_fatal("not built with zlib support");
+ pg_fatal("this build does not support compression with %s", "gzip");
fp = NULL; /* keep compiler quiet */
#endif
}
@@ -490,127 +520,203 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_specification compression_spec)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ fp->compression_spec = compression_spec;
+
+ switch (compression_spec.algorithm)
{
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
+ case PG_COMPRESSION_NONE:
+ if (fd >= 0)
+ fp->fp = fdopen(fd, mode);
+ else
+ fp->fp = fopen(path, mode);
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
- }
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ if (compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use
+ * it
+ */
+ char mode_compression[32];
+
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, compression_spec.level);
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode_compression);
+ else
+ fp->fp = gzopen(path, mode_compression);
+ }
+ else
+ {
+ /* don't specify a level, just use the zlib default */
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode);
+ else
+ fp->fp = gzopen(path, mode);
+ }
- fp->uncompressedfp = NULL;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
#else
- pg_fatal("not built with zlib support");
-#endif
- }
- else
- {
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- fp->uncompressedfp = fopen(path, mode);
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
return fp;
}
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
{
- int ret;
+ int ret = 0;
if (size == 0)
return 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ case PG_COMPRESSION_NONE:
+ ret = fread(ptr, 1, size, (FILE *) fp->fp);
+ if (ret != size && !feof((FILE *) fp->fp))
+ READ_ERROR_EXIT((FILE *) fp->fp);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
- else
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzread((gzFile) fp->fp, ptr, size);
+ if (ret != size && !gzeof((gzFile) fp->fp))
+ {
+ int errnum;
+ const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
+
return ret;
}
int
cfwrite(const void *ptr, int size, cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fwrite(ptr, 1, size, (FILE *) fp->fp);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
+ ret = gzwrite((gzFile) fp->fp, ptr, size);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
int
cfgetc(cfp *fp)
{
- int ret;
+ int ret = 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
- {
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fgetc((FILE *) fp->fp);
+ if (ret == EOF)
+ READ_ERROR_EXIT((FILE *) fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzgetc((gzFile) fp->fp);
+ if (ret == EOF)
+ {
+ if (!gzeof((gzFile) fp->fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
return ret;
@@ -619,65 +725,119 @@ cfgetc(cfp *fp)
char *
cfgets(cfp *fp, char *buf, int len)
{
+ char *ret = NULL;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fgets(buf, len, (FILE *) fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
+ ret = gzgets((gzFile) fp->fp, buf, len);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return fgets(buf, len, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
int
cfclose(cfp *fp)
{
- int result;
+ int ret = 0;
if (fp == NULL)
{
errno = EBADF;
return EOF;
}
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+
+ switch (fp->compression_spec.algorithm)
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fclose((FILE *) fp->fp);
+ fp->fp = NULL;
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzclose((gzFile) fp->fp);
+ fp->fp = NULL;
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
+
free_keep_errno(fp);
- return result;
+ return ret;
}
int
cfeof(cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = feof((FILE *) fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
+ ret = gzeof((gzFile) fp->fp);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return feof(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
const char *
get_cfp_error(cfp *fp)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
+#ifdef HAVE_LIBZ
int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
if (errnum != Z_ERRNO)
return errmsg;
- }
+#else
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
+ }
+
return strerror(errno);
}
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 7f7a0f1ce7..fb94317ad9 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -362,8 +353,9 @@ RestoreArchive(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
+ bool supports_compression;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +383,24 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
+ supports_compression = true;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE &&
+ AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+#ifndef HAVE_LIBZ
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+ if (supports_compression == false)
+ pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1128,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1503,58 +1502,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1565,33 +1538,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1715,22 +1679,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2219,6 +2178,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2272,8 +2232,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b..4725e49747 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.34.1
v18-0002-Introduce-Compressor-API-in-pg_dump.patchtext/x-patch; name=v18-0002-Introduce-Compressor-API-in-pg_dump.patchDownload
From b6f74632599b30ee5479b548e4a906839eb308e7 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 19 Dec 2022 15:16:45 +0000
Subject: [PATCH v18 2/3] Introduce Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for opening, writing, etc. The implementor of a new
compression method is now able to "simply" just add those definitions.
Custom compressed archives now need to store the compression algorithm in their
header. This requires a bump in the version number. The level of compression
is no longer stored in the dump as it is irrelevant.
---
src/bin/pg_dump/Makefile | 1 +
src/bin/pg_dump/compress_gzip.c | 398 +++++++++++
src/bin/pg_dump/compress_gzip.h | 22 +
src/bin/pg_dump/compress_io.c | 918 +++++++-------------------
src/bin/pg_dump/compress_io.h | 71 +-
src/bin/pg_dump/meson.build | 1 +
src/bin/pg_dump/pg_backup_archiver.c | 102 +--
src/bin/pg_dump/pg_backup_archiver.h | 5 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 94 +--
src/bin/pg_dump/t/002_pg_dump.pl | 10 +-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 2 +
13 files changed, 846 insertions(+), 802 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 9dc5a784dd..29eab02d37 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,6 +24,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..95e1d6c276
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,398 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.c
+ * Routines for archivers to write an uncompressed or compressed data
+ * stream.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_gzip.h"
+#include "pg_backup_utils.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ cs->compression_spec = compression_spec;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
+ cs->private_data = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ size_t ret;
+
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzwrite(gzfp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzgets(gzfp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int save_errno;
+ int ret;
+
+ CFH->private_data = NULL;
+
+ ret = gzclose(gzfp);
+
+ save_errno = errno;
+ errno = save_errno;
+
+ return ret;
+}
+
+static int
+Gzip_eof(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzeof(gzfp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ gzFile gzfp;
+ char mode_compression[32];
+
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
+ else
+ gzfp = gzopen(path, mode_compression);
+
+ if (gzfp == NULL)
+ return 1;
+
+ CFH->private_data = gzfp;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ CFH->open_func = Gzip_open;
+ CFH->open_write_func = Gzip_open_write;
+ CFH->read_func = Gzip_read;
+ CFH->write_func = Gzip_write;
+ CFH->gets_func = Gzip_gets;
+ CFH->getc_func = Gzip_getc;
+ CFH->close_func = Gzip_close;
+ CFH->eof_func = Gzip_eof;
+ CFH->get_error_func = Gzip_get_error;
+
+ CFH->compression_spec = compression_spec;
+
+ CFH->private_data = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..6dfd0eb04d
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.h
+ * Interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index bbac154669..4fe5b262ad 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -9,42 +9,44 @@
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
+ * underlying stream. The second API is a wrapper around fopen and
* friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
+ * the possible compression. The second API is aimed for the resulting
+ * files can be easily manipulated with an external compression utility
+ * program.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
+ * AllocateCompressor, writeData, and EndCompressor. First you call
+ * AllocateCompressor, then write all the data by calling writeData as many
+ * times as needed, and finally EndCompressor. writeData will call the
+ * WriteFunc that was provided to AllocateCompressor for each chunk of
+ * compressed data.
*
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
- *
- * The interface is the same for compressed and uncompressed streams.
+ * The interface for reading an archive consists of the same three functions:
+ * AllocateCompressor, readData, and EndCompressor. First you call
+ * AllocateCompressor, then read all the data by calling readData to read the
+ * whole compressed stream which repeatedly calls the given ReadFunc. ReadFunc
+ * returns the compressed data chunk at a time, and readData decompresses it
+ * and passes the decompressed data to ahwrite(), until ReadFunc returns 0 to
+ * signal EOF. The interface is the same for compressed and uncompressed
+ * streams.
*
* Compressed stream API
* ----------------------
*
* The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * libz's gzopen() APIs and custom LZ4 calls which provide similar
+ * functionality. It allows you to use the same functions for compressed and
+ * uncompressed streams. cfopen_read() first tries to open the file with given
+ * name, and if it fails, it tries to open the same file with the .gz suffix,
+ * failing that it tries to open the same file with the .lz4 suffix.
+ * cfopen_write() opens a file for writing, an extra argument specifies the
+ * method to use should the file be compressed, and adds the appropriate
+ * suffix, .gz or .lz4, to the filename if so. This allows you to easily handle
+ * both compressed and uncompressed files.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -53,7 +55,11 @@
*/
#include "postgres_fe.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
#include "compress_io.h"
+#include "compress_gzip.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -65,85 +71,70 @@
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
+/* Private routines that support uncompressed data I/O */
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
+
+ free(buf);
+}
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+static void
+InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+
+ cs->compression_spec = compression_spec;
+}
/* Public interface routines */
/* Allocate a new compressor */
CompressorState *
AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
- cs->compression_spec = compression_spec;
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
-}
-
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF)
-{
switch (compression_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- ReadDataFromArchiveNone(AH, readF);
+ InitCompressorNone(cs, compression_spec);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
+ InitCompressorGzip(cs, compression_spec);
break;
case PG_COMPRESSION_LZ4:
pg_fatal("compression with %s is not yet supported", "LZ4");
@@ -152,35 +143,8 @@ ReadDataFromArchive(ArchiveHandle *AH,
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
}
-}
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
+ return cs;
}
/*
@@ -189,402 +153,178 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- EndCompressorZlib(AH, cs);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- free(cs);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
+/*----------------------
+ * Compressed stream API
+ *----------------------
*/
-static void
-InitCompressorZlib(CompressorState *cs, int level)
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
+
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
}
+/* free() without changing errno; useful in several places below */
static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
+free_keep_errno(void *p)
{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
+ int save_errno = errno;
- free(cs->zlibOut);
- free(cs->zp);
+ free(p);
+ errno = save_errno;
}
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
+/*
+ * Compression None implementation
+ */
+static size_t
+read_none(void *ptr, size_t size, CompressFileHandle *CFH)
{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
+ FILE *fp = (FILE *) CFH->private_data;
+ size_t ret;
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
+ if (size == 0)
+ return 0;
- if (res == Z_STREAM_END)
- break;
- }
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
+
+ return ret;
}
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+static size_t
+write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
+ return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
}
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
+static const char *
+get_error_none(CompressFileHandle *CFH)
{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
+ return strerror(errno);
+}
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
+static char *
+gets_none(char *ptr, int size, CompressFileHandle *CFH)
+{
+ return fgets(ptr, size, (FILE *) CFH->private_data);
+}
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
+static int
+getc_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret;
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
+ ret = fgetc(fp);
+ if (ret == EOF)
{
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
}
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
+ return ret;
}
-#endif /* HAVE_LIBZ */
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
+static int
+close_none(CompressFileHandle *CFH)
{
- size_t cnt;
- char *buf;
- size_t buflen;
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret = 0;
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
+ CFH->private_data = NULL;
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
+ if (fp)
+ ret = fclose(fp);
- free(buf);
+ return ret;
}
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+
+static int
+eof_none(CompressFileHandle *CFH)
{
- cs->writeF(AH, data, dLen);
+ return feof((FILE *) CFH->private_data);
}
-
-/*----------------------
- * Compressed stream API
- *----------------------
- */
-
-/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
- */
-struct cfp
+static int
+open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
{
- pg_compress_specification compression_spec;
- void *fp;
-};
+ Assert(CFH->private_data == NULL);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (fd >= 0)
+ CFH->private_data = fdopen(dup(fd), mode);
+ else
+ CFH->private_data = fopen(path, mode);
-/* free() without changing errno; useful in several places below */
-static void
-free_keep_errno(void *p)
-{
- int save_errno = errno;
+ if (CFH->private_data == NULL)
+ return 1;
- free(p);
- errno = save_errno;
+ return 0;
}
-/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_read(const char *path, const char *mode)
+static int
+open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
{
- cfp *fp;
-
- pg_compress_specification compression_spec = {0};
+ Assert(CFH->private_data == NULL);
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
+ CFH->private_data = fopen(path, mode);
+ if (CFH->private_data == NULL)
+ return 1;
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
+ return 0;
}
-/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static void
+InitCompressNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
{
- cfp *fp;
-
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
+ CFH->open_func = open_none;
+ CFH->open_write_func = open_write_none;
+ CFH->read_func = read_none;
+ CFH->write_func = write_none;
+ CFH->gets_func = gets_none;
+ CFH->getc_func = getc_none;
+ CFH->close_func = close_none;
+ CFH->eof_func = eof_none;
+ CFH->get_error_func = get_error_none;
+
+ CFH->private_data = NULL;
}
/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
+ * Public interface
*/
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compression_spec)
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compression_spec)
{
- cfp *fp = pg_malloc(sizeof(cfp));
+ CompressFileHandle *CFH;
- fp->compression_spec = compression_spec;
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
switch (compression_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- if (fd >= 0)
- fp->fp = fdopen(fd, mode);
- else
- fp->fp = fopen(path, mode);
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-
+ InitCompressNone(CFH, compression_spec);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /*
- * user has specified a compression level, so tell zlib to use
- * it
- */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode_compression);
- else
- fp->fp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode);
- else
- fp->fp = gzopen(path, mode);
- }
-
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
+ InitCompressGzip(CFH, compression_spec);
break;
case PG_COMPRESSION_LZ4:
pg_fatal("compression with %s is not yet supported", "LZ4");
@@ -594,266 +334,88 @@ cfopen_internal(const char *path, int fd, const char *mode,
break;
}
- return fp;
+ return CFH;
}
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+/*
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
+ *
+ * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (i
+ * 'path' doesn't already have it) and try again. So if you pass "foo" as
+ * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
+ * order.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- return cfopen_internal(path, -1, mode, compression_spec);
-}
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec)
-{
- return cfopen_internal(NULL, fd, mode, compression_spec);
-}
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
-int
-cfread(void *ptr, int size, cfp *fp)
-{
- int ret = 0;
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
- if (size == 0)
- return 0;
+ fname = strdup(path);
- switch (fp->compression_spec.algorithm)
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ else
{
- case PG_COMPRESSION_NONE:
- ret = fread(ptr, 1, size, (FILE *) fp->fp);
- if (ret != size && !feof((FILE *) fp->fp))
- READ_ERROR_EXIT((FILE *) fp->fp);
+ bool exists;
- break;
- case PG_COMPRESSION_GZIP:
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not build with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- ret = gzread((gzFile) fp->fp, ptr, size);
- if (ret != size && !gzeof((gzFile) fp->fp))
- {
- int errnum;
- const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
-
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-int
-cfwrite(const void *ptr, int size, cfp *fp)
-{
- int ret = 0;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fwrite(ptr, 1, size, (FILE *) fp->fp);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzwrite((gzFile) fp->fp, ptr, size);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-int
-cfgetc(cfp *fp)
-{
- int ret = 0;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgetc((FILE *) fp->fp);
- if (ret == EOF)
- READ_ERROR_EXIT((FILE *) fp->fp);
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgetc((gzFile) fp->fp);
- if (ret == EOF)
- {
- if (!gzeof((gzFile) fp->fp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ }
#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-char *
-cfgets(cfp *fp, char *buf, int len)
-{
- char *ret = NULL;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgets(buf, len, (FILE *) fp->fp);
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgets((gzFile) fp->fp, buf, len);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-int
-cfclose(cfp *fp)
-{
- int ret = 0;
-
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
}
- switch (fp->compression_spec.algorithm)
+ CFH = InitCompressFileHandle(compression_spec);
+ if (CFH->open_func(fname, -1, mode, CFH))
{
- case PG_COMPRESSION_NONE:
- ret = fclose((FILE *) fp->fp);
- fp->fp = NULL;
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzclose((gzFile) fp->fp);
- fp->fp = NULL;
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
+ free_keep_errno(CFH);
+ CFH = NULL;
}
+ free_keep_errno(fname);
- free_keep_errno(fp);
-
- return ret;
+ return CFH;
}
int
-cfeof(cfp *fp)
+DestroyCompressFileHandle(CompressFileHandle *CFH)
{
int ret = 0;
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = feof((FILE *) fp->fp);
+ if (CFH->private_data)
+ ret = CFH->close_func(CFH);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzeof((gzFile) fp->fp);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
+ free_keep_errno(CFH);
return ret;
}
-
-const char *
-get_cfp_error(cfp *fp)
-{
- if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- int errnum;
- const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
-
- if (errnum != Z_ERRNO)
- return errmsg;
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
-
- return strerror(errno);
-}
-
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
-{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
-
- if (filenamelen < suffixlen)
- return 0;
-
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
-}
-
-#endif
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 6fad6c2cd5..62e3da1b1d 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -37,34 +37,63 @@ typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ ReadFunc readF;
+ WriteFunc writeF;
+
+ pg_compress_specification compression_spec;
+ void *private_data;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ int (*open_func) (const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH);
+ int (*open_write_func) (const char *path, const char *mode,
+ CompressFileHandle *cxt);
+ size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
+ size_t (*write_func) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
+ int (*getc_func) (CompressFileHandle *CFH);
+ int (*eof_func) (CompressFileHandle *CFH);
+ int (*close_func) (CompressFileHandle *CFH);
+ const char *(*get_error_func) (CompressFileHandle *CFH);
+
+ pg_compress_specification compression_spec;
+ void *private_data;
+};
-typedef struct cfp cfp;
+extern CompressFileHandle *InitCompressFileHandle(
+ const pg_compress_specification compression_spec);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- pg_compress_specification compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int DestroyCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index d96e566846..0c73a4707e 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,5 +1,6 @@
pg_dump_common_sources = files(
'compress_io.c',
+ 'compress_gzip.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index fb94317ad9..1f207c6f4d 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -355,7 +355,7 @@ RestoreArchive(Archive *AHX)
bool parallel_mode;
bool supports_compression;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1127,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1143,9 +1143,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1502,6 +1503,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1524,33 +1526,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1689,7 +1690,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2031,6 +2036,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2061,26 +2078,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2178,6 +2181,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2233,7 +2237,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3646,12 +3653,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3722,10 +3724,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
@@ -3737,10 +3740,17 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
+ {
+ bool unsupported = false;
+
#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ unsupported = true;
#endif
+ if (unsupported)
+ pg_fatal("archive is compressed, but this installation does not support compression");
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 4725e49747..18b38c17ab 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,13 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add
+ * compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index d1e54644a9..512ab043af 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index f6aee775eb..b6d025576f 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read_func(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (DestroyCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof_func(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (DestroyCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write_func(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc_func(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write_func(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read_func(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compression_spec);
+ if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
+ if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write_func(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (DestroyCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 1c7fc728c2..39daa1fc43 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -94,7 +94,7 @@ my %pgdump_runs = (
command => [
'pg_restore', '-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip-compressed'
},
},
@@ -239,8 +239,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default if available',
},
},
@@ -264,8 +264,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
glob_patterns => [
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index b393f2a2ea..8805237edb 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -150,6 +150,7 @@ do
# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.
+ test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 60c71d05fe..81a451641a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -428,6 +428,7 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
+CompressFileHandle
CompressionLocation
CompressorState
ComputeXidHorizonsResult
@@ -1034,6 +1035,7 @@ GucStack
GucStackState
GucStringAssignHook
GucStringCheckHook
+GzipCompressorState
HANDLE
HASHACTION
HASHBUCKET
--
2.34.1
v18-0003-Add-LZ4-compression-in-pg_-dump-restore.patchtext/x-patch; name=v18-0003-Add-LZ4-compression-in-pg_-dump-restore.patchDownload
From 03c101e84b74078b3bb4a7b85637db71157fd2be Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 19 Dec 2022 15:43:56 +0000
Subject: [PATCH v18 3/3] Add LZ4 compression in pg_{dump|restore}
Within compress_lz4.{c,h} the streaming API and a file API compression is
implemented.. The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(), it has been
implemented localy.
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 11 +-
src/bin/pg_dump/compress_lz4.c | 618 +++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 22 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 14 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 1 +
11 files changed, 756 insertions(+), 21 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e1..49d218905f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 29eab02d37..28c1fc27cc 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -25,6 +26,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
compress_gzip.o \
+ compress_lz4.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 4fe5b262ad..fa971fa95c 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -60,6 +60,7 @@
#include "compress_io.h"
#include "compress_gzip.h"
+#include "compress_lz4.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -137,7 +138,7 @@ AllocateCompressor(const pg_compress_specification compression_spec,
InitCompressorGzip(cs, compression_spec);
break;
case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
+ InitCompressorLZ4(cs, compression_spec);
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
@@ -327,7 +328,7 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
InitCompressGzip(CFH, compression_spec);
break;
case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
+ InitCompressLZ4(CFH, compression_spec);
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
@@ -341,12 +342,12 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
* Open a file for reading. 'path' is the file to open, and 'mode' should
* be either "r" or "rb".
*
- * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (i
+ * If the file at 'path' does not exist, we append the ".gz" suffix (if
* 'path' doesn't already have it) and try again. So if you pass "foo" as
- * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
- * order.
+ * 'path', this will open either "foo" or "foo.gz", trying in that order.
*
* On failure, return NULL with an error code in errno.
+ *
*/
CompressFileHandle *
InitDiscoverCompressFileHandle(const char *path, const char *mode)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..c97e16187a
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,618 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.c
+ * Routines for archivers to write an uncompressed or compressed data
+ * stream.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private_data;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private_data = NULL;
+ }
+}
+
+
+/* Public routines that support LZ4 compressed data I/O */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ cs->compression_spec = compression_spec;
+
+ /* Will be lazy init'd */
+ cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the nessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurance of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (!fs->inited && LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verfiy that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (eol_flag == 0) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag == true &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the begining of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int remaining = size;
+
+ if (!fs->inited && LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = errno ? : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ LZ4File *lz4fp;
+
+ CFH->open_func = LZ4File_open;
+ CFH->open_write_func = LZ4File_open_write;
+ CFH->read_func = LZ4File_read;
+ CFH->write_func = LZ4File_write;
+ CFH->gets_func = LZ4File_gets;
+ CFH->getc_func = LZ4File_getc;
+ CFH->eof_func = LZ4File_eof;
+ CFH->close_func = LZ4File_close;
+ CFH->get_error_func = LZ4File_get_error;
+
+ CFH->compression_spec = compression_spec;
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (CFH->compression_spec.level >= 0)
+ lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
+
+ CFH->private_data = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..74595db1b9
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.h
+ * Interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 0c73a4707e..b27e92ffd0 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,6 +1,7 @@
pg_dump_common_sources = files(
'compress_io.c',
'compress_gzip.c',
+ 'compress_lz4.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
@@ -15,7 +16,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -83,7 +84,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 1f207c6f4d..119b7f2553 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -395,6 +395,10 @@ RestoreArchive(Archive *AHX)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
#endif
if (supports_compression == false)
pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
@@ -2074,7 +2078,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2084,6 +2088,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -3747,6 +3755,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
unsupported = true;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ unsupported = true;
#endif
if (unsupported)
pg_fatal("archive is compressed, but this installation does not support compression");
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 44d957c038..1bb874b8e3 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 39daa1fc43..d3f28dfba7 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -139,6 +139,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files where compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4175,11 +4249,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index 8805237edb..d44ce64fd7 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -152,6 +152,7 @@ do
# as field names, which is unfortunate but we won't change it now.
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_lz4.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
test "$f" = src/bin/pg_dump/pg_dump.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 81a451641a..0e25c7f58a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1381,6 +1381,7 @@ LWLock
LWLockHandle
LWLockMode
LWLockPadded
+LZ4CompressorState
LZ4F_compressionContext_t
LZ4F_decompressOptions_t
LZ4F_decompressionContext_t
--
2.34.1
On Mon, Dec 19, 2022 at 05:03:21PM +0000, gkokolatos@pm.me wrote:
001 still doesn't compile on freebsd, and 002 doesn't compile on
windows. Have you checked test results from cirrusci on your private
github account ?There are still known gaps in 0002 and 0003, for example documentation,
and I have not been focusing too much on those. You are right, it is helpful
and kind to try to reduce the noise. The attached should have hopefully
tackled the ci errors.
Yep. Are you using cirrusci under your github account ?
FYI, I have re-added an entry to the CF app to get some automated
coverage:
https://commitfest.postgresql.org/41/3571/Much obliged. Should I change the state to "ready for review" when post a
new version or should I leave that to the senior personnel?
It's better to update it to reflect what you think its current status
is. If you think it's ready for review.
002 breaks "pg_dump -Fc -Z2" because (I think) AllocateCompressor()
doesn't store the passed-in compression_spec.I am afraid I have not been able to reproduce this error. I tried both
debian and freebsd after I addressed the compilation warnings. Which
error did you get? Is it still present in the attached?
It's not that there's an error - it's that compression isn't working.
$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z1 -Fp regression |wc -c
659956
$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z2 -Fp regression |wc -c
637192
$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z1 -Fc regression |wc -c
1954890
$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z2 -Fc regression |wc -c
1954890
--
Justin
On Mon, Dec 19, 2022 at 01:06:00PM +0900, Michael Paquier wrote:
On Sat, Dec 17, 2022 at 05:26:15PM -0600, Justin Pryzby wrote:
001: still refers to "gzip", which is correct for -Fp and -Fd but not
for -Fc, for which it's more correct to say "zlib".Or should we begin by changing all these existing "not built with zlib
support" error strings to the more generic "this build does not
support compression with %s" to reduce the number of messages to
translate? That would bring consistency with the other tools dealing
with compression.
That's fine, but it doesn't touch on the issue I'm talking about, which
is that zlib != gzip.
BTW I noticed that that also affects the pg_dump file itself; 002
changes the file format to say "gzip", but that's wrong for -Fc, which
does not use gzip headers, which could be surprising to someone who
specified "gzip".
--
Justin
------- Original Message -------
On Monday, December 19th, 2022 at 6:27 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Mon, Dec 19, 2022 at 05:03:21PM +0000, gkokolatos@pm.me wrote:
001 still doesn't compile on freebsd, and 002 doesn't compile on
windows. Have you checked test results from cirrusci on your private
github account ?There are still known gaps in 0002 and 0003, for example documentation,
and I have not been focusing too much on those. You are right, it is helpful
and kind to try to reduce the noise. The attached should have hopefully
tackled the ci errors.Yep. Are you using cirrusci under your github account ?
Thank you. To be very honest, I am not using github exclusively to post patches.
Sometimes I do, sometimes I do not. Is github a requirement?
To answer your question, some of my github accounts are integrated with cirrusci,
others are not.
The current cfbot build is green for what is worth.
https://cirrus-ci.com/build/5934319840002048
FYI, I have re-added an entry to the CF app to get some automated
coverage:
https://commitfest.postgresql.org/41/3571/Much obliged. Should I change the state to "ready for review" when post a
new version or should I leave that to the senior personnel?It's better to update it to reflect what you think its current status
is. If you think it's ready for review.
Thank you.
002 breaks "pg_dump -Fc -Z2" because (I think) AllocateCompressor()
doesn't store the passed-in compression_spec.I am afraid I have not been able to reproduce this error. I tried both
debian and freebsd after I addressed the compilation warnings. Which
error did you get? Is it still present in the attached?It's not that there's an error - it's that compression isn't working.
$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z1 -Fp regression |wc -c
659956
$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z2 -Fp regression |wc -c
637192$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z1 -Fc regression |wc -c
1954890
$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z2 -Fc regression |wc -c
1954890
Thank you. Now I understand what you mean. Trying the same on top of v18-0003
on Ubuntu 22.04 yields:
$ for compression in none gzip:1 gzip:6 gzip:9; do \
pg_dump --format=custom --compress="$compression" -f regression."$compression".dump -d regression; \
wc -c regression."$compression".dump; \
done;
14963753 regression.none.dump
3600183 regression.gzip:1.dump
3223755 regression.gzip:6.dump
3196903 regression.gzip:9.dump
and on FreeBSD 13.1
$ for compression in none gzip:1 gzip:6 gzip:9; do \
pg_dump --format=custom --compress="$compression" -f regression."$compression".dump -d regression; \
wc -c regression."$compression".dump; \
done;
14828822 regression.none.dump
3584304 regression.gzip:1.dump
3208548 regression.gzip:6.dump
3182044 regression.gzip:9.dump
Although there are some variations between the installations, within the same
installation the size of the dump file is shrinking as expected.
Investigating a bit further on the issue, you are correct in identifying an
issue in v17. Up until v16, the compressor function looked like:
+InitCompressorGzip(CompressorState *cs, int compressionLevel)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+ gzipcs->compressionLevel = compressionLevel;
V17 considered that more options could become available in the future
and changed the signature of the relevant Init functions to:
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
V18 reinstated the assignment in similar fashion to InitCompressorNone and
InitCompressorLz4:
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ cs->compression_spec = compression_spec;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
A test case can be added which performs a check similar to the loop above.
Create a custom dump with the least and most compression for each method.
Then verify that the output sizes differ as expected. This addition could
become 0001 in the current series.
Thoughts?
Cheers,
//Georgios
Show quoted text
--
Justin
On Tue, Dec 20, 2022 at 11:19:15AM +0000, gkokolatos@pm.me wrote:
------- Original Message -------
On Monday, December 19th, 2022 at 6:27 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:On Mon, Dec 19, 2022 at 05:03:21PM +0000, gkokolatos@pm.me wrote:
001 still doesn't compile on freebsd, and 002 doesn't compile on
windows. Have you checked test results from cirrusci on your private
github account ?There are still known gaps in 0002 and 0003, for example documentation,
and I have not been focusing too much on those. You are right, it is helpful
and kind to try to reduce the noise. The attached should have hopefully
tackled the ci errors.Yep. Are you using cirrusci under your github account ?
Thank you. To be very honest, I am not using github exclusively to post patches.
Sometimes I do, sometimes I do not. Is github a requirement?
Github isn't a requirement for postgres (but cirrusci only supports
github). I wasn't not trying to say that it's required, only trying to
make sure that you (and others) know that it's available, since our
cirrus.yml is relatively new.
002 breaks "pg_dump -Fc -Z2" because (I think) AllocateCompressor()
doesn't store the passed-in compression_spec.I am afraid I have not been able to reproduce this error. I tried both
debian and freebsd after I addressed the compilation warnings. Which
error did you get? Is it still present in the attached?It's not that there's an error - it's that compression isn't working.
$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z1 -Fp regression |wc -c
659956
$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z2 -Fp regression |wc -c
637192$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z1 -Fc regression |wc -c
1954890
$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z2 -Fc regression |wc -c
1954890Thank you. Now I understand what you mean. Trying the same on top of v18-0003
on Ubuntu 22.04 yields:
You're right; this seems to be fixed in v18. Thanks.
It looks like I'd forgotten to run "meson test tmp_install", so had
retested v17...
--
Justin
------- Original Message -------
On Tuesday, December 20th, 2022 at 4:26 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Tue, Dec 20, 2022 at 11:19:15AM +0000, gkokolatos@pm.me wrote:
------- Original Message -------
On Monday, December 19th, 2022 at 6:27 PM, Justin Pryzby pryzby@telsasoft.com wrote:On Mon, Dec 19, 2022 at 05:03:21PM +0000, gkokolatos@pm.me wrote:
001 still doesn't compile on freebsd, and 002 doesn't compile on
windows. Have you checked test results from cirrusci on your private
github account ?There are still known gaps in 0002 and 0003, for example documentation,
and I have not been focusing too much on those. You are right, it is helpful
and kind to try to reduce the noise. The attached should have hopefully
tackled the ci errors.Yep. Are you using cirrusci under your github account ?
Thank you. To be very honest, I am not using github exclusively to post patches.
Sometimes I do, sometimes I do not. Is github a requirement?Github isn't a requirement for postgres (but cirrusci only supports
github). I wasn't not trying to say that it's required, only trying to
make sure that you (and others) know that it's available, since our
cirrus.yml is relatively new.
Got it. Thank you very much for spreading the word. It is a useful feature which
should be known.
002 breaks "pg_dump -Fc -Z2" because (I think) AllocateCompressor()
doesn't store the passed-in compression_spec.I am afraid I have not been able to reproduce this error. I tried both
debian and freebsd after I addressed the compilation warnings. Which
error did you get? Is it still present in the attached?It's not that there's an error - it's that compression isn't working.
$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z1 -Fp regression |wc -c
659956
$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z2 -Fp regression |wc -c
637192$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z1 -Fc regression |wc -c
1954890
$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z2 -Fc regression |wc -c
1954890Thank you. Now I understand what you mean. Trying the same on top of v18-0003
on Ubuntu 22.04 yields:You're right; this seems to be fixed in v18. Thanks.
Great. Still there was a bug in v17 which you discovered. Thank you for the review
effort.
Please find in the attached v19 an extra check right before calling deflateInit().
This check will verify that only compressed output will be generated for this
method.
Also v19 is rebased on top f450695e889 and applies cleanly.
Cheers.
//Georgios
Show quoted text
--
Justin
Attachments:
v19-0001-Prepare-pg_dump-internals-for-additional-compres.patchtext/x-patch; name=v19-0001-Prepare-pg_dump-internals-for-additional-compres.patchDownload
From 6e3c760dfef68e805b26fba1b499b9f5ef46d86b Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 21 Dec 2022 09:49:23 +0000
Subject: [PATCH v19 1/3] Prepare pg_dump internals for additional compression
methods.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it and is using it throughout.
---
src/bin/pg_dump/compress_io.c | 389 +++++++++++++++++++--------
src/bin/pg_dump/pg_backup_archiver.c | 128 +++------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-
3 files changed, 318 insertions(+), 226 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 26967eb618..4bad69f4bd 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,10 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
/*----------------------
* Compressor API
*----------------------
@@ -128,15 +132,24 @@ ReadDataFromArchive(ArchiveHandle *AH,
const pg_compress_specification compression_spec,
ReadFunc readF)
{
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ switch (compression_spec.algorithm)
{
+ case PG_COMPRESSION_NONE:
+ ReadDataFromArchiveNone(AH, readF);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
+ ReadDataFromArchiveZlib(AH, readF);
#else
- pg_fatal("this build does not support compression with %s", "gzip");
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
}
@@ -149,6 +162,9 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
{
switch (cs->compression_spec.algorithm)
{
+ case PG_COMPRESSION_NONE:
+ WriteDataToArchiveNone(AH, cs, data, dLen);
+ break;
case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
WriteDataToArchiveZlib(AH, cs, data, dLen);
@@ -156,13 +172,11 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
pg_fatal("this build does not support compression with %s", "gzip");
#endif
break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
case PG_COMPRESSION_LZ4:
- /* fallthrough */
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
}
}
@@ -173,10 +187,26 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
+ switch (cs->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
+ EndCompressorZlib(AH, cs);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
free(cs);
}
@@ -391,10 +421,8 @@ WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
*/
struct cfp
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
+ pg_compress_specification compression_spec;
+ void *fp;
};
#ifdef HAVE_LIBZ
@@ -490,127 +518,202 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_specification compression_spec)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ fp->compression_spec = compression_spec;
+
+ switch (compression_spec.algorithm)
{
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
+ case PG_COMPRESSION_NONE:
+ if (fd >= 0)
+ fp->fp = fdopen(fd, mode);
+ else
+ fp->fp = fopen(path, mode);
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
- }
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ if (compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use
+ * it
+ */
+ char mode_compression[32];
+
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, compression_spec.level);
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode_compression);
+ else
+ fp->fp = gzopen(path, mode_compression);
+ }
+ else
+ {
+ /* don't specify a level, just use the zlib default */
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode);
+ else
+ fp->fp = gzopen(path, mode);
+ }
- fp->uncompressedfp = NULL;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
- else
- {
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
- fp->uncompressedfp = fopen(path, mode);
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
return fp;
}
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
{
- int ret;
+ int ret = 0;
if (size == 0)
return 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ case PG_COMPRESSION_NONE:
+ ret = fread(ptr, 1, size, (FILE *) fp->fp);
+ if (ret != size && !feof((FILE *) fp->fp))
+ READ_ERROR_EXIT((FILE *) fp->fp);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
- else
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzread((gzFile) fp->fp, ptr, size);
+ if (ret != size && !gzeof((gzFile) fp->fp))
+ {
+ int errnum;
+ const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
+
return ret;
}
int
cfwrite(const void *ptr, int size, cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fwrite(ptr, 1, size, (FILE *) fp->fp);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
+ ret = gzwrite((gzFile) fp->fp, ptr, size);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
int
cfgetc(cfp *fp)
{
- int ret;
+ int ret = 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
- {
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fgetc((FILE *) fp->fp);
+ if (ret == EOF)
+ READ_ERROR_EXIT((FILE *) fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzgetc((gzFile) fp->fp);
+ if (ret == EOF)
+ {
+ if (!gzeof((gzFile) fp->fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
return ret;
@@ -619,65 +722,119 @@ cfgetc(cfp *fp)
char *
cfgets(cfp *fp, char *buf, int len)
{
+ char *ret = NULL;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fgets(buf, len, (FILE *) fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
+ ret = gzgets((gzFile) fp->fp, buf, len);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return fgets(buf, len, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
int
cfclose(cfp *fp)
{
- int result;
+ int ret = 0;
if (fp == NULL)
{
errno = EBADF;
return EOF;
}
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+
+ switch (fp->compression_spec.algorithm)
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fclose((FILE *) fp->fp);
+ fp->fp = NULL;
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzclose((gzFile) fp->fp);
+ fp->fp = NULL;
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
+
free_keep_errno(fp);
- return result;
+ return ret;
}
int
cfeof(cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = feof((FILE *) fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
+ ret = gzeof((gzFile) fp->fp);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return feof(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
const char *
get_cfp_error(cfp *fp)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
+#ifdef HAVE_LIBZ
int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
if (errnum != Z_ERRNO)
return errmsg;
- }
+#else
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
+ }
+
return strerror(errno);
}
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 7f7a0f1ce7..fb94317ad9 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -362,8 +353,9 @@ RestoreArchive(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
+ bool supports_compression;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +383,24 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
+ supports_compression = true;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE &&
+ AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+#ifndef HAVE_LIBZ
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+ if (supports_compression == false)
+ pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1128,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1503,58 +1502,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1565,33 +1538,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1715,22 +1679,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2219,6 +2178,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2272,8 +2232,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b..4725e49747 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.34.1
v19-0002-Introduce-Compressor-API-in-pg_dump.patchtext/x-patch; name=v19-0002-Introduce-Compressor-API-in-pg_dump.patchDownload
From 5406ebc73d7e6c068b9cb10a05210ee852685b73 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 21 Dec 2022 09:49:31 +0000
Subject: [PATCH v19 2/3] Introduce Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for opening, writing, etc. The implementor of a new
compression method is now able to "simply" just add those definitions.
Custom compressed archives now need to store the compression algorithm in their
header. This requires a bump in the version number. The level of compression
is no longer stored in the dump as it is irrelevant.
---
src/bin/pg_dump/Makefile | 1 +
src/bin/pg_dump/compress_gzip.c | 405 ++++++++++++
src/bin/pg_dump/compress_gzip.h | 22 +
src/bin/pg_dump/compress_io.c | 914 +++++++-------------------
src/bin/pg_dump/compress_io.h | 71 +-
src/bin/pg_dump/meson.build | 1 +
src/bin/pg_dump/pg_backup_archiver.c | 102 +--
src/bin/pg_dump/pg_backup_archiver.h | 5 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 94 +--
src/bin/pg_dump/t/002_pg_dump.pl | 10 +-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 2 +
13 files changed, 852 insertions(+), 799 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 9dc5a784dd..29eab02d37 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,6 +24,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..60fb95d7b7
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,405 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.c
+ * Routines for archivers to write an uncompressed or compressed data
+ * stream.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_gzip.h"
+#include "pg_backup_utils.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ /*
+ * A level of zero simply copies the input one block at the time. This
+ * is probably not what the user wanted when calling this interface.
+ */
+ if (cs->compression_spec.level == 0)
+ pg_fatal("requested to compress the archive yet no level was specified");
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ cs->compression_spec = compression_spec;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
+ cs->private_data = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ size_t ret;
+
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzwrite(gzfp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzgets(gzfp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int save_errno;
+ int ret;
+
+ CFH->private_data = NULL;
+
+ ret = gzclose(gzfp);
+
+ save_errno = errno;
+ errno = save_errno;
+
+ return ret;
+}
+
+static int
+Gzip_eof(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzeof(gzfp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ gzFile gzfp;
+ char mode_compression[32];
+
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
+ else
+ gzfp = gzopen(path, mode_compression);
+
+ if (gzfp == NULL)
+ return 1;
+
+ CFH->private_data = gzfp;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ CFH->open_func = Gzip_open;
+ CFH->open_write_func = Gzip_open_write;
+ CFH->read_func = Gzip_read;
+ CFH->write_func = Gzip_write;
+ CFH->gets_func = Gzip_gets;
+ CFH->getc_func = Gzip_getc;
+ CFH->close_func = Gzip_close;
+ CFH->eof_func = Gzip_eof;
+ CFH->get_error_func = Gzip_get_error;
+
+ CFH->compression_spec = compression_spec;
+
+ CFH->private_data = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..6dfd0eb04d
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.h
+ * Interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 4bad69f4bd..6975ea6920 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -9,42 +9,44 @@
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
+ * underlying stream. The second API is a wrapper around fopen and
* friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
+ * the possible compression. The second API is aimed for the resulting
+ * files can be easily manipulated with an external compression utility
+ * program.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
+ * AllocateCompressor, writeData, and EndCompressor. First you call
+ * AllocateCompressor, then write all the data by calling writeData as many
+ * times as needed, and finally EndCompressor. writeData will call the
+ * WriteFunc that was provided to AllocateCompressor for each chunk of
+ * compressed data.
*
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
- *
- * The interface is the same for compressed and uncompressed streams.
+ * The interface for reading an archive consists of the same three functions:
+ * AllocateCompressor, readData, and EndCompressor. First you call
+ * AllocateCompressor, then read all the data by calling readData to read the
+ * whole compressed stream which repeatedly calls the given ReadFunc. ReadFunc
+ * returns the compressed data chunk at a time, and readData decompresses it
+ * and passes the decompressed data to ahwrite(), until ReadFunc returns 0 to
+ * signal EOF. The interface is the same for compressed and uncompressed
+ * streams.
*
* Compressed stream API
* ----------------------
*
* The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * libz's gzopen() APIs and custom LZ4 calls which provide similar
+ * functionality. It allows you to use the same functions for compressed and
+ * uncompressed streams. cfopen_read() first tries to open the file with given
+ * name, and if it fails, it tries to open the same file with the .gz suffix,
+ * failing that it tries to open the same file with the .lz4 suffix.
+ * cfopen_write() opens a file for writing, an extra argument specifies the
+ * method to use should the file be compressed, and adds the appropriate
+ * suffix, .gz or .lz4, to the filename if so. This allows you to easily handle
+ * both compressed and uncompressed files.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -53,7 +55,11 @@
*/
#include "postgres_fe.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
#include "compress_io.h"
+#include "compress_gzip.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -65,84 +71,69 @@
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
+
+ free(buf);
+}
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+static void
+InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+
+ cs->compression_spec = compression_spec;
+}
/* Public interface routines */
/* Allocate a new compressor */
CompressorState *
AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
- cs->compression_spec = compression_spec;
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
-}
-
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF)
-{
switch (compression_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- ReadDataFromArchiveNone(AH, readF);
+ InitCompressorNone(cs, compression_spec);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
+ InitCompressorGzip(cs, compression_spec);
break;
case PG_COMPRESSION_LZ4:
pg_fatal("compression with %s is not yet supported", "LZ4");
@@ -151,34 +142,8 @@ ReadDataFromArchive(ArchiveHandle *AH,
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
}
-}
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
+ return cs;
}
/*
@@ -187,401 +152,178 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- EndCompressorZlib(AH, cs);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- free(cs);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
+/*----------------------
+ * Compressed stream API
+ *----------------------
*/
-static void
-InitCompressorZlib(CompressorState *cs, int level)
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
+
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
}
+/* free() without changing errno; useful in several places below */
static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
+free_keep_errno(void *p)
{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
+ int save_errno = errno;
- free(cs->zlibOut);
- free(cs->zp);
+ free(p);
+ errno = save_errno;
}
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
+/*
+ * Compression None implementation
+ */
+static size_t
+read_none(void *ptr, size_t size, CompressFileHandle *CFH)
{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
+ FILE *fp = (FILE *) CFH->private_data;
+ size_t ret;
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
+ if (size == 0)
+ return 0;
- if (res == Z_STREAM_END)
- break;
- }
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
+
+ return ret;
}
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+static size_t
+write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
+ return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
}
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
+static const char *
+get_error_none(CompressFileHandle *CFH)
{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
+ return strerror(errno);
+}
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
+static char *
+gets_none(char *ptr, int size, CompressFileHandle *CFH)
+{
+ return fgets(ptr, size, (FILE *) CFH->private_data);
+}
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
+static int
+getc_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret;
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
+ ret = fgetc(fp);
+ if (ret == EOF)
{
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
}
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
+ return ret;
}
-#endif /* HAVE_LIBZ */
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
+static int
+close_none(CompressFileHandle *CFH)
{
- size_t cnt;
- char *buf;
- size_t buflen;
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret = 0;
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
+ CFH->private_data = NULL;
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
+ if (fp)
+ ret = fclose(fp);
- free(buf);
+ return ret;
}
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+
+static int
+eof_none(CompressFileHandle *CFH)
{
- cs->writeF(AH, data, dLen);
+ return feof((FILE *) CFH->private_data);
}
-
-/*----------------------
- * Compressed stream API
- *----------------------
- */
-
-/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
- */
-struct cfp
+static int
+open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
{
- pg_compress_specification compression_spec;
- void *fp;
-};
+ Assert(CFH->private_data == NULL);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (fd >= 0)
+ CFH->private_data = fdopen(dup(fd), mode);
+ else
+ CFH->private_data = fopen(path, mode);
-/* free() without changing errno; useful in several places below */
-static void
-free_keep_errno(void *p)
-{
- int save_errno = errno;
+ if (CFH->private_data == NULL)
+ return 1;
- free(p);
- errno = save_errno;
+ return 0;
}
-/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_read(const char *path, const char *mode)
+static int
+open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
{
- cfp *fp;
-
- pg_compress_specification compression_spec = {0};
+ Assert(CFH->private_data == NULL);
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
+ CFH->private_data = fopen(path, mode);
+ if (CFH->private_data == NULL)
+ return 1;
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
+ return 0;
}
-/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static void
+InitCompressNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
{
- cfp *fp;
-
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
+ CFH->open_func = open_none;
+ CFH->open_write_func = open_write_none;
+ CFH->read_func = read_none;
+ CFH->write_func = write_none;
+ CFH->gets_func = gets_none;
+ CFH->getc_func = getc_none;
+ CFH->close_func = close_none;
+ CFH->eof_func = eof_none;
+ CFH->get_error_func = get_error_none;
+
+ CFH->private_data = NULL;
}
/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
+ * Public interface
*/
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compression_spec)
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compression_spec)
{
- cfp *fp = pg_malloc(sizeof(cfp));
+ CompressFileHandle *CFH;
- fp->compression_spec = compression_spec;
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
switch (compression_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- if (fd >= 0)
- fp->fp = fdopen(fd, mode);
- else
- fp->fp = fopen(path, mode);
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-
+ InitCompressNone(CFH, compression_spec);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /*
- * user has specified a compression level, so tell zlib to use
- * it
- */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode_compression);
- else
- fp->fp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode);
- else
- fp->fp = gzopen(path, mode);
- }
-
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
+ InitCompressGzip(CFH, compression_spec);
break;
case PG_COMPRESSION_LZ4:
pg_fatal("compression with %s is not yet supported", "LZ4");
@@ -591,266 +333,88 @@ cfopen_internal(const char *path, int fd, const char *mode,
break;
}
- return fp;
+ return CFH;
}
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+/*
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
+ *
+ * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (i
+ * 'path' doesn't already have it) and try again. So if you pass "foo" as
+ * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
+ * order.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- return cfopen_internal(path, -1, mode, compression_spec);
-}
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec)
-{
- return cfopen_internal(NULL, fd, mode, compression_spec);
-}
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
-int
-cfread(void *ptr, int size, cfp *fp)
-{
- int ret = 0;
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
- if (size == 0)
- return 0;
+ fname = strdup(path);
- switch (fp->compression_spec.algorithm)
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ else
{
- case PG_COMPRESSION_NONE:
- ret = fread(ptr, 1, size, (FILE *) fp->fp);
- if (ret != size && !feof((FILE *) fp->fp))
- READ_ERROR_EXIT((FILE *) fp->fp);
+ bool exists;
- break;
- case PG_COMPRESSION_GZIP:
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not build with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- ret = gzread((gzFile) fp->fp, ptr, size);
- if (ret != size && !gzeof((gzFile) fp->fp))
- {
- int errnum;
- const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
-
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-int
-cfwrite(const void *ptr, int size, cfp *fp)
-{
- int ret = 0;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fwrite(ptr, 1, size, (FILE *) fp->fp);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzwrite((gzFile) fp->fp, ptr, size);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-int
-cfgetc(cfp *fp)
-{
- int ret = 0;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgetc((FILE *) fp->fp);
- if (ret == EOF)
- READ_ERROR_EXIT((FILE *) fp->fp);
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgetc((gzFile) fp->fp);
- if (ret == EOF)
- {
- if (!gzeof((gzFile) fp->fp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ }
#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-char *
-cfgets(cfp *fp, char *buf, int len)
-{
- char *ret = NULL;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgets(buf, len, (FILE *) fp->fp);
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgets((gzFile) fp->fp, buf, len);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-int
-cfclose(cfp *fp)
-{
- int ret = 0;
-
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
}
- switch (fp->compression_spec.algorithm)
+ CFH = InitCompressFileHandle(compression_spec);
+ if (CFH->open_func(fname, -1, mode, CFH))
{
- case PG_COMPRESSION_NONE:
- ret = fclose((FILE *) fp->fp);
- fp->fp = NULL;
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzclose((gzFile) fp->fp);
- fp->fp = NULL;
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
+ free_keep_errno(CFH);
+ CFH = NULL;
}
+ free_keep_errno(fname);
- free_keep_errno(fp);
-
- return ret;
+ return CFH;
}
int
-cfeof(cfp *fp)
+DestroyCompressFileHandle(CompressFileHandle *CFH)
{
int ret = 0;
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = feof((FILE *) fp->fp);
+ if (CFH->private_data)
+ ret = CFH->close_func(CFH);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzeof((gzFile) fp->fp);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
+ free_keep_errno(CFH);
return ret;
}
-
-const char *
-get_cfp_error(cfp *fp)
-{
- if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- int errnum;
- const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
-
- if (errnum != Z_ERRNO)
- return errmsg;
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
-
- return strerror(errno);
-}
-
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
-{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
-
- if (filenamelen < suffixlen)
- return 0;
-
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
-}
-
-#endif
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 6fad6c2cd5..62e3da1b1d 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -37,34 +37,63 @@ typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ ReadFunc readF;
+ WriteFunc writeF;
+
+ pg_compress_specification compression_spec;
+ void *private_data;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ int (*open_func) (const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH);
+ int (*open_write_func) (const char *path, const char *mode,
+ CompressFileHandle *cxt);
+ size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
+ size_t (*write_func) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
+ int (*getc_func) (CompressFileHandle *CFH);
+ int (*eof_func) (CompressFileHandle *CFH);
+ int (*close_func) (CompressFileHandle *CFH);
+ const char *(*get_error_func) (CompressFileHandle *CFH);
+
+ pg_compress_specification compression_spec;
+ void *private_data;
+};
-typedef struct cfp cfp;
+extern CompressFileHandle *InitCompressFileHandle(
+ const pg_compress_specification compression_spec);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- pg_compress_specification compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int DestroyCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 5537cda3cc..714e492c60 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -2,6 +2,7 @@
pg_dump_common_sources = files(
'compress_io.c',
+ 'compress_gzip.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index fb94317ad9..1f207c6f4d 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -355,7 +355,7 @@ RestoreArchive(Archive *AHX)
bool parallel_mode;
bool supports_compression;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1127,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1143,9 +1143,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1502,6 +1503,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1524,33 +1526,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1689,7 +1690,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2031,6 +2036,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2061,26 +2078,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2178,6 +2181,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2233,7 +2237,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3646,12 +3653,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3722,10 +3724,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
@@ -3737,10 +3740,17 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
+ {
+ bool unsupported = false;
+
#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ unsupported = true;
#endif
+ if (unsupported)
+ pg_fatal("archive is compressed, but this installation does not support compression");
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 4725e49747..18b38c17ab 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,13 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add
+ * compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index d1e54644a9..512ab043af 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index f6aee775eb..b6d025576f 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read_func(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (DestroyCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof_func(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (DestroyCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write_func(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc_func(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write_func(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read_func(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compression_spec);
+ if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
+ if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write_func(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (DestroyCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 7c3067a3f4..1110ffa5e2 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -94,7 +94,7 @@ my %pgdump_runs = (
command => [
'pg_restore', '-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip-compressed'
},
},
@@ -239,8 +239,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default if available',
},
},
@@ -264,8 +264,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
glob_patterns => [
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index b393f2a2ea..8805237edb 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -150,6 +150,7 @@ do
# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.
+ test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 60c71d05fe..81a451641a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -428,6 +428,7 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
+CompressFileHandle
CompressionLocation
CompressorState
ComputeXidHorizonsResult
@@ -1034,6 +1035,7 @@ GucStack
GucStackState
GucStringAssignHook
GucStringCheckHook
+GzipCompressorState
HANDLE
HASHACTION
HASHBUCKET
--
2.34.1
v19-0003-Add-LZ4-compression-in-pg_-dump-restore.patchtext/x-patch; name=v19-0003-Add-LZ4-compression-in-pg_-dump-restore.patchDownload
From e211681ca4d5d770d503cee9f5a4c7d44f2b1444 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 21 Dec 2022 09:49:36 +0000
Subject: [PATCH v19 3/3] Add LZ4 compression in pg_{dump|restore}
Within compress_lz4.{c,h} the streaming API and a file API compression is
implemented.. The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(), it has been
implemented localy.
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 11 +-
src/bin/pg_dump/compress_lz4.c | 618 +++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 22 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 14 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 1 +
11 files changed, 756 insertions(+), 21 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e1..49d218905f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 29eab02d37..28c1fc27cc 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -25,6 +26,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
compress_gzip.o \
+ compress_lz4.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 6975ea6920..81760600fc 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -60,6 +60,7 @@
#include "compress_io.h"
#include "compress_gzip.h"
+#include "compress_lz4.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -136,7 +137,7 @@ AllocateCompressor(const pg_compress_specification compression_spec,
InitCompressorGzip(cs, compression_spec);
break;
case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
+ InitCompressorLZ4(cs, compression_spec);
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
@@ -326,7 +327,7 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
InitCompressGzip(CFH, compression_spec);
break;
case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
+ InitCompressLZ4(CFH, compression_spec);
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
@@ -340,12 +341,12 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
* Open a file for reading. 'path' is the file to open, and 'mode' should
* be either "r" or "rb".
*
- * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (i
+ * If the file at 'path' does not exist, we append the ".gz" suffix (if
* 'path' doesn't already have it) and try again. So if you pass "foo" as
- * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
- * order.
+ * 'path', this will open either "foo" or "foo.gz", trying in that order.
*
* On failure, return NULL with an error code in errno.
+ *
*/
CompressFileHandle *
InitDiscoverCompressFileHandle(const char *path, const char *mode)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..c97e16187a
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,618 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.c
+ * Routines for archivers to write an uncompressed or compressed data
+ * stream.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private_data;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private_data = NULL;
+ }
+}
+
+
+/* Public routines that support LZ4 compressed data I/O */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ cs->compression_spec = compression_spec;
+
+ /* Will be lazy init'd */
+ cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the nessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurance of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (!fs->inited && LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verfiy that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (eol_flag == 0) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag == true &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the begining of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int remaining = size;
+
+ if (!fs->inited && LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = errno ? : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ LZ4File *lz4fp;
+
+ CFH->open_func = LZ4File_open;
+ CFH->open_write_func = LZ4File_open_write;
+ CFH->read_func = LZ4File_read;
+ CFH->write_func = LZ4File_write;
+ CFH->gets_func = LZ4File_gets;
+ CFH->getc_func = LZ4File_getc;
+ CFH->eof_func = LZ4File_eof;
+ CFH->close_func = LZ4File_close;
+ CFH->get_error_func = LZ4File_get_error;
+
+ CFH->compression_spec = compression_spec;
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (CFH->compression_spec.level >= 0)
+ lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
+
+ CFH->private_data = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..74595db1b9
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.h
+ * Interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 714e492c60..712e08aa02 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -3,6 +3,7 @@
pg_dump_common_sources = files(
'compress_io.c',
'compress_gzip.c',
+ 'compress_lz4.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
@@ -17,7 +18,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -85,7 +86,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 1f207c6f4d..119b7f2553 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -395,6 +395,10 @@ RestoreArchive(Archive *AHX)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
#endif
if (supports_compression == false)
pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
@@ -2074,7 +2078,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2084,6 +2088,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -3747,6 +3755,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
unsupported = true;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ unsupported = true;
#endif
if (unsupported)
pg_fatal("archive is compressed, but this installation does not support compression");
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 44d957c038..1bb874b8e3 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 1110ffa5e2..d4b9b3652d 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -139,6 +139,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files where compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4175,11 +4249,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index 8805237edb..d44ce64fd7 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -152,6 +152,7 @@ do
# as field names, which is unfortunate but we won't change it now.
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_lz4.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
test "$f" = src/bin/pg_dump/pg_dump.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 81a451641a..0e25c7f58a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1381,6 +1381,7 @@ LWLock
LWLockHandle
LWLockMode
LWLockPadded
+LZ4CompressorState
LZ4F_compressionContext_t
LZ4F_decompressOptions_t
LZ4F_decompressionContext_t
--
2.34.1
On Thu, Dec 22, 2022 at 11:08:59AM -0600, Justin Pryzby wrote:
There's a couple of lz4 bits which shouldn't be present in 002: file
extension and comments.
There were "LZ4" comments and file extension stuff in the preparatory
commit. But now it seems like you *removed* them in the LZ4 commit
(where it actually belongs) rather than *moving* it from the
prior/parent commit *to* the lz4 commit. I recommend to run something
like "git diff @{1}" whenever doing this kind of patch surgery.
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE &&
+ AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
This looks wrong/redundant. The gzip part should be removed, right ?
Maybe other places that check if (compression==PG_COMPRESSION_GZIP)
should maybe change to say compression!=NONE?
_PrepParallelRestore() references ".gz", so I think it needs to be
retrofitted to handle .lz4. Ideally, that's built into a struct or list
of file extensions to try. Maybe compression.h should have a function
to return the file extension of a given algorithm. I'm planning to send
a patch for zstd, and hoping its changes will be minimized by these
preparatory commits.
+ errno = errno ? : ENOSPC;
"?:" is a GNU extension (not the ternary operator, but the ternary
operator with only 2 args). It's not in use anywhere else in postgres.
You could instead write it with 3 "errno"s or as "if (errno==0):
errno=ENOSPC"
You wrote "eol_flag == false" and "eol_flag == 0" and true. But it's
cleaner to test it as a boolean: if (eol_flag) / if (!eol_flag).
Both LZ4File_init() and its callers check "inited". Better to do it in
one place than 3. It's a static function, so I think there's no
performance concern.
Gzip_close() still has a useless save_errno (or rebase issue?).
I think it's confusing to have two functions, one named
InitCompressLZ4() and InitCompressorLZ4().
pg_compress_specification is being passed by value, but I think it
should be passed as a pointer, as is done everywhere else.
pg_compress_algorithm is being writen directly into the pg_dump header.
Currently, I think that's not an externally-visible value (it could be
renumbered, theoretically even in a minor release). Maybe there should
be a "private" enum for encoding the pg_dump header, similar to
WAL_COMPRESSION_LZ4 vs BKPIMAGE_COMPRESS_LZ4 ? Or else a comment there
should warn that the values are encoded in pg_dump, and must never be
changed.
+ Verify that data files where compressed
typo: s/where/were/
Also:
s/occurance/occurrence/
s/begining/beginning/
s/Verfiy/Verify/
s/nessary/necessary/
BTW I noticed that cfdopen() was accidentally committed to compress_io.h
in master without being defined anywhere.
--
Justin
On Wed, 21 Dec 2022 at 15:40, <gkokolatos@pm.me> wrote:
------- Original Message -------
On Tuesday, December 20th, 2022 at 4:26 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:On Tue, Dec 20, 2022 at 11:19:15AM +0000, gkokolatos@pm.me wrote:
------- Original Message -------
On Monday, December 19th, 2022 at 6:27 PM, Justin Pryzby pryzby@telsasoft.com wrote:On Mon, Dec 19, 2022 at 05:03:21PM +0000, gkokolatos@pm.me wrote:
001 still doesn't compile on freebsd, and 002 doesn't compile on
windows. Have you checked test results from cirrusci on your private
github account ?There are still known gaps in 0002 and 0003, for example documentation,
and I have not been focusing too much on those. You are right, it is helpful
and kind to try to reduce the noise. The attached should have hopefully
tackled the ci errors.Yep. Are you using cirrusci under your github account ?
Thank you. To be very honest, I am not using github exclusively to post patches.
Sometimes I do, sometimes I do not. Is github a requirement?Github isn't a requirement for postgres (but cirrusci only supports
github). I wasn't not trying to say that it's required, only trying to
make sure that you (and others) know that it's available, since our
cirrus.yml is relatively new.Got it. Thank you very much for spreading the word. It is a useful feature which
should be known.002 breaks "pg_dump -Fc -Z2" because (I think) AllocateCompressor()
doesn't store the passed-in compression_spec.I am afraid I have not been able to reproduce this error. I tried both
debian and freebsd after I addressed the compilation warnings. Which
error did you get? Is it still present in the attached?It's not that there's an error - it's that compression isn't working.
$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z1 -Fp regression |wc -c
659956
$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z2 -Fp regression |wc -c
637192$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z1 -Fc regression |wc -c
1954890
$ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z2 -Fc regression |wc -c
1954890Thank you. Now I understand what you mean. Trying the same on top of v18-0003
on Ubuntu 22.04 yields:You're right; this seems to be fixed in v18. Thanks.
Great. Still there was a bug in v17 which you discovered. Thank you for the review
effort.Please find in the attached v19 an extra check right before calling deflateInit().
This check will verify that only compressed output will be generated for this
method.Also v19 is rebased on top f450695e889 and applies cleanly.
The patch does not apply on top of HEAD as in [1]http://cfbot.cputube.org/patch_41_3571.log, please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
ff23b592ad6621563d3128b26860bcb41daf9542 ===
=== applying patch ./v19-0002-Introduce-Compressor-API-in-pg_dump.patch
patching file src/bin/pg_dump/compress_io.h
Hunk #1 FAILED at 37.
1 out of 1 hunk FAILED -- saving rejects to file
src/bin/pg_dump/compress_io.h.rej
[1]: http://cfbot.cputube.org/patch_41_3571.log
Regards,
Vignesh
On Sun, Jan 08, 2023 at 01:45:25PM -0600, Justin Pryzby wrote:
On Thu, Dec 22, 2022 at 11:08:59AM -0600, Justin Pryzby wrote:
There's a couple of lz4 bits which shouldn't be present in 002: file
extension and comments.
BTW I noticed that cfdopen() was accidentally committed to compress_io.h
in master without being defined anywhere.
This was resolved in 69fb29d1a (so now needs to be re-added for this
patch series).
pg_compress_specification is being passed by value, but I think it
should be passed as a pointer, as is done everywhere else.
ISTM that was an issue with 5e73a6048, affecting a few public and
private functions. I wrote a pre-preparatory patch which changes to
pass by reference.
And addressed a handful of other issues I reported as separate fixup
commits. And changed to use LZ4 by default for CI.
I also rebased my 2 year old patch to support zstd in pg_dump. I hope
it can finally added for v16. I'll send it for the next CF if these
patches progress.
One more thing: some comments still refer to the cfopen API, which this
patch removes.
There were "LZ4" comments and file extension stuff in the preparatory
commit. But now it seems like you *removed* them in the LZ4 commit
(where it actually belongs) rather than *moving* it from the
prior/parent commit *to* the lz4 commit. I recommend to run something
like "git diff @{1}" whenever doing this kind of patch surgery.
TODO
Maybe other places that check if (compression==PG_COMPRESSION_GZIP)
should maybe change to say compression!=NONE?_PrepParallelRestore() references ".gz", so I think it needs to be
retrofitted to handle .lz4. Ideally, that's built into a struct or list
of file extensions to try. Maybe compression.h should have a function
to return the file extension of a given algorithm. I'm planning to send
a patch for zstd, and hoping its changes will be minimized by these
preparatory commits.
TODO
I think it's confusing to have two functions, one named
InitCompressLZ4() and InitCompressorLZ4().
TODO
pg_compress_algorithm is being writen directly into the pg_dump header.
Currently, I think that's not an externally-visible value (it could be
renumbered, theoretically even in a minor release). Maybe there should
be a "private" enum for encoding the pg_dump header, similar to
WAL_COMPRESSION_LZ4 vs BKPIMAGE_COMPRESS_LZ4 ? Or else a comment there
should warn that the values are encoded in pg_dump, and must never be
changed.
Michael, WDYT ?
--
Justin
Attachments:
0001-pg_dump-pass-pg_compress_specification-as-a-pointer.patchtext/x-diff; charset=us-asciiDownload
From 3105d480ab82093ca2873e423782f5b2edd9fbb7 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 14 Jan 2023 10:58:23 -0600
Subject: [PATCH 1/7] pg_dump: pass pg_compress_specification as a pointer..
..as is done everywhere else.
---
src/bin/pg_dump/compress_io.c | 30 +++++++++++++--------------
src/bin/pg_dump/compress_io.h | 8 +++----
src/bin/pg_dump/pg_backup.h | 2 +-
src/bin/pg_dump/pg_backup_archiver.c | 12 +++++------
src/bin/pg_dump/pg_backup_custom.c | 6 +++---
src/bin/pg_dump/pg_backup_directory.c | 8 +++----
src/bin/pg_dump/pg_dump.c | 2 +-
7 files changed, 34 insertions(+), 34 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 7a2c80bbc4c..62a9527fa48 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -94,19 +94,19 @@ static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
/* Allocate a new compressor */
CompressorState *
-AllocateCompressor(const pg_compress_specification compression_spec,
+AllocateCompressor(const pg_compress_specification *compression_spec,
WriteFunc writeF)
{
CompressorState *cs;
#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ if (compression_spec->algorithm == PG_COMPRESSION_GZIP)
pg_fatal("this build does not support compression with %s", "gzip");
#endif
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
cs->writeF = writeF;
- cs->compression_spec = compression_spec;
+ cs->compression_spec = *compression_spec; // XXX
/*
* Perform compression algorithm specific initialization.
@@ -125,12 +125,12 @@ AllocateCompressor(const pg_compress_specification compression_spec,
*/
void
ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
+ const pg_compress_specification *compression_spec,
ReadFunc readF)
{
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
+ if (compression_spec->algorithm == PG_COMPRESSION_NONE)
ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ if (compression_spec->algorithm == PG_COMPRESSION_GZIP)
{
#ifdef HAVE_LIBZ
ReadDataFromArchiveZlib(AH, readF);
@@ -432,13 +432,13 @@ cfopen_read(const char *path, const char *mode)
if (hasSuffix(path, ".gz"))
{
compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
+ fp = cfopen(path, mode, &compression_spec);
}
else
#endif
{
compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
+ fp = cfopen(path, mode, &compression_spec);
#ifdef HAVE_LIBZ
if (fp == NULL)
{
@@ -446,7 +446,7 @@ cfopen_read(const char *path, const char *mode)
fname = psprintf("%s.gz", path);
compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
+ fp = cfopen(fname, mode, &compression_spec);
free_keep_errno(fname);
}
#endif
@@ -467,11 +467,11 @@ cfopen_read(const char *path, const char *mode)
*/
cfp *
cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+ const pg_compress_specification *compression_spec)
{
cfp *fp;
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
+ if (compression_spec->algorithm == PG_COMPRESSION_NONE)
fp = cfopen(path, mode, compression_spec);
else
{
@@ -497,20 +497,20 @@ cfopen_write(const char *path, const char *mode,
*/
cfp *
cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+ const pg_compress_specification *compression_spec)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ if (compression_spec->algorithm == PG_COMPRESSION_GZIP)
{
#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
+ if (compression_spec->level != Z_DEFAULT_COMPRESSION)
{
/* user has specified a compression level, so tell zlib to use it */
char mode_compression[32];
snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
+ mode, compression_spec->level);
fp->compressedfp = gzopen(path, mode_compression);
}
else
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a429dc4789d..34f4e5e1e14 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -40,10 +40,10 @@ typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
-extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+extern CompressorState *AllocateCompressor(const pg_compress_specification *compression_spec,
WriteFunc writeF);
extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
+ const pg_compress_specification *compression_spec,
ReadFunc readF);
extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen);
@@ -53,10 +53,10 @@ extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
typedef struct cfp cfp;
extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
+ const pg_compress_specification *compression_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
+ const pg_compress_specification *compression_spec);
extern int cfread(void *ptr, int size, cfp *fp);
extern int cfwrite(const void *ptr, int size, cfp *fp);
extern int cfgetc(cfp *fp);
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index aba780ef4b1..216e24e7ec5 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -305,7 +305,7 @@ extern Archive *OpenArchive(const char *FileSpec, const ArchiveFormat fmt);
/* Create a new archive */
extern Archive *CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
- const pg_compress_specification compression_spec,
+ const pg_compress_specification *compression_spec,
bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupDumpWorker);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 7f7a0f1ce7b..0d91b75c748 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -70,7 +70,7 @@ typedef struct _parallelReadyList
static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt,
- const pg_compress_specification compression_spec,
+ const pg_compress_specification *compression_spec,
bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupWorkerPtr);
static void _getObjectDescription(PQExpBuffer buf, const TocEntry *te);
@@ -241,7 +241,7 @@ setupRestoreWorker(Archive *AHX)
/* Public */
Archive *
CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
- const pg_compress_specification compression_spec,
+ const pg_compress_specification *compression_spec,
bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupDumpWorker)
@@ -261,7 +261,7 @@ OpenArchive(const char *FileSpec, const ArchiveFormat fmt)
pg_compress_specification compression_spec = {0};
compression_spec.algorithm = PG_COMPRESSION_NONE;
- AH = _allocAH(FileSpec, fmt, compression_spec, true,
+ AH = _allocAH(FileSpec, fmt, &compression_spec, true,
archModeRead, setupRestoreWorker);
return (Archive *) AH;
@@ -2214,7 +2214,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
static ArchiveHandle *
_allocAH(const char *FileSpec, const ArchiveFormat fmt,
- const pg_compress_specification compression_spec,
+ const pg_compress_specification *compression_spec,
bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupWorkerPtr)
{
@@ -2266,7 +2266,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
AH->toc->prev = AH->toc;
AH->mode = mode;
- AH->compression_spec = compression_spec;
+ AH->compression_spec = *compression_spec; // XXX
AH->dosync = dosync;
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
@@ -2281,7 +2281,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
* Force stdin/stdout into binary mode if that is what we are using.
*/
#ifdef WIN32
- if ((fmt != archNull || compression_spec.algorithm != PG_COMPRESSION_NONE) &&
+ if ((fmt != archNull || compression_spec->algorithm != PG_COMPRESSION_NONE) &&
(AH->fSpec == NULL || strcmp(AH->fSpec, "") == 0))
{
if (mode == archModeWrite)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index d1e54644a94..0e87444de85 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,7 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(&AH->compression_spec, _CustomWriteFunc);
}
/*
@@ -377,7 +377,7 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(&AH->compression_spec, _CustomWriteFunc);
}
/*
@@ -566,7 +566,7 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ ReadDataFromArchive(AH, &AH->compression_spec, _CustomReadFunc);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 6800c3cceef..ffb8a0e4dd7 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -328,7 +328,7 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
+ &AH->compression_spec);
if (ctx->dataFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -584,7 +584,7 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
+ tocFH = cfopen_write(fname, PG_BINARY_W, &compression_spec);
if (tocFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -649,7 +649,7 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
+ ctx->LOsTocFH = cfopen_write(fname, "ab", &compression_spec);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,7 +667,7 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
+ ctx->dataFH = cfopen_write(fname, PG_BINARY_W, &AH->compression_spec);
if (ctx->dataFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2c0a9699729..20f73729fac 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -751,7 +751,7 @@ main(int argc, char **argv)
pg_fatal("parallel backup only supported by the directory format");
/* Open the output file */
- fout = CreateArchive(filename, archiveFormat, compression_spec,
+ fout = CreateArchive(filename, archiveFormat, &compression_spec,
dosync, archiveMode, setupDumpWorker);
/* Make dump options accessible right away */
--
2.25.1
0002-Prepare-pg_dump-internals-for-additional-compression.patchtext/x-diff; charset=us-asciiDownload
From 6a8f2cd926be4f0b83f2d2d5170cf02a2a825036 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 21 Dec 2022 09:49:23 +0000
Subject: [PATCH 2/7] Prepare pg_dump internals for additional compression
methods.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it and is using it throughout.
---
src/bin/pg_dump/compress_io.c | 391 +++++++++++++++++++--------
src/bin/pg_dump/compress_io.h | 2 +
src/bin/pg_dump/pg_backup_archiver.c | 136 ++++------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-
4 files changed, 326 insertions(+), 230 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 62a9527fa48..97b18337578 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,10 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
/*----------------------
* Compressor API
*----------------------
@@ -128,15 +132,24 @@ ReadDataFromArchive(ArchiveHandle *AH,
const pg_compress_specification *compression_spec,
ReadFunc readF)
{
- if (compression_spec->algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec->algorithm == PG_COMPRESSION_GZIP)
+ switch (compression_spec->algorithm)
{
+ case PG_COMPRESSION_NONE:
+ ReadDataFromArchiveNone(AH, readF);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
+ ReadDataFromArchiveZlib(AH, readF);
#else
- pg_fatal("this build does not support compression with %s", "gzip");
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
}
@@ -149,6 +162,9 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
{
switch (cs->compression_spec.algorithm)
{
+ case PG_COMPRESSION_NONE:
+ WriteDataToArchiveNone(AH, cs, data, dLen);
+ break;
case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
WriteDataToArchiveZlib(AH, cs, data, dLen);
@@ -156,13 +172,11 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
pg_fatal("this build does not support compression with %s", "gzip");
#endif
break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
case PG_COMPRESSION_LZ4:
- /* fallthrough */
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
}
}
@@ -173,10 +187,26 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
+ switch (cs->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
+ EndCompressorZlib(AH, cs);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
free(cs);
}
@@ -391,10 +421,8 @@ WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
*/
struct cfp
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
+ pg_compress_specification compression_spec;
+ void *fp;
};
#ifdef HAVE_LIBZ
@@ -490,127 +518,204 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification *compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ const pg_compress_specification *compression_spec)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compression_spec->algorithm == PG_COMPRESSION_GZIP)
+ fp->compression_spec = *compression_spec;
+
+ switch (compression_spec->algorithm)
{
-#ifdef HAVE_LIBZ
- if (compression_spec->level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
+ case PG_COMPRESSION_NONE:
+ if (fd >= 0)
+ fp->fp = fdopen(fd, mode);
+ else
+ fp->fp = fopen(path, mode);
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec->level);
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
- }
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ if (compression_spec->level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use
+ * it
+ */
+ char mode_compression[32];
+
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, compression_spec->level);
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode_compression);
+ // fp->compressedfp = gzopen(path, mode_compression);
+ else
+ fp->fp = gzopen(path, mode_compression);
+ }
+ else
+ {
+ /* don't specify a level, just use the zlib default */
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode);
+ else
+ fp->fp = gzopen(path, mode);
+ // fp->compressedfp = gzopen(path, mode);
+ }
- fp->uncompressedfp = NULL;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
- else
- {
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
- fp->uncompressedfp = fopen(path, mode);
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
return fp;
}
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification *compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification *compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
{
- int ret;
+ int ret = 0;
if (size == 0)
return 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ case PG_COMPRESSION_NONE:
+ ret = fread(ptr, 1, size, (FILE *) fp->fp);
+ if (ret != size && !feof((FILE *) fp->fp))
+ READ_ERROR_EXIT((FILE *) fp->fp);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
- else
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzread((gzFile) fp->fp, ptr, size);
+ if (ret != size && !gzeof((gzFile) fp->fp))
+ {
+ int errnum;
+ const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
+
return ret;
}
int
cfwrite(const void *ptr, int size, cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fwrite(ptr, 1, size, (FILE *) fp->fp);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
+ ret = gzwrite((gzFile) fp->fp, ptr, size);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
int
cfgetc(cfp *fp)
{
- int ret;
+ int ret = 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
- {
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fgetc((FILE *) fp->fp);
+ if (ret == EOF)
+ READ_ERROR_EXIT((FILE *) fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzgetc((gzFile) fp->fp);
+ if (ret == EOF)
+ {
+ if (!gzeof((gzFile) fp->fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
return ret;
@@ -619,65 +724,119 @@ cfgetc(cfp *fp)
char *
cfgets(cfp *fp, char *buf, int len)
{
+ char *ret = NULL;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fgets(buf, len, (FILE *) fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
+ ret = gzgets((gzFile) fp->fp, buf, len);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return fgets(buf, len, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
int
cfclose(cfp *fp)
{
- int result;
+ int ret = 0;
if (fp == NULL)
{
errno = EBADF;
return EOF;
}
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+
+ switch (fp->compression_spec.algorithm)
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fclose((FILE *) fp->fp);
+ fp->fp = NULL;
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzclose((gzFile) fp->fp);
+ fp->fp = NULL;
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
+
free_keep_errno(fp);
- return result;
+ return ret;
}
int
cfeof(cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = feof((FILE *) fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
+ ret = gzeof((gzFile) fp->fp);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return feof(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
const char *
get_cfp_error(cfp *fp)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
+#ifdef HAVE_LIBZ
int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
if (errnum != Z_ERRNO)
return errmsg;
- }
+#else
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
+ }
+
return strerror(errno);
}
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 34f4e5e1e14..768096c820d 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -54,6 +54,8 @@ typedef struct cfp cfp;
extern cfp *cfopen(const char *path, const char *mode,
const pg_compress_specification *compression_spec);
+extern cfp *cfdopen(int fd, const char *mode,
+ const pg_compress_specification *compression_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
extern cfp *cfopen_write(const char *path, const char *mode,
const pg_compress_specification *compression_spec);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 0d91b75c748..cbe110c917a 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -100,9 +94,9 @@ static int RestoringToDB(ArchiveHandle *AH);
static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
- const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+ const pg_compress_specification *compression_spec);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -362,8 +353,9 @@ RestoreArchive(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
+ bool supports_compression;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +383,24 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
+ supports_compression = true;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE &&
+ AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+#ifndef HAVE_LIBZ
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+ if (supports_compression == false)
+ pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -469,7 +468,7 @@ RestoreArchive(Archive *AHX)
*/
sav = SaveOutput(AH);
if (ropt->filename || ropt->compression_spec.algorithm != PG_COMPRESSION_NONE)
- SetOutput(AH, ropt->filename, ropt->compression_spec);
+ SetOutput(AH, ropt->filename, &ropt->compression_spec);
ahprintf(AH, "--\n-- PostgreSQL database dump\n--\n\n");
@@ -1128,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1137,7 +1136,7 @@ PrintTOCSummary(Archive *AHX)
sav = SaveOutput(AH);
if (ropt->filename)
- SetOutput(AH, ropt->filename, out_compression_spec);
+ SetOutput(AH, ropt->filename, &out_compression_spec);
if (strftime(stamp_str, sizeof(stamp_str), PGDUMP_STRFTIME_FMT,
localtime(&AH->createDate)) == 0)
@@ -1501,60 +1500,34 @@ archprintf(Archive *AH, const char *fmt,...)
static void
SetOutput(ArchiveHandle *AH, const char *filename,
- const pg_compress_specification compression_spec)
+ const pg_compress_specification *compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1565,33 +1538,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1715,22 +1679,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2219,6 +2178,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2272,8 +2232,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, &out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b4..4725e49747b 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.25.1
0003-Introduce-Compressor-API-in-pg_dump.patchtext/x-diff; charset=us-asciiDownload
From 6bcd16aa5d94b2fee99ed34fc8a76a757f569cb6 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 21 Dec 2022 09:49:31 +0000
Subject: [PATCH 3/7] Introduce Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for opening, writing, etc. The implementor of a new
compression method is now able to "simply" just add those definitions.
Custom compressed archives now need to store the compression algorithm in their
header. This requires a bump in the version number. The level of compression
is no longer stored in the dump as it is irrelevant.
---
src/bin/pg_dump/Makefile | 1 +
src/bin/pg_dump/compress_gzip.c | 405 ++++++++++++
src/bin/pg_dump/compress_gzip.h | 22 +
src/bin/pg_dump/compress_io.c | 916 +++++++-------------------
src/bin/pg_dump/compress_io.h | 71 +-
src/bin/pg_dump/meson.build | 1 +
src/bin/pg_dump/pg_backup_archiver.c | 102 +--
src/bin/pg_dump/pg_backup_archiver.h | 5 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 94 +--
src/bin/pg_dump/t/002_pg_dump.pl | 10 +-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 2 +
13 files changed, 852 insertions(+), 801 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index ef1ed0f3e51..7a19f5d6172 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,6 +24,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 00000000000..37c841c5a9b
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,405 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.c
+ * Routines for archivers to write an uncompressed or compressed data
+ * stream.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_gzip.h"
+#include "pg_backup_utils.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ /*
+ * A level of zero simply copies the input one block at the time. This
+ * is probably not what the user wanted when calling this interface.
+ */
+ if (cs->compression_spec.level == 0)
+ pg_fatal("requested to compress the archive yet no level was specified");
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification *compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ cs->compression_spec = *compression_spec;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
+ cs->private_data = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ size_t ret;
+
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzwrite(gzfp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzgets(gzfp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int save_errno;
+ int ret;
+
+ CFH->private_data = NULL;
+
+ ret = gzclose(gzfp);
+
+ save_errno = errno;
+ errno = save_errno;
+
+ return ret;
+}
+
+static int
+Gzip_eof(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzeof(gzfp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ gzFile gzfp;
+ char mode_compression[32];
+
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
+ else
+ gzfp = gzopen(path, mode_compression);
+
+ if (gzfp == NULL)
+ return 1;
+
+ CFH->private_data = gzfp;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification *compression_spec)
+{
+ CFH->open_func = Gzip_open;
+ CFH->open_write_func = Gzip_open_write;
+ CFH->read_func = Gzip_read;
+ CFH->write_func = Gzip_write;
+ CFH->gets_func = Gzip_gets;
+ CFH->getc_func = Gzip_getc;
+ CFH->close_func = Gzip_close;
+ CFH->eof_func = Gzip_eof;
+ CFH->get_error_func = Gzip_get_error;
+
+ CFH->compression_spec = *compression_spec;
+
+ CFH->private_data = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification *compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification *compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 00000000000..a1fc3595e51
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.h
+ * Interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs, const pg_compress_specification *compression_spec);
+extern void InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification *compression_spec);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 97b18337578..576a8653193 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -9,42 +9,44 @@
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
+ * underlying stream. The second API is a wrapper around fopen and
* friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
+ * the possible compression. The second API is aimed for the resulting
+ * files can be easily manipulated with an external compression utility
+ * program.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
+ * AllocateCompressor, writeData, and EndCompressor. First you call
+ * AllocateCompressor, then write all the data by calling writeData as many
+ * times as needed, and finally EndCompressor. writeData will call the
+ * WriteFunc that was provided to AllocateCompressor for each chunk of
+ * compressed data.
*
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
- *
- * The interface is the same for compressed and uncompressed streams.
+ * The interface for reading an archive consists of the same three functions:
+ * AllocateCompressor, readData, and EndCompressor. First you call
+ * AllocateCompressor, then read all the data by calling readData to read the
+ * whole compressed stream which repeatedly calls the given ReadFunc. ReadFunc
+ * returns the compressed data chunk at a time, and readData decompresses it
+ * and passes the decompressed data to ahwrite(), until ReadFunc returns 0 to
+ * signal EOF. The interface is the same for compressed and uncompressed
+ * streams.
*
* Compressed stream API
* ----------------------
*
* The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * libz's gzopen() APIs and custom LZ4 calls which provide similar
+ * functionality. It allows you to use the same functions for compressed and
+ * uncompressed streams. cfopen_read() first tries to open the file with given
+ * name, and if it fails, it tries to open the same file with the .gz suffix,
+ * failing that it tries to open the same file with the .lz4 suffix.
+ * cfopen_write() opens a file for writing, an extra argument specifies the
+ * method to use should the file be compressed, and adds the appropriate
+ * suffix, .gz or .lz4, to the filename if so. This allows you to easily handle
+ * both compressed and uncompressed files.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -53,7 +55,11 @@
*/
#include "postgres_fe.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
#include "compress_io.h"
+#include "compress_gzip.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -65,84 +71,70 @@
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
+
+ free(buf);
+}
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+static void
+InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification *compression_spec)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+
+ cs->compression_spec = *compression_spec;
+}
/* Public interface routines */
/* Allocate a new compressor */
CompressorState *
AllocateCompressor(const pg_compress_specification *compression_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compression_spec->algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
cs->compression_spec = *compression_spec; // XXX
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
-}
-
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification *compression_spec,
- ReadFunc readF)
-{
switch (compression_spec->algorithm)
{
case PG_COMPRESSION_NONE:
- ReadDataFromArchiveNone(AH, readF);
+ InitCompressorNone(cs, compression_spec);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
+ InitCompressorGzip(cs, compression_spec);
break;
case PG_COMPRESSION_LZ4:
pg_fatal("compression with %s is not yet supported", "LZ4");
@@ -151,34 +143,8 @@ ReadDataFromArchive(ArchiveHandle *AH,
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
}
-}
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
+ return cs;
}
/*
@@ -187,403 +153,177 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- EndCompressorZlib(AH, cs);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- free(cs);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
+/*----------------------
+ * Compressed stream API
+ *----------------------
*/
-static void
-InitCompressorZlib(CompressorState *cs, int level)
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
+
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
}
+/* free() without changing errno; useful in several places below */
static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
+free_keep_errno(void *p)
{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
+ int save_errno = errno;
- free(cs->zlibOut);
- free(cs->zp);
+ free(p);
+ errno = save_errno;
}
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
+/*
+ * Compression None implementation
+ */
+static size_t
+read_none(void *ptr, size_t size, CompressFileHandle *CFH)
{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
+ FILE *fp = (FILE *) CFH->private_data;
+ size_t ret;
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
+ if (size == 0)
+ return 0;
- if (res == Z_STREAM_END)
- break;
- }
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
+
+ return ret;
}
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+static size_t
+write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
+ return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
}
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
+static const char *
+get_error_none(CompressFileHandle *CFH)
{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
+ return strerror(errno);
+}
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
+static char *
+gets_none(char *ptr, int size, CompressFileHandle *CFH)
+{
+ return fgets(ptr, size, (FILE *) CFH->private_data);
+}
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
+static int
+getc_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret;
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
+ ret = fgetc(fp);
+ if (ret == EOF)
{
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
}
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
+ return ret;
}
-#endif /* HAVE_LIBZ */
-
-
-/*
- * Functions for uncompressed output.
- */
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
+static int
+close_none(CompressFileHandle *CFH)
{
- size_t cnt;
- char *buf;
- size_t buflen;
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret = 0;
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
+ CFH->private_data = NULL;
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
+ if (fp)
+ ret = fclose(fp);
- free(buf);
+ return ret;
}
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+
+static int
+eof_none(CompressFileHandle *CFH)
{
- cs->writeF(AH, data, dLen);
+ return feof((FILE *) CFH->private_data);
}
-
-/*----------------------
- * Compressed stream API
- *----------------------
- */
-
-/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
- */
-struct cfp
+static int
+open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
{
- pg_compress_specification compression_spec;
- void *fp;
-};
+ Assert(CFH->private_data == NULL);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (fd >= 0)
+ CFH->private_data = fdopen(dup(fd), mode);
+ else
+ CFH->private_data = fopen(path, mode);
-/* free() without changing errno; useful in several places below */
-static void
-free_keep_errno(void *p)
-{
- int save_errno = errno;
+ if (CFH->private_data == NULL)
+ return 1;
- free(p);
- errno = save_errno;
+ return 0;
}
-/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_read(const char *path, const char *mode)
+static int
+open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
{
- cfp *fp;
+ Assert(CFH->private_data == NULL);
+ CFH->private_data = fopen(path, mode);
+ if (CFH->private_data == NULL)
+ return 1;
- pg_compress_specification compression_spec = {0};
-
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, &compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, &compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
-
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, &compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
+ return 0;
}
-/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification *compression_spec)
+static void
+InitCompressNone(CompressFileHandle *CFH,
+ const pg_compress_specification *compression_spec)
{
- cfp *fp;
-
- if (compression_spec->algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
+ CFH->open_func = open_none;
+ CFH->open_write_func = open_write_none;
+ CFH->read_func = read_none;
+ CFH->write_func = write_none;
+ CFH->gets_func = gets_none;
+ CFH->getc_func = getc_none;
+ CFH->close_func = close_none;
+ CFH->eof_func = eof_none;
+ CFH->get_error_func = get_error_none;
+
+ CFH->private_data = NULL;
}
/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
+ * Public interface
*/
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- const pg_compress_specification *compression_spec)
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification *compression_spec)
{
- cfp *fp = pg_malloc(sizeof(cfp));
+ CompressFileHandle *CFH;
- fp->compression_spec = *compression_spec;
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
switch (compression_spec->algorithm)
{
case PG_COMPRESSION_NONE:
- if (fd >= 0)
- fp->fp = fdopen(fd, mode);
- else
- fp->fp = fopen(path, mode);
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-
+ InitCompressNone(CFH, compression_spec);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- if (compression_spec->level != Z_DEFAULT_COMPRESSION)
- {
- /*
- * user has specified a compression level, so tell zlib to use
- * it
- */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec->level);
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode_compression);
- // fp->compressedfp = gzopen(path, mode_compression);
- else
- fp->fp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode);
- else
- fp->fp = gzopen(path, mode);
- // fp->compressedfp = gzopen(path, mode);
- }
-
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
+ InitCompressGzip(CFH, compression_spec);
break;
case PG_COMPRESSION_LZ4:
pg_fatal("compression with %s is not yet supported", "LZ4");
@@ -593,266 +333,88 @@ cfopen_internal(const char *path, int fd, const char *mode,
break;
}
- return fp;
+ return CFH;
}
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification *compression_spec)
+/*
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
+ *
+ * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (i
+ * 'path' doesn't already have it) and try again. So if you pass "foo" as
+ * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
+ * order.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- return cfopen_internal(path, -1, mode, compression_spec);
-}
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification *compression_spec)
-{
- return cfopen_internal(NULL, fd, mode, compression_spec);
-}
+ // compression_spec.algorithm = PG_COMPRESSION_NONE;
-int
-cfread(void *ptr, int size, cfp *fp)
-{
- int ret = 0;
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
- if (size == 0)
- return 0;
+ fname = strdup(path);
- switch (fp->compression_spec.algorithm)
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ else
{
- case PG_COMPRESSION_NONE:
- ret = fread(ptr, 1, size, (FILE *) fp->fp);
- if (ret != size && !feof((FILE *) fp->fp))
- READ_ERROR_EXIT((FILE *) fp->fp);
+ bool exists;
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzread((gzFile) fp->fp, ptr, size);
- if (ret != size && !gzeof((gzFile) fp->fp))
- {
- int errnum;
- const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
-
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-int
-cfwrite(const void *ptr, int size, cfp *fp)
-{
- int ret = 0;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fwrite(ptr, 1, size, (FILE *) fp->fp);
- break;
- case PG_COMPRESSION_GZIP:
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not build with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- ret = gzwrite((gzFile) fp->fp, ptr, size);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-int
-cfgetc(cfp *fp)
-{
- int ret = 0;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgetc((FILE *) fp->fp);
- if (ret == EOF)
- READ_ERROR_EXIT((FILE *) fp->fp);
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgetc((gzFile) fp->fp);
- if (ret == EOF)
- {
- if (!gzeof((gzFile) fp->fp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ }
#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-char *
-cfgets(cfp *fp, char *buf, int len)
-{
- char *ret = NULL;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgets(buf, len, (FILE *) fp->fp);
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgets((gzFile) fp->fp, buf, len);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
}
- return ret;
-}
-
-int
-cfclose(cfp *fp)
-{
- int ret = 0;
-
- if (fp == NULL)
+ CFH = InitCompressFileHandle(&compression_spec);
+ if (CFH->open_func(fname, -1, mode, CFH))
{
- errno = EBADF;
- return EOF;
+ free_keep_errno(CFH);
+ CFH = NULL;
}
+ free_keep_errno(fname);
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fclose((FILE *) fp->fp);
- fp->fp = NULL;
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzclose((gzFile) fp->fp);
- fp->fp = NULL;
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- free_keep_errno(fp);
-
- return ret;
+ return CFH;
}
int
-cfeof(cfp *fp)
+DestroyCompressFileHandle(CompressFileHandle *CFH)
{
int ret = 0;
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = feof((FILE *) fp->fp);
+ if (CFH->private_data)
+ ret = CFH->close_func(CFH);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzeof((gzFile) fp->fp);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
+ free_keep_errno(CFH);
return ret;
}
-
-const char *
-get_cfp_error(cfp *fp)
-{
- if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- int errnum;
- const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
-
- if (errnum != Z_ERRNO)
- return errmsg;
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
-
- return strerror(errno);
-}
-
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
-{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
-
- if (filenamelen < suffixlen)
- return 0;
-
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
-}
-
-#endif
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 768096c820d..afe6b22efaf 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -37,34 +37,63 @@ typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ ReadFunc readF;
+ WriteFunc writeF;
+
+ pg_compress_specification compression_spec;
+ void *private_data;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification *compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification *compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ int (*open_func) (const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH);
+ int (*open_write_func) (const char *path, const char *mode,
+ CompressFileHandle *cxt);
+ size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
+ size_t (*write_func) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
+ int (*getc_func) (CompressFileHandle *CFH);
+ int (*eof_func) (CompressFileHandle *CFH);
+ int (*close_func) (CompressFileHandle *CFH);
+ const char *(*get_error_func) (CompressFileHandle *CFH);
+
+ pg_compress_specification compression_spec;
+ void *private_data;
+};
-typedef struct cfp cfp;
+extern CompressFileHandle *InitCompressFileHandle(
+ const pg_compress_specification *compression_spec);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification *compression_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- const pg_compress_specification *compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification *compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int DestroyCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index ca62f9a3740..aa2c91829c0 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -2,6 +2,7 @@
pg_dump_common_sources = files(
'compress_io.c',
+ 'compress_gzip.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index cbe110c917a..06f0b46cbfc 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification *compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -355,7 +355,7 @@ RestoreArchive(Archive *AHX)
bool parallel_mode;
bool supports_compression;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1127,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1143,9 +1143,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1502,6 +1503,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification *compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1524,33 +1526,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1689,7 +1690,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2031,6 +2036,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2061,26 +2078,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2178,6 +2181,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2233,7 +2237,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, &out_compress_spec);
+ CFH = InitCompressFileHandle(&out_compress_spec);
+ if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3646,12 +3653,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3722,10 +3724,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
@@ -3737,10 +3740,17 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
+ {
+ bool unsupported = false;
+
#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ unsupported = true;
#endif
+ if (unsupported)
+ pg_fatal("archive is compressed, but this installation does not support compression");
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 4725e49747b..18b38c17abc 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,13 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add
+ * compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index 0e87444de85..40cd90b7325 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(&AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(&AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(&AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(&AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, &AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(&AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index ffb8a0e4dd7..2d4baf58c22 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- &AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(&AH->compression_spec);
+
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read_func(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (DestroyCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof_func(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (DestroyCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write_func(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc_func(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write_func(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read_func(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, &compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(&compression_spec);
+ if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", &compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(&compression_spec);
+ if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, &AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(&AH->compression_spec);
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write_func(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (DestroyCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 2eeef2a4783..f3ba9263213 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -94,7 +94,7 @@ my %pgdump_runs = (
command => [
'pg_restore', '-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip-compressed'
},
},
@@ -239,8 +239,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default if available',
},
},
@@ -264,8 +264,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
glob_patterns => [
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index fe432e2cccc..62f3e9a81d4 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -150,6 +150,7 @@ do
# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.
+ test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 23bafec5f79..840191d680b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -428,6 +428,7 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
+CompressFileHandle
CompressionLocation
CompressorState
ComputeXidHorizonsResult
@@ -1034,6 +1035,7 @@ GucStack
GucStackState
GucStringAssignHook
GucStringCheckHook
+GzipCompressorState
HANDLE
HASHACTION
HASHBUCKET
--
2.25.1
0004-f.patchtext/x-diff; charset=us-asciiDownload
From 100ff6665ddc4965d9cc4c0f2cd03d9b17a46099 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 14 Jan 2023 10:31:47 -0600
Subject: [PATCH 4/7] f!
---
src/bin/pg_dump/compress_gzip.c | 9 +--------
src/bin/pg_dump/pg_backup_archiver.c | 1 -
2 files changed, 1 insertion(+), 9 deletions(-)
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index 37c841c5a9b..b00be32f2e9 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -291,17 +291,10 @@ static int
Gzip_close(CompressFileHandle *CFH)
{
gzFile gzfp = (gzFile) CFH->private_data;
- int save_errno;
- int ret;
CFH->private_data = NULL;
- ret = gzclose(gzfp);
-
- save_errno = errno;
- errno = save_errno;
-
- return ret;
+ return gzclose(gzfp);
}
static int
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 06f0b46cbfc..7f06beff61c 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -385,7 +385,6 @@ RestoreArchive(Archive *AHX)
*/
supports_compression = true;
if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE &&
- AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
--
2.25.1
0005-Add-LZ4-compression-in-pg_-dump-restore.patchtext/x-diff; charset=us-asciiDownload
From efba6161c6d6849e5ff7cd922b1572a0e27e76b7 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 21 Dec 2022 09:49:36 +0000
Subject: [PATCH 5/7] Add LZ4 compression in pg_{dump|restore}
Within compress_lz4.{c,h} the streaming API and a file API compression is
implemented.. The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(), it has been
implemented localy.
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 11 +-
src/bin/pg_dump/compress_lz4.c | 618 +++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 22 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 14 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 1 +
11 files changed, 756 insertions(+), 21 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e14..49d218905fb 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 7a19f5d6172..a1401377ab9 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -25,6 +26,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
compress_gzip.o \
+ compress_lz4.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 576a8653193..8ebefd1ed13 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -60,6 +60,7 @@
#include "compress_io.h"
#include "compress_gzip.h"
+#include "compress_lz4.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -137,7 +138,7 @@ AllocateCompressor(const pg_compress_specification *compression_spec,
InitCompressorGzip(cs, compression_spec);
break;
case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
+ InitCompressorLZ4(cs, compression_spec);
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
@@ -326,7 +327,7 @@ InitCompressFileHandle(const pg_compress_specification *compression_spec)
InitCompressGzip(CFH, compression_spec);
break;
case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
+ InitCompressLZ4(CFH, compression_spec);
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
@@ -340,12 +341,12 @@ InitCompressFileHandle(const pg_compress_specification *compression_spec)
* Open a file for reading. 'path' is the file to open, and 'mode' should
* be either "r" or "rb".
*
- * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (i
+ * If the file at 'path' does not exist, we append the ".gz" suffix (if
* 'path' doesn't already have it) and try again. So if you pass "foo" as
- * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
- * order.
+ * 'path', this will open either "foo" or "foo.gz", trying in that order.
*
* On failure, return NULL with an error code in errno.
+ *
*/
CompressFileHandle *
InitDiscoverCompressFileHandle(const char *path, const char *mode)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 00000000000..c97e16187a0
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,618 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.c
+ * Routines for archivers to write an uncompressed or compressed data
+ * stream.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private_data;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private_data = NULL;
+ }
+}
+
+
+/* Public routines that support LZ4 compressed data I/O */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ cs->compression_spec = compression_spec;
+
+ /* Will be lazy init'd */
+ cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the nessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurance of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (!fs->inited && LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verfiy that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (eol_flag == 0) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag == true &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the begining of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int remaining = size;
+
+ if (!fs->inited && LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = errno ? : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ LZ4File *lz4fp;
+
+ CFH->open_func = LZ4File_open;
+ CFH->open_write_func = LZ4File_open_write;
+ CFH->read_func = LZ4File_read;
+ CFH->write_func = LZ4File_write;
+ CFH->gets_func = LZ4File_gets;
+ CFH->getc_func = LZ4File_getc;
+ CFH->eof_func = LZ4File_eof;
+ CFH->close_func = LZ4File_close;
+ CFH->get_error_func = LZ4File_get_error;
+
+ CFH->compression_spec = compression_spec;
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (CFH->compression_spec.level >= 0)
+ lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
+
+ CFH->private_data = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 00000000000..74595db1b98
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.h
+ * Interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index aa2c91829c0..473d40d456f 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -3,6 +3,7 @@
pg_dump_common_sources = files(
'compress_io.c',
'compress_gzip.c',
+ 'compress_lz4.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
@@ -17,7 +18,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -85,7 +86,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 7f06beff61c..2d406a5f0e3 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -394,6 +394,10 @@ RestoreArchive(Archive *AHX)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
#endif
if (supports_compression == false)
pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
@@ -2073,7 +2077,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2083,6 +2087,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -3746,6 +3754,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
unsupported = true;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ unsupported = true;
#endif
if (unsupported)
pg_fatal("archive is compressed, but this installation does not support compression");
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 20f73729fac..224d2c900ce 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index f3ba9263213..f497ec60407 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -139,6 +139,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files where compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4175,11 +4249,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index 62f3e9a81d4..2b461e797c6 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -152,6 +152,7 @@ do
# as field names, which is unfortunate but we won't change it now.
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_lz4.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
test "$f" = src/bin/pg_dump/pg_dump.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 840191d680b..232228d427c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1381,6 +1381,7 @@ LWLock
LWLockHandle
LWLockMode
LWLockPadded
+LZ4CompressorState
LZ4F_compressionContext_t
LZ4F_decompressOptions_t
LZ4F_decompressionContext_t
--
2.25.1
0006-f.patchtext/x-diff; charset=us-asciiDownload
From cf1efb67c49c5e31d77049f5469967dd750ee8c9 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 14 Jan 2023 10:45:02 -0600
Subject: [PATCH 6/7] f!
---
src/bin/pg_dump/compress_lz4.c | 34 ++++++++++++++++----------------
src/bin/pg_dump/compress_lz4.h | 4 ++--
src/bin/pg_dump/t/002_pg_dump.pl | 2 +-
3 files changed, 20 insertions(+), 20 deletions(-)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index c97e16187a0..0e259a6251a 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -117,13 +117,13 @@ EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
/* Public routines that support LZ4 compressed data I/O */
void
-InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification *compression_spec)
{
cs->readData = ReadDataFromArchiveLZ4;
cs->writeData = WriteDataToArchiveLZ4;
cs->end = EndCompressorLZ4;
- cs->compression_spec = compression_spec;
+ cs->compression_spec = *compression_spec;
/* Will be lazy init'd */
cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
@@ -189,7 +189,7 @@ LZ4File_get_error(CompressFileHandle *CFH)
/*
* Prepare an already alloc'ed LZ4File struct for subsequent calls.
*
- * It creates the nessary contexts for the operations. When compressing,
+ * It creates the necessary contexts for the operations. When compressing,
* it additionally writes the LZ4 header in the output stream.
*/
static int
@@ -228,7 +228,7 @@ LZ4File_init(LZ4File * fs, int size, bool compressing)
if (fwrite(fs->buffer, 1, status, fs->fp) != status)
{
- errno = errno ? : ENOSPC;
+ errno = errno ? errno : ENOSPC;
return 1;
}
}
@@ -255,7 +255,7 @@ LZ4File_init(LZ4File * fs, int size, bool compressing)
/*
* Read already decompressed content from the overflow buffer into 'ptr' up to
* 'size' bytes, if available. If the eol_flag is set, then stop at the first
- * occurance of the new line char prior to 'size' bytes.
+ * occurrence of the new line char prior to 'size' bytes.
*
* Any unread content in the overflow buffer, is moved to the beginning.
*/
@@ -309,10 +309,10 @@ LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
void *readbuf;
/* Lazy init */
- if (!fs->inited && LZ4File_init(fs, size, false /* decompressing */ ))
+ if (LZ4File_init(fs, size, false /* decompressing */ ))
return -1;
- /* Verfiy that there is enough space in the outbuf */
+ /* Verify that there is enough space in the outbuf */
if (size > fs->buflen)
{
fs->buflen = size;
@@ -363,10 +363,10 @@ LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
if (outlen > 0 && dsize < size && eol_found == false)
{
char *p;
- size_t lib = (eol_flag == 0) ? size - dsize : size - 1 - dsize;
+ size_t lib = eol_flag ? size - 1 - dsize : size - dsize ;
size_t len = outlen < lib ? outlen : lib;
- if (eol_flag == true &&
+ if (eol_flag &&
(p = memchr(fs->buffer, '\n', outlen)) &&
(size_t) (p - fs->buffer + 1) <= len)
{
@@ -377,7 +377,7 @@ LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
memcpy((char *) ptr + dsize, fs->buffer, len);
dsize += len;
- /* move what did not fit, if any, at the begining of the buf */
+ /* move what did not fit, if any, at the beginning of the buf */
if (len < outlen)
memmove(fs->buffer, fs->buffer + len, outlen - len);
outlen -= len;
@@ -414,7 +414,7 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
size_t status;
int remaining = size;
- if (!fs->inited && LZ4File_init(fs, size, true))
+ if (LZ4File_init(fs, size, true))
return -1;
while (remaining > 0)
@@ -433,7 +433,7 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
if (fwrite(fs->buffer, 1, status, fs->fp) != status)
{
- errno = errno ? : ENOSPC;
+ errno = errno ? errno : ENOSPC;
return 1;
}
}
@@ -520,7 +520,7 @@ LZ4File_close(CompressFileHandle *CFH)
LZ4F_getErrorName(status));
else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
{
- errno = errno ? : ENOSPC;
+ errno = errno ? errno : ENOSPC;
WRITE_ERROR_EXIT;
}
@@ -582,7 +582,7 @@ LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
}
void
-InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification *compression_spec)
{
LZ4File *lz4fp;
@@ -596,7 +596,7 @@ InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compres
CFH->close_func = LZ4File_close;
CFH->get_error_func = LZ4File_get_error;
- CFH->compression_spec = compression_spec;
+ CFH->compression_spec = *compression_spec;
lz4fp = pg_malloc0(sizeof(*lz4fp));
if (CFH->compression_spec.level >= 0)
lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
@@ -605,13 +605,13 @@ InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compres
}
#else /* USE_LZ4 */
void
-InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification *compression_spec)
{
pg_fatal("this build does not support compression with %s", "LZ4");
}
void
-InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification *compression_spec)
{
pg_fatal("this build does not support compression with %s", "LZ4");
}
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
index 74595db1b98..69a3d9c171f 100644
--- a/src/bin/pg_dump/compress_lz4.h
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -16,7 +16,7 @@
#include "compress_io.h"
-extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec);
-extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification *compression_spec);
+extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification *compression_spec);
#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index f497ec60407..263995a2b7a 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -183,7 +183,7 @@ my %pgdump_runs = (
"$tempdir/compression_lz4_dir/blobs.toc.lz4",
],
},
- # Verify that data files where compressed
+ # Verify that data files were compressed
glob_patterns => [
"$tempdir/compression_lz4_dir/toc.dat",
"$tempdir/compression_lz4_dir/*.dat.lz4",
--
2.25.1
0007-TMP-pg_dump-use-lz4-by-default-for-CI-only.patchtext/x-diff; charset=us-asciiDownload
From d2fe3c9c5bb1fdce5c58af2773c603358085a0bb Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Wed, 4 Jan 2023 21:21:53 -0600
Subject: [PATCH 7/7] TMP: pg_dump: use lz4 by default, for CI only
---
src/bin/pg_dump/pg_dump.c | 7 +++++--
src/bin/pg_dump/t/002_pg_dump.pl | 8 ++++----
2 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 224d2c900ce..cf5083c432f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -733,8 +733,11 @@ main(int argc, char **argv)
#ifdef HAVE_LIBZ
parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
&compression_spec);
-#else
- /* Nothing to do in the default case */
+#endif
+
+#ifdef USE_LZ4
+ parse_compress_specification(PG_COMPRESSION_LZ4, NULL,
+ &compression_spec);
#endif
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 263995a2b7a..3485ebca57d 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -313,9 +313,9 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: gzip/ :
+ qr/Compression: lz4/ :
qr/Compression: none/,
- name => 'data content is gzip-compressed by default if available',
+ name => 'data content is lz4-compressed by default if available',
},
},
@@ -338,7 +338,7 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: gzip/ :
+ qr/Compression: lz4/ :
qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
@@ -346,7 +346,7 @@ my %pgdump_runs = (
"$tempdir/defaults_dir_format/toc.dat",
"$tempdir/defaults_dir_format/blobs.toc",
$supports_gzip ?
- "$tempdir/defaults_dir_format/*.dat.gz" :
+ "$tempdir/defaults_dir_format/*.dat.lz4" :
"$tempdir/defaults_dir_format/*.dat",
],
},
--
2.25.1
On Sat, Jan 14, 2023 at 03:43:08PM -0600, Justin Pryzby wrote:
On Sun, Jan 08, 2023 at 01:45:25PM -0600, Justin Pryzby wrote:
pg_compress_specification is being passed by value, but I think it
should be passed as a pointer, as is done everywhere else.ISTM that was an issue with 5e73a6048, affecting a few public and
private functions. I wrote a pre-preparatory patch which changes to
pass by reference.
I updated 001 to change SetOutput() to pass by reference, too (before,
that ended up in the 002 patch).
I can't see any issue in 002 other than the == GZIP change (the fix for
which I'd previously included in a later patch).
One more thing: some comments still refer to the cfopen API, which this
patch removes.There were "LZ4" comments and file extension stuff in the preparatory
commit. But now it seems like you *removed* them in the LZ4 commit
(where it actually belongs) rather than *moving* it from the
prior/parent commit *to* the lz4 commit. I recommend to run something
like "git diff @{1}" whenever doing this kind of patch surgery.TODO
I addressed that in the fixup commits 005 and 007.
--
Justin
Attachments:
0001-pg_dump-pass-pg_compress_specification-as-a-pointer.patchtext/x-diff; charset=us-asciiDownload
From b822e6ace433235232a74cbe50af514271f9e49d Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 14 Jan 2023 10:58:23 -0600
Subject: [PATCH 1/8] pg_dump: pass pg_compress_specification as a pointer..
..as is done everywhere else.
---
src/bin/pg_dump/compress_io.c | 30 +++++++++++++--------------
src/bin/pg_dump/compress_io.h | 8 +++----
src/bin/pg_dump/pg_backup.h | 2 +-
src/bin/pg_dump/pg_backup_archiver.c | 24 ++++++++++-----------
src/bin/pg_dump/pg_backup_custom.c | 6 +++---
src/bin/pg_dump/pg_backup_directory.c | 8 +++----
src/bin/pg_dump/pg_dump.c | 2 +-
7 files changed, 40 insertions(+), 40 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 7a2c80bbc4c..e5107c75874 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -94,19 +94,19 @@ static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
/* Allocate a new compressor */
CompressorState *
-AllocateCompressor(const pg_compress_specification compression_spec,
+AllocateCompressor(const pg_compress_specification *compression_spec,
WriteFunc writeF)
{
CompressorState *cs;
#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ if (compression_spec->algorithm == PG_COMPRESSION_GZIP)
pg_fatal("this build does not support compression with %s", "gzip");
#endif
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
cs->writeF = writeF;
- cs->compression_spec = compression_spec;
+ cs->compression_spec = *compression_spec;
/*
* Perform compression algorithm specific initialization.
@@ -125,12 +125,12 @@ AllocateCompressor(const pg_compress_specification compression_spec,
*/
void
ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
+ const pg_compress_specification *compression_spec,
ReadFunc readF)
{
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
+ if (compression_spec->algorithm == PG_COMPRESSION_NONE)
ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ if (compression_spec->algorithm == PG_COMPRESSION_GZIP)
{
#ifdef HAVE_LIBZ
ReadDataFromArchiveZlib(AH, readF);
@@ -432,13 +432,13 @@ cfopen_read(const char *path, const char *mode)
if (hasSuffix(path, ".gz"))
{
compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
+ fp = cfopen(path, mode, &compression_spec);
}
else
#endif
{
compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
+ fp = cfopen(path, mode, &compression_spec);
#ifdef HAVE_LIBZ
if (fp == NULL)
{
@@ -446,7 +446,7 @@ cfopen_read(const char *path, const char *mode)
fname = psprintf("%s.gz", path);
compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
+ fp = cfopen(fname, mode, &compression_spec);
free_keep_errno(fname);
}
#endif
@@ -467,11 +467,11 @@ cfopen_read(const char *path, const char *mode)
*/
cfp *
cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+ const pg_compress_specification *compression_spec)
{
cfp *fp;
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
+ if (compression_spec->algorithm == PG_COMPRESSION_NONE)
fp = cfopen(path, mode, compression_spec);
else
{
@@ -497,20 +497,20 @@ cfopen_write(const char *path, const char *mode,
*/
cfp *
cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+ const pg_compress_specification *compression_spec)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ if (compression_spec->algorithm == PG_COMPRESSION_GZIP)
{
#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
+ if (compression_spec->level != Z_DEFAULT_COMPRESSION)
{
/* user has specified a compression level, so tell zlib to use it */
char mode_compression[32];
snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
+ mode, compression_spec->level);
fp->compressedfp = gzopen(path, mode_compression);
}
else
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a429dc4789d..34f4e5e1e14 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -40,10 +40,10 @@ typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
-extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+extern CompressorState *AllocateCompressor(const pg_compress_specification *compression_spec,
WriteFunc writeF);
extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
+ const pg_compress_specification *compression_spec,
ReadFunc readF);
extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen);
@@ -53,10 +53,10 @@ extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
typedef struct cfp cfp;
extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
+ const pg_compress_specification *compression_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
+ const pg_compress_specification *compression_spec);
extern int cfread(void *ptr, int size, cfp *fp);
extern int cfwrite(const void *ptr, int size, cfp *fp);
extern int cfgetc(cfp *fp);
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index aba780ef4b1..216e24e7ec5 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -305,7 +305,7 @@ extern Archive *OpenArchive(const char *FileSpec, const ArchiveFormat fmt);
/* Create a new archive */
extern Archive *CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
- const pg_compress_specification compression_spec,
+ const pg_compress_specification *compression_spec,
bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupDumpWorker);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 7f7a0f1ce7b..b82bad107f8 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -70,7 +70,7 @@ typedef struct _parallelReadyList
static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt,
- const pg_compress_specification compression_spec,
+ const pg_compress_specification *compression_spec,
bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupWorkerPtr);
static void _getObjectDescription(PQExpBuffer buf, const TocEntry *te);
@@ -100,7 +100,7 @@ static int RestoringToDB(ArchiveHandle *AH);
static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
- const pg_compress_specification compression_spec);
+ const pg_compress_specification *compression_spec);
static OutputContext SaveOutput(ArchiveHandle *AH);
static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
@@ -241,7 +241,7 @@ setupRestoreWorker(Archive *AHX)
/* Public */
Archive *
CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
- const pg_compress_specification compression_spec,
+ const pg_compress_specification *compression_spec,
bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupDumpWorker)
@@ -261,7 +261,7 @@ OpenArchive(const char *FileSpec, const ArchiveFormat fmt)
pg_compress_specification compression_spec = {0};
compression_spec.algorithm = PG_COMPRESSION_NONE;
- AH = _allocAH(FileSpec, fmt, compression_spec, true,
+ AH = _allocAH(FileSpec, fmt, &compression_spec, true,
archModeRead, setupRestoreWorker);
return (Archive *) AH;
@@ -469,7 +469,7 @@ RestoreArchive(Archive *AHX)
*/
sav = SaveOutput(AH);
if (ropt->filename || ropt->compression_spec.algorithm != PG_COMPRESSION_NONE)
- SetOutput(AH, ropt->filename, ropt->compression_spec);
+ SetOutput(AH, ropt->filename, &ropt->compression_spec);
ahprintf(AH, "--\n-- PostgreSQL database dump\n--\n\n");
@@ -1137,7 +1137,7 @@ PrintTOCSummary(Archive *AHX)
sav = SaveOutput(AH);
if (ropt->filename)
- SetOutput(AH, ropt->filename, out_compression_spec);
+ SetOutput(AH, ropt->filename, &out_compression_spec);
if (strftime(stamp_str, sizeof(stamp_str), PGDUMP_STRFTIME_FMT,
localtime(&AH->createDate)) == 0)
@@ -1501,7 +1501,7 @@ archprintf(Archive *AH, const char *fmt,...)
static void
SetOutput(ArchiveHandle *AH, const char *filename,
- const pg_compress_specification compression_spec)
+ const pg_compress_specification *compression_spec)
{
int fn;
@@ -1524,12 +1524,12 @@ SetOutput(ArchiveHandle *AH, const char *filename,
/* If compression explicitly requested, use gzopen */
#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ if (compression_spec->algorithm == PG_COMPRESSION_GZIP)
{
char fmode[14];
/* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
+ sprintf(fmode, "wb%d", compression_spec->level);
if (fn >= 0)
AH->OF = gzdopen(dup(fn), fmode);
else
@@ -2214,7 +2214,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
static ArchiveHandle *
_allocAH(const char *FileSpec, const ArchiveFormat fmt,
- const pg_compress_specification compression_spec,
+ const pg_compress_specification *compression_spec,
bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupWorkerPtr)
{
@@ -2266,7 +2266,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
AH->toc->prev = AH->toc;
AH->mode = mode;
- AH->compression_spec = compression_spec;
+ AH->compression_spec = *compression_spec;
AH->dosync = dosync;
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
@@ -2281,7 +2281,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
* Force stdin/stdout into binary mode if that is what we are using.
*/
#ifdef WIN32
- if ((fmt != archNull || compression_spec.algorithm != PG_COMPRESSION_NONE) &&
+ if ((fmt != archNull || compression_spec->algorithm != PG_COMPRESSION_NONE) &&
(AH->fSpec == NULL || strcmp(AH->fSpec, "") == 0))
{
if (mode == archModeWrite)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index d1e54644a94..0e87444de85 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,7 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(&AH->compression_spec, _CustomWriteFunc);
}
/*
@@ -377,7 +377,7 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(&AH->compression_spec, _CustomWriteFunc);
}
/*
@@ -566,7 +566,7 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ ReadDataFromArchive(AH, &AH->compression_spec, _CustomReadFunc);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 6800c3cceef..ffb8a0e4dd7 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -328,7 +328,7 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
+ &AH->compression_spec);
if (ctx->dataFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -584,7 +584,7 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
+ tocFH = cfopen_write(fname, PG_BINARY_W, &compression_spec);
if (tocFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -649,7 +649,7 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
+ ctx->LOsTocFH = cfopen_write(fname, "ab", &compression_spec);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,7 +667,7 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
+ ctx->dataFH = cfopen_write(fname, PG_BINARY_W, &AH->compression_spec);
if (ctx->dataFH == NULL)
pg_fatal("could not open output file \"%s\": %m", fname);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2c0a9699729..20f73729fac 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -751,7 +751,7 @@ main(int argc, char **argv)
pg_fatal("parallel backup only supported by the directory format");
/* Open the output file */
- fout = CreateArchive(filename, archiveFormat, compression_spec,
+ fout = CreateArchive(filename, archiveFormat, &compression_spec,
dosync, archiveMode, setupDumpWorker);
/* Make dump options accessible right away */
--
2.25.1
0002-Prepare-pg_dump-internals-for-additional-compression.patchtext/x-diff; charset=us-asciiDownload
From 7f14216f4c46017f48efc73cb4e3021b573c5391 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 21 Dec 2022 09:49:23 +0000
Subject: [PATCH 2/8] Prepare pg_dump internals for additional compression
methods.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it and is using it throughout.
---
src/bin/pg_dump/compress_io.c | 389 +++++++++++++++++++--------
src/bin/pg_dump/compress_io.h | 2 +
src/bin/pg_dump/pg_backup_archiver.c | 128 +++------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-
4 files changed, 320 insertions(+), 226 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index e5107c75874..8d0bec08d7c 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,10 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
/*----------------------
* Compressor API
*----------------------
@@ -128,15 +132,24 @@ ReadDataFromArchive(ArchiveHandle *AH,
const pg_compress_specification *compression_spec,
ReadFunc readF)
{
- if (compression_spec->algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec->algorithm == PG_COMPRESSION_GZIP)
+ switch (compression_spec->algorithm)
{
+ case PG_COMPRESSION_NONE:
+ ReadDataFromArchiveNone(AH, readF);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
+ ReadDataFromArchiveZlib(AH, readF);
#else
- pg_fatal("this build does not support compression with %s", "gzip");
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
}
@@ -149,6 +162,9 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
{
switch (cs->compression_spec.algorithm)
{
+ case PG_COMPRESSION_NONE:
+ WriteDataToArchiveNone(AH, cs, data, dLen);
+ break;
case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
WriteDataToArchiveZlib(AH, cs, data, dLen);
@@ -156,13 +172,11 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
pg_fatal("this build does not support compression with %s", "gzip");
#endif
break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
case PG_COMPRESSION_LZ4:
- /* fallthrough */
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
}
}
@@ -173,10 +187,26 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
+ switch (cs->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
+ EndCompressorZlib(AH, cs);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
free(cs);
}
@@ -391,10 +421,8 @@ WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
*/
struct cfp
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
+ pg_compress_specification compression_spec;
+ void *fp;
};
#ifdef HAVE_LIBZ
@@ -490,127 +518,202 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification *compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ const pg_compress_specification *compression_spec)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compression_spec->algorithm == PG_COMPRESSION_GZIP)
+ fp->compression_spec = *compression_spec;
+
+ switch (compression_spec->algorithm)
{
-#ifdef HAVE_LIBZ
- if (compression_spec->level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
+ case PG_COMPRESSION_NONE:
+ if (fd >= 0)
+ fp->fp = fdopen(fd, mode);
+ else
+ fp->fp = fopen(path, mode);
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec->level);
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
- }
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ if (compression_spec->level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use
+ * it
+ */
+ char mode_compression[32];
+
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, compression_spec->level);
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode_compression);
+ else
+ fp->fp = gzopen(path, mode_compression);
+ }
+ else
+ {
+ /* don't specify a level, just use the zlib default */
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode);
+ else
+ fp->fp = gzopen(path, mode);
+ }
- fp->uncompressedfp = NULL;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
- else
- {
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
- fp->uncompressedfp = fopen(path, mode);
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
return fp;
}
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification *compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification *compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
{
- int ret;
+ int ret = 0;
if (size == 0)
return 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ case PG_COMPRESSION_NONE:
+ ret = fread(ptr, 1, size, (FILE *) fp->fp);
+ if (ret != size && !feof((FILE *) fp->fp))
+ READ_ERROR_EXIT((FILE *) fp->fp);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
- else
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzread((gzFile) fp->fp, ptr, size);
+ if (ret != size && !gzeof((gzFile) fp->fp))
+ {
+ int errnum;
+ const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
+
return ret;
}
int
cfwrite(const void *ptr, int size, cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fwrite(ptr, 1, size, (FILE *) fp->fp);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
+ ret = gzwrite((gzFile) fp->fp, ptr, size);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
int
cfgetc(cfp *fp)
{
- int ret;
+ int ret = 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
- {
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fgetc((FILE *) fp->fp);
+ if (ret == EOF)
+ READ_ERROR_EXIT((FILE *) fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzgetc((gzFile) fp->fp);
+ if (ret == EOF)
+ {
+ if (!gzeof((gzFile) fp->fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
return ret;
@@ -619,65 +722,119 @@ cfgetc(cfp *fp)
char *
cfgets(cfp *fp, char *buf, int len)
{
+ char *ret = NULL;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fgets(buf, len, (FILE *) fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
+ ret = gzgets((gzFile) fp->fp, buf, len);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return fgets(buf, len, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
int
cfclose(cfp *fp)
{
- int result;
+ int ret = 0;
if (fp == NULL)
{
errno = EBADF;
return EOF;
}
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+
+ switch (fp->compression_spec.algorithm)
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fclose((FILE *) fp->fp);
+ fp->fp = NULL;
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzclose((gzFile) fp->fp);
+ fp->fp = NULL;
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
+
free_keep_errno(fp);
- return result;
+ return ret;
}
int
cfeof(cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = feof((FILE *) fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
+ ret = gzeof((gzFile) fp->fp);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return feof(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
const char *
get_cfp_error(cfp *fp)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
+#ifdef HAVE_LIBZ
int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
if (errnum != Z_ERRNO)
return errmsg;
- }
+#else
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
+ }
+
return strerror(errno);
}
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 34f4e5e1e14..768096c820d 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -54,6 +54,8 @@ typedef struct cfp cfp;
extern cfp *cfopen(const char *path, const char *mode,
const pg_compress_specification *compression_spec);
+extern cfp *cfdopen(int fd, const char *mode,
+ const pg_compress_specification *compression_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
extern cfp *cfopen_write(const char *path, const char *mode,
const pg_compress_specification *compression_spec);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index b82bad107f8..5164b57f042 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification *compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -362,8 +353,9 @@ RestoreArchive(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
+ bool supports_compression;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +383,24 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
+ supports_compression = true;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE &&
+ AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+#ifndef HAVE_LIBZ
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+ if (supports_compression == false)
+ pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1128,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1503,58 +1502,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification *compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec->algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec->level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1565,33 +1538,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1715,22 +1679,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2219,6 +2178,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2272,8 +2232,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, &out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b4..4725e49747b 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.25.1
0003-f.patchtext/x-diff; charset=us-asciiDownload
From 5105ac46180d287f8df61a58bd8042f5f6aed1e0 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sun, 15 Jan 2023 16:27:28 -0600
Subject: [PATCH 3/8] f!
---
src/bin/pg_dump/pg_backup_archiver.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 5164b57f042..d4f1e09fce6 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -385,7 +385,6 @@ RestoreArchive(Archive *AHX)
*/
supports_compression = true;
if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE &&
- AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
--
2.25.1
0004-Introduce-Compressor-API-in-pg_dump.patchtext/x-diff; charset=us-asciiDownload
From a9db15f09c8cbf71603bddc7d0b411e6c91411c8 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 21 Dec 2022 09:49:31 +0000
Subject: [PATCH 4/8] Introduce Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for opening, writing, etc. The implementor of a new
compression method is now able to "simply" just add those definitions.
Custom compressed archives now need to store the compression algorithm in their
header. This requires a bump in the version number. The level of compression
is no longer stored in the dump as it is irrelevant.
---
src/bin/pg_dump/Makefile | 1 +
src/bin/pg_dump/compress_gzip.c | 405 ++++++++++++
src/bin/pg_dump/compress_gzip.h | 22 +
src/bin/pg_dump/compress_io.c | 914 +++++++-------------------
src/bin/pg_dump/compress_io.h | 71 +-
src/bin/pg_dump/meson.build | 1 +
src/bin/pg_dump/pg_backup_archiver.c | 102 +--
src/bin/pg_dump/pg_backup_archiver.h | 5 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 94 +--
src/bin/pg_dump/t/002_pg_dump.pl | 10 +-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 2 +
13 files changed, 852 insertions(+), 799 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index ef1ed0f3e51..7a19f5d6172 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,6 +24,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 00000000000..37c841c5a9b
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,405 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.c
+ * Routines for archivers to write an uncompressed or compressed data
+ * stream.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_gzip.h"
+#include "pg_backup_utils.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ /*
+ * A level of zero simply copies the input one block at the time. This
+ * is probably not what the user wanted when calling this interface.
+ */
+ if (cs->compression_spec.level == 0)
+ pg_fatal("requested to compress the archive yet no level was specified");
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification *compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ cs->compression_spec = *compression_spec;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
+ cs->private_data = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ size_t ret;
+
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzwrite(gzfp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzgets(gzfp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int save_errno;
+ int ret;
+
+ CFH->private_data = NULL;
+
+ ret = gzclose(gzfp);
+
+ save_errno = errno;
+ errno = save_errno;
+
+ return ret;
+}
+
+static int
+Gzip_eof(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzeof(gzfp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ gzFile gzfp;
+ char mode_compression[32];
+
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
+ else
+ gzfp = gzopen(path, mode_compression);
+
+ if (gzfp == NULL)
+ return 1;
+
+ CFH->private_data = gzfp;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification *compression_spec)
+{
+ CFH->open_func = Gzip_open;
+ CFH->open_write_func = Gzip_open_write;
+ CFH->read_func = Gzip_read;
+ CFH->write_func = Gzip_write;
+ CFH->gets_func = Gzip_gets;
+ CFH->getc_func = Gzip_getc;
+ CFH->close_func = Gzip_close;
+ CFH->eof_func = Gzip_eof;
+ CFH->get_error_func = Gzip_get_error;
+
+ CFH->compression_spec = *compression_spec;
+
+ CFH->private_data = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification *compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification *compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 00000000000..a1fc3595e51
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.h
+ * Interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs, const pg_compress_specification *compression_spec);
+extern void InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification *compression_spec);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 8d0bec08d7c..c4ac4042794 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -9,42 +9,44 @@
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
+ * underlying stream. The second API is a wrapper around fopen and
* friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
+ * the possible compression. The second API is aimed for the resulting
+ * files can be easily manipulated with an external compression utility
+ * program.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
+ * AllocateCompressor, writeData, and EndCompressor. First you call
+ * AllocateCompressor, then write all the data by calling writeData as many
+ * times as needed, and finally EndCompressor. writeData will call the
+ * WriteFunc that was provided to AllocateCompressor for each chunk of
+ * compressed data.
*
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
- *
- * The interface is the same for compressed and uncompressed streams.
+ * The interface for reading an archive consists of the same three functions:
+ * AllocateCompressor, readData, and EndCompressor. First you call
+ * AllocateCompressor, then read all the data by calling readData to read the
+ * whole compressed stream which repeatedly calls the given ReadFunc. ReadFunc
+ * returns the compressed data chunk at a time, and readData decompresses it
+ * and passes the decompressed data to ahwrite(), until ReadFunc returns 0 to
+ * signal EOF. The interface is the same for compressed and uncompressed
+ * streams.
*
* Compressed stream API
* ----------------------
*
* The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * libz's gzopen() APIs and custom LZ4 calls which provide similar
+ * functionality. It allows you to use the same functions for compressed and
+ * uncompressed streams. cfopen_read() first tries to open the file with given
+ * name, and if it fails, it tries to open the same file with the .gz suffix,
+ * failing that it tries to open the same file with the .lz4 suffix.
+ * cfopen_write() opens a file for writing, an extra argument specifies the
+ * method to use should the file be compressed, and adds the appropriate
+ * suffix, .gz or .lz4, to the filename if so. This allows you to easily handle
+ * both compressed and uncompressed files.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -53,7 +55,11 @@
*/
#include "postgres_fe.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
#include "compress_io.h"
+#include "compress_gzip.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -65,84 +71,70 @@
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
+
+ free(buf);
+}
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+static void
+InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification *compression_spec)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+
+ cs->compression_spec = *compression_spec;
+}
/* Public interface routines */
/* Allocate a new compressor */
CompressorState *
AllocateCompressor(const pg_compress_specification *compression_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compression_spec->algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
cs->compression_spec = *compression_spec;
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
-}
-
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification *compression_spec,
- ReadFunc readF)
-{
switch (compression_spec->algorithm)
{
case PG_COMPRESSION_NONE:
- ReadDataFromArchiveNone(AH, readF);
+ InitCompressorNone(cs, compression_spec);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
+ InitCompressorGzip(cs, compression_spec);
break;
case PG_COMPRESSION_LZ4:
pg_fatal("compression with %s is not yet supported", "LZ4");
@@ -151,34 +143,8 @@ ReadDataFromArchive(ArchiveHandle *AH,
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
}
-}
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
+ return cs;
}
/*
@@ -187,401 +153,177 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- EndCompressorZlib(AH, cs);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- free(cs);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
+/*----------------------
+ * Compressed stream API
+ *----------------------
*/
-static void
-InitCompressorZlib(CompressorState *cs, int level)
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
+
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
}
+/* free() without changing errno; useful in several places below */
static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
+free_keep_errno(void *p)
{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
+ int save_errno = errno;
- free(cs->zlibOut);
- free(cs->zp);
+ free(p);
+ errno = save_errno;
}
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
+/*
+ * Compression None implementation
+ */
+static size_t
+read_none(void *ptr, size_t size, CompressFileHandle *CFH)
{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
+ FILE *fp = (FILE *) CFH->private_data;
+ size_t ret;
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
+ if (size == 0)
+ return 0;
- if (res == Z_STREAM_END)
- break;
- }
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
+
+ return ret;
}
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+static size_t
+write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
+ return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
}
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
+static const char *
+get_error_none(CompressFileHandle *CFH)
{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
+ return strerror(errno);
+}
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
+static char *
+gets_none(char *ptr, int size, CompressFileHandle *CFH)
+{
+ return fgets(ptr, size, (FILE *) CFH->private_data);
+}
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
+static int
+getc_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret;
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
+ ret = fgetc(fp);
+ if (ret == EOF)
{
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
}
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
+ return ret;
}
-#endif /* HAVE_LIBZ */
-
-
-/*
- * Functions for uncompressed output.
- */
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
+static int
+close_none(CompressFileHandle *CFH)
{
- size_t cnt;
- char *buf;
- size_t buflen;
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret = 0;
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
+ CFH->private_data = NULL;
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
+ if (fp)
+ ret = fclose(fp);
- free(buf);
+ return ret;
}
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+
+static int
+eof_none(CompressFileHandle *CFH)
{
- cs->writeF(AH, data, dLen);
+ return feof((FILE *) CFH->private_data);
}
-
-/*----------------------
- * Compressed stream API
- *----------------------
- */
-
-/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
- */
-struct cfp
+static int
+open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
{
- pg_compress_specification compression_spec;
- void *fp;
-};
+ Assert(CFH->private_data == NULL);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (fd >= 0)
+ CFH->private_data = fdopen(dup(fd), mode);
+ else
+ CFH->private_data = fopen(path, mode);
-/* free() without changing errno; useful in several places below */
-static void
-free_keep_errno(void *p)
-{
- int save_errno = errno;
+ if (CFH->private_data == NULL)
+ return 1;
- free(p);
- errno = save_errno;
+ return 0;
}
-/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_read(const char *path, const char *mode)
+static int
+open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
{
- cfp *fp;
-
- pg_compress_specification compression_spec = {0};
+ Assert(CFH->private_data == NULL);
+ CFH->private_data = fopen(path, mode);
+ if (CFH->private_data == NULL)
+ return 1;
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, &compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, &compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
-
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, &compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
+ return 0;
}
-/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification *compression_spec)
+static void
+InitCompressNone(CompressFileHandle *CFH,
+ const pg_compress_specification *compression_spec)
{
- cfp *fp;
-
- if (compression_spec->algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
+ CFH->open_func = open_none;
+ CFH->open_write_func = open_write_none;
+ CFH->read_func = read_none;
+ CFH->write_func = write_none;
+ CFH->gets_func = gets_none;
+ CFH->getc_func = getc_none;
+ CFH->close_func = close_none;
+ CFH->eof_func = eof_none;
+ CFH->get_error_func = get_error_none;
+
+ CFH->private_data = NULL;
}
/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
+ * Public interface
*/
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- const pg_compress_specification *compression_spec)
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification *compression_spec)
{
- cfp *fp = pg_malloc(sizeof(cfp));
+ CompressFileHandle *CFH;
- fp->compression_spec = *compression_spec;
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
switch (compression_spec->algorithm)
{
case PG_COMPRESSION_NONE:
- if (fd >= 0)
- fp->fp = fdopen(fd, mode);
- else
- fp->fp = fopen(path, mode);
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-
+ InitCompressNone(CFH, compression_spec);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- if (compression_spec->level != Z_DEFAULT_COMPRESSION)
- {
- /*
- * user has specified a compression level, so tell zlib to use
- * it
- */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec->level);
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode_compression);
- else
- fp->fp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode);
- else
- fp->fp = gzopen(path, mode);
- }
-
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
+ InitCompressGzip(CFH, compression_spec);
break;
case PG_COMPRESSION_LZ4:
pg_fatal("compression with %s is not yet supported", "LZ4");
@@ -591,266 +333,88 @@ cfopen_internal(const char *path, int fd, const char *mode,
break;
}
- return fp;
+ return CFH;
}
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification *compression_spec)
+/*
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
+ *
+ * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (i
+ * 'path' doesn't already have it) and try again. So if you pass "foo" as
+ * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
+ * order.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- return cfopen_internal(path, -1, mode, compression_spec);
-}
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification *compression_spec)
-{
- return cfopen_internal(NULL, fd, mode, compression_spec);
-}
+ // compression_spec.algorithm = PG_COMPRESSION_NONE;
-int
-cfread(void *ptr, int size, cfp *fp)
-{
- int ret = 0;
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
- if (size == 0)
- return 0;
+ fname = strdup(path);
- switch (fp->compression_spec.algorithm)
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ else
{
- case PG_COMPRESSION_NONE:
- ret = fread(ptr, 1, size, (FILE *) fp->fp);
- if (ret != size && !feof((FILE *) fp->fp))
- READ_ERROR_EXIT((FILE *) fp->fp);
+ bool exists;
- break;
- case PG_COMPRESSION_GZIP:
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not build with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- ret = gzread((gzFile) fp->fp, ptr, size);
- if (ret != size && !gzeof((gzFile) fp->fp))
- {
- int errnum;
- const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
-
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-int
-cfwrite(const void *ptr, int size, cfp *fp)
-{
- int ret = 0;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fwrite(ptr, 1, size, (FILE *) fp->fp);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzwrite((gzFile) fp->fp, ptr, size);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-int
-cfgetc(cfp *fp)
-{
- int ret = 0;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgetc((FILE *) fp->fp);
- if (ret == EOF)
- READ_ERROR_EXIT((FILE *) fp->fp);
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgetc((gzFile) fp->fp);
- if (ret == EOF)
- {
- if (!gzeof((gzFile) fp->fp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ }
#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-char *
-cfgets(cfp *fp, char *buf, int len)
-{
- char *ret = NULL;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgets(buf, len, (FILE *) fp->fp);
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgets((gzFile) fp->fp, buf, len);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-int
-cfclose(cfp *fp)
-{
- int ret = 0;
-
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
}
- switch (fp->compression_spec.algorithm)
+ CFH = InitCompressFileHandle(&compression_spec);
+ if (CFH->open_func(fname, -1, mode, CFH))
{
- case PG_COMPRESSION_NONE:
- ret = fclose((FILE *) fp->fp);
- fp->fp = NULL;
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzclose((gzFile) fp->fp);
- fp->fp = NULL;
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
+ free_keep_errno(CFH);
+ CFH = NULL;
}
+ free_keep_errno(fname);
- free_keep_errno(fp);
-
- return ret;
+ return CFH;
}
int
-cfeof(cfp *fp)
+DestroyCompressFileHandle(CompressFileHandle *CFH)
{
int ret = 0;
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = feof((FILE *) fp->fp);
+ if (CFH->private_data)
+ ret = CFH->close_func(CFH);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzeof((gzFile) fp->fp);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
+ free_keep_errno(CFH);
return ret;
}
-
-const char *
-get_cfp_error(cfp *fp)
-{
- if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- int errnum;
- const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
-
- if (errnum != Z_ERRNO)
- return errmsg;
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
-
- return strerror(errno);
-}
-
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
-{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
-
- if (filenamelen < suffixlen)
- return 0;
-
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
-}
-
-#endif
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 768096c820d..afe6b22efaf 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -37,34 +37,63 @@ typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ ReadFunc readF;
+ WriteFunc writeF;
+
+ pg_compress_specification compression_spec;
+ void *private_data;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification *compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification *compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ int (*open_func) (const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH);
+ int (*open_write_func) (const char *path, const char *mode,
+ CompressFileHandle *cxt);
+ size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
+ size_t (*write_func) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
+ int (*getc_func) (CompressFileHandle *CFH);
+ int (*eof_func) (CompressFileHandle *CFH);
+ int (*close_func) (CompressFileHandle *CFH);
+ const char *(*get_error_func) (CompressFileHandle *CFH);
+
+ pg_compress_specification compression_spec;
+ void *private_data;
+};
-typedef struct cfp cfp;
+extern CompressFileHandle *InitCompressFileHandle(
+ const pg_compress_specification *compression_spec);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification *compression_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- const pg_compress_specification *compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification *compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int DestroyCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index ca62f9a3740..aa2c91829c0 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -2,6 +2,7 @@
pg_dump_common_sources = files(
'compress_io.c',
+ 'compress_gzip.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index d4f1e09fce6..9e9f1d626b5 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification *compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -355,7 +355,7 @@ RestoreArchive(Archive *AHX)
bool parallel_mode;
bool supports_compression;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1126,7 +1126,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1142,9 +1142,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1501,6 +1502,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification *compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1523,33 +1525,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1688,7 +1689,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2030,6 +2035,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2060,26 +2077,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2177,6 +2180,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2232,7 +2236,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, &out_compress_spec);
+ CFH = InitCompressFileHandle(&out_compress_spec);
+ if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3645,12 +3652,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3721,10 +3723,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
@@ -3736,10 +3739,17 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
+ {
+ bool unsupported = false;
+
#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ unsupported = true;
#endif
+ if (unsupported)
+ pg_fatal("archive is compressed, but this installation does not support compression");
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 4725e49747b..18b38c17abc 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,13 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add
+ * compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index 0e87444de85..40cd90b7325 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(&AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(&AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(&AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(&AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, &AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(&AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index ffb8a0e4dd7..2d4baf58c22 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- &AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(&AH->compression_spec);
+
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read_func(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (DestroyCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof_func(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (DestroyCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write_func(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc_func(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write_func(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read_func(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, &compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(&compression_spec);
+ if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", &compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(&compression_spec);
+ if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, &AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(&AH->compression_spec);
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write_func(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (DestroyCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 2eeef2a4783..f3ba9263213 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -94,7 +94,7 @@ my %pgdump_runs = (
command => [
'pg_restore', '-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip-compressed'
},
},
@@ -239,8 +239,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default if available',
},
},
@@ -264,8 +264,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
glob_patterns => [
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index fe432e2cccc..62f3e9a81d4 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -150,6 +150,7 @@ do
# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.
+ test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 23bafec5f79..840191d680b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -428,6 +428,7 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
+CompressFileHandle
CompressionLocation
CompressorState
ComputeXidHorizonsResult
@@ -1034,6 +1035,7 @@ GucStack
GucStackState
GucStringAssignHook
GucStringCheckHook
+GzipCompressorState
HANDLE
HASHACTION
HASHBUCKET
--
2.25.1
0005-f.patchtext/x-diff; charset=us-asciiDownload
From 84404669ee9473d4ef7c24127453b37143324450 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 14 Jan 2023 10:31:47 -0600
Subject: [PATCH 5/8] f!
---
src/bin/pg_dump/compress_gzip.c | 13 +----------
src/bin/pg_dump/compress_io.c | 40 ++++++++++-----------------------
2 files changed, 13 insertions(+), 40 deletions(-)
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index 37c841c5a9b..a021d414624 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -220,8 +220,6 @@ InitCompressorGzip(CompressorState *cs, const pg_compress_specification *compres
cs->writeData = WriteDataToArchiveGzip;
cs->end = EndCompressorGzip;
- cs->compression_spec = *compression_spec;
-
gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
cs->private_data = gzipcs;
@@ -291,17 +289,10 @@ static int
Gzip_close(CompressFileHandle *CFH)
{
gzFile gzfp = (gzFile) CFH->private_data;
- int save_errno;
- int ret;
CFH->private_data = NULL;
- ret = gzclose(gzfp);
-
- save_errno = errno;
- errno = save_errno;
-
- return ret;
+ return gzclose(gzfp);
}
static int
@@ -386,8 +377,6 @@ InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification *compr
CFH->eof_func = Gzip_eof;
CFH->get_error_func = Gzip_get_error;
- CFH->compression_spec = *compression_spec;
-
CFH->private_data = NULL;
}
#else /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index c4ac4042794..7a1fd318c12 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -39,14 +39,13 @@
*
* The compressed stream API is a wrapper around the C standard fopen() and
* libz's gzopen() APIs and custom LZ4 calls which provide similar
- * functionality. It allows you to use the same functions for compressed and
- * uncompressed streams. cfopen_read() first tries to open the file with given
- * name, and if it fails, it tries to open the same file with the .gz suffix,
- * failing that it tries to open the same file with the .lz4 suffix.
- * cfopen_write() opens a file for writing, an extra argument specifies the
- * method to use should the file be compressed, and adds the appropriate
- * suffix, .gz or .lz4, to the filename if so. This allows you to easily handle
- * both compressed and uncompressed files.
+ * libz's gzopen() APIs. It allows you to use the same functions for
+ * compressed and uncompressed streams. cfopen_read() first tries to open
+ * the file with given name, and if it fails, it tries to open the same
+ * file with the .gz suffix. cfopen_write() opens a file for writing, an
+ * extra argument specifies if the file should be compressed, and adds the
+ * .gz suffix to the filename if so. This allows you to easily handle both
+ * compressed and uncompressed files.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -110,8 +109,6 @@ InitCompressorNone(CompressorState *cs,
cs->readData = ReadDataFromArchiveNone;
cs->writeData = WriteDataToArchiveNone;
cs->end = EndCompressorNone;
-
- cs->compression_spec = *compression_spec;
}
/* Public interface routines */
@@ -126,7 +123,8 @@ AllocateCompressor(const pg_compress_specification *compression_spec,
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
cs->readF = readF;
cs->writeF = writeF;
- cs->compression_spec = *compression_spec;
+ cs->compression_spec = *compression_spec; // XXX: should do this here rather than every compressor ?
+ // does it even need to be passed at all ?
switch (compression_spec->algorithm)
{
@@ -340,10 +338,9 @@ InitCompressFileHandle(const pg_compress_specification *compression_spec)
* Open a file for reading. 'path' is the file to open, and 'mode' should
* be either "r" or "rb".
*
- * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (i
- * 'path' doesn't already have it) and try again. So if you pass "foo" as
- * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
- * order.
+ * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
+ * doesn't already have it) and try again. So if you pass "foo" as 'path',
+ * this will open either "foo" or "foo.gz".
*
* On failure, return NULL with an error code in errno.
*/
@@ -355,8 +352,6 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
char *fname;
pg_compress_specification compression_spec = {0};
- // compression_spec.algorithm = PG_COMPRESSION_NONE;
-
Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
fname = strdup(path);
@@ -381,17 +376,6 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
-#endif
-#ifdef USE_LZ4
- if (!exists)
- {
- free_keep_errno(fname);
- fname = psprintf("%s.lz4", path);
- exists = (stat(fname, &st) == 0);
-
- if (exists)
- compression_spec.algorithm = PG_COMPRESSION_LZ4;
- }
#endif
}
--
2.25.1
0006-Add-LZ4-compression-in-pg_-dump-restore.patchtext/x-diff; charset=us-asciiDownload
From 701484d05cab87147843b21971581bd61a0e01db Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 21 Dec 2022 09:49:36 +0000
Subject: [PATCH 6/8] Add LZ4 compression in pg_{dump|restore}
Within compress_lz4.{c,h} the streaming API and a file API compression is
implemented.. The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(), it has been
implemented localy.
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 13 +-
src/bin/pg_dump/compress_lz4.c | 618 +++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 22 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 14 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 1 +
11 files changed, 758 insertions(+), 21 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e14..49d218905fb 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 7a19f5d6172..a1401377ab9 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -25,6 +26,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
compress_gzip.o \
+ compress_lz4.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 7a1fd318c12..95b18843080 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -59,6 +59,7 @@
#include "compress_io.h"
#include "compress_gzip.h"
+#include "compress_lz4.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -135,7 +136,7 @@ AllocateCompressor(const pg_compress_specification *compression_spec,
InitCompressorGzip(cs, compression_spec);
break;
case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
+ InitCompressorLZ4(cs, compression_spec);
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
@@ -324,7 +325,7 @@ InitCompressFileHandle(const pg_compress_specification *compression_spec)
InitCompressGzip(CFH, compression_spec);
break;
case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
+ InitCompressLZ4(CFH, compression_spec);
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
@@ -338,11 +339,13 @@ InitCompressFileHandle(const pg_compress_specification *compression_spec)
* Open a file for reading. 'path' is the file to open, and 'mode' should
* be either "r" or "rb".
*
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
+ * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (if
+ * 'path' doesn't already have it) and try again. So if you pass "foo" as
+ * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
+ * order.
*
* On failure, return NULL with an error code in errno.
+ *
*/
CompressFileHandle *
InitDiscoverCompressFileHandle(const char *path, const char *mode)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 00000000000..c97e16187a0
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,618 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.c
+ * Routines for archivers to write an uncompressed or compressed data
+ * stream.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private_data;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private_data = NULL;
+ }
+}
+
+
+/* Public routines that support LZ4 compressed data I/O */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ cs->compression_spec = compression_spec;
+
+ /* Will be lazy init'd */
+ cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the nessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurance of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (!fs->inited && LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verfiy that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (eol_flag == 0) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag == true &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the begining of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int remaining = size;
+
+ if (!fs->inited && LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = errno ? : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ LZ4File *lz4fp;
+
+ CFH->open_func = LZ4File_open;
+ CFH->open_write_func = LZ4File_open_write;
+ CFH->read_func = LZ4File_read;
+ CFH->write_func = LZ4File_write;
+ CFH->gets_func = LZ4File_gets;
+ CFH->getc_func = LZ4File_getc;
+ CFH->eof_func = LZ4File_eof;
+ CFH->close_func = LZ4File_close;
+ CFH->get_error_func = LZ4File_get_error;
+
+ CFH->compression_spec = compression_spec;
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (CFH->compression_spec.level >= 0)
+ lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
+
+ CFH->private_data = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 00000000000..74595db1b98
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.h
+ * Interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index aa2c91829c0..473d40d456f 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -3,6 +3,7 @@
pg_dump_common_sources = files(
'compress_io.c',
'compress_gzip.c',
+ 'compress_lz4.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
@@ -17,7 +18,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -85,7 +86,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 9e9f1d626b5..40e96c93ebc 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -394,6 +394,10 @@ RestoreArchive(Archive *AHX)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
#endif
if (supports_compression == false)
pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
@@ -2073,7 +2077,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2083,6 +2087,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -3746,6 +3754,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
unsupported = true;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ unsupported = true;
#endif
if (unsupported)
pg_fatal("archive is compressed, but this installation does not support compression");
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 20f73729fac..224d2c900ce 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index f3ba9263213..f497ec60407 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -139,6 +139,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files where compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4175,11 +4249,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index 62f3e9a81d4..2b461e797c6 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -152,6 +152,7 @@ do
# as field names, which is unfortunate but we won't change it now.
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_lz4.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
test "$f" = src/bin/pg_dump/pg_dump.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 840191d680b..232228d427c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1381,6 +1381,7 @@ LWLock
LWLockHandle
LWLockMode
LWLockPadded
+LZ4CompressorState
LZ4F_compressionContext_t
LZ4F_decompressOptions_t
LZ4F_decompressionContext_t
--
2.25.1
0007-f.patchtext/x-diff; charset=us-asciiDownload
From bc66e2b035ebd24de71ea41c909c62ca1aae2e2d Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sat, 14 Jan 2023 10:45:02 -0600
Subject: [PATCH 7/8] f!
---
src/bin/pg_dump/compress_io.c | 42 ++++++++++++++-------------
src/bin/pg_dump/compress_io.h | 3 +-
src/bin/pg_dump/compress_lz4.c | 34 +++++++++++-----------
src/bin/pg_dump/compress_lz4.h | 4 +--
src/bin/pg_dump/pg_backup_directory.c | 10 +++++--
src/bin/pg_dump/t/002_pg_dump.pl | 2 +-
6 files changed, 51 insertions(+), 44 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 95b18843080..01bf3df0db0 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -39,13 +39,14 @@
*
* The compressed stream API is a wrapper around the C standard fopen() and
* libz's gzopen() APIs and custom LZ4 calls which provide similar
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * functionality. It allows you to use the same functions for compressed and
+ * uncompressed streams. cfopen_read() first tries to open the file with given
+ * name, and if it fails, it tries to open the same file with the .gz suffix,
+ * failing that it tries to open the same file with the .lz4 suffix.
+ * cfopen_write() opens a file for writing, an extra argument specifies the
+ * method to use should the file be compressed, and adds the appropriate
+ * suffix, .gz or .lz4, to the filename if so. This allows you to easily handle
+ * both compressed and uncompressed files.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -335,6 +336,14 @@ InitCompressFileHandle(const pg_compress_specification *compression_spec)
return CFH;
}
+static bool
+check_compressed_file(const char *path, char **fname, char *ext)
+{
+ free_keep_errno(*fname);
+ *fname = psprintf("%s.%s", path, ext);
+ return (access(*fname, F_OK) == 0);
+}
+
/*
* Open a file for reading. 'path' is the file to open, and 'mode' should
* be either "r" or "rb".
@@ -348,7 +357,7 @@ InitCompressFileHandle(const pg_compress_specification *compression_spec)
*
*/
CompressFileHandle *
-InitDiscoverCompressFileHandle(const char *path, const char *mode)
+InitDiscoverCompressFileHandle(const char *path, const char *mode, pg_compress_algorithm alg)
{
CompressFileHandle *CFH = NULL;
struct stat st;
@@ -366,20 +375,13 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
bool exists;
exists = (stat(path, &st) == 0);
- /* avoid unused warning if it is not build with compression */
+ /* avoid unused warning if it is not built with compression */
if (exists)
compression_spec.algorithm = PG_COMPRESSION_NONE;
-#ifdef HAVE_LIBZ
- if (!exists)
- {
- free_keep_errno(fname);
- fname = psprintf("%s.gz", path);
- exists = (stat(fname, &st) == 0);
-
- if (exists)
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- }
-#endif
+ else if (alg == PG_COMPRESSION_GZIP && check_compressed_file(path, &fname, "gz"))
+ compression_spec.algorithm = alg;
+ else if (alg == PG_COMPRESSION_LZ4 && check_compressed_file(path, &fname, "lz4"))
+ compression_spec.algorithm = alg;
}
CFH = InitCompressFileHandle(&compression_spec);
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index afe6b22efaf..2600182c469 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -93,7 +93,8 @@ extern CompressFileHandle *InitCompressFileHandle(
const pg_compress_specification *compression_spec);
extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
- const char *mode);
+ const char *mode,
+ pg_compress_algorithm alg);
extern int DestroyCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index c97e16187a0..0e259a6251a 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -117,13 +117,13 @@ EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
/* Public routines that support LZ4 compressed data I/O */
void
-InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification *compression_spec)
{
cs->readData = ReadDataFromArchiveLZ4;
cs->writeData = WriteDataToArchiveLZ4;
cs->end = EndCompressorLZ4;
- cs->compression_spec = compression_spec;
+ cs->compression_spec = *compression_spec;
/* Will be lazy init'd */
cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
@@ -189,7 +189,7 @@ LZ4File_get_error(CompressFileHandle *CFH)
/*
* Prepare an already alloc'ed LZ4File struct for subsequent calls.
*
- * It creates the nessary contexts for the operations. When compressing,
+ * It creates the necessary contexts for the operations. When compressing,
* it additionally writes the LZ4 header in the output stream.
*/
static int
@@ -228,7 +228,7 @@ LZ4File_init(LZ4File * fs, int size, bool compressing)
if (fwrite(fs->buffer, 1, status, fs->fp) != status)
{
- errno = errno ? : ENOSPC;
+ errno = errno ? errno : ENOSPC;
return 1;
}
}
@@ -255,7 +255,7 @@ LZ4File_init(LZ4File * fs, int size, bool compressing)
/*
* Read already decompressed content from the overflow buffer into 'ptr' up to
* 'size' bytes, if available. If the eol_flag is set, then stop at the first
- * occurance of the new line char prior to 'size' bytes.
+ * occurrence of the new line char prior to 'size' bytes.
*
* Any unread content in the overflow buffer, is moved to the beginning.
*/
@@ -309,10 +309,10 @@ LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
void *readbuf;
/* Lazy init */
- if (!fs->inited && LZ4File_init(fs, size, false /* decompressing */ ))
+ if (LZ4File_init(fs, size, false /* decompressing */ ))
return -1;
- /* Verfiy that there is enough space in the outbuf */
+ /* Verify that there is enough space in the outbuf */
if (size > fs->buflen)
{
fs->buflen = size;
@@ -363,10 +363,10 @@ LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
if (outlen > 0 && dsize < size && eol_found == false)
{
char *p;
- size_t lib = (eol_flag == 0) ? size - dsize : size - 1 - dsize;
+ size_t lib = eol_flag ? size - 1 - dsize : size - dsize ;
size_t len = outlen < lib ? outlen : lib;
- if (eol_flag == true &&
+ if (eol_flag &&
(p = memchr(fs->buffer, '\n', outlen)) &&
(size_t) (p - fs->buffer + 1) <= len)
{
@@ -377,7 +377,7 @@ LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
memcpy((char *) ptr + dsize, fs->buffer, len);
dsize += len;
- /* move what did not fit, if any, at the begining of the buf */
+ /* move what did not fit, if any, at the beginning of the buf */
if (len < outlen)
memmove(fs->buffer, fs->buffer + len, outlen - len);
outlen -= len;
@@ -414,7 +414,7 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
size_t status;
int remaining = size;
- if (!fs->inited && LZ4File_init(fs, size, true))
+ if (LZ4File_init(fs, size, true))
return -1;
while (remaining > 0)
@@ -433,7 +433,7 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
if (fwrite(fs->buffer, 1, status, fs->fp) != status)
{
- errno = errno ? : ENOSPC;
+ errno = errno ? errno : ENOSPC;
return 1;
}
}
@@ -520,7 +520,7 @@ LZ4File_close(CompressFileHandle *CFH)
LZ4F_getErrorName(status));
else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
{
- errno = errno ? : ENOSPC;
+ errno = errno ? errno : ENOSPC;
WRITE_ERROR_EXIT;
}
@@ -582,7 +582,7 @@ LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
}
void
-InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification *compression_spec)
{
LZ4File *lz4fp;
@@ -596,7 +596,7 @@ InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compres
CFH->close_func = LZ4File_close;
CFH->get_error_func = LZ4File_get_error;
- CFH->compression_spec = compression_spec;
+ CFH->compression_spec = *compression_spec;
lz4fp = pg_malloc0(sizeof(*lz4fp));
if (CFH->compression_spec.level >= 0)
lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
@@ -605,13 +605,13 @@ InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compres
}
#else /* USE_LZ4 */
void
-InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification *compression_spec)
{
pg_fatal("this build does not support compression with %s", "LZ4");
}
void
-InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification *compression_spec)
{
pg_fatal("this build does not support compression with %s", "LZ4");
}
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
index 74595db1b98..69a3d9c171f 100644
--- a/src/bin/pg_dump/compress_lz4.h
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -16,7 +16,7 @@
#include "compress_io.h"
-extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec);
-extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification *compression_spec);
+extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification *compression_spec);
#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 2d4baf58c22..4845fd9368c 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -201,7 +201,8 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
setFilePath(AH, fname, "toc.dat");
- tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R,
+ PG_COMPRESSION_NONE);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -390,7 +391,8 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
if (!filename)
return;
- CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R,
+ AH->compression_spec.algorithm);
if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
@@ -442,7 +444,8 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R,
+ AH->compression_spec.algorithm);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
@@ -783,6 +786,7 @@ _PrepParallelRestore(ArchiveHandle *AH)
{
/* It might be compressed */
strlcat(fname, ".gz", sizeof(fname));
+ // XXX
if (stat(fname, &st) == 0)
te->dataLength = st.st_size;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index f497ec60407..263995a2b7a 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -183,7 +183,7 @@ my %pgdump_runs = (
"$tempdir/compression_lz4_dir/blobs.toc.lz4",
],
},
- # Verify that data files where compressed
+ # Verify that data files were compressed
glob_patterns => [
"$tempdir/compression_lz4_dir/toc.dat",
"$tempdir/compression_lz4_dir/*.dat.lz4",
--
2.25.1
0008-TMP-pg_dump-use-lz4-by-default-for-CI-only.patchtext/x-diff; charset=us-asciiDownload
From 6c21acef733a52aa95fd517d73033d0fe9e5efbd Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Wed, 4 Jan 2023 21:21:53 -0600
Subject: [PATCH 8/8] TMP: pg_dump: use lz4 by default, for CI only
---
src/bin/pg_dump/pg_dump.c | 7 +++++--
src/bin/pg_dump/t/002_pg_dump.pl | 8 ++++----
2 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 224d2c900ce..cf5083c432f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -733,8 +733,11 @@ main(int argc, char **argv)
#ifdef HAVE_LIBZ
parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
&compression_spec);
-#else
- /* Nothing to do in the default case */
+#endif
+
+#ifdef USE_LZ4
+ parse_compress_specification(PG_COMPRESSION_LZ4, NULL,
+ &compression_spec);
#endif
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 263995a2b7a..3485ebca57d 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -313,9 +313,9 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: gzip/ :
+ qr/Compression: lz4/ :
qr/Compression: none/,
- name => 'data content is gzip-compressed by default if available',
+ name => 'data content is lz4-compressed by default if available',
},
},
@@ -338,7 +338,7 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: gzip/ :
+ qr/Compression: lz4/ :
qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
@@ -346,7 +346,7 @@ my %pgdump_runs = (
"$tempdir/defaults_dir_format/toc.dat",
"$tempdir/defaults_dir_format/blobs.toc",
$supports_gzip ?
- "$tempdir/defaults_dir_format/*.dat.gz" :
+ "$tempdir/defaults_dir_format/*.dat.lz4" :
"$tempdir/defaults_dir_format/*.dat",
],
},
--
2.25.1
On Sat, Jan 14, 2023 at 03:43:09PM -0600, Justin Pryzby wrote:
On Sun, Jan 08, 2023 at 01:45:25PM -0600, Justin Pryzby wrote:
pg_compress_specification is being passed by value, but I think it
should be passed as a pointer, as is done everywhere else.ISTM that was an issue with 5e73a6048, affecting a few public and
private functions. I wrote a pre-preparatory patch which changes to
pass by reference.
The functions changed by 0001 are cfopen[_write](),
AllocateCompressor() and ReadDataFromArchive(). Why is it a good idea
to change these interfaces which basically exist to handle inputs? Is
there some benefit in changing compression_spec within the internals
of these routines before going back one layer down to their callers?
Changing the compression_spec on-the-fly in these internal paths could
be risky, actually, no?
And addressed a handful of other issues I reported as separate fixup
commits. And changed to use LZ4 by default for CI.
Are your slight changes shaped as of 0003-f.patch, 0005-f.patch and
0007-f.patch on top of the original patches sent by Georgios?
I also rebased my 2 year old patch to support zstd in pg_dump. I hope
it can finally added for v16. I'll send it for the next CF if these
patches progress.
Good idea to see if what you have done for zstd fits with what's
presented here.
pg_compress_algorithm is being writen directly into the pg_dump header.
Do you mean that this is what happens once the patch series 0001~0008
sent upthread is applied on HEAD?
Currently, I think that's not an externally-visible value (it could be
renumbered, theoretically even in a minor release). Maybe there should
be a "private" enum for encoding the pg_dump header, similar to
WAL_COMPRESSION_LZ4 vs BKPIMAGE_COMPRESS_LZ4 ? Or else a comment there
should warn that the values are encoded in pg_dump, and must never be
changed.Michael, WDYT ?
Changing the order of the members in an enum would cause an ABI
breakage, so that would not happen, and we tend to be very careful
about that. Appending new members would be fine, though. FWIW, I'd
rather avoid adding more enums that would just be exact maps to
pg_compress_algorithm.
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
I may be missing something here, but it seems to me that you ought to
store as well the level in the dump header, or it would not be
possible to report in the dump's description what was used? Hence,
K_VERS_1_15 should imply that we have both the method compression and
the compression level.
--
Michael
On Mon, Jan 16, 2023 at 10:28:50AM +0900, Michael Paquier wrote:
On Sat, Jan 14, 2023 at 03:43:09PM -0600, Justin Pryzby wrote:
On Sun, Jan 08, 2023 at 01:45:25PM -0600, Justin Pryzby wrote:
pg_compress_specification is being passed by value, but I think it
should be passed as a pointer, as is done everywhere else.ISTM that was an issue with 5e73a6048, affecting a few public and
private functions. I wrote a pre-preparatory patch which changes to
pass by reference.The functions changed by 0001 are cfopen[_write](),
AllocateCompressor() and ReadDataFromArchive(). Why is it a good idea
to change these interfaces which basically exist to handle inputs?
I changed to pass pg_compress_specification as a pointer, since that's
the usual convention for structs, as followed by the existing uses of
pg_compress_specification.
Is there some benefit in changing compression_spec within the
internals of these routines before going back one layer down to their
callers? Changing the compression_spec on-the-fly in these internal
paths could be risky, actually, no?
I think what you're saying is that if the spec is passed as a pointer,
then the called functions shouldn't set spec->algorithm=something.
I agree that if they need to do that, they should use a local variable.
Which looks to be true for the functions that were changed in 001.
And addressed a handful of other issues I reported as separate fixup
commits. And changed to use LZ4 by default for CI.Are your slight changes shaped as of 0003-f.patch, 0005-f.patch and
0007-f.patch on top of the original patches sent by Georgios?
Yes, the original patches, rebased as needed on top of HEAD and 001...
pg_compress_algorithm is being writen directly into the pg_dump header.
Do you mean that this is what happens once the patch series 0001~0008
sent upthread is applied on HEAD?
Yes
- /* - * For now the compression type is implied by the level. This will need - * to change once support for more compression algorithms is added, - * requiring a format bump. - */ - WriteInt(AH, AH->compression_spec.level); + AH->WriteBytePtr(AH, AH->compression_spec.algorithm);I may be missing something here, but it seems to me that you ought to
store as well the level in the dump header, or it would not be
possible to report in the dump's description what was used? Hence,
K_VERS_1_15 should imply that we have both the method compression and
the compression level.
Maybe. But the "level" isn't needed for decompression for any case I'm
aware of.
Also, dumps with the default compression level currently say:
"Compression: -1", which does't seem valuable.
--
Justin
Oh, I didn’t realize you took over Justin? Why? After almost a year of work?
This is rather disheartening.
On Mon, Jan 16, 2023 at 02:56, Justin Pryzby <pryzby@telsasoft.com> wrote:
Show quoted text
On Mon, Jan 16, 2023 at 10:28:50AM +0900, Michael Paquier wrote:
On Sat, Jan 14, 2023 at 03:43:09PM -0600, Justin Pryzby wrote:
On Sun, Jan 08, 2023 at 01:45:25PM -0600, Justin Pryzby wrote:
pg_compress_specification is being passed by value, but I think it
should be passed as a pointer, as is done everywhere else.ISTM that was an issue with 5e73a6048, affecting a few public and
private functions. I wrote a pre-preparatory patch which changes to
pass by reference.The functions changed by 0001 are cfopen[_write](),
AllocateCompressor() and ReadDataFromArchive(). Why is it a good idea
to change these interfaces which basically exist to handle inputs?I changed to pass pg_compress_specification as a pointer, since that's
the usual convention for structs, as followed by the existing uses of
pg_compress_specification.Is there some benefit in changing compression_spec within the
internals of these routines before going back one layer down to their
callers? Changing the compression_spec on-the-fly in these internal
paths could be risky, actually, no?I think what you're saying is that if the spec is passed as a pointer,
then the called functions shouldn't set spec->algorithm=something.I agree that if they need to do that, they should use a local variable.
Which looks to be true for the functions that were changed in 001.And addressed a handful of other issues I reported as separate fixup
commits. And changed to use LZ4 by default for CI.Are your slight changes shaped as of 0003-f.patch, 0005-f.patch and
0007-f.patch on top of the original patches sent by Georgios?Yes, the original patches, rebased as needed on top of HEAD and 001...
pg_compress_algorithm is being writen directly into the pg_dump header.
Do you mean that this is what happens once the patch series 0001~0008
sent upthread is applied on HEAD?Yes
- /* - * For now the compression type is implied by the level. This will need - * to change once support for more compression algorithms is added, - * requiring a format bump. - */ - WriteInt(AH, AH->compression_spec.level); + AH->WriteBytePtr(AH, AH->compression_spec.algorithm);I may be missing something here, but it seems to me that you ought to
store as well the level in the dump header, or it would not be
possible to report in the dump's description what was used? Hence,
K_VERS_1_15 should imply that we have both the method compression and
the compression level.Maybe. But the "level" isn't needed for decompression for any case I'm
aware of.Also, dumps with the default compression level currently say:
"Compression: -1", which does't seem valuable.--
Justin
On Mon, Jan 16, 2023 at 02:27:56AM +0000, gkokolatos@pm.me wrote:
Oh, I didn’t realize you took over Justin? Why? After almost a year of work?
This is rather disheartening.
I believe you've misunderstood my intent here. I sent rebased versions
of your patches with fixup commits implementing fixes that I'd
previously sent. I don't think that's unusual. I hope your patches
will be included in v16, and I hope to facilitate that. I don't mean
any offense. Actually, the fixups are provided as separate patches so
you can adopt the changes easily into your branch.
--
Justin
On Sun, Jan 15, 2023 at 07:56:25PM -0600, Justin Pryzby wrote:
On Mon, Jan 16, 2023 at 10:28:50AM +0900, Michael Paquier wrote:
The functions changed by 0001 are cfopen[_write](),
AllocateCompressor() and ReadDataFromArchive(). Why is it a good idea
to change these interfaces which basically exist to handle inputs?I changed to pass pg_compress_specification as a pointer, since that's
the usual convention for structs, as followed by the existing uses of
pg_compress_specification.
Okay, but what do we gain here? It seems to me that this introduces
the risk that a careless change in one of the internal routines if
they change slight;ly compress_spec, hence impacting any of their
callers? Or is that fixing an actual bug (except if I am missing your
point, that does not seem to be the case)?
Is there some benefit in changing compression_spec within the
internals of these routines before going back one layer down to their
callers? Changing the compression_spec on-the-fly in these internal
paths could be risky, actually, no?I think what you're saying is that if the spec is passed as a pointer,
then the called functions shouldn't set spec->algorithm=something.
Yes. HEAD makes sure of that, 0001 would not prevent that. So I am a
bit confused in seeing how this is a benefit.
--
Michael
Hi,
I admit I am completely at lost as to what is expected from me anymore.
I had posted v19-0001 for a committer's consideration and v19-000{2,3} for completeness.
Please find a rebased v20 attached.
Also please let me know if I should silently step away from it and let other people lead
it. I would be glad to comply either way.
Cheers,
//Georgios
------- Original Message -------
On Monday, January 16th, 2023 at 3:54 AM, Michael Paquier <michael@paquier.xyz> wrote:
Show quoted text
On Sun, Jan 15, 2023 at 07:56:25PM -0600, Justin Pryzby wrote:
On Mon, Jan 16, 2023 at 10:28:50AM +0900, Michael Paquier wrote:
The functions changed by 0001 are cfopen_write,
AllocateCompressor() and ReadDataFromArchive(). Why is it a good idea
to change these interfaces which basically exist to handle inputs?I changed to pass pg_compress_specification as a pointer, since that's
the usual convention for structs, as followed by the existing uses of
pg_compress_specification.Okay, but what do we gain here? It seems to me that this introduces
the risk that a careless change in one of the internal routines if
they change slight;ly compress_spec, hence impacting any of their
callers? Or is that fixing an actual bug (except if I am missing your
point, that does not seem to be the case)?Is there some benefit in changing compression_spec within the
internals of these routines before going back one layer down to their
callers? Changing the compression_spec on-the-fly in these internal
paths could be risky, actually, no?I think what you're saying is that if the spec is passed as a pointer,
then the called functions shouldn't set spec->algorithm=something.Yes. HEAD makes sure of that, 0001 would not prevent that. So I am a
bit confused in seeing how this is a benefit.
--
Michael
Attachments:
v20-0001-Prepare-pg_dump-internals-for-additional-compres.patchtext/x-patch; name=v20-0001-Prepare-pg_dump-internals-for-additional-compres.patchDownload
From 348a306b47148790c602b2c208ff3345befa8eb7 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 16 Jan 2023 14:56:31 +0000
Subject: [PATCH v20 1/3] Prepare pg_dump internals for additional compression
methods.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it and is using it throughout.
---
src/bin/pg_dump/compress_io.c | 389 +++++++++++++++++++--------
src/bin/pg_dump/pg_backup_archiver.c | 128 +++------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-
3 files changed, 318 insertions(+), 226 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 7a2c80bbc4..1db973b6a2 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,10 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
/*----------------------
* Compressor API
*----------------------
@@ -128,15 +132,24 @@ ReadDataFromArchive(ArchiveHandle *AH,
const pg_compress_specification compression_spec,
ReadFunc readF)
{
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ switch (compression_spec.algorithm)
{
+ case PG_COMPRESSION_NONE:
+ ReadDataFromArchiveNone(AH, readF);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
+ ReadDataFromArchiveZlib(AH, readF);
#else
- pg_fatal("this build does not support compression with %s", "gzip");
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
}
@@ -149,6 +162,9 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
{
switch (cs->compression_spec.algorithm)
{
+ case PG_COMPRESSION_NONE:
+ WriteDataToArchiveNone(AH, cs, data, dLen);
+ break;
case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
WriteDataToArchiveZlib(AH, cs, data, dLen);
@@ -156,13 +172,11 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
pg_fatal("this build does not support compression with %s", "gzip");
#endif
break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
case PG_COMPRESSION_LZ4:
- /* fallthrough */
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
}
}
@@ -173,10 +187,26 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
+ switch (cs->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
+ EndCompressorZlib(AH, cs);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
free(cs);
}
@@ -391,10 +421,8 @@ WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
*/
struct cfp
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
+ pg_compress_specification compression_spec;
+ void *fp;
};
#ifdef HAVE_LIBZ
@@ -490,127 +518,202 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_specification compression_spec)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ fp->compression_spec = compression_spec;
+
+ switch (compression_spec.algorithm)
{
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
+ case PG_COMPRESSION_NONE:
+ if (fd >= 0)
+ fp->fp = fdopen(fd, mode);
+ else
+ fp->fp = fopen(path, mode);
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
- }
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ if (compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use
+ * it
+ */
+ char mode_compression[32];
+
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, compression_spec.level);
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode_compression);
+ else
+ fp->fp = gzopen(path, mode_compression);
+ }
+ else
+ {
+ /* don't specify a level, just use the zlib default */
+ if (fd >= 0)
+ fp->fp = gzdopen(fd, mode);
+ else
+ fp->fp = gzopen(path, mode);
+ }
- fp->uncompressedfp = NULL;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ if (fp->fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
- else
- {
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
- fp->uncompressedfp = fopen(path, mode);
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
return fp;
}
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
{
- int ret;
+ int ret = 0;
if (size == 0)
return 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ case PG_COMPRESSION_NONE:
+ ret = fread(ptr, 1, size, (FILE *) fp->fp);
+ if (ret != size && !feof((FILE *) fp->fp))
+ READ_ERROR_EXIT((FILE *) fp->fp);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
- else
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzread((gzFile) fp->fp, ptr, size);
+ if (ret != size && !gzeof((gzFile) fp->fp))
+ {
+ int errnum;
+ const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
+
return ret;
}
int
cfwrite(const void *ptr, int size, cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fwrite(ptr, 1, size, (FILE *) fp->fp);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
+ ret = gzwrite((gzFile) fp->fp, ptr, size);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
int
cfgetc(cfp *fp)
{
- int ret;
+ int ret = 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
- {
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fgetc((FILE *) fp->fp);
+ if (ret == EOF)
+ READ_ERROR_EXIT((FILE *) fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzgetc((gzFile) fp->fp);
+ if (ret == EOF)
+ {
+ if (!gzeof((gzFile) fp->fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
return ret;
@@ -619,65 +722,119 @@ cfgetc(cfp *fp)
char *
cfgets(cfp *fp, char *buf, int len)
{
+ char *ret = NULL;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fgets(buf, len, (FILE *) fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
+ ret = gzgets((gzFile) fp->fp, buf, len);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return fgets(buf, len, fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
int
cfclose(cfp *fp)
{
- int result;
+ int ret = 0;
if (fp == NULL)
{
errno = EBADF;
return EOF;
}
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+
+ switch (fp->compression_spec.algorithm)
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fclose((FILE *) fp->fp);
+ fp->fp = NULL;
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzclose((gzFile) fp->fp);
+ fp->fp = NULL;
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
}
+
free_keep_errno(fp);
- return result;
+ return ret;
}
int
cfeof(cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = feof((FILE *) fp->fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
+ ret = gzeof((gzFile) fp->fp);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return feof(fp->uncompressedfp);
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ }
+
+ return ret;
}
const char *
get_cfp_error(cfp *fp)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
+#ifdef HAVE_LIBZ
int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
if (errnum != Z_ERRNO)
return errmsg;
- }
+#else
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
+ }
+
return strerror(errno);
}
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 7f7a0f1ce7..fb94317ad9 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -362,8 +353,9 @@ RestoreArchive(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
+ bool supports_compression;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +383,24 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
+ supports_compression = true;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE &&
+ AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+#ifndef HAVE_LIBZ
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+ if (supports_compression == false)
+ pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1128,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1503,58 +1502,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1565,33 +1538,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1715,22 +1679,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2219,6 +2178,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2272,8 +2232,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b..4725e49747 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.34.1
v20-0003-THIS-IS-WIP-PATCH-PRESENTED-HERE-FOR-COMPLETENES.patchtext/x-patch; name=v20-0003-THIS-IS-WIP-PATCH-PRESENTED-HERE-FOR-COMPLETENES.patchDownload
From 3c7e5b8ebe3c4d063f8d281a4c9d854fef6578b0 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 16 Jan 2023 15:01:05 +0000
Subject: [PATCH v20 3/3] THIS IS WIP PATCH PRESENTED HERE FOR COMPLETENESS:
LZ4 compression
Please be aware that comments, references, naming, etc, are not scrutinized.
Those will be addressed once 0001 of the series is getting closer to a
commitable state. However the patch should apply cleanly and the tests should
be passing.
Within compress_lz4.{c,h} the streaming API and a file API compression is
implemented.. The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(), it has been
implemented localy.
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 11 +-
src/bin/pg_dump/compress_lz4.c | 618 +++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 22 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 14 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 1 +
11 files changed, 756 insertions(+), 21 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e1..49d218905f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 7a19f5d617..a1401377ab 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -25,6 +26,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
compress_gzip.o \
+ compress_lz4.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 7274039e87..43bd4c1a2c 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -60,6 +60,7 @@
#include "compress_io.h"
#include "compress_gzip.h"
+#include "compress_lz4.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -136,7 +137,7 @@ AllocateCompressor(const pg_compress_specification compression_spec,
InitCompressorGzip(cs, compression_spec);
break;
case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
+ InitCompressorLZ4(cs, compression_spec);
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
@@ -326,7 +327,7 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
InitCompressGzip(CFH, compression_spec);
break;
case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
+ InitCompressLZ4(CFH, compression_spec);
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
@@ -340,12 +341,12 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
* Open a file for reading. 'path' is the file to open, and 'mode' should
* be either "r" or "rb".
*
- * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (i
+ * If the file at 'path' does not exist, we append the ".gz" suffix (if
* 'path' doesn't already have it) and try again. So if you pass "foo" as
- * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
- * order.
+ * 'path', this will open either "foo" or "foo.gz", trying in that order.
*
* On failure, return NULL with an error code in errno.
+ *
*/
CompressFileHandle *
InitDiscoverCompressFileHandle(const char *path, const char *mode)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..c97e16187a
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,618 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.c
+ * Routines for archivers to write an uncompressed or compressed data
+ * stream.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private_data;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private_data = NULL;
+ }
+}
+
+
+/* Public routines that support LZ4 compressed data I/O */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ cs->compression_spec = compression_spec;
+
+ /* Will be lazy init'd */
+ cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the nessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurance of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (!fs->inited && LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verfiy that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (eol_flag == 0) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag == true &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the begining of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int remaining = size;
+
+ if (!fs->inited && LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = errno ? : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ LZ4File *lz4fp;
+
+ CFH->open_func = LZ4File_open;
+ CFH->open_write_func = LZ4File_open_write;
+ CFH->read_func = LZ4File_read;
+ CFH->write_func = LZ4File_write;
+ CFH->gets_func = LZ4File_gets;
+ CFH->getc_func = LZ4File_getc;
+ CFH->eof_func = LZ4File_eof;
+ CFH->close_func = LZ4File_close;
+ CFH->get_error_func = LZ4File_get_error;
+
+ CFH->compression_spec = compression_spec;
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (CFH->compression_spec.level >= 0)
+ lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
+
+ CFH->private_data = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..74595db1b9
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.h
+ * Interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index aa2c91829c..473d40d456 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -3,6 +3,7 @@
pg_dump_common_sources = files(
'compress_io.c',
'compress_gzip.c',
+ 'compress_lz4.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
@@ -17,7 +18,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -85,7 +86,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 1f207c6f4d..119b7f2553 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -395,6 +395,10 @@ RestoreArchive(Archive *AHX)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
#endif
if (supports_compression == false)
pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
@@ -2074,7 +2078,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2084,6 +2088,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -3747,6 +3755,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
unsupported = true;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ unsupported = true;
#endif
if (unsupported)
pg_fatal("archive is compressed, but this installation does not support compression");
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2c0a969972..f392760e06 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index f3ba926321..f497ec6040 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -139,6 +139,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files where compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4175,11 +4249,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index da8e4861f4..1f7f817e4d 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -152,6 +152,7 @@ do
# as field names, which is unfortunate but we won't change it now.
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_lz4.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
test "$f" = src/bin/pg_dump/pg_dump.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 840191d680..232228d427 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1381,6 +1381,7 @@ LWLock
LWLockHandle
LWLockMode
LWLockPadded
+LZ4CompressorState
LZ4F_compressionContext_t
LZ4F_decompressOptions_t
LZ4F_decompressionContext_t
--
2.34.1
v20-0002-THIS-IS-WIP-PATCH-PRESENTED-HERE-FOR-COMPLETENES.patchtext/x-patch; name=v20-0002-THIS-IS-WIP-PATCH-PRESENTED-HERE-FOR-COMPLETENES.patchDownload
From 7c27acd1faaf4a640a1d0ba7989dec39c853660f Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 16 Jan 2023 14:59:41 +0000
Subject: [PATCH v20 2/3] THIS IS WIP PATCH PRESENTED HERE FOR COMPLETENESS:
Compressor API
Please be aware that comments, references, naming, etc, are not scrutinized.
Those will be addressed once 0001 of the series is getting closer to a
commitable state. However the patch should apply cleanly and the tests should
be passing.
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for opening, writing, etc. The implementor of a new
compression method is now able to "simply" just add those definitions.
Custom compressed archives now need to store the compression algorithm in their
header. This requires a bump in the version number. The level of compression
is no longer stored in the dump as it is irrelevant.
---
src/bin/pg_dump/Makefile | 1 +
src/bin/pg_dump/compress_gzip.c | 405 ++++++++++++
src/bin/pg_dump/compress_gzip.h | 22 +
src/bin/pg_dump/compress_io.c | 914 +++++++-------------------
src/bin/pg_dump/compress_io.h | 68 +-
src/bin/pg_dump/meson.build | 1 +
src/bin/pg_dump/pg_backup_archiver.c | 102 +--
src/bin/pg_dump/pg_backup_archiver.h | 5 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 94 +--
src/bin/pg_dump/t/002_pg_dump.pl | 10 +-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 2 +
13 files changed, 851 insertions(+), 797 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index ef1ed0f3e5..7a19f5d617 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,6 +24,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..60fb95d7b7
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,405 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.c
+ * Routines for archivers to write an uncompressed or compressed data
+ * stream.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_gzip.h"
+#include "pg_backup_utils.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ /*
+ * A level of zero simply copies the input one block at the time. This
+ * is probably not what the user wanted when calling this interface.
+ */
+ if (cs->compression_spec.level == 0)
+ pg_fatal("requested to compress the archive yet no level was specified");
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ cs->compression_spec = compression_spec;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
+ cs->private_data = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ size_t ret;
+
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzwrite(gzfp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzgets(gzfp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int save_errno;
+ int ret;
+
+ CFH->private_data = NULL;
+
+ ret = gzclose(gzfp);
+
+ save_errno = errno;
+ errno = save_errno;
+
+ return ret;
+}
+
+static int
+Gzip_eof(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzeof(gzfp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ gzFile gzfp;
+ char mode_compression[32];
+
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
+ else
+ gzfp = gzopen(path, mode_compression);
+
+ if (gzfp == NULL)
+ return 1;
+
+ CFH->private_data = gzfp;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ CFH->open_func = Gzip_open;
+ CFH->open_write_func = Gzip_open_write;
+ CFH->read_func = Gzip_read;
+ CFH->write_func = Gzip_write;
+ CFH->gets_func = Gzip_gets;
+ CFH->getc_func = Gzip_getc;
+ CFH->close_func = Gzip_close;
+ CFH->eof_func = Gzip_eof;
+ CFH->get_error_func = Gzip_get_error;
+
+ CFH->compression_spec = compression_spec;
+
+ CFH->private_data = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..6dfd0eb04d
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.h
+ * Interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 1db973b6a2..7274039e87 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -9,42 +9,44 @@
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
+ * underlying stream. The second API is a wrapper around fopen and
* friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
+ * the possible compression. The second API is aimed for the resulting
+ * files can be easily manipulated with an external compression utility
+ * program.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
+ * AllocateCompressor, writeData, and EndCompressor. First you call
+ * AllocateCompressor, then write all the data by calling writeData as many
+ * times as needed, and finally EndCompressor. writeData will call the
+ * WriteFunc that was provided to AllocateCompressor for each chunk of
+ * compressed data.
*
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
- *
- * The interface is the same for compressed and uncompressed streams.
+ * The interface for reading an archive consists of the same three functions:
+ * AllocateCompressor, readData, and EndCompressor. First you call
+ * AllocateCompressor, then read all the data by calling readData to read the
+ * whole compressed stream which repeatedly calls the given ReadFunc. ReadFunc
+ * returns the compressed data chunk at a time, and readData decompresses it
+ * and passes the decompressed data to ahwrite(), until ReadFunc returns 0 to
+ * signal EOF. The interface is the same for compressed and uncompressed
+ * streams.
*
* Compressed stream API
* ----------------------
*
* The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * libz's gzopen() APIs and custom LZ4 calls which provide similar
+ * functionality. It allows you to use the same functions for compressed and
+ * uncompressed streams. cfopen_read() first tries to open the file with given
+ * name, and if it fails, it tries to open the same file with the .gz suffix,
+ * failing that it tries to open the same file with the .lz4 suffix.
+ * cfopen_write() opens a file for writing, an extra argument specifies the
+ * method to use should the file be compressed, and adds the appropriate
+ * suffix, .gz or .lz4, to the filename if so. This allows you to easily handle
+ * both compressed and uncompressed files.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -53,7 +55,11 @@
*/
#include "postgres_fe.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
#include "compress_io.h"
+#include "compress_gzip.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
@@ -65,84 +71,69 @@
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
+
+ free(buf);
+}
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+static void
+InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+
+ cs->compression_spec = compression_spec;
+}
/* Public interface routines */
/* Allocate a new compressor */
CompressorState *
AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
- cs->compression_spec = compression_spec;
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
-}
-
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF)
-{
switch (compression_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- ReadDataFromArchiveNone(AH, readF);
+ InitCompressorNone(cs, compression_spec);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
+ InitCompressorGzip(cs, compression_spec);
break;
case PG_COMPRESSION_LZ4:
pg_fatal("compression with %s is not yet supported", "LZ4");
@@ -151,34 +142,8 @@ ReadDataFromArchive(ArchiveHandle *AH,
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
}
-}
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
+ return cs;
}
/*
@@ -187,401 +152,178 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- EndCompressorZlib(AH, cs);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- free(cs);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
+/*----------------------
+ * Compressed stream API
+ *----------------------
*/
-static void
-InitCompressorZlib(CompressorState *cs, int level)
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
+
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
}
+/* free() without changing errno; useful in several places below */
static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
+free_keep_errno(void *p)
{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
+ int save_errno = errno;
- free(cs->zlibOut);
- free(cs->zp);
+ free(p);
+ errno = save_errno;
}
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
+/*
+ * Compression None implementation
+ */
+static size_t
+read_none(void *ptr, size_t size, CompressFileHandle *CFH)
{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
+ FILE *fp = (FILE *) CFH->private_data;
+ size_t ret;
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
+ if (size == 0)
+ return 0;
- if (res == Z_STREAM_END)
- break;
- }
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
+
+ return ret;
}
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+static size_t
+write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
+ return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
}
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
+static const char *
+get_error_none(CompressFileHandle *CFH)
{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
+ return strerror(errno);
+}
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
+static char *
+gets_none(char *ptr, int size, CompressFileHandle *CFH)
+{
+ return fgets(ptr, size, (FILE *) CFH->private_data);
+}
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
+static int
+getc_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret;
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
+ ret = fgetc(fp);
+ if (ret == EOF)
{
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
}
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
+ return ret;
}
-#endif /* HAVE_LIBZ */
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
+static int
+close_none(CompressFileHandle *CFH)
{
- size_t cnt;
- char *buf;
- size_t buflen;
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret = 0;
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
+ CFH->private_data = NULL;
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
+ if (fp)
+ ret = fclose(fp);
- free(buf);
+ return ret;
}
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+
+static int
+eof_none(CompressFileHandle *CFH)
{
- cs->writeF(AH, data, dLen);
+ return feof((FILE *) CFH->private_data);
}
-
-/*----------------------
- * Compressed stream API
- *----------------------
- */
-
-/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
- */
-struct cfp
+static int
+open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
{
- pg_compress_specification compression_spec;
- void *fp;
-};
+ Assert(CFH->private_data == NULL);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (fd >= 0)
+ CFH->private_data = fdopen(dup(fd), mode);
+ else
+ CFH->private_data = fopen(path, mode);
-/* free() without changing errno; useful in several places below */
-static void
-free_keep_errno(void *p)
-{
- int save_errno = errno;
+ if (CFH->private_data == NULL)
+ return 1;
- free(p);
- errno = save_errno;
+ return 0;
}
-/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_read(const char *path, const char *mode)
+static int
+open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
{
- cfp *fp;
-
- pg_compress_specification compression_spec = {0};
+ Assert(CFH->private_data == NULL);
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
+ CFH->private_data = fopen(path, mode);
+ if (CFH->private_data == NULL)
+ return 1;
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
+ return 0;
}
-/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static void
+InitCompressNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
{
- cfp *fp;
-
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
+ CFH->open_func = open_none;
+ CFH->open_write_func = open_write_none;
+ CFH->read_func = read_none;
+ CFH->write_func = write_none;
+ CFH->gets_func = gets_none;
+ CFH->getc_func = getc_none;
+ CFH->close_func = close_none;
+ CFH->eof_func = eof_none;
+ CFH->get_error_func = get_error_none;
+
+ CFH->private_data = NULL;
}
/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
+ * Public interface
*/
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compression_spec)
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compression_spec)
{
- cfp *fp = pg_malloc(sizeof(cfp));
+ CompressFileHandle *CFH;
- fp->compression_spec = compression_spec;
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
switch (compression_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- if (fd >= 0)
- fp->fp = fdopen(fd, mode);
- else
- fp->fp = fopen(path, mode);
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-
+ InitCompressNone(CFH, compression_spec);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /*
- * user has specified a compression level, so tell zlib to use
- * it
- */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode_compression);
- else
- fp->fp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->fp = gzdopen(fd, mode);
- else
- fp->fp = gzopen(path, mode);
- }
-
- if (fp->fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
+ InitCompressGzip(CFH, compression_spec);
break;
case PG_COMPRESSION_LZ4:
pg_fatal("compression with %s is not yet supported", "LZ4");
@@ -591,266 +333,88 @@ cfopen_internal(const char *path, int fd, const char *mode,
break;
}
- return fp;
+ return CFH;
}
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+/*
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
+ *
+ * If the file at 'path' does not exist, we append the "{.gz,.lz4}" suffix (i
+ * 'path' doesn't already have it) and try again. So if you pass "foo" as
+ * 'path', this will open either "foo" or "foo.gz" or "foo.lz4", trying in that
+ * order.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- return cfopen_internal(path, -1, mode, compression_spec);
-}
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec)
-{
- return cfopen_internal(NULL, fd, mode, compression_spec);
-}
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
-int
-cfread(void *ptr, int size, cfp *fp)
-{
- int ret = 0;
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
- if (size == 0)
- return 0;
+ fname = strdup(path);
- switch (fp->compression_spec.algorithm)
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ else
{
- case PG_COMPRESSION_NONE:
- ret = fread(ptr, 1, size, (FILE *) fp->fp);
- if (ret != size && !feof((FILE *) fp->fp))
- READ_ERROR_EXIT((FILE *) fp->fp);
+ bool exists;
- break;
- case PG_COMPRESSION_GZIP:
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not build with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- ret = gzread((gzFile) fp->fp, ptr, size);
- if (ret != size && !gzeof((gzFile) fp->fp))
- {
- int errnum;
- const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
-
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-int
-cfwrite(const void *ptr, int size, cfp *fp)
-{
- int ret = 0;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fwrite(ptr, 1, size, (FILE *) fp->fp);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzwrite((gzFile) fp->fp, ptr, size);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-int
-cfgetc(cfp *fp)
-{
- int ret = 0;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgetc((FILE *) fp->fp);
- if (ret == EOF)
- READ_ERROR_EXIT((FILE *) fp->fp);
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgetc((gzFile) fp->fp);
- if (ret == EOF)
- {
- if (!gzeof((gzFile) fp->fp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ }
#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-char *
-cfgets(cfp *fp, char *buf, int len)
-{
- char *ret = NULL;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgets(buf, len, (FILE *) fp->fp);
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgets((gzFile) fp->fp, buf, len);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
-
- return ret;
-}
-
-int
-cfclose(cfp *fp)
-{
- int ret = 0;
-
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
}
- switch (fp->compression_spec.algorithm)
+ CFH = InitCompressFileHandle(compression_spec);
+ if (CFH->open_func(fname, -1, mode, CFH))
{
- case PG_COMPRESSION_NONE:
- ret = fclose((FILE *) fp->fp);
- fp->fp = NULL;
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzclose((gzFile) fp->fp);
- fp->fp = NULL;
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
+ free_keep_errno(CFH);
+ CFH = NULL;
}
+ free_keep_errno(fname);
- free_keep_errno(fp);
-
- return ret;
+ return CFH;
}
int
-cfeof(cfp *fp)
+DestroyCompressFileHandle(CompressFileHandle *CFH)
{
int ret = 0;
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = feof((FILE *) fp->fp);
+ if (CFH->private_data)
+ ret = CFH->close_func(CFH);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzeof((gzFile) fp->fp);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
- case PG_COMPRESSION_ZSTD:
- pg_fatal("compression with %s is not yet supported", "ZSTD");
- break;
- }
+ free_keep_errno(CFH);
return ret;
}
-
-const char *
-get_cfp_error(cfp *fp)
-{
- if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- int errnum;
- const char *errmsg = gzerror((gzFile) fp->fp, &errnum);
-
- if (errnum != Z_ERRNO)
- return errmsg;
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
-
- return strerror(errno);
-}
-
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
-{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
-
- if (filenamelen < suffixlen)
- return 0;
-
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
-}
-
-#endif
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a429dc4789..4b4a00c010 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -37,32 +37,62 @@ typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ ReadFunc readF;
+ WriteFunc writeF;
+
+ pg_compress_specification compression_spec;
+ void *private_data;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ int (*open_func) (const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH);
+ int (*open_write_func) (const char *path, const char *mode,
+ CompressFileHandle *cxt);
+ size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
+ size_t (*write_func) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
+ int (*getc_func) (CompressFileHandle *CFH);
+ int (*eof_func) (CompressFileHandle *CFH);
+ int (*close_func) (CompressFileHandle *CFH);
+ const char *(*get_error_func) (CompressFileHandle *CFH);
+
+ pg_compress_specification compression_spec;
+ void *private_data;
+};
-typedef struct cfp cfp;
+extern CompressFileHandle *InitCompressFileHandle(const pg_compress_specification compression_spec);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int DestroyCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index ca62f9a374..aa2c91829c 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -2,6 +2,7 @@
pg_dump_common_sources = files(
'compress_io.c',
+ 'compress_gzip.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index fb94317ad9..1f207c6f4d 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -355,7 +355,7 @@ RestoreArchive(Archive *AHX)
bool parallel_mode;
bool supports_compression;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1127,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1143,9 +1143,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1502,6 +1503,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1524,33 +1526,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = DestroyCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1689,7 +1690,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2031,6 +2036,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2061,26 +2078,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2178,6 +2181,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2233,7 +2237,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3646,12 +3653,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3722,10 +3724,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
@@ -3737,10 +3740,17 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
+ {
+ bool unsupported = false;
+
#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ unsupported = true;
#endif
+ if (unsupported)
+ pg_fatal("archive is compressed, but this installation does not support compression");
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 4725e49747..18b38c17ab 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,13 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add
+ * compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index d1e54644a9..512ab043af 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 6800c3ccee..e2c6e6ecd0 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read_func(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (DestroyCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof_func(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (DestroyCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write_func(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc_func(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write_func(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read_func(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compression_spec);
+ if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (DestroyCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
+ if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (DestroyCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write_func(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (DestroyCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 2eeef2a478..f3ba926321 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -94,7 +94,7 @@ my %pgdump_runs = (
command => [
'pg_restore', '-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip-compressed'
},
},
@@ -239,8 +239,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default if available',
},
},
@@ -264,8 +264,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
glob_patterns => [
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index e52fe9f509..da8e4861f4 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -150,6 +150,7 @@ do
# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.
+ test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 23bafec5f7..840191d680 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -428,6 +428,7 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
+CompressFileHandle
CompressionLocation
CompressorState
ComputeXidHorizonsResult
@@ -1034,6 +1035,7 @@ GucStack
GucStackState
GucStringAssignHook
GucStringCheckHook
+GzipCompressorState
HANDLE
HASHACTION
HASHBUCKET
--
2.34.1
Hi,
On 1/16/23 16:14, gkokolatos@pm.me wrote:
Hi,
I admit I am completely at lost as to what is expected from me anymore.
:-(
I understand it's frustrating not to know why a patch is not moving
forward. Particularly when is seems fairly straightforward ...
Let me briefly explain my personal (and admittedly very subjective) view
on picking what patches to review/commit. I'm sure other committers have
other criteria, but maybe this will help.
There are always more patches than I can review/commit, so I have to
prioritize, and pick which patches to look at. For me, it's mostly about
cost/benefit of the patch. The cost is e.g. the amount of time I need to
spend to review/commit the stuff, maybe read the thread, etc. Benefits
is mainly the new features/improvements.
It's oversimplified, we could talk about various bits that contribute to
the costs and benefits, but this is what it boils down.
There's always the aspect of time - patches A and B have roughly the
same benefits, but with A we get it "immediately" while B requires
additional parts that we don't have ready yet (and if they don't make it
we get no benefit), I'll probably pick A.
Unfortunately, this plays against this patch - I'm certainly in favor of
adding lz4 (and other compression algos) into pg_dump, but if I commit
0001 we get little benefit, and the other parts actually adding lz4/zstd
are treated as "WIP / for completeness" so it's unclear when we'd get to
commit them.
So if I could recommend one thing, it'd be to get at least one of those
WIP patches into a shape that's likely committable right after 0001.
I had posted v19-0001 for a committer's consideration and v19-000{2,3} for completeness.
Please find a rebased v20 attached.
I took a quick look at 0001, so a couple comments (sorry if some of this
was already discussed in the thread):
1) I don't think a "refactoring" patch should reference particular
compression algorithms (lz4/zstd), and in particular I don't think we
should have "not yet implemented" messages. We only have a couple other
places doing that, when we didn't have a better choice. But here we can
simply reject the algorithm when parsing the options, we don't need to
do that in a dozen other places.
2) I wouldn't reorder the cases in WriteDataToArchive, i.e. I'd keep
"none" at the end. It might make backpatches harder.
3) While building, I get bunch of warnings about missing cfdopen()
prototype and pg_backup_archiver.c not knowing about cfdopen() and
adding an implicit prototype (so I doubt it actually works).
4) "cfp" struct no longer wraps gzFile, but the comment was not updated.
FWIW I'm not sure switching to "void *" is an improvement, maybe it'd be
better to have a "union" of correct types?
5) cfopen/cfdopen are missing comments. cfopen_internal has an updated
comment, but that's a static function while cfopen/cfdopen are the
actual API.
Also please let me know if I should silently step away from it and let other people lead
it. I would be glad to comply either way.
Please don't. I promise to take a look at this patch again.
Thanks for doing all the work.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
------- Original Message -------
On Wednesday, January 18th, 2023 at 3:00 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
Hi,
On 1/16/23 16:14, gkokolatos@pm.me wrote:
Hi,
I admit I am completely at lost as to what is expected from me anymore.
<snip>
Unfortunately, this plays against this patch - I'm certainly in favor of
adding lz4 (and other compression algos) into pg_dump, but if I commit
0001 we get little benefit, and the other parts actually adding lz4/zstd
are treated as "WIP / for completeness" so it's unclear when we'd get to
commit them.
Thank you for your kindness and for taking the time to explain.
So if I could recommend one thing, it'd be to get at least one of those
WIP patches into a shape that's likely committable right after 0001.
This was clearly my fault. I misunderstood a suggestion upthread to focus
on the first patch of the series and ignore documentation and comments on
the rest.
Please find v21 to contain 0002 and 0003 in a state which I no longer consider
as WIP but worthy of proper consideration. Some guidance on where is best to add
documentation in 0002 for the function pointers in CompressFileHandle will
be welcomed.
I had posted v19-0001 for a committer's consideration and v19-000{2,3} for completeness.
Please find a rebased v20 attached.I took a quick look at 0001, so a couple comments (sorry if some of this
was already discussed in the thread):
Much appreciated!
1) I don't think a "refactoring" patch should reference particular
compression algorithms (lz4/zstd), and in particular I don't think we
should have "not yet implemented" messages. We only have a couple other
places doing that, when we didn't have a better choice. But here we can
simply reject the algorithm when parsing the options, we don't need to
do that in a dozen other places.
I have now removed lz4/zstd from where they were present with the exception
of pg_dump.c which is responsible for parsing.
2) I wouldn't reorder the cases in WriteDataToArchive, i.e. I'd keep
"none" at the end. It might make backpatches harder.
Agreed. However a 'default' is needed in order to avoid compilation warnings.
Also note that 0002 completely does away with cases within WriteDataToArchive.
3) While building, I get bunch of warnings about missing cfdopen()
prototype and pg_backup_archiver.c not knowing about cfdopen() and
adding an implicit prototype (so I doubt it actually works).
Fixed. cfdopen() got prematurely introduced in 5e73a6048 and then got removed
in 69fb29d1af. v20 failed to properly take 69fb29d1af in consideration. Note
that cfdopen is removed in 0002 which explains why cfbot didn't complain.
4) "cfp" struct no longer wraps gzFile, but the comment was not updated.
FWIW I'm not sure switching to "void *" is an improvement, maybe it'd be
better to have a "union" of correct types?
Please find and updated comment and a union in place of the void *. Also
note that 0002 completely does away with cfp in favour of a new struct
CompressFileHandle. I maintained the void * there because it is used by
private methods of the compressors. 0003 contains such an example with
LZ4CompressorState.
5) cfopen/cfdopen are missing comments. cfopen_internal has an updated
comment, but that's a static function while cfopen/cfdopen are the
actual API.
Added comments to cfopen/cfdopen.
Also please let me know if I should silently step away from it and let other people lead
it. I would be glad to comply either way.Please don't. I promise to take a look at this patch again.
Thank you very much.
Thanks for doing all the work.
Thank you.
Cheers,
//Georgios
Show quoted text
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
v21-0002-Introduce-Compressor-API-in-pg_dump.patchtext/x-patch; name=v21-0002-Introduce-Compressor-API-in-pg_dump.patchDownload
From 2a7e38bf3613f86bad1c089123e1651972119502 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 18 Jan 2023 18:24:31 +0000
Subject: [PATCH v21 2/3] Introduce Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for opening, writing, etc. The implementor of a new
compression method is now able to "simply" just add those definitions.
Custom compressed archives now need to store the compression algorithm in their
header. This requires a bump in the version number. The level of compression
is no longer stored in the dump as it is now irrelevant.
---
src/bin/pg_dump/Makefile | 1 +
src/bin/pg_dump/compress_gzip.c | 406 +++++++++++
src/bin/pg_dump/compress_gzip.h | 22 +
src/bin/pg_dump/compress_io.c | 933 +++++++-------------------
src/bin/pg_dump/compress_io.h | 72 +-
src/bin/pg_dump/meson.build | 1 +
src/bin/pg_dump/pg_backup_archiver.c | 102 +--
src/bin/pg_dump/pg_backup_archiver.h | 5 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 94 +--
src/bin/pg_dump/t/002_pg_dump.pl | 10 +-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 2 +
13 files changed, 868 insertions(+), 804 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index ef1ed0f3e5..7a19f5d617 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,6 +24,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..f365b93b76
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,406 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.c
+ * Routines for archivers to write an uncompressed or compressed data
+ * stream.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_gzip.h"
+#include "pg_backup_utils.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ /*
+ * A level of zero simply copies the input one block at the time.
+ * This is probably not what the user wanted when calling this
+ * interface.
+ */
+ if (cs->compression_spec.level == 0)
+ pg_fatal("requested to compress the archive yet no level was specified");
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ cs->compression_spec = compression_spec;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
+ cs->private_data = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ size_t ret;
+
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzwrite(gzfp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzgets(gzfp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int save_errno;
+ int ret;
+
+ CFH->private_data = NULL;
+
+ ret = gzclose(gzfp);
+
+ save_errno = errno;
+ errno = save_errno;
+
+ return ret;
+}
+
+static int
+Gzip_eof(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzeof(gzfp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ gzFile gzfp;
+ char mode_compression[32];
+
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
+ else
+ gzfp = gzopen(path, mode_compression);
+
+ if (gzfp == NULL)
+ return 1;
+
+ CFH->private_data = gzfp;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ CFH->open_func = Gzip_open;
+ CFH->open_write_func = Gzip_open_write;
+ CFH->read_func = Gzip_read;
+ CFH->write_func = Gzip_write;
+ CFH->gets_func = Gzip_gets;
+ CFH->getc_func = Gzip_getc;
+ CFH->close_func = Gzip_close;
+ CFH->eof_func = Gzip_eof;
+ CFH->get_error_func = Gzip_get_error;
+
+ CFH->compression_spec = compression_spec;
+
+ CFH->private_data = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..6dfd0eb04d
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.h
+ * Interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 44d9b0d4a5..d60111b2b8 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -9,42 +9,54 @@
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
+ * underlying stream. The second API is a wrapper around fopen and
* friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
+ * the possible compression. The second API is aimed for the resulting
+ * files to be easily manipulated with an external compression utility
+ * program.
+ *
+ * This file also includes the implementation when compression is none for
+ * both API's.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
- *
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
+ * AllocateCompressor, writeData, and EndCompressor. First you call
+ * AllocateCompressor, then write all the data by calling writeData as many
+ * times as needed, and finally EndCompressor. writeData will call the
+ * WriteFunc that was provided to AllocateCompressor for each chunk of
+ * compressed data.
*
- * The interface is the same for compressed and uncompressed streams.
+ * The interface for reading an archive consists of the same three functions:
+ * AllocateCompressor, readData, and EndCompressor. First you call
+ * AllocateCompressor, then read all the data by calling readData to read the
+ * whole compressed stream which repeatedly calls the given ReadFunc. ReadFunc
+ * returns the compressed data one chunk at a time. Then readData decompresses
+ * it and passes the decompressed data to ahwrite(), until ReadFunc returns 0
+ * to signal EOF. The interface is the same for compressed and uncompressed
+ * streams.
*
* Compressed stream API
* ----------------------
*
- * The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * The compressed stream API is providing a set of function pointers for
+ * opening, reading, writing, and finally closing files. The implemented
+ * function pointers are documented in the corresponding header file and are
+ * common for all streams. It allows the caller to use the same functions for
+ * both compressed and uncompressed streams.
+ *
+ * The interface consists of three functions, InitCompressFileHandle,
+ * InitDiscoverCompressFileHandle, and EndCompressFileHandle. If the
+ * compression is known, then start by calling InitCompressFileHandle,
+ * otherwise discover it by using InitDiscoverCompressFileHandle. Then call
+ * the function pointers as required for the read/write operations. Finally
+ * call EndCompressFileHandle to end the stream.
+ *
+ * InitDiscoverCompressFileHandle tries to deffer the compression by the
+ * filename suffix. If the suffix is not yet known, then it tries to simply
+ * open the file, and if it fails, it tries to open the same file with the .gz
+ * suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -53,128 +65,94 @@
*/
#include "postgres_fe.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
#include "compress_io.h"
+#include "compress_gzip.h"
#include "pg_backup_utils.h"
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#endif
-
/*----------------------
* Compressor API
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
+/* Private routines */
+
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
+
+ free(buf);
+}
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
+/* Compressor API None implementation */
-/* Public interface routines */
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
-/* Allocate a new compressor */
-CompressorState *
-AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
{
- CompressorState *cs;
+ /* no op */
+}
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
+static void
+InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
- cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
- cs->writeF = writeF;
cs->compression_spec = compression_spec;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
}
+/* Public interface routines */
+
/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
+ * Allocate a new compressor.
*/
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF)
+CompressorState *
+AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF, WriteFunc writeF)
{
+ CompressorState *cs;
+
+ cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
+ cs->writeF = writeF;
+
switch (compression_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- ReadDataFromArchiveNone(AH, readF);
+ InitCompressorNone(cs, compression_spec);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
+ InitCompressorGzip(cs, compression_spec);
break;
default:
pg_fatal("invalid compression method \"%s\"",
get_compress_algorithm_name(compression_spec.algorithm));
break;
}
-}
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- default:
- pg_fatal("invalid compression method \"%s\"",
- get_compress_algorithm_name(cs->compression_spec.algorithm));
- break;
- }
+ return cs;
}
/*
@@ -183,404 +161,183 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- EndCompressorZlib(AH, cs);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- default:
- pg_fatal("invalid compression method \"%s\"",
- get_compress_algorithm_name(cs->compression_spec.algorithm));
- break;
- }
-
- free(cs);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
+/*----------------------
+ * Compressed stream API
+ *----------------------
*/
-static void
-InitCompressorZlib(CompressorState *cs, int level)
-{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
-}
+/* Private routines */
-static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
+ if (filenamelen < suffixlen)
+ return 0;
- free(cs->zlibOut);
- free(cs->zp);
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
}
+/* free() without changing errno; useful in several places below */
static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
+free_keep_errno(void *p)
{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
-
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
-
- if (res == Z_STREAM_END)
- break;
- }
-}
+ int save_errno = errno;
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
+ free(p);
+ errno = save_errno;
}
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
+/*
+ * Compressed stream API None implementation
+ */
+static size_t
+read_none(void *ptr, size_t size, CompressFileHandle *CFH)
{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
+ FILE *fp = (FILE *) CFH->private_data;
+ size_t ret;
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
+ if (size == 0)
+ return 0;
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
+ return ret;
+}
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
+static size_t
+write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
+}
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
+static const char *
+get_error_none(CompressFileHandle *CFH)
+{
+ return strerror(errno);
+}
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
+static char *
+gets_none(char *ptr, int size, CompressFileHandle *CFH)
+{
+ return fgets(ptr, size, (FILE *) CFH->private_data);
+}
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
+static int
+getc_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret;
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
+ ret = fgetc(fp);
+ if (ret == EOF)
{
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
}
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
+ return ret;
}
-#endif /* HAVE_LIBZ */
-
-
-/*
- * Functions for uncompressed output.
- */
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
+static int
+close_none(CompressFileHandle *CFH)
{
- size_t cnt;
- char *buf;
- size_t buflen;
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret = 0;
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
+ CFH->private_data = NULL;
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
+ if (fp)
+ ret = fclose(fp);
- free(buf);
+ return ret;
}
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+static int
+eof_none(CompressFileHandle *CFH)
{
- cs->writeF(AH, data, dLen);
+ return feof((FILE *) CFH->private_data);
}
-
-/*----------------------
- * Compressed stream API
- *----------------------
- */
-
-/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
- */
-struct cfp
+static int
+open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
{
- pg_compress_specification compression_spec;
- union {
- FILE *fp;
-#ifdef HAVE_LIBZ
- gzFile gzfp;
-#endif
- } file;
-};
+ Assert(CFH->private_data == NULL);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (fd >= 0)
+ CFH->private_data = fdopen(dup(fd), mode);
+ else
+ CFH->private_data = fopen(path, mode);
-/* free() without changing errno; useful in several places below */
-static void
-free_keep_errno(void *p)
-{
- int save_errno = errno;
+ if (CFH->private_data == NULL)
+ return 1;
- free(p);
- errno = save_errno;
+ return 0;
}
-/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_read(const char *path, const char *mode)
+static int
+open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
{
- cfp *fp;
+ Assert(CFH->private_data == NULL);
- pg_compress_specification compression_spec = {0};
+ CFH->private_data = fopen(path, mode);
+ if (CFH->private_data == NULL)
+ return 1;
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
+ return 0;
+}
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
+static void
+InitCompressNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ CFH->open_func = open_none;
+ CFH->open_write_func = open_write_none;
+ CFH->read_func = read_none;
+ CFH->write_func = write_none;
+ CFH->gets_func = gets_none;
+ CFH->getc_func = getc_none;
+ CFH->close_func = close_none;
+ CFH->eof_func = eof_none;
+ CFH->get_error_func = get_error_none;
+
+ CFH->private_data = NULL;
}
/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
+ * Public interface
*/
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- cfp *fp;
-
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
-}
/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
+ * Initialize a compress file handle for the specified compression algorithm.
*/
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compression_spec)
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compression_spec)
{
- cfp *fp = pg_malloc(sizeof(cfp));
+ CompressFileHandle *CFH;
- fp->compression_spec = compression_spec;
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
switch (compression_spec.algorithm)
{
case PG_COMPRESSION_NONE:
- if (fd >= 0)
- fp->file.fp = fdopen(fd, mode);
- else
- fp->file.fp = fopen(path, mode);
- if (fp->file.fp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-
+ InitCompressNone(CFH, compression_spec);
break;
case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /*
- * user has specified a compression level, so tell zlib to use
- * it
- */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- if (fd >= 0)
- fp->file.gzfp = gzdopen(fd, mode_compression);
- else
- fp->file.gzfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->file.gzfp = gzdopen(fd, mode);
- else
- fp->file.gzfp = gzopen(path, mode);
- }
-
- if (fp->file.gzfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
+ InitCompressGzip(CFH, compression_spec);
break;
default:
pg_fatal("invalid compression method \"%s\"",
@@ -588,268 +345,84 @@ cfopen_internal(const char *path, int fd, const char *mode,
break;
}
- return fp;
+ return CFH;
}
/*
- * Opens file 'path' in 'mode' and compression as defined in
- * compression_spec. The caller must verify that the compression
- * is supported by the current build.
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
*
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- return cfopen_internal(path, -1, mode, compression_spec);
-}
-
-/*
- * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
- * and compression as defined in compression_spec. The caller must
- * verify that the compression is supported by the current build.
+ * If the file at 'path' contains the suffix of a supported compression method,
+ * currently this includes only ".gz", then this compression will be used
+ * throughout. Otherwise the compression will be deferred by iteratively trying
+ * to open the file at 'path', first as is, then by appending known compression
+ * suffixes. So if you pass "foo" as 'path', this will open either "foo" or
+ * "foo.gz", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec)
-{
- return cfopen_internal(NULL, fd, mode, compression_spec);
-}
-
-int
-cfread(void *ptr, int size, cfp *fp)
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- int ret = 0;
-
- if (size == 0)
- return 0;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fread(ptr, 1, size, fp->file.fp);
- if (ret != size && !feof(fp->file.fp))
- READ_ERROR_EXIT(fp->file.fp);
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzread(fp->file.gzfp, ptr, size);
- if (ret != size && !gzeof(fp->file.gzfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->file.gzfp, &errnum);
-
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- default:
- pg_fatal("invalid compression method \"%s\"",
- get_compress_algorithm_name(fp->compression_spec.algorithm));
- break;
- }
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
- return ret;
-}
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
-int
-cfwrite(const void *ptr, int size, cfp *fp)
-{
- int ret = 0;
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fwrite(ptr, 1, size, fp->file.fp);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzwrite(fp->file.gzfp, ptr, size);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- default:
- pg_fatal("invalid compression method \"%s\"",
- get_compress_algorithm_name(fp->compression_spec.algorithm));
- break;
- }
-
- return ret;
-}
-
-int
-cfgetc(cfp *fp)
-{
- int ret = 0;
+ fname = strdup(path);
- switch (fp->compression_spec.algorithm)
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ else
{
- case PG_COMPRESSION_NONE:
- ret = fgetc(fp->file.fp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->file.fp);
+ bool exists;
- break;
- case PG_COMPRESSION_GZIP:
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not build with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- ret = gzgetc(fp->file.gzfp);
- if (ret == EOF)
- {
- if (!gzeof(fp->file.gzfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- default:
- pg_fatal("invalid compression method \"%s\"",
- get_compress_algorithm_name(fp->compression_spec.algorithm));
- break;
- }
-
- return ret;
-}
-
-char *
-cfgets(cfp *fp, char *buf, int len)
-{
- char *ret = NULL;
-
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fgets(buf, len, fp->file.fp);
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzgets(fp->file.gzfp, buf, len);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ }
#endif
- break;
- default:
- pg_fatal("invalid compression method \"%s\"",
- get_compress_algorithm_name(fp->compression_spec.algorithm));
- break;
}
- return ret;
-}
-
-int
-cfclose(cfp *fp)
-{
- int ret = 0;
-
- if (fp == NULL)
+ CFH = InitCompressFileHandle(compression_spec);
+ if (CFH->open_func(fname, -1, mode, CFH))
{
- errno = EBADF;
- return EOF;
+ free_keep_errno(CFH);
+ CFH = NULL;
}
+ free_keep_errno(fname);
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = fclose(fp->file.fp);
- fp->file.fp = NULL;
-
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzclose(fp->file.gzfp);
- fp->file.gzfp = NULL;
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- default:
- pg_fatal("invalid compression method \"%s\"",
- get_compress_algorithm_name(fp->compression_spec.algorithm));
- break;
- }
-
- free_keep_errno(fp);
-
- return ret;
+ return CFH;
}
+/*
+ * Close an open file handle and release its memory.
+ *
+ * On failure, returns an error value and sets errno appropriately.
+ */
int
-cfeof(cfp *fp)
+EndCompressFileHandle(CompressFileHandle *CFH)
{
int ret = 0;
- switch (fp->compression_spec.algorithm)
- {
- case PG_COMPRESSION_NONE:
- ret = feof(fp->file.fp);
+ if (CFH->private_data)
+ ret = CFH->close_func(CFH);
- break;
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- ret = gzeof(fp->file.gzfp);
-#else
- pg_fatal("this build does not support compression with %s",
- "gzip");
-#endif
- break;
- default:
- pg_fatal("invalid compression method \"%s\"",
- get_compress_algorithm_name(fp->compression_spec.algorithm));
- break;
- }
+ free_keep_errno(CFH);
return ret;
}
-
-const char *
-get_cfp_error(cfp *fp)
-{
- if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- int errnum;
- const char *errmsg = gzerror(fp->file.gzfp, &errnum);
-
- if (errnum != Z_ERRNO)
- return errmsg;
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
-
- return strerror(errno);
-}
-
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
-{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
-
- if (filenamelen < suffixlen)
- return 0;
-
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
-}
-
-#endif
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a6cdf588dd..bd2b5623a5 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -37,34 +37,64 @@ typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+
+ /*
+ * End compression and flush internal buffers if any.
+ */
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ ReadFunc readF;
+ WriteFunc writeF;
+
+ pg_compress_specification compression_spec;
+ void *private_data;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
-typedef struct cfp cfp;
+struct CompressFileHandle
+{
+ int (*open_func) (const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH);
+ int (*open_write_func) (const char *path, const char *mode,
+ CompressFileHandle *cxt);
+ size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
+ size_t (*write_func) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
+ int (*getc_func) (CompressFileHandle *CFH);
+ int (*eof_func) (CompressFileHandle *CFH);
+ int (*close_func) (CompressFileHandle *CFH);
+ const char *(*get_error_func) (CompressFileHandle *CFH);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+ pg_compress_specification compression_spec;
+ void *private_data;
+};
+extern CompressFileHandle *InitCompressFileHandle(const pg_compress_specification compression_spec);
+extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int EndCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index ca62f9a374..aa2c91829c 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -2,6 +2,7 @@
pg_dump_common_sources = files(
'compress_io.c',
+ 'compress_gzip.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index fb94317ad9..cc3b7f0992 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -355,7 +355,7 @@ RestoreArchive(Archive *AHX)
bool parallel_mode;
bool supports_compression;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1127,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1143,9 +1143,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1502,6 +1503,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1524,33 +1526,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1689,7 +1690,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2031,6 +2036,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2061,26 +2078,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2178,6 +2181,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2233,7 +2237,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3646,12 +3653,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3722,10 +3724,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
@@ -3737,10 +3740,17 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
+ {
+ bool unsupported = false;
+
#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ unsupported = true;
#endif
+ if (unsupported)
+ pg_fatal("archive is compressed, but this installation does not support compression");
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 4725e49747..18b38c17ab 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,13 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add
+ * compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index d1e54644a9..512ab043af 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 6800c3ccee..a2f88995c0 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read_func(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (EndCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof_func(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write_func(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc_func(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write_func(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read_func(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compression_spec);
+ if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
+ if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write_func(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 2eeef2a478..f3ba926321 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -94,7 +94,7 @@ my %pgdump_runs = (
command => [
'pg_restore', '-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip-compressed'
},
},
@@ -239,8 +239,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default if available',
},
},
@@ -264,8 +264,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
glob_patterns => [
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index e52fe9f509..da8e4861f4 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -150,6 +150,7 @@ do
# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.
+ test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 23bafec5f7..840191d680 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -428,6 +428,7 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
+CompressFileHandle
CompressionLocation
CompressorState
ComputeXidHorizonsResult
@@ -1034,6 +1035,7 @@ GucStack
GucStackState
GucStringAssignHook
GucStringCheckHook
+GzipCompressorState
HANDLE
HASHACTION
HASHBUCKET
--
2.34.1
v21-0001-Prepare-pg_dump-internals-for-additional-compres.patchtext/x-patch; name=v21-0001-Prepare-pg_dump-internals-for-additional-compres.patchDownload
From 44a3850423887d9623336cf623b2ee702d891432 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 18 Jan 2023 18:20:43 +0000
Subject: [PATCH v21 1/3] Prepare pg_dump internals for additional compression
methods.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it and is using it throughout.
---
src/bin/pg_dump/compress_io.c | 382 +++++++++++++++++++--------
src/bin/pg_dump/compress_io.h | 2 +
src/bin/pg_dump/pg_backup_archiver.c | 128 +++------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-
4 files changed, 316 insertions(+), 223 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 7a2c80bbc4..44d9b0d4a5 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,10 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
/*----------------------
* Compressor API
*----------------------
@@ -128,15 +132,22 @@ ReadDataFromArchive(ArchiveHandle *AH,
const pg_compress_specification compression_spec,
ReadFunc readF)
{
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ switch (compression_spec.algorithm)
{
+ case PG_COMPRESSION_NONE:
+ ReadDataFromArchiveNone(AH, readF);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
+ ReadDataFromArchiveZlib(AH, readF);
#else
- pg_fatal("this build does not support compression with %s", "gzip");
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
+ break;
+ default:
+ pg_fatal("invalid compression method \"%s\"",
+ get_compress_algorithm_name(compression_spec.algorithm));
+ break;
}
}
@@ -159,10 +170,9 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
case PG_COMPRESSION_NONE:
WriteDataToArchiveNone(AH, cs, data, dLen);
break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
+ default:
+ pg_fatal("invalid compression method \"%s\"",
+ get_compress_algorithm_name(cs->compression_spec.algorithm));
break;
}
}
@@ -173,10 +183,24 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
+ switch (cs->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
+ EndCompressorZlib(AH, cs);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
+ break;
+ default:
+ pg_fatal("invalid compression method \"%s\"",
+ get_compress_algorithm_name(cs->compression_spec.algorithm));
+ break;
+ }
+
free(cs);
}
@@ -391,10 +415,13 @@ WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
*/
struct cfp
{
- FILE *uncompressedfp;
+ pg_compress_specification compression_spec;
+ union {
+ FILE *fp;
#ifdef HAVE_LIBZ
- gzFile compressedfp;
+ gzFile gzfp;
#endif
+ } file;
};
#ifdef HAVE_LIBZ
@@ -490,127 +517,208 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_specification compression_spec)
{
cfp *fp = pg_malloc(sizeof(cfp));
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ fp->compression_spec = compression_spec;
+
+ switch (compression_spec.algorithm)
{
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
+ case PG_COMPRESSION_NONE:
+ if (fd >= 0)
+ fp->file.fp = fdopen(fd, mode);
+ else
+ fp->file.fp = fopen(path, mode);
+ if (fp->file.fp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
- }
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ if (compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use
+ * it
+ */
+ char mode_compression[32];
+
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, compression_spec.level);
+ if (fd >= 0)
+ fp->file.gzfp = gzdopen(fd, mode_compression);
+ else
+ fp->file.gzfp = gzopen(path, mode_compression);
+ }
+ else
+ {
+ /* don't specify a level, just use the zlib default */
+ if (fd >= 0)
+ fp->file.gzfp = gzdopen(fd, mode);
+ else
+ fp->file.gzfp = gzopen(path, mode);
+ }
- fp->uncompressedfp = NULL;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ if (fp->file.gzfp == NULL)
+ {
+ free_keep_errno(fp);
+ fp = NULL;
+ }
#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
- else
- {
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
- fp->uncompressedfp = fopen(path, mode);
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
+ break;
+ default:
+ pg_fatal("invalid compression method \"%s\"",
+ get_compress_algorithm_name(compression_spec.algorithm));
+ break;
}
return fp;
}
+/*
+ * Opens file 'path' in 'mode' and compression as defined in
+ * compression_spec. The caller must verify that the compression
+ * is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+/*
+ * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
+ * and compression as defined in compression_spec. The caller must
+ * verify that the compression is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
{
- int ret;
+ int ret = 0;
if (size == 0)
return 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ case PG_COMPRESSION_NONE:
+ ret = fread(ptr, 1, size, fp->file.fp);
+ if (ret != size && !feof(fp->file.fp))
+ READ_ERROR_EXIT(fp->file.fp);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
- else
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzread(fp->file.gzfp, ptr, size);
+ if (ret != size && !gzeof(fp->file.gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(fp->file.gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ default:
+ pg_fatal("invalid compression method \"%s\"",
+ get_compress_algorithm_name(fp->compression_spec.algorithm));
+ break;
}
+
return ret;
}
int
cfwrite(const void *ptr, int size, cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fwrite(ptr, 1, size, fp->file.fp);
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
+ ret = gzwrite(fp->file.gzfp, ptr, size);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
+ break;
+ default:
+ pg_fatal("invalid compression method \"%s\"",
+ get_compress_algorithm_name(fp->compression_spec.algorithm));
+ break;
+ }
+
+ return ret;
}
int
cfgetc(cfp *fp)
{
- int ret;
+ int ret = 0;
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ switch (fp->compression_spec.algorithm)
{
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
- {
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fgetc(fp->file.fp);
+ if (ret == EOF)
+ READ_ERROR_EXIT(fp->file.fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzgetc(fp->file.gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(fp->file.gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
+ break;
+ default:
+ pg_fatal("invalid compression method \"%s\"",
+ get_compress_algorithm_name(fp->compression_spec.algorithm));
+ break;
}
return ret;
@@ -619,65 +727,113 @@ cfgetc(cfp *fp)
char *
cfgets(cfp *fp, char *buf, int len)
{
+ char *ret = NULL;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = fgets(buf, len, fp->file.fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
+ ret = gzgets(fp->file.gzfp, buf, len);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return fgets(buf, len, fp->uncompressedfp);
+ break;
+ default:
+ pg_fatal("invalid compression method \"%s\"",
+ get_compress_algorithm_name(fp->compression_spec.algorithm));
+ break;
+ }
+
+ return ret;
}
int
cfclose(cfp *fp)
{
- int result;
+ int ret = 0;
if (fp == NULL)
{
errno = EBADF;
return EOF;
}
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+
+ switch (fp->compression_spec.algorithm)
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
- }
- else
+ case PG_COMPRESSION_NONE:
+ ret = fclose(fp->file.fp);
+ fp->file.fp = NULL;
+
+ break;
+ case PG_COMPRESSION_GZIP:
+#ifdef HAVE_LIBZ
+ ret = gzclose(fp->file.gzfp);
+ fp->file.gzfp = NULL;
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
+ break;
+ default:
+ pg_fatal("invalid compression method \"%s\"",
+ get_compress_algorithm_name(fp->compression_spec.algorithm));
+ break;
}
+
free_keep_errno(fp);
- return result;
+ return ret;
}
int
cfeof(cfp *fp)
{
+ int ret = 0;
+
+ switch (fp->compression_spec.algorithm)
+ {
+ case PG_COMPRESSION_NONE:
+ ret = feof(fp->file.fp);
+
+ break;
+ case PG_COMPRESSION_GZIP:
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
+ ret = gzeof(fp->file.gzfp);
+#else
+ pg_fatal("this build does not support compression with %s",
+ "gzip");
#endif
- return feof(fp->uncompressedfp);
+ break;
+ default:
+ pg_fatal("invalid compression method \"%s\"",
+ get_compress_algorithm_name(fp->compression_spec.algorithm));
+ break;
+ }
+
+ return ret;
}
const char *
get_cfp_error(cfp *fp)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ if (fp->compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
+#ifdef HAVE_LIBZ
int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ const char *errmsg = gzerror(fp->file.gzfp, &errnum);
if (errnum != Z_ERRNO)
return errmsg;
- }
+#else
+ pg_fatal("this build does not support compression with %s", "gzip");
#endif
+ }
+
return strerror(errno);
}
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a429dc4789..a6cdf588dd 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -54,6 +54,8 @@ typedef struct cfp cfp;
extern cfp *cfopen(const char *path, const char *mode,
const pg_compress_specification compression_spec);
+extern cfp *cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
extern cfp *cfopen_write(const char *path, const char *mode,
const pg_compress_specification compression_spec);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 7f7a0f1ce7..fb94317ad9 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -362,8 +353,9 @@ RestoreArchive(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
+ bool supports_compression;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +383,24 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
+ supports_compression = true;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE &&
+ AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+#ifndef HAVE_LIBZ
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+ if (supports_compression == false)
+ pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1128,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1503,58 +1502,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1565,33 +1538,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1715,22 +1679,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2219,6 +2178,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2272,8 +2232,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b..4725e49747 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.34.1
v21-0003-Add-LZ4-compression-to-pg_dump.patchtext/x-patch; name=v21-0003-Add-LZ4-compression-to-pg_dump.patchDownload
From 854f2ec500300dfff8ad7eda04300a83adfd3f46 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 18 Jan 2023 18:15:42 +0000
Subject: [PATCH v21 3/3] Add LZ4 compression to pg_dump
This is mostly done within pg_dump's compression streaming and file APIs.
It is confined within the newly introduced compress_lz4.{c,h} files.
The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(), it has been
implemented localy.
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 22 +-
src/bin/pg_dump/compress_lz4.c | 618 +++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 22 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 14 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 1 +
11 files changed, 770 insertions(+), 18 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e1..49d218905f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 7a19f5d617..a1401377ab 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -25,6 +26,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
compress_gzip.o \
+ compress_lz4.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index d60111b2b8..96947cd3ea 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,7 +56,7 @@
* InitDiscoverCompressFileHandle tries to deffer the compression by the
* filename suffix. If the suffix is not yet known, then it tries to simply
* open the file, and if it fails, it tries to open the same file with the .gz
- * suffix.
+ * suffix, and then again with the .lz4 suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -70,6 +70,7 @@
#include "compress_io.h"
#include "compress_gzip.h"
+#include "compress_lz4.h"
#include "pg_backup_utils.h"
/*----------------------
@@ -146,6 +147,9 @@ AllocateCompressor(const pg_compress_specification compression_spec,
case PG_COMPRESSION_GZIP:
InitCompressorGzip(cs, compression_spec);
break;
+ case PG_COMPRESSION_LZ4:
+ InitCompressorLZ4(cs, compression_spec);
+ break;
default:
pg_fatal("invalid compression method \"%s\"",
get_compress_algorithm_name(compression_spec.algorithm));
@@ -339,6 +343,9 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
case PG_COMPRESSION_GZIP:
InitCompressGzip(CFH, compression_spec);
break;
+ case PG_COMPRESSION_LZ4:
+ InitCompressLZ4(CFH, compression_spec);
+ break;
default:
pg_fatal("invalid compression method \"%s\"",
get_compress_algorithm_name(compression_spec.algorithm));
@@ -357,7 +364,7 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
* throughout. Otherwise the compression will be deferred by iteratively trying
* to open the file at 'path', first as is, then by appending known compression
* suffixes. So if you pass "foo" as 'path', this will open either "foo" or
- * "foo.gz", trying in that order.
+ * "foo.gz" or "foo.lz4", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
@@ -395,6 +402,17 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
+#endif
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..c97e16187a
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,618 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.c
+ * Routines for archivers to write an uncompressed or compressed data
+ * stream.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private_data;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private_data = NULL;
+ }
+}
+
+
+/* Public routines that support LZ4 compressed data I/O */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ cs->compression_spec = compression_spec;
+
+ /* Will be lazy init'd */
+ cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the nessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurance of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (!fs->inited && LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verfiy that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (eol_flag == 0) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag == true &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the begining of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int remaining = size;
+
+ if (!fs->inited && LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = errno ? : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ LZ4File *lz4fp;
+
+ CFH->open_func = LZ4File_open;
+ CFH->open_write_func = LZ4File_open_write;
+ CFH->read_func = LZ4File_read;
+ CFH->write_func = LZ4File_write;
+ CFH->gets_func = LZ4File_gets;
+ CFH->getc_func = LZ4File_getc;
+ CFH->eof_func = LZ4File_eof;
+ CFH->close_func = LZ4File_close;
+ CFH->get_error_func = LZ4File_get_error;
+
+ CFH->compression_spec = compression_spec;
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (CFH->compression_spec.level >= 0)
+ lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
+
+ CFH->private_data = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..74595db1b9
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.h
+ * Interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index aa2c91829c..473d40d456 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -3,6 +3,7 @@
pg_dump_common_sources = files(
'compress_io.c',
'compress_gzip.c',
+ 'compress_lz4.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
@@ -17,7 +18,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -85,7 +86,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index cc3b7f0992..7005ffcd5b 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -395,6 +395,10 @@ RestoreArchive(Archive *AHX)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
#endif
if (supports_compression == false)
pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
@@ -2074,7 +2078,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2084,6 +2088,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -3747,6 +3755,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
unsupported = true;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ unsupported = true;
#endif
if (unsupported)
pg_fatal("archive is compressed, but this installation does not support compression");
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..08105337b1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index f3ba926321..f497ec6040 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -139,6 +139,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files where compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4175,11 +4249,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index da8e4861f4..1f7f817e4d 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -152,6 +152,7 @@ do
# as field names, which is unfortunate but we won't change it now.
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_lz4.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
test "$f" = src/bin/pg_dump/pg_dump.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 840191d680..232228d427 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1381,6 +1381,7 @@ LWLock
LWLockHandle
LWLockMode
LWLockPadded
+LZ4CompressorState
LZ4F_compressionContext_t
LZ4F_decompressOptions_t
LZ4F_decompressionContext_t
--
2.34.1
On 1/18/23 20:05, gkokolatos@pm.me wrote:
------- Original Message -------
On Wednesday, January 18th, 2023 at 3:00 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:Hi,
On 1/16/23 16:14, gkokolatos@pm.me wrote:
Hi,
I admit I am completely at lost as to what is expected from me anymore.
<snip>
Unfortunately, this plays against this patch - I'm certainly in favor of
adding lz4 (and other compression algos) into pg_dump, but if I commit
0001 we get little benefit, and the other parts actually adding lz4/zstd
are treated as "WIP / for completeness" so it's unclear when we'd get to
commit them.Thank you for your kindness and for taking the time to explain.
So if I could recommend one thing, it'd be to get at least one of those
WIP patches into a shape that's likely committable right after 0001.This was clearly my fault. I misunderstood a suggestion upthread to focus
on the first patch of the series and ignore documentation and comments on
the rest.Please find v21 to contain 0002 and 0003 in a state which I no longer consider
as WIP but worthy of proper consideration. Some guidance on where is best to add
documentation in 0002 for the function pointers in CompressFileHandle will
be welcomed.
This is internal-only API, not meant for use by regular users and/or
extension authors, so I don't think we need sgml docs. I'd just add
regular code-level documentation to compress_io.h.
For inspiration see docs for "struct ReorderBuffer" in reorderbuffer.h,
or "struct _archiveHandle" in pg_backup_archiver.h.
Or what other kind of documentation you had in mind?
I had posted v19-0001 for a committer's consideration and v19-000{2,3} for completeness.
Please find a rebased v20 attached.I took a quick look at 0001, so a couple comments (sorry if some of this
was already discussed in the thread):Much appreciated!
1) I don't think a "refactoring" patch should reference particular
compression algorithms (lz4/zstd), and in particular I don't think we
should have "not yet implemented" messages. We only have a couple other
places doing that, when we didn't have a better choice. But here we can
simply reject the algorithm when parsing the options, we don't need to
do that in a dozen other places.I have now removed lz4/zstd from where they were present with the exception
of pg_dump.c which is responsible for parsing.
I'm not sure I understand why leave the lz4/zstd in this place?
2) I wouldn't reorder the cases in WriteDataToArchive, i.e. I'd keep
"none" at the end. It might make backpatches harder.Agreed. However a 'default' is needed in order to avoid compilation warnings.
Also note that 0002 completely does away with cases within WriteDataToArchive.
OK, although that's also a consequence of using a "switch" instead of
plan "if" branches.
Furthermore, I'm not sure we really need the pg_fatal() about invalid
compression method in these default blocks. I mean, how could we even
get to these places when the build does not support the algorithm? All
of this (ReadDataFromArchive, WriteDataToArchive, EndCompressor, ...)
happens looong after the compressor was initialized and the method
checked, no? So maybe either this should simply do Assert(false) or use
a different error message.
3) While building, I get bunch of warnings about missing cfdopen()
prototype and pg_backup_archiver.c not knowing about cfdopen() and
adding an implicit prototype (so I doubt it actually works).Fixed. cfdopen() got prematurely introduced in 5e73a6048 and then got removed
in 69fb29d1af. v20 failed to properly take 69fb29d1af in consideration. Note
that cfdopen is removed in 0002 which explains why cfbot didn't complain.
OK.
4) "cfp" struct no longer wraps gzFile, but the comment was not updated.
FWIW I'm not sure switching to "void *" is an improvement, maybe it'd be
better to have a "union" of correct types?Please find and updated comment and a union in place of the void *. Also
note that 0002 completely does away with cfp in favour of a new struct
CompressFileHandle. I maintained the void * there because it is used by
private methods of the compressors. 0003 contains such an example with
LZ4CompressorState.
I wonder if this (and also the previous item) makes sense to keep 0001
and 0002 or to combine them. The "intermediate" state is a bit annoying.
5) cfopen/cfdopen are missing comments. cfopen_internal has an updated
comment, but that's a static function while cfopen/cfdopen are the
actual API.Added comments to cfopen/cfdopen.
OK.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
------- Original Message -------
On Thursday, January 19th, 2023 at 4:45 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 1/18/23 20:05, gkokolatos@pm.me wrote:
------- Original Message -------
On Wednesday, January 18th, 2023 at 3:00 PM, Tomas Vondra tomas.vondra@enterprisedb.com wrote:Hi,
On 1/16/23 16:14, gkokolatos@pm.me wrote:
Hi,
I admit I am completely at lost as to what is expected from me anymore.
<snip>
Unfortunately, this plays against this patch - I'm certainly in favor of
adding lz4 (and other compression algos) into pg_dump, but if I commit
0001 we get little benefit, and the other parts actually adding lz4/zstd
are treated as "WIP / for completeness" so it's unclear when we'd get to
commit them.Thank you for your kindness and for taking the time to explain.
So if I could recommend one thing, it'd be to get at least one of those
WIP patches into a shape that's likely committable right after 0001.This was clearly my fault. I misunderstood a suggestion upthread to focus
on the first patch of the series and ignore documentation and comments on
the rest.Please find v21 to contain 0002 and 0003 in a state which I no longer consider
as WIP but worthy of proper consideration. Some guidance on where is best to add
documentation in 0002 for the function pointers in CompressFileHandle will
be welcomed.This is internal-only API, not meant for use by regular users and/or
extension authors, so I don't think we need sgml docs. I'd just add
regular code-level documentation to compress_io.h.For inspiration see docs for "struct ReorderBuffer" in reorderbuffer.h,
or "struct _archiveHandle" in pg_backup_archiver.h.Or what other kind of documentation you had in mind?
This is exactly what I was after. I was between compress_io.c and compress_io.h.
Thank you.
I had posted v19-0001 for a committer's consideration and v19-000{2,3} for completeness.
Please find a rebased v20 attached.I took a quick look at 0001, so a couple comments (sorry if some of this
was already discussed in the thread):Much appreciated!
1) I don't think a "refactoring" patch should reference particular
compression algorithms (lz4/zstd), and in particular I don't think we
should have "not yet implemented" messages. We only have a couple other
places doing that, when we didn't have a better choice. But here we can
simply reject the algorithm when parsing the options, we don't need to
do that in a dozen other places.I have now removed lz4/zstd from where they were present with the exception
of pg_dump.c which is responsible for parsing.I'm not sure I understand why leave the lz4/zstd in this place?
You are right, it is not obvious. Those were added in 5e73a60488 which is
already committed in master and I didn't want to backtrack. Of course, I am
not opposing in doing so if you wish.
2) I wouldn't reorder the cases in WriteDataToArchive, i.e. I'd keep
"none" at the end. It might make backpatches harder.Agreed. However a 'default' is needed in order to avoid compilation warnings.
Also note that 0002 completely does away with cases within WriteDataToArchive.OK, although that's also a consequence of using a "switch" instead of
plan "if" branches.Furthermore, I'm not sure we really need the pg_fatal() about invalid
compression method in these default blocks. I mean, how could we even
get to these places when the build does not support the algorithm? All
of this (ReadDataFromArchive, WriteDataToArchive, EndCompressor, ...)
happens looong after the compressor was initialized and the method
checked, no? So maybe either this should simply do Assert(false) or use
a different error message.
I like Assert(false).
3) While building, I get bunch of warnings about missing cfdopen()
prototype and pg_backup_archiver.c not knowing about cfdopen() and
adding an implicit prototype (so I doubt it actually works).Fixed. cfdopen() got prematurely introduced in 5e73a6048 and then got removed
in 69fb29d1af. v20 failed to properly take 69fb29d1af in consideration. Note
that cfdopen is removed in 0002 which explains why cfbot didn't complain.OK.
4) "cfp" struct no longer wraps gzFile, but the comment was not updated.
FWIW I'm not sure switching to "void *" is an improvement, maybe it'd be
better to have a "union" of correct types?Please find and updated comment and a union in place of the void *. Also
note that 0002 completely does away with cfp in favour of a new struct
CompressFileHandle. I maintained the void * there because it is used by
private methods of the compressors. 0003 contains such an example with
LZ4CompressorState.I wonder if this (and also the previous item) makes sense to keep 0001
and 0002 or to combine them. The "intermediate" state is a bit annoying.
Agreed. It was initially submitted as one patch. Then it was requested to be
split up in two parts, one to expand the use of the existing API and one to
replace with the new interface. Unfortunately the expansion of usage of the
existing API requires some tweaking, but that is not a very good reason for
the current patch set. I should have done a better job there.
Please find v22 attach which combines back 0001 and 0002. It is missing the
documentation that was discussed above as I wanted to give a quick feedback.
Let me know if you think that the combined version is the one to move forward
with.
Cheers,
//Georgios
Show quoted text
5) cfopen/cfdopen are missing comments. cfopen_internal has an updated
comment, but that's a static function while cfopen/cfdopen are the
actual API.Added comments to cfopen/cfdopen.
OK.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
v22-0002-Add-LZ4-compression-to-pg_dump.patchtext/x-patch; name=v22-0002-Add-LZ4-compression-to-pg_dump.patchDownload
From 48451d2316a7016bfb2824c33ba38594ec03953e Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 18 Jan 2023 18:15:42 +0000
Subject: [PATCH v22 2/2] Add LZ4 compression to pg_dump
This is mostly done within pg_dump's compression streaming and file APIs.
It is confined within the newly introduced compress_lz4.{c,h} files.
The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(), it has been
implemented localy.
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 22 +-
src/bin/pg_dump/compress_lz4.c | 618 +++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 22 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 14 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 1 +
11 files changed, 770 insertions(+), 18 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e1..49d218905f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 7a19f5d617..a1401377ab 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -25,6 +26,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
compress_gzip.o \
+ compress_lz4.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index d60111b2b8..96947cd3ea 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,7 +56,7 @@
* InitDiscoverCompressFileHandle tries to deffer the compression by the
* filename suffix. If the suffix is not yet known, then it tries to simply
* open the file, and if it fails, it tries to open the same file with the .gz
- * suffix.
+ * suffix, and then again with the .lz4 suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -70,6 +70,7 @@
#include "compress_io.h"
#include "compress_gzip.h"
+#include "compress_lz4.h"
#include "pg_backup_utils.h"
/*----------------------
@@ -146,6 +147,9 @@ AllocateCompressor(const pg_compress_specification compression_spec,
case PG_COMPRESSION_GZIP:
InitCompressorGzip(cs, compression_spec);
break;
+ case PG_COMPRESSION_LZ4:
+ InitCompressorLZ4(cs, compression_spec);
+ break;
default:
pg_fatal("invalid compression method \"%s\"",
get_compress_algorithm_name(compression_spec.algorithm));
@@ -339,6 +343,9 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
case PG_COMPRESSION_GZIP:
InitCompressGzip(CFH, compression_spec);
break;
+ case PG_COMPRESSION_LZ4:
+ InitCompressLZ4(CFH, compression_spec);
+ break;
default:
pg_fatal("invalid compression method \"%s\"",
get_compress_algorithm_name(compression_spec.algorithm));
@@ -357,7 +364,7 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
* throughout. Otherwise the compression will be deferred by iteratively trying
* to open the file at 'path', first as is, then by appending known compression
* suffixes. So if you pass "foo" as 'path', this will open either "foo" or
- * "foo.gz", trying in that order.
+ * "foo.gz" or "foo.lz4", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
@@ -395,6 +402,17 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
+#endif
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..c97e16187a
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,618 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.c
+ * Routines for archivers to write an uncompressed or compressed data
+ * stream.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private_data;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private_data = NULL;
+ }
+}
+
+
+/* Public routines that support LZ4 compressed data I/O */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ cs->compression_spec = compression_spec;
+
+ /* Will be lazy init'd */
+ cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the nessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurance of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (!fs->inited && LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verfiy that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (eol_flag == 0) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag == true &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the begining of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int remaining = size;
+
+ if (!fs->inited && LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = errno ? : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ LZ4File *lz4fp;
+
+ CFH->open_func = LZ4File_open;
+ CFH->open_write_func = LZ4File_open_write;
+ CFH->read_func = LZ4File_read;
+ CFH->write_func = LZ4File_write;
+ CFH->gets_func = LZ4File_gets;
+ CFH->getc_func = LZ4File_getc;
+ CFH->eof_func = LZ4File_eof;
+ CFH->close_func = LZ4File_close;
+ CFH->get_error_func = LZ4File_get_error;
+
+ CFH->compression_spec = compression_spec;
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (CFH->compression_spec.level >= 0)
+ lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
+
+ CFH->private_data = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..74595db1b9
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.h
+ * Interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index aa2c91829c..473d40d456 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -3,6 +3,7 @@
pg_dump_common_sources = files(
'compress_io.c',
'compress_gzip.c',
+ 'compress_lz4.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
@@ -17,7 +18,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -85,7 +86,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index cc3b7f0992..7005ffcd5b 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -395,6 +395,10 @@ RestoreArchive(Archive *AHX)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
#endif
if (supports_compression == false)
pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
@@ -2074,7 +2078,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2084,6 +2088,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -3747,6 +3755,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
unsupported = true;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ unsupported = true;
#endif
if (unsupported)
pg_fatal("archive is compressed, but this installation does not support compression");
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..08105337b1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index f3ba926321..f497ec6040 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -139,6 +139,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files where compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4175,11 +4249,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index da8e4861f4..1f7f817e4d 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -152,6 +152,7 @@ do
# as field names, which is unfortunate but we won't change it now.
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_lz4.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
test "$f" = src/bin/pg_dump/pg_dump.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 840191d680..232228d427 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1381,6 +1381,7 @@ LWLock
LWLockHandle
LWLockMode
LWLockPadded
+LZ4CompressorState
LZ4F_compressionContext_t
LZ4F_decompressOptions_t
LZ4F_decompressionContext_t
--
2.34.1
v22-0001-Introduce-Compressor-API-in-pg_dump-and-use-it-t.patchtext/x-patch; name=v22-0001-Introduce-Compressor-API-in-pg_dump-and-use-it-t.patchDownload
From 1fc195f9c37a8f175b7f2843ccd46953f6b19b0f Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 18 Jan 2023 18:20:43 +0000
Subject: [PATCH v22 1/2] Introduce Compressor API in pg_dump and use it
throughout
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for opening, writing, etc. The implementor of a new
compression method is now able to "simply" just add those definitions.
Custom compressed archives now need to store the compression algorithm in their
header. This requires a bump in the version number. The level of compression
is no longer stored in the dump as it is now irrelevant.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it.
---
src/bin/pg_dump/Makefile | 1 +
src/bin/pg_dump/compress_gzip.c | 406 +++++++++++++
src/bin/pg_dump/compress_gzip.h | 22 +
src/bin/pg_dump/compress_io.c | 815 +++++++++-----------------
src/bin/pg_dump/compress_io.h | 70 ++-
src/bin/pg_dump/meson.build | 1 +
src/bin/pg_dump/pg_backup_archiver.c | 198 +++----
src/bin/pg_dump/pg_backup_archiver.h | 32 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 94 +--
src/bin/pg_dump/t/002_pg_dump.pl | 10 +-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 2 +
13 files changed, 916 insertions(+), 759 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index ef1ed0f3e5..7a19f5d617 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,6 +24,7 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..f365b93b76
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,406 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.c
+ * Routines for archivers to write an uncompressed or compressed data
+ * stream.
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_gzip.h"
+#include "pg_backup_utils.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ /*
+ * A level of zero simply copies the input one block at the time.
+ * This is probably not what the user wanted when calling this
+ * interface.
+ */
+ if (cs->compression_spec.level == 0)
+ pg_fatal("requested to compress the archive yet no level was specified");
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ cs->compression_spec = compression_spec;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
+ cs->private_data = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ size_t ret;
+
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzwrite(gzfp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzgets(gzfp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int save_errno;
+ int ret;
+
+ CFH->private_data = NULL;
+
+ ret = gzclose(gzfp);
+
+ save_errno = errno;
+ errno = save_errno;
+
+ return ret;
+}
+
+static int
+Gzip_eof(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzeof(gzfp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ gzFile gzfp;
+ char mode_compression[32];
+
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
+ else
+ gzfp = gzopen(path, mode_compression);
+
+ if (gzfp == NULL)
+ return 1;
+
+ CFH->private_data = gzfp;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ CFH->open_func = Gzip_open;
+ CFH->open_write_func = Gzip_open_write;
+ CFH->read_func = Gzip_read;
+ CFH->write_func = Gzip_write;
+ CFH->gets_func = Gzip_gets;
+ CFH->getc_func = Gzip_getc;
+ CFH->close_func = Gzip_close;
+ CFH->eof_func = Gzip_eof;
+ CFH->get_error_func = Gzip_get_error;
+
+ CFH->compression_spec = compression_spec;
+
+ CFH->private_data = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..6dfd0eb04d
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.h
+ * Interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 7a2c80bbc4..d60111b2b8 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -9,42 +9,54 @@
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
+ * underlying stream. The second API is a wrapper around fopen and
* friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
+ * the possible compression. The second API is aimed for the resulting
+ * files to be easily manipulated with an external compression utility
+ * program.
+ *
+ * This file also includes the implementation when compression is none for
+ * both API's.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
- *
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
+ * AllocateCompressor, writeData, and EndCompressor. First you call
+ * AllocateCompressor, then write all the data by calling writeData as many
+ * times as needed, and finally EndCompressor. writeData will call the
+ * WriteFunc that was provided to AllocateCompressor for each chunk of
+ * compressed data.
*
- * The interface is the same for compressed and uncompressed streams.
+ * The interface for reading an archive consists of the same three functions:
+ * AllocateCompressor, readData, and EndCompressor. First you call
+ * AllocateCompressor, then read all the data by calling readData to read the
+ * whole compressed stream which repeatedly calls the given ReadFunc. ReadFunc
+ * returns the compressed data one chunk at a time. Then readData decompresses
+ * it and passes the decompressed data to ahwrite(), until ReadFunc returns 0
+ * to signal EOF. The interface is the same for compressed and uncompressed
+ * streams.
*
* Compressed stream API
* ----------------------
*
- * The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * The compressed stream API is providing a set of function pointers for
+ * opening, reading, writing, and finally closing files. The implemented
+ * function pointers are documented in the corresponding header file and are
+ * common for all streams. It allows the caller to use the same functions for
+ * both compressed and uncompressed streams.
+ *
+ * The interface consists of three functions, InitCompressFileHandle,
+ * InitDiscoverCompressFileHandle, and EndCompressFileHandle. If the
+ * compression is known, then start by calling InitCompressFileHandle,
+ * otherwise discover it by using InitDiscoverCompressFileHandle. Then call
+ * the function pointers as required for the read/write operations. Finally
+ * call EndCompressFileHandle to end the stream.
+ *
+ * InitDiscoverCompressFileHandle tries to deffer the compression by the
+ * filename suffix. If the suffix is not yet known, then it tries to simply
+ * open the file, and if it fails, it tries to open the same file with the .gz
+ * suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -53,7 +65,11 @@
*/
#include "postgres_fe.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
#include "compress_io.h"
+#include "compress_gzip.h"
#include "pg_backup_utils.h"
/*----------------------
@@ -61,110 +77,82 @@
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
+/* Private routines */
+
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
+ free(buf);
+}
-/* Public interface routines */
+/* Compressor API None implementation */
-/* Allocate a new compressor */
-CompressorState *
-AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
{
- CompressorState *cs;
-
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
-
- cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
- cs->writeF = writeF;
- cs->compression_spec = compression_spec;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
+ cs->writeF(AH, data, dLen);
+}
- return cs;
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
}
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF)
+static void
+InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec)
{
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+
+ cs->compression_spec = compression_spec;
}
+/* Public interface routines */
+
/*
- * Compress and write data to the output stream (via writeF).
+ * Allocate a new compressor.
*/
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
+CompressorState *
+AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF, WriteFunc writeF)
{
- switch (cs->compression_spec.algorithm)
+ CompressorState *cs;
+
+ cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
+ cs->writeF = writeF;
+
+ switch (compression_spec.algorithm)
{
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- break;
case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
+ InitCompressorNone(cs, compression_spec);
+ break;
+ case PG_COMPRESSION_GZIP:
+ InitCompressorGzip(cs, compression_spec);
break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
+ default:
+ pg_fatal("invalid compression method \"%s\"",
+ get_compress_algorithm_name(compression_spec.algorithm));
break;
}
+
+ return cs;
}
/*
@@ -173,527 +161,268 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
-#endif
- free(cs);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
+/*----------------------
+ * Compressed stream API
+ *----------------------
*/
-static void
-InitCompressorZlib(CompressorState *cs, int level)
+/* Private routines */
+
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
+
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
}
+/* free() without changing errno; useful in several places below */
static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
+free_keep_errno(void *p)
{
- z_streamp zp = cs->zp;
+ int save_errno = errno;
- zp->next_in = NULL;
- zp->avail_in = 0;
+ free(p);
+ errno = save_errno;
+}
+
+/*
+ * Compressed stream API None implementation
+ */
+static size_t
+read_none(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ size_t ret;
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
+ if (size == 0)
+ return 0;
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
- free(cs->zlibOut);
- free(cs->zp);
+ return ret;
}
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
+static size_t
+write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
-
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
+ return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
+}
- if (res == Z_STREAM_END)
- break;
- }
+static const char *
+get_error_none(CompressFileHandle *CFH)
+{
+ return strerror(errno);
}
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+static char *
+gets_none(char *ptr, int size, CompressFileHandle *CFH)
{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
+ return fgets(ptr, size, (FILE *) CFH->private_data);
}
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
+static int
+getc_none(CompressFileHandle *CFH)
{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret;
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
+ ret = fgetc(fp);
+ if (ret == EOF)
+ {
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
+ return ret;
+}
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
+static int
+close_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret = 0;
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
+ CFH->private_data = NULL;
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
+ if (fp)
+ ret = fclose(fp);
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
+ return ret;
+}
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
+static int
+eof_none(CompressFileHandle *CFH)
+{
+ return feof((FILE *) CFH->private_data);
+}
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
+static int
+open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
+ if (fd >= 0)
+ CFH->private_data = fdopen(dup(fd), mode);
+ else
+ CFH->private_data = fopen(path, mode);
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
+ if (CFH->private_data == NULL)
+ return 1;
- free(buf);
- free(out);
- free(zp);
+ return 0;
}
-#endif /* HAVE_LIBZ */
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
+static int
+open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
{
- size_t cnt;
- char *buf;
- size_t buflen;
-
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
+ Assert(CFH->private_data == NULL);
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
+ CFH->private_data = fopen(path, mode);
+ if (CFH->private_data == NULL)
+ return 1;
- free(buf);
+ return 0;
}
static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+InitCompressNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
{
- cs->writeF(AH, data, dLen);
+ CFH->open_func = open_none;
+ CFH->open_write_func = open_write_none;
+ CFH->read_func = read_none;
+ CFH->write_func = write_none;
+ CFH->gets_func = gets_none;
+ CFH->getc_func = getc_none;
+ CFH->close_func = close_none;
+ CFH->eof_func = eof_none;
+ CFH->get_error_func = get_error_none;
+
+ CFH->private_data = NULL;
}
-
-/*----------------------
- * Compressed stream API
- *----------------------
- */
-
/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
+ * Public interface
*/
-struct cfp
-{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
-};
-
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
-
-/* free() without changing errno; useful in several places below */
-static void
-free_keep_errno(void *p)
-{
- int save_errno = errno;
-
- free(p);
- errno = save_errno;
-}
/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
+ * Initialize a compress file handle for the specified compression algorithm.
*/
-cfp *
-cfopen_read(const char *path, const char *mode)
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compression_spec)
{
- cfp *fp;
+ CompressFileHandle *CFH;
- pg_compress_specification compression_spec = {0};
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
+ switch (compression_spec.algorithm)
{
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
+ case PG_COMPRESSION_NONE:
+ InitCompressNone(CFH, compression_spec);
+ break;
+ case PG_COMPRESSION_GZIP:
+ InitCompressGzip(CFH, compression_spec);
+ break;
+ default:
+ pg_fatal("invalid compression method \"%s\"",
+ get_compress_algorithm_name(compression_spec.algorithm));
+ break;
}
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
+ return CFH;
}
/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
*
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
+ * If the file at 'path' contains the suffix of a supported compression method,
+ * currently this includes only ".gz", then this compression will be used
+ * throughout. Otherwise the compression will be deferred by iteratively trying
+ * to open the file at 'path', first as is, then by appending known compression
+ * suffixes. So if you pass "foo" as 'path', this will open either "foo" or
+ * "foo.gz", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- cfp *fp;
-
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
-}
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
-/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- cfp *fp = pg_malloc(sizeof(cfp));
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
- }
+ fname = strdup(path);
- fp->uncompressedfp = NULL;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
else
{
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
-#endif
- fp->uncompressedfp = fopen(path, mode);
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
- }
-
- return fp;
-}
-
-
-int
-cfread(void *ptr, int size, cfp *fp)
-{
- int ret;
-
- if (size == 0)
- return 0;
+ bool exists;
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not build with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
+ if (!exists)
{
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
- }
- else
#endif
- {
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
}
- return ret;
-}
-int
-cfwrite(const void *ptr, int size, cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
-#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
-}
-
-int
-cfgetc(cfp *fp)
-{
- int ret;
-
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
- {
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
- }
- else
-#endif
+ CFH = InitCompressFileHandle(compression_spec);
+ if (CFH->open_func(fname, -1, mode, CFH))
{
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
+ free_keep_errno(CFH);
+ CFH = NULL;
}
+ free_keep_errno(fname);
- return ret;
-}
-
-char *
-cfgets(cfp *fp, char *buf, int len)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
-#endif
- return fgets(buf, len, fp->uncompressedfp);
-}
-
-int
-cfclose(cfp *fp)
-{
- int result;
-
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
- }
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
- }
- else
-#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
- }
- free_keep_errno(fp);
-
- return result;
+ return CFH;
}
+/*
+ * Close an open file handle and release its memory.
+ *
+ * On failure, returns an error value and sets errno appropriately.
+ */
int
-cfeof(cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
-#endif
- return feof(fp->uncompressedfp);
-}
-
-const char *
-get_cfp_error(cfp *fp)
+EndCompressFileHandle(CompressFileHandle *CFH)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
-
- if (errnum != Z_ERRNO)
- return errmsg;
- }
-#endif
- return strerror(errno);
-}
+ int ret = 0;
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
-{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
+ if (CFH->private_data)
+ ret = CFH->close_func(CFH);
- if (filenamelen < suffixlen)
- return 0;
+ free_keep_errno(CFH);
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
+ return ret;
}
-
-#endif
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a429dc4789..bd2b5623a5 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -37,32 +37,64 @@ typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+
+ /*
+ * End compression and flush internal buffers if any.
+ */
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ ReadFunc readF;
+ WriteFunc writeF;
+
+ pg_compress_specification compression_spec;
+ void *private_data;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
-typedef struct cfp cfp;
+struct CompressFileHandle
+{
+ int (*open_func) (const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH);
+ int (*open_write_func) (const char *path, const char *mode,
+ CompressFileHandle *cxt);
+ size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
+ size_t (*write_func) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
+ int (*getc_func) (CompressFileHandle *CFH);
+ int (*eof_func) (CompressFileHandle *CFH);
+ int (*close_func) (CompressFileHandle *CFH);
+ const char *(*get_error_func) (CompressFileHandle *CFH);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+ pg_compress_specification compression_spec;
+ void *private_data;
+};
+extern CompressFileHandle *InitCompressFileHandle(const pg_compress_specification compression_spec);
+extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int EndCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index ca62f9a374..aa2c91829c 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -2,6 +2,7 @@
pg_dump_common_sources = files(
'compress_io.c',
+ 'compress_gzip.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 7f7a0f1ce7..cc3b7f0992 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -362,8 +353,9 @@ RestoreArchive(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
+ bool supports_compression;
TocEntry *te;
- OutputContext sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +383,24 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
+ supports_compression = true;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE &&
+ AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+#ifndef HAVE_LIBZ
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+ if (supports_compression == false)
+ pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1128,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1144,9 +1143,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1503,95 +1503,60 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
- int fn;
+ CompressFileHandle *CFH;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
-
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ mode = PG_BINARY_W;
- if (!AH->OF)
+ CFH = InitCompressFileHandle(compression_spec);
+
+ if (CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static OutputContext
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1715,21 +1680,20 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
{
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
}
if (bytes_written != size * nmemb)
@@ -2072,6 +2036,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2102,26 +2078,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2219,6 +2181,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2272,8 +2236,11 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3686,12 +3653,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3762,10 +3724,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
@@ -3777,10 +3740,17 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
+ {
+ bool unsupported = false;
+
#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ unsupported = true;
#endif
+ if (unsupported)
+ pg_fatal("archive is compressed, but this installation does not support compression");
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b..18b38c17ab 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -89,10 +65,13 @@ typedef z_stream *z_streamp;
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add
+ * compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
@@ -319,8 +298,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index d1e54644a9..512ab043af 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 6800c3ccee..a2f88995c0 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read_func(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (EndCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof_func(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write_func(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc_func(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write_func(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read_func(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compression_spec);
+ if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
+ if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write_func(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 2eeef2a478..f3ba926321 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -94,7 +94,7 @@ my %pgdump_runs = (
command => [
'pg_restore', '-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip-compressed'
},
},
@@ -239,8 +239,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default if available',
},
},
@@ -264,8 +264,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
glob_patterns => [
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index e52fe9f509..da8e4861f4 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -150,6 +150,7 @@ do
# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.
+ test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 23bafec5f7..840191d680 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -428,6 +428,7 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
+CompressFileHandle
CompressionLocation
CompressorState
ComputeXidHorizonsResult
@@ -1034,6 +1035,7 @@ GucStack
GucStackState
GucStringAssignHook
GucStringCheckHook
+GzipCompressorState
HANDLE
HASHACTION
HASHBUCKET
--
2.34.1
Hi,
On 1/19/23 17:42, gkokolatos@pm.me wrote:
------- Original Message -------
On Thursday, January 19th, 2023 at 4:45 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:On 1/18/23 20:05, gkokolatos@pm.me wrote:
------- Original Message -------
On Wednesday, January 18th, 2023 at 3:00 PM, Tomas Vondra tomas.vondra@enterprisedb.com wrote:I'm not sure I understand why leave the lz4/zstd in this place?
You are right, it is not obvious. Those were added in 5e73a60488 which is
already committed in master and I didn't want to backtrack. Of course, I am
not opposing in doing so if you wish.
Ah, I didn't realize it was already added by earlier commit. In that
case let's not worry about it.
2) I wouldn't reorder the cases in WriteDataToArchive, i.e. I'd keep
"none" at the end. It might make backpatches harder.Agreed. However a 'default' is needed in order to avoid compilation warnings.
Also note that 0002 completely does away with cases within WriteDataToArchive.OK, although that's also a consequence of using a "switch" instead of
plan "if" branches.Furthermore, I'm not sure we really need the pg_fatal() about invalid
compression method in these default blocks. I mean, how could we even
get to these places when the build does not support the algorithm? All
of this (ReadDataFromArchive, WriteDataToArchive, EndCompressor, ...)
happens looong after the compressor was initialized and the method
checked, no? So maybe either this should simply do Assert(false) or use
a different error message.I like Assert(false).
OK, good. Do you agree we should never actually get there, if the
earlier checks work correctly?
4) "cfp" struct no longer wraps gzFile, but the comment was not updated.
FWIW I'm not sure switching to "void *" is an improvement, maybe it'd be
better to have a "union" of correct types?Please find and updated comment and a union in place of the void *. Also
note that 0002 completely does away with cfp in favour of a new struct
CompressFileHandle. I maintained the void * there because it is used by
private methods of the compressors. 0003 contains such an example with
LZ4CompressorState.I wonder if this (and also the previous item) makes sense to keep 0001
and 0002 or to combine them. The "intermediate" state is a bit annoying.Agreed. It was initially submitted as one patch. Then it was requested to be
split up in two parts, one to expand the use of the existing API and one to
replace with the new interface. Unfortunately the expansion of usage of the
existing API requires some tweaking, but that is not a very good reason for
the current patch set. I should have done a better job there.Please find v22 attach which combines back 0001 and 0002. It is missing the
documentation that was discussed above as I wanted to give a quick feedback.
Let me know if you think that the combined version is the one to move forward
with.
Thanks, I'll take a look.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 1/19/23 18:55, Tomas Vondra wrote:
Hi,
On 1/19/23 17:42, gkokolatos@pm.me wrote:
...
Agreed. It was initially submitted as one patch. Then it was requested to be
split up in two parts, one to expand the use of the existing API and one to
replace with the new interface. Unfortunately the expansion of usage of the
existing API requires some tweaking, but that is not a very good reason for
the current patch set. I should have done a better job there.Please find v22 attach which combines back 0001 and 0002. It is missing the
documentation that was discussed above as I wanted to give a quick feedback.
Let me know if you think that the combined version is the one to move forward
with.Thanks, I'll take a look.
After taking a look and thinking about it a bit more, I think we should
keep the two parts separate. I think Michael (or whoever proposed) the
split was right, it makes the patches easier to grok.
Sorry for the noise, hopefully we can just revert to the last version.
While reading the thread, I also noticed this:
By the way, I think that this 0002 should drop all the default clauses
in the switches for the compression method so as we'd catch any
missing code paths with compiler warnings if a new compression method
is added in the future.
Now I realize why there were "not yet implemented" errors for lz4/zstd
in all the switches, and why after removing them you had to add a
default branch.
We DON'T want a default branch, because the idea is that after adding a
new compression algorithm, we get warnings about switches not handling
it correctly.
So I guess we should walk back this change too :-( It's probably easier
to go back to v20 from January 16, and redo the couple remaining things
I commented on.
FWIW I think this is a hint that adding LZ4/ZSTD options, in 5e73a6048,
but without implementation, was not a great idea. It mostly defeats the
idea of getting the compiler warnings - all the places already handle
PG_COMPRESSION_LZ4/PG_COMPRESSION_ZSTD by throwing a pg_fatal. So you'd
have to grep for the options, inspect all the places or something like
that anyway. The warnings would only work for entirely new methods.
However, I now also realize the compressor API in 0002 replaces all of
this with calls to a generic API callback, so trying to improve this was
pretty silly from me.
Please, fix the couple remaining details in v20, add the docs for the
callbacks, and I'll try to polish it and get it committed.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
------- Original Message -------
On Friday, January 20th, 2023 at 12:34 AM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 1/19/23 18:55, Tomas Vondra wrote:
Hi,
On 1/19/23 17:42, gkokolatos@pm.me wrote:
...
Agreed. It was initially submitted as one patch. Then it was requested to be
split up in two parts, one to expand the use of the existing API and one to
replace with the new interface. Unfortunately the expansion of usage of the
existing API requires some tweaking, but that is not a very good reason for
the current patch set. I should have done a better job there.Please find v22 attach which combines back 0001 and 0002. It is missing the
documentation that was discussed above as I wanted to give a quick feedback.
Let me know if you think that the combined version is the one to move forward
with.Thanks, I'll take a look.
After taking a look and thinking about it a bit more, I think we should
keep the two parts separate. I think Michael (or whoever proposed) the
split was right, it makes the patches easier to grok.
Excellent. I will attempt a better split this time round.
While reading the thread, I also noticed this:
By the way, I think that this 0002 should drop all the default clauses
in the switches for the compression method so as we'd catch any
missing code paths with compiler warnings if a new compression method
is added in the future.Now I realize why there were "not yet implemented" errors for lz4/zstd
in all the switches, and why after removing them you had to add a
default branch.We DON'T want a default branch, because the idea is that after adding a
new compression algorithm, we get warnings about switches not handling
it correctly.So I guess we should walk back this change too :-( It's probably easier
to go back to v20 from January 16, and redo the couple remaining things
I commented on.
Sure.
FWIW I think this is a hint that adding LZ4/ZSTD options, in 5e73a6048,
but without implementation, was not a great idea. It mostly defeats the
idea of getting the compiler warnings - all the places already handle
PG_COMPRESSION_LZ4/PG_COMPRESSION_ZSTD by throwing a pg_fatal. So you'd
have to grep for the options, inspect all the places or something like
that anyway. The warnings would only work for entirely new methods.However, I now also realize the compressor API in 0002 replaces all of
this with calls to a generic API callback, so trying to improve this was
pretty silly from me.
I can try to do a better job at splitting things up.
Please, fix the couple remaining details in v20, add the docs for the
callbacks, and I'll try to polish it and get it committed.
Excellent. Allow me an attempt to polish and expect a new version soon.
Cheers,
//Georgios
Show quoted text
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
------- Original Message -------
On Friday, January 20th, 2023 at 12:34 AM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 1/19/23 18:55, Tomas Vondra wrote:
Hi,
On 1/19/23 17:42, gkokolatos@pm.me wrote:
...
Agreed. It was initially submitted as one patch. Then it was requested to be
split up in two parts, one to expand the use of the existing API and one to
replace with the new interface. Unfortunately the expansion of usage of the
existing API requires some tweaking, but that is not a very good reason for
the current patch set. I should have done a better job there.Please find v22 attach which combines back 0001 and 0002. It is missing the
documentation that was discussed above as I wanted to give a quick feedback.
Let me know if you think that the combined version is the one to move forward
with.Thanks, I'll take a look.
After taking a look and thinking about it a bit more, I think we should
keep the two parts separate. I think Michael (or whoever proposed) the
split was right, it makes the patches easier to grok.
Please find attached v23 which reintroduces the split.
0001 is reworked to have a reduced footprint than before. Also in an attempt
to facilitate the readability, 0002 splits the API's and the uncompressed
implementation in separate files.
While reading the thread, I also noticed this:
By the way, I think that this 0002 should drop all the default clauses
in the switches for the compression method so as we'd catch any
missing code paths with compiler warnings if a new compression method
is added in the future.Now I realize why there were "not yet implemented" errors for lz4/zstd
in all the switches, and why after removing them you had to add a
default branch.We DON'T want a default branch, because the idea is that after adding a
new compression algorithm, we get warnings about switches not handling
it correctly.So I guess we should walk back this change too :-( It's probably easier
to go back to v20 from January 16, and redo the couple remaining things
I commented on.
No problem.
FWIW I think this is a hint that adding LZ4/ZSTD options, in 5e73a6048,
but without implementation, was not a great idea. It mostly defeats the
idea of getting the compiler warnings - all the places already handle
PG_COMPRESSION_LZ4/PG_COMPRESSION_ZSTD by throwing a pg_fatal. So you'd
have to grep for the options, inspect all the places or something like
that anyway. The warnings would only work for entirely new methods.However, I now also realize the compressor API in 0002 replaces all of
this with calls to a generic API callback, so trying to improve this was
pretty silly from me.Please, fix the couple remaining details in v20, add the docs for the
callbacks, and I'll try to polish it and get it committed.
Thank you very much. Please find an attempt to comply with the requested
changes in the attached.
Show quoted text
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
v23-0001-Prepare-pg_dump-internals-for-additional-compres.patchtext/x-patch; name=v23-0001-Prepare-pg_dump-internals-for-additional-compres.patchDownload
From 4861f0cac1ef181bccb37278ec74ec43d3f0df5f Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 23 Jan 2023 13:24:22 +0000
Subject: [PATCH v23 1/3] Prepare pg_dump internals for additional compression
methods.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it and is using it throughout.
---
src/bin/pg_dump/compress_io.c | 66 +++++++++++---
src/bin/pg_dump/compress_io.h | 2 +
src/bin/pg_dump/pg_backup_archiver.c | 129 ++++++++++-----------------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-----
4 files changed, 102 insertions(+), 122 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 7a2c80bbc4..ef033914ba 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,10 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
/*----------------------
* Compressor API
*----------------------
@@ -490,16 +494,19 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_specification compression_spec)
{
- cfp *fp = pg_malloc(sizeof(cfp));
+ cfp *fp = pg_malloc0(sizeof(cfp));
if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
@@ -511,15 +518,20 @@ cfopen(const char *path, const char *mode,
snprintf(mode_compression, sizeof(mode_compression), "%s%d",
mode, compression_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
+ if (fd >= 0)
+ fp->compressedfp = gzdopen(fd, mode_compression);
+ else
+ fp->compressedfp = gzopen(path, mode_compression);
}
else
{
/* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
+ if (fd >= 0)
+ fp->compressedfp = gzdopen(fd, mode);
+ else
+ fp->compressedfp = gzopen(path, mode);
}
- fp->uncompressedfp = NULL;
if (fp->compressedfp == NULL)
{
free_keep_errno(fp);
@@ -531,10 +543,11 @@ cfopen(const char *path, const char *mode,
}
else
{
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
-#endif
- fp->uncompressedfp = fopen(path, mode);
+ if (fd >= 0)
+ fp->uncompressedfp = fdopen(fd, mode);
+ else
+ fp->uncompressedfp = fopen(path, mode);
+
if (fp->uncompressedfp == NULL)
{
free_keep_errno(fp);
@@ -545,6 +558,33 @@ cfopen(const char *path, const char *mode,
return fp;
}
+/*
+ * Opens file 'path' in 'mode' and compression as defined in
+ * compression_spec. The caller must verify that the compression
+ * is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+/*
+ * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
+ * and compression as defined in compression_spec. The caller must
+ * verify that the compression is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a429dc4789..a6cdf588dd 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -54,6 +54,8 @@ typedef struct cfp cfp;
extern cfp *cfopen(const char *path, const char *mode,
const pg_compress_specification compression_spec);
+extern cfp *cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
extern cfp *cfopen_write(const char *path, const char *mode,
const pg_compress_specification compression_spec);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index ba5e6acbbb..ddb712d042 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -362,8 +353,9 @@ RestoreArchive(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
+ bool supports_compression;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +383,27 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
+ supports_compression = false;
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_NONE ||
+ AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = true;
+
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+#ifndef HAVE_LIBZ
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+ if (supports_compression == false)
+ pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1133,7 +1135,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1508,58 +1510,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1570,33 +1546,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1720,22 +1687,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2224,6 +2186,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2277,8 +2240,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b..4725e49747 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.34.1
v23-0002-Introduce-Compress-and-Compressor-API-in-pg_dump.patchtext/x-patch; name=v23-0002-Introduce-Compress-and-Compressor-API-in-pg_dump.patchDownload
From 656dfad3ababfdd4665880494bec0271004702ae Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 23 Jan 2023 14:29:02 +0000
Subject: [PATCH v23 2/3] Introduce Compress and Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for file manipulation. The implementor of a new
compression method is now able to "simply" just add those definitions.
Additionaly custom compressed archives store the compression algorithm in their
header instead of the compression level. The header version number is bumped.
---
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_gzip.c | 404 ++++++++++++++
src/bin/pg_dump/compress_gzip.h | 24 +
src/bin/pg_dump/compress_io.c | 746 +++++---------------------
src/bin/pg_dump/compress_io.h | 166 +++++-
src/bin/pg_dump/compress_none.c | 206 +++++++
src/bin/pg_dump/compress_none.h | 24 +
src/bin/pg_dump/meson.build | 2 +
src/bin/pg_dump/pg_backup_archiver.c | 102 ++--
src/bin/pg_dump/pg_backup_archiver.h | 5 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 94 ++--
src/bin/pg_dump/t/002_pg_dump.pl | 10 +-
src/tools/pginclude/cpluspluscheck | 2 +
src/tools/pgindent/typedefs.list | 2 +
15 files changed, 1060 insertions(+), 752 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
create mode 100644 src/bin/pg_dump/compress_none.c
create mode 100644 src/bin/pg_dump/compress_none.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index ef1ed0f3e5..0013bc080c 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,7 +24,9 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
+ compress_none.o \
dumputils.o \
parallel.o \
pg_backup_archiver.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..ffc4e6d56b
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,404 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.c
+ * Routines for archivers to read or write a gzip compressed data stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_gzip.h"
+#include "pg_backup_utils.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ /*
+ * A level of zero simply copies the input one block at the time. This
+ * is probably not what the user wanted when calling this interface.
+ */
+ if (cs->compression_spec.level == 0)
+ pg_fatal("requested to compress the archive yet no level was specified");
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ cs->compression_spec = compression_spec;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
+ cs->private_data = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ size_t ret;
+
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzwrite(gzfp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzgets(gzfp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int save_errno;
+ int ret;
+
+ CFH->private_data = NULL;
+
+ ret = gzclose(gzfp);
+
+ save_errno = errno;
+ errno = save_errno;
+
+ return ret;
+}
+
+static int
+Gzip_eof(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzeof(gzfp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ gzFile gzfp;
+ char mode_compression[32];
+
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
+ else
+ gzfp = gzopen(path, mode_compression);
+
+ if (gzfp == NULL)
+ return 1;
+
+ CFH->private_data = gzfp;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ CFH->open_func = Gzip_open;
+ CFH->open_write_func = Gzip_open_write;
+ CFH->read_func = Gzip_read;
+ CFH->write_func = Gzip_write;
+ CFH->gets_func = Gzip_gets;
+ CFH->getc_func = Gzip_getc;
+ CFH->close_func = Gzip_close;
+ CFH->eof_func = Gzip_eof;
+ CFH->get_error_func = Gzip_get_error;
+
+ CFH->compression_spec = compression_spec;
+
+ CFH->private_data = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..45fdf01f7d
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.h
+ * GZIP interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index ef033914ba..ea4c266e08 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -9,42 +9,54 @@
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
+ * underlying stream. The second API is a wrapper around fopen and
* friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
+ * the possible compression. The second API is aimed for the resulting
+ * files to be easily manipulated with an external compression utility
+ * program.
+ *
+ * This file also includes the implementation when compression is none for
+ * both API's.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
- *
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
- *
- * The interface is the same for compressed and uncompressed streams.
+ * AllocateCompressor, writeData, and EndCompressor. First you call
+ * AllocateCompressor, then write all the data by calling writeData as many
+ * times as needed, and finally EndCompressor. writeData will call the
+ * WriteFunc that was provided to AllocateCompressor for each chunk of
+ * compressed data.
+ *
+ * The interface for reading an archive consists of the same three functions:
+ * AllocateCompressor, readData, and EndCompressor. First you call
+ * AllocateCompressor, then read all the data by calling readData to read the
+ * whole compressed stream which repeatedly calls the given ReadFunc. ReadFunc
+ * returns the compressed data one chunk at a time. Then readData decompresses
+ * it and passes the decompressed data to ahwrite(), until ReadFunc returns 0
+ * to signal EOF. The interface is the same for compressed and uncompressed
+ * streams.
*
* Compressed stream API
* ----------------------
*
- * The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * The compressed stream API is providing a set of function pointers for
+ * opening, reading, writing, and finally closing files. The implemented
+ * function pointers are documented in the corresponding header file and are
+ * common for all streams. It allows the caller to use the same functions for
+ * both compressed and uncompressed streams.
+ *
+ * The interface consists of three functions, InitCompressFileHandle,
+ * InitDiscoverCompressFileHandle, and EndCompressFileHandle. If the
+ * compression is known, then start by calling InitCompressFileHandle,
+ * otherwise discover it by using InitDiscoverCompressFileHandle. Then call
+ * the function pointers as required for the read/write operations. Finally
+ * call EndCompressFileHandle to end the stream.
+ *
+ * InitDiscoverCompressFileHandle tries to deffer the compression by the
+ * filename suffix. If the suffix is not yet known, then it tries to simply
+ * open the file, and if it fails, it tries to open the same file with the .gz
+ * suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -53,122 +65,38 @@
*/
#include "postgres_fe.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
+#include "compress_gzip.h"
#include "compress_io.h"
+#include "compress_none.h"
#include "pg_backup_utils.h"
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#endif
-
/*----------------------
* Compressor API
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
-{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
-
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
-
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
-
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-
-/* Public interface routines */
-
-/* Allocate a new compressor */
+/*
+ * Allocate a new compressor.
+ */
CompressorState *
AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
- cs->compression_spec = compression_spec;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
-}
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF)
-{
if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
-}
+ InitCompressorNone(cs, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressorGzip(cs, compression_spec);
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ return cs;
}
/*
@@ -177,233 +105,31 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
-#endif
- free(cs);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
- */
-
-static void
-InitCompressorZlib(CompressorState *cs, int level)
-{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
-}
-
-static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
-{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
-
- free(cs->zlibOut);
- free(cs->zp);
-}
-
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
-{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
-
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
-
- if (res == Z_STREAM_END)
- break;
- }
-}
-
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
-}
-
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
-{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
-
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
-
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
-
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
-}
-#endif /* HAVE_LIBZ */
-
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
-{
- size_t cnt;
- char *buf;
- size_t buflen;
-
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
-
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
-
- free(buf);
-}
-
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->writeF(AH, data, dLen);
-}
-
-
/*----------------------
* Compressed stream API
*----------------------
*/
/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
+ * Private routines
*/
-struct cfp
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
-};
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
+}
/* free() without changing errno; useful in several places below */
static void
@@ -416,324 +142,102 @@ free_keep_errno(void *p)
}
/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
+ * Public interface
*/
-cfp *
-cfopen_read(const char *path, const char *mode)
-{
- cfp *fp;
-
- pg_compress_specification compression_spec = {0};
-
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
-
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
-}
/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- cfp *fp;
-
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
-}
-
-/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
+ * Initialize a compress file handle for the specified compression algorithm.
*/
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compression_spec)
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compression_spec)
{
- cfp *fp = pg_malloc0(sizeof(cfp));
+ CompressFileHandle *CFH;
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode_compression);
- else
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode);
- else
- fp->compressedfp = gzopen(path, mode);
- }
-
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
- else
- {
- if (fd >= 0)
- fp->uncompressedfp = fdopen(fd, mode);
- else
- fp->uncompressedfp = fopen(path, mode);
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
- }
+ if (compression_spec.algorithm == PG_COMPRESSION_NONE)
+ InitCompressNone(CFH, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressGzip(CFH, compression_spec);
- return fp;
+ return CFH;
}
/*
- * Opens file 'path' in 'mode' and compression as defined in
- * compression_spec. The caller must verify that the compression
- * is supported by the current build.
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
*
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- return cfopen_internal(path, -1, mode, compression_spec);
-}
-
-/*
- * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
- * and compression as defined in compression_spec. The caller must
- * verify that the compression is supported by the current build.
+ * If the file at 'path' contains the suffix of a supported compression method,
+ * currently this includes only ".gz", then this compression will be used
+ * throughout. Otherwise the compression will be deferred by iteratively trying
+ * to open the file at 'path', first as is, then by appending known compression
+ * suffixes. So if you pass "foo" as 'path', this will open either "foo" or
+ * "foo.gz", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec)
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- return cfopen_internal(NULL, fd, mode, compression_spec);
-}
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
-int
-cfread(void *ptr, int size, cfp *fp)
-{
- int ret;
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (size == 0)
- return 0;
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ fname = strdup(path);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
else
-#endif
{
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
- }
- return ret;
-}
-
-int
-cfwrite(const void *ptr, int size, cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
-#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
-}
-
-int
-cfgetc(cfp *fp)
-{
- int ret;
+ bool exists;
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not build with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
+ if (!exists)
{
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
- }
- else
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
}
- return ret;
-}
-
-char *
-cfgets(cfp *fp, char *buf, int len)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
-#endif
- return fgets(buf, len, fp->uncompressedfp);
-}
-
-int
-cfclose(cfp *fp)
-{
- int result;
-
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
- }
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ CFH = InitCompressFileHandle(compression_spec);
+ if (CFH->open_func(fname, -1, mode, CFH))
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
+ free_keep_errno(CFH);
+ CFH = NULL;
}
- else
-#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
- }
- free_keep_errno(fp);
+ free_keep_errno(fname);
- return result;
+ return CFH;
}
+/*
+ * Close an open file handle and release its memory.
+ *
+ * On failure, returns an error value and sets errno appropriately.
+ */
int
-cfeof(cfp *fp)
+EndCompressFileHandle(CompressFileHandle *CFH)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
-#endif
- return feof(fp->uncompressedfp);
-}
+ int ret = 0;
-const char *
-get_cfp_error(cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ if (CFH->private_data)
+ ret = CFH->close_func(CFH);
- if (errnum != Z_ERRNO)
- return errmsg;
- }
-#endif
- return strerror(errno);
-}
+ free_keep_errno(CFH);
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
-{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
-
- if (filenamelen < suffixlen)
- return 0;
-
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
+ return ret;
}
-
-#endif
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a6cdf588dd..a7a4fec036 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -21,50 +21,160 @@
#define ZLIB_OUT_SIZE 4096
#define ZLIB_IN_SIZE 4096
-/* Prototype for callback function to WriteDataToArchive() */
+/*
+ * Prototype for callback function used in writeData()
+ */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
/*
- * Prototype for callback function to ReadDataFromArchive()
+ * Prototype for callback function used in readData()
*
- * ReadDataFromArchive will call the read function repeatedly, until it
- * returns 0 to signal EOF. ReadDataFromArchive passes a buffer to read the
- * data into in *buf, of length *buflen. If that's not big enough for the
- * callback function, it can free() it and malloc() a new one, returning the
- * new buffer and its size in *buf and *buflen.
+ * readData will call the read function repeatedly, until it returns 0 to signal
+ * EOF. readData passes a buffer to read the data into in *buf, of length
+ * *buflen. If that's not big enough for the callback function, it can free() it
+ * and malloc() a new one, returning the new buffer and its size in *buf and
+ * *buflen.
*
* Returns the number of bytes read into *buf, or 0 on EOF.
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+
+ /*
+ * End compression and flush internal buffers if any.
+ */
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Callback function to read from an already processed input stream
+ */
+ ReadFunc readF;
+
+ /*
+ * Callback function to write an already processed chunk of data.
+ */
+ WriteFunc writeF;
+
+ /*
+ * Compression specification for this state.
+ */
+ pg_compress_specification compression_spec;
+
+ /*
+ * Private data to be used by the compressor.
+ */
+ void *private_data;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ /*
+ * Open a file in mode.
+ *
+ * Pass either 'path' or 'fd' depending on whether a filepath or a file
+ * descriptor is available. 'mode' can be one of 'r', 'rb', 'w', 'wb',
+ * 'a', and 'ab'. Requrires an already initialized CompressFileHandle.
+ */
+ int (*open_func) (const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH);
+
+ /*
+ * Open a file for writing.
+ *
+ * 'mode' can be one of ''w', 'wb', 'a', and 'ab'. Requrires an already
+ * initialized CompressFileHandle.
+ */
+ int (*open_write_func) (const char *path, const char *mode,
+ CompressFileHandle *CFH);
-typedef struct cfp cfp;
+ /*
+ * Read 'size' bytes of data from the file and store them into 'ptr'.
+ */
+ size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+ /*
+ * Write 'size' bytes of data into the file from 'ptr'.
+ */
+ size_t (*write_func) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ /*
+ * Read at most size - 1 characters from the compress file handle into
+ * 's'.
+ *
+ * Stop if an EOF or a newline is found first. 's' is always null
+ * terminated and contains the newline if it was found.
+ */
+ char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
+
+ /*
+ * Read the next character from the compress file handle as 'unsigned
+ * char' cast into 'int'.
+ */
+ int (*getc_func) (CompressFileHandle *CFH);
+
+ /*
+ * Test if EOF is reached in the compress file handle.
+ */
+ int (*eof_func) (CompressFileHandle *CFH);
+
+ /*
+ * Close an open file handle.
+ */
+ int (*close_func) (CompressFileHandle *CFH);
+
+ /*
+ * Get a pointer to a string that describes an error that occured during a
+ * compress file handle operation.
+ */
+ const char *(*get_error_func) (CompressFileHandle *CFH);
+
+ /*
+ * Compression specification for this file handle.
+ */
+ pg_compress_specification compression_spec;
+
+ /*
+ * Private data to be used by the compressor.
+ */
+ void *private_data;
+};
+
+/*
+ * Initialize a compress file handle with the requested compression.
+ */
+extern CompressFileHandle *InitCompressFileHandle(const pg_compress_specification compression_spec);
+
+/*
+ * Initialize a compress file stream. Deffer the compression algorithm
+ * from 'path', either by examining its suffix or by appending the supported
+ * suffixes in 'path'.
+ */
+extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int EndCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/compress_none.c b/src/bin/pg_dump/compress_none.c
new file mode 100644
index 0000000000..3a36f94898
--- /dev/null
+++ b/src/bin/pg_dump/compress_none.c
@@ -0,0 +1,206 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_none.c
+ * Routines for archivers to read or write an uncompressed stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_none.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_none.h"
+#include "pg_backup_utils.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+/*
+ * Private routines
+ */
+
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
+
+ free(buf);
+}
+
+
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+/*
+ * Public interface
+ */
+
+void
+InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+
+ cs->compression_spec = compression_spec;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * Private routines
+ */
+
+static size_t
+read_none(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ size_t ret;
+
+ if (size == 0)
+ return 0;
+
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
+
+ return ret;
+}
+
+static size_t
+write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
+}
+
+static const char *
+get_error_none(CompressFileHandle *CFH)
+{
+ return strerror(errno);
+}
+
+static char *
+gets_none(char *ptr, int size, CompressFileHandle *CFH)
+{
+ return fgets(ptr, size, (FILE *) CFH->private_data);
+}
+
+static int
+getc_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret;
+
+ ret = fgetc(fp);
+ if (ret == EOF)
+ {
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static int
+close_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret = 0;
+
+ CFH->private_data = NULL;
+
+ if (fp)
+ ret = fclose(fp);
+
+ return ret;
+}
+
+static int
+eof_none(CompressFileHandle *CFH)
+{
+ return feof((FILE *) CFH->private_data);
+}
+
+static int
+open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
+
+ if (fd >= 0)
+ CFH->private_data = fdopen(dup(fd), mode);
+ else
+ CFH->private_data = fopen(path, mode);
+
+ if (CFH->private_data == NULL)
+ return 1;
+
+ return 0;
+}
+
+static int
+open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
+
+ CFH->private_data = fopen(path, mode);
+ if (CFH->private_data == NULL)
+ return 1;
+
+ return 0;
+}
+
+/*
+ * Public interface
+ */
+
+void
+InitCompressNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ CFH->open_func = open_none;
+ CFH->open_write_func = open_write_none;
+ CFH->read_func = read_none;
+ CFH->write_func = write_none;
+ CFH->gets_func = gets_none;
+ CFH->getc_func = getc_none;
+ CFH->close_func = close_none;
+ CFH->eof_func = eof_none;
+ CFH->get_error_func = get_error_none;
+
+ CFH->private_data = NULL;
+}
diff --git a/src/bin/pg_dump/compress_none.h b/src/bin/pg_dump/compress_none.h
new file mode 100644
index 0000000000..0e704c5491
--- /dev/null
+++ b/src/bin/pg_dump/compress_none.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_none.h
+ * Uncompressed interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_none.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_NONE_H_
+#define _COMPRESS_NONE_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_NONE_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index ca62f9a374..84e9f0defa 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,7 +1,9 @@
# Copyright (c) 2022-2023, PostgreSQL Global Development Group
pg_dump_common_sources = files(
+ 'compress_gzip.c',
'compress_io.c',
+ 'compress_none.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index ddb712d042..febdd14885 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -355,7 +355,7 @@ RestoreArchive(Archive *AHX)
bool parallel_mode;
bool supports_compression;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1135,7 +1135,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1151,9 +1151,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1510,6 +1511,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1532,33 +1534,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1697,7 +1698,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2039,6 +2044,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2069,26 +2086,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2186,6 +2189,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2241,7 +2245,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3654,12 +3661,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3730,10 +3732,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
@@ -3745,10 +3748,17 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
+ {
+ bool unsupported = false;
+
#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ unsupported = true;
#endif
+ if (unsupported)
+ pg_fatal("archive is compressed, but this installation does not support compression");
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 4725e49747..18b38c17ab 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,13 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add
+ * compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index d1e54644a9..512ab043af 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 6800c3ccee..a2f88995c0 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read_func(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (EndCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof_func(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write_func(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc_func(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write_func(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read_func(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compression_spec);
+ if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
+ if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write_func(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index d92247c915..78454928cc 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -94,7 +94,7 @@ my %pgdump_runs = (
command => [
'pg_restore', '-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip-compressed'
},
},
@@ -239,8 +239,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default if available',
},
},
@@ -264,8 +264,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
glob_patterns => [
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index e52fe9f509..db429474a2 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -150,7 +150,9 @@ do
# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.
+ test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
test "$f" = src/bin/pg_dump/pg_dump.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 51484ca7e2..7e62c6ef3d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -428,6 +428,7 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
+CompressFileHandle
CompressionLocation
CompressorState
ComputeXidHorizonsResult
@@ -1034,6 +1035,7 @@ GucStack
GucStackState
GucStringAssignHook
GucStringCheckHook
+GzipCompressorState
HANDLE
HASHACTION
HASHBUCKET
--
2.34.1
v23-0003-Add-LZ4-compression-to-pg_dump.patchtext/x-patch; name=v23-0003-Add-LZ4-compression-to-pg_dump.patchDownload
From 978ed90ea8044dda8c04977c10a1b0c8a557ce76 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 23 Jan 2023 14:51:06 +0000
Subject: [PATCH v23 3/3] Add LZ4 compression to pg_dump
This is mostly done within pg_dump's compression streaming and file APIs.
It is confined within the newly introduced compress_lz4.{c,h} files.
The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(). Where the
functionality was missing from the official API, it has been implemented
localy.
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 20 +-
src/bin/pg_dump/compress_lz4.c | 622 +++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 22 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 14 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 1 +
11 files changed, 772 insertions(+), 18 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e1..49d218905f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 0013bc080c..eb8f59459a 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -26,6 +27,7 @@ OBJS = \
$(WIN32RES) \
compress_gzip.o \
compress_io.o \
+ compress_lz4.o \
compress_none.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index ea4c266e08..4ec25e352c 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,7 +56,7 @@
* InitDiscoverCompressFileHandle tries to deffer the compression by the
* filename suffix. If the suffix is not yet known, then it tries to simply
* open the file, and if it fails, it tries to open the same file with the .gz
- * suffix.
+ * suffix, and then again with the .lz4 suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -70,6 +70,7 @@
#include "compress_gzip.h"
#include "compress_io.h"
+#include "compress_lz4.h"
#include "compress_none.h"
#include "pg_backup_utils.h"
@@ -95,6 +96,8 @@ AllocateCompressor(const pg_compress_specification compression_spec,
InitCompressorNone(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressorGzip(cs, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ InitCompressorLZ4(cs, compression_spec);
return cs;
}
@@ -159,6 +162,8 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
InitCompressNone(CFH, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressGzip(CFH, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ InitCompressLZ4(CFH, compression_spec);
return CFH;
}
@@ -172,7 +177,7 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
* throughout. Otherwise the compression will be deferred by iteratively trying
* to open the file at 'path', first as is, then by appending known compression
* suffixes. So if you pass "foo" as 'path', this will open either "foo" or
- * "foo.gz", trying in that order.
+ * "foo.gz" or "foo.lz4", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
@@ -210,6 +215,17 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
+#endif
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..e99893d777
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,622 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.c
+ * Routines for archivers to write a LZ4 compressed data stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private_data;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private_data = NULL;
+ }
+}
+
+
+/*
+ * Public routines that support LZ4 compressed data I/O
+ */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ cs->compression_spec = compression_spec;
+
+ /* Will be lazy init'd */
+ cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the nessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurance of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (!fs->inited && LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verfiy that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (eol_flag == 0) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag == true &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the begining of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int remaining = size;
+
+ if (!fs->inited && LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = errno ? : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = errno ? : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+/*
+ * Public routines
+ */
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ LZ4File *lz4fp;
+
+ CFH->open_func = LZ4File_open;
+ CFH->open_write_func = LZ4File_open_write;
+ CFH->read_func = LZ4File_read;
+ CFH->write_func = LZ4File_write;
+ CFH->gets_func = LZ4File_gets;
+ CFH->getc_func = LZ4File_getc;
+ CFH->eof_func = LZ4File_eof;
+ CFH->close_func = LZ4File_close;
+ CFH->get_error_func = LZ4File_get_error;
+
+ CFH->compression_spec = compression_spec;
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (CFH->compression_spec.level >= 0)
+ lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
+
+ CFH->private_data = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..0fe8c4e524
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.h
+ * LZ4 interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 84e9f0defa..0da476a4c3 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -3,6 +3,7 @@
pg_dump_common_sources = files(
'compress_gzip.c',
'compress_io.c',
+ 'compress_lz4.c',
'compress_none.c',
'dumputils.c',
'parallel.c',
@@ -18,7 +19,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -86,7 +87,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index febdd14885..4c5c4ed9be 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -398,6 +398,10 @@ RestoreArchive(Archive *AHX)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
#endif
if (supports_compression == false)
pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
@@ -2082,7 +2086,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2092,6 +2096,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -3755,6 +3763,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
unsupported = true;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ unsupported = true;
#endif
if (unsupported)
pg_fatal("archive is compressed, but this installation does not support compression");
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..08105337b1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 78454928cc..1777a4e5ac 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -139,6 +139,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files where compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4175,11 +4249,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index db429474a2..2c5042eb41 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -152,6 +152,7 @@ do
# as field names, which is unfortunate but we won't change it now.
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_lz4.h && continue
test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 7e62c6ef3d..692c5ceb0b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1383,6 +1383,7 @@ LWLock
LWLockHandle
LWLockMode
LWLockPadded
+LZ4CompressorState
LZ4F_compressionContext_t
LZ4F_decompressOptions_t
LZ4F_decompressionContext_t
--
2.34.1
On Mon, Jan 23, 2023 at 05:31:55PM +0000, gkokolatos@pm.me wrote:
Please find attached v23 which reintroduces the split.
0001 is reworked to have a reduced footprint than before. Also in an attempt
to facilitate the readability, 0002 splits the API's and the uncompressed
implementation in separate files.
Thanks for updating the patch. Could you address the review comments I
sent here ?
/messages/by-id/20230108194524.GA27637@telsasoft.com
Thanks,
--
Justin
------- Original Message -------
On Monday, January 23rd, 2023 at 7:00 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Mon, Jan 23, 2023 at 05:31:55PM +0000, gkokolatos@pm.me wrote:
Please find attached v23 which reintroduces the split.
0001 is reworked to have a reduced footprint than before. Also in an attempt
to facilitate the readability, 0002 splits the API's and the uncompressed
implementation in separate files.Thanks for updating the patch. Could you address the review comments I
sent here ?
/messages/by-id/20230108194524.GA27637@telsasoft.com
Please find v24 attached.
Cheers,
//Georgios
Show quoted text
Thanks,
--
Justin
Attachments:
v24-0001-Prepare-pg_dump-internals-for-additional-compres.patchtext/x-patch; name=v24-0001-Prepare-pg_dump-internals-for-additional-compres.patchDownload
From af03b49cd4deedfcd05c468a74e40732650ef681 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Tue, 24 Jan 2023 15:37:59 +0000
Subject: [PATCH v24 1/3] Prepare pg_dump internals for additional compression
methods.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it and is using it throughout.
---
src/bin/pg_dump/compress_io.c | 66 +++++++++++---
src/bin/pg_dump/compress_io.h | 2 +
src/bin/pg_dump/pg_backup_archiver.c | 132 ++++++++++-----------------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-----
4 files changed, 103 insertions(+), 124 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 7a2c80bbc4..ef033914ba 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,10 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
/*----------------------
* Compressor API
*----------------------
@@ -490,16 +494,19 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_specification compression_spec)
{
- cfp *fp = pg_malloc(sizeof(cfp));
+ cfp *fp = pg_malloc0(sizeof(cfp));
if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
@@ -511,15 +518,20 @@ cfopen(const char *path, const char *mode,
snprintf(mode_compression, sizeof(mode_compression), "%s%d",
mode, compression_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
+ if (fd >= 0)
+ fp->compressedfp = gzdopen(fd, mode_compression);
+ else
+ fp->compressedfp = gzopen(path, mode_compression);
}
else
{
/* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
+ if (fd >= 0)
+ fp->compressedfp = gzdopen(fd, mode);
+ else
+ fp->compressedfp = gzopen(path, mode);
}
- fp->uncompressedfp = NULL;
if (fp->compressedfp == NULL)
{
free_keep_errno(fp);
@@ -531,10 +543,11 @@ cfopen(const char *path, const char *mode,
}
else
{
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
-#endif
- fp->uncompressedfp = fopen(path, mode);
+ if (fd >= 0)
+ fp->uncompressedfp = fdopen(fd, mode);
+ else
+ fp->uncompressedfp = fopen(path, mode);
+
if (fp->uncompressedfp == NULL)
{
free_keep_errno(fp);
@@ -545,6 +558,33 @@ cfopen(const char *path, const char *mode,
return fp;
}
+/*
+ * Opens file 'path' in 'mode' and compression as defined in
+ * compression_spec. The caller must verify that the compression
+ * is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+/*
+ * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
+ * and compression as defined in compression_spec. The caller must
+ * verify that the compression is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a429dc4789..a6cdf588dd 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -54,6 +54,8 @@ typedef struct cfp cfp;
extern cfp *cfopen(const char *path, const char *mode,
const pg_compress_specification compression_spec);
+extern cfp *cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
extern cfp *cfopen_write(const char *path, const char *mode,
const pg_compress_specification compression_spec);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index ba5e6acbbb..5f52ea40e8 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -362,8 +353,9 @@ RestoreArchive(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
+ bool supports_compression;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +383,26 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
- AH->PrintTocDataPtr != NULL)
+ supports_compression = false;
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_NONE ||
+ AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = true;
+
+ if (AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+#ifndef HAVE_LIBZ
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+ if (supports_compression == false)
+ pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1133,7 +1134,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1508,58 +1509,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1570,33 +1545,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1720,22 +1686,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2224,6 +2185,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2277,8 +2239,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b..4725e49747 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.34.1
v24-0002-Introduce-Compress-and-Compressor-API-in-pg_dump.patchtext/x-patch; name=v24-0002-Introduce-Compress-and-Compressor-API-in-pg_dump.patchDownload
From 4900a66813144ca42ba343370c52259ca8afd491 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Tue, 24 Jan 2023 15:40:12 +0000
Subject: [PATCH v24 2/3] Introduce Compress and Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for file manipulation. The implementor of a new
compression method is now able to "simply" just add those definitions.
Additionaly custom compressed archives store the compression algorithm in their
header instead of the compression level. The header version number is bumped.
---
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_gzip.c | 397 ++++++++++++++
src/bin/pg_dump/compress_gzip.h | 24 +
src/bin/pg_dump/compress_io.c | 746 +++++---------------------
src/bin/pg_dump/compress_io.h | 166 +++++-
src/bin/pg_dump/compress_none.c | 206 +++++++
src/bin/pg_dump/compress_none.h | 24 +
src/bin/pg_dump/meson.build | 2 +
src/bin/pg_dump/pg_backup_archiver.c | 102 ++--
src/bin/pg_dump/pg_backup_archiver.h | 5 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 94 ++--
src/bin/pg_dump/t/002_pg_dump.pl | 10 +-
src/include/common/compression.h | 4 +
src/tools/pginclude/cpluspluscheck | 2 +
src/tools/pgindent/typedefs.list | 2 +
16 files changed, 1057 insertions(+), 752 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
create mode 100644 src/bin/pg_dump/compress_none.c
create mode 100644 src/bin/pg_dump/compress_none.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index ef1ed0f3e5..0013bc080c 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,7 +24,9 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
+ compress_none.o \
dumputils.o \
parallel.o \
pg_backup_archiver.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..59da4a955e
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,397 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.c
+ * Routines for archivers to read or write a gzip compressed data stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_gzip.h"
+#include "pg_backup_utils.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ /*
+ * A level of zero simply copies the input one block at the time. This
+ * is probably not what the user wanted when calling this interface.
+ */
+ if (cs->compression_spec.level == 0)
+ pg_fatal("requested to compress the archive yet no level was specified");
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ cs->compression_spec = compression_spec;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
+ cs->private_data = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ size_t ret;
+
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzwrite(gzfp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzgets(gzfp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ CFH->private_data = NULL;
+
+ return gzclose(gzfp);
+}
+
+static int
+Gzip_eof(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzeof(gzfp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ gzFile gzfp;
+ char mode_compression[32];
+
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
+ else
+ gzfp = gzopen(path, mode_compression);
+
+ if (gzfp == NULL)
+ return 1;
+
+ CFH->private_data = gzfp;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ CFH->open_func = Gzip_open;
+ CFH->open_write_func = Gzip_open_write;
+ CFH->read_func = Gzip_read;
+ CFH->write_func = Gzip_write;
+ CFH->gets_func = Gzip_gets;
+ CFH->getc_func = Gzip_getc;
+ CFH->close_func = Gzip_close;
+ CFH->eof_func = Gzip_eof;
+ CFH->get_error_func = Gzip_get_error;
+
+ CFH->compression_spec = compression_spec;
+
+ CFH->private_data = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+
+void
+InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..45fdf01f7d
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.h
+ * GZIP interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index ef033914ba..ea4c266e08 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -9,42 +9,54 @@
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
+ * underlying stream. The second API is a wrapper around fopen and
* friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
+ * the possible compression. The second API is aimed for the resulting
+ * files to be easily manipulated with an external compression utility
+ * program.
+ *
+ * This file also includes the implementation when compression is none for
+ * both API's.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
- *
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
- *
- * The interface is the same for compressed and uncompressed streams.
+ * AllocateCompressor, writeData, and EndCompressor. First you call
+ * AllocateCompressor, then write all the data by calling writeData as many
+ * times as needed, and finally EndCompressor. writeData will call the
+ * WriteFunc that was provided to AllocateCompressor for each chunk of
+ * compressed data.
+ *
+ * The interface for reading an archive consists of the same three functions:
+ * AllocateCompressor, readData, and EndCompressor. First you call
+ * AllocateCompressor, then read all the data by calling readData to read the
+ * whole compressed stream which repeatedly calls the given ReadFunc. ReadFunc
+ * returns the compressed data one chunk at a time. Then readData decompresses
+ * it and passes the decompressed data to ahwrite(), until ReadFunc returns 0
+ * to signal EOF. The interface is the same for compressed and uncompressed
+ * streams.
*
* Compressed stream API
* ----------------------
*
- * The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * The compressed stream API is providing a set of function pointers for
+ * opening, reading, writing, and finally closing files. The implemented
+ * function pointers are documented in the corresponding header file and are
+ * common for all streams. It allows the caller to use the same functions for
+ * both compressed and uncompressed streams.
+ *
+ * The interface consists of three functions, InitCompressFileHandle,
+ * InitDiscoverCompressFileHandle, and EndCompressFileHandle. If the
+ * compression is known, then start by calling InitCompressFileHandle,
+ * otherwise discover it by using InitDiscoverCompressFileHandle. Then call
+ * the function pointers as required for the read/write operations. Finally
+ * call EndCompressFileHandle to end the stream.
+ *
+ * InitDiscoverCompressFileHandle tries to deffer the compression by the
+ * filename suffix. If the suffix is not yet known, then it tries to simply
+ * open the file, and if it fails, it tries to open the same file with the .gz
+ * suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -53,122 +65,38 @@
*/
#include "postgres_fe.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
+#include "compress_gzip.h"
#include "compress_io.h"
+#include "compress_none.h"
#include "pg_backup_utils.h"
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#endif
-
/*----------------------
* Compressor API
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
-{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
-
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
-
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
-
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-
-/* Public interface routines */
-
-/* Allocate a new compressor */
+/*
+ * Allocate a new compressor.
+ */
CompressorState *
AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
- cs->compression_spec = compression_spec;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
-}
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF)
-{
if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
-}
+ InitCompressorNone(cs, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressorGzip(cs, compression_spec);
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ return cs;
}
/*
@@ -177,233 +105,31 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
-#endif
- free(cs);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
- */
-
-static void
-InitCompressorZlib(CompressorState *cs, int level)
-{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
-}
-
-static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
-{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
-
- free(cs->zlibOut);
- free(cs->zp);
-}
-
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
-{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
-
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
-
- if (res == Z_STREAM_END)
- break;
- }
-}
-
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
-}
-
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
-{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
-
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
-
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
-
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
-}
-#endif /* HAVE_LIBZ */
-
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
-{
- size_t cnt;
- char *buf;
- size_t buflen;
-
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
-
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
-
- free(buf);
-}
-
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->writeF(AH, data, dLen);
-}
-
-
/*----------------------
* Compressed stream API
*----------------------
*/
/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
+ * Private routines
*/
-struct cfp
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
-};
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
+}
/* free() without changing errno; useful in several places below */
static void
@@ -416,324 +142,102 @@ free_keep_errno(void *p)
}
/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
+ * Public interface
*/
-cfp *
-cfopen_read(const char *path, const char *mode)
-{
- cfp *fp;
-
- pg_compress_specification compression_spec = {0};
-
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
-
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
-}
/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- cfp *fp;
-
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
-}
-
-/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
+ * Initialize a compress file handle for the specified compression algorithm.
*/
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compression_spec)
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compression_spec)
{
- cfp *fp = pg_malloc0(sizeof(cfp));
+ CompressFileHandle *CFH;
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode_compression);
- else
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode);
- else
- fp->compressedfp = gzopen(path, mode);
- }
-
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
- else
- {
- if (fd >= 0)
- fp->uncompressedfp = fdopen(fd, mode);
- else
- fp->uncompressedfp = fopen(path, mode);
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
- }
+ if (compression_spec.algorithm == PG_COMPRESSION_NONE)
+ InitCompressNone(CFH, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressGzip(CFH, compression_spec);
- return fp;
+ return CFH;
}
/*
- * Opens file 'path' in 'mode' and compression as defined in
- * compression_spec. The caller must verify that the compression
- * is supported by the current build.
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
*
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- return cfopen_internal(path, -1, mode, compression_spec);
-}
-
-/*
- * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
- * and compression as defined in compression_spec. The caller must
- * verify that the compression is supported by the current build.
+ * If the file at 'path' contains the suffix of a supported compression method,
+ * currently this includes only ".gz", then this compression will be used
+ * throughout. Otherwise the compression will be deferred by iteratively trying
+ * to open the file at 'path', first as is, then by appending known compression
+ * suffixes. So if you pass "foo" as 'path', this will open either "foo" or
+ * "foo.gz", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec)
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- return cfopen_internal(NULL, fd, mode, compression_spec);
-}
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
-int
-cfread(void *ptr, int size, cfp *fp)
-{
- int ret;
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (size == 0)
- return 0;
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ fname = strdup(path);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
else
-#endif
{
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
- }
- return ret;
-}
-
-int
-cfwrite(const void *ptr, int size, cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
-#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
-}
-
-int
-cfgetc(cfp *fp)
-{
- int ret;
+ bool exists;
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not build with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
+ if (!exists)
{
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
- }
- else
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
}
- return ret;
-}
-
-char *
-cfgets(cfp *fp, char *buf, int len)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
-#endif
- return fgets(buf, len, fp->uncompressedfp);
-}
-
-int
-cfclose(cfp *fp)
-{
- int result;
-
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
- }
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ CFH = InitCompressFileHandle(compression_spec);
+ if (CFH->open_func(fname, -1, mode, CFH))
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
+ free_keep_errno(CFH);
+ CFH = NULL;
}
- else
-#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
- }
- free_keep_errno(fp);
+ free_keep_errno(fname);
- return result;
+ return CFH;
}
+/*
+ * Close an open file handle and release its memory.
+ *
+ * On failure, returns an error value and sets errno appropriately.
+ */
int
-cfeof(cfp *fp)
+EndCompressFileHandle(CompressFileHandle *CFH)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
-#endif
- return feof(fp->uncompressedfp);
-}
+ int ret = 0;
-const char *
-get_cfp_error(cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ if (CFH->private_data)
+ ret = CFH->close_func(CFH);
- if (errnum != Z_ERRNO)
- return errmsg;
- }
-#endif
- return strerror(errno);
-}
+ free_keep_errno(CFH);
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
-{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
-
- if (filenamelen < suffixlen)
- return 0;
-
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
+ return ret;
}
-
-#endif
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a6cdf588dd..a7a4fec036 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -21,50 +21,160 @@
#define ZLIB_OUT_SIZE 4096
#define ZLIB_IN_SIZE 4096
-/* Prototype for callback function to WriteDataToArchive() */
+/*
+ * Prototype for callback function used in writeData()
+ */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
/*
- * Prototype for callback function to ReadDataFromArchive()
+ * Prototype for callback function used in readData()
*
- * ReadDataFromArchive will call the read function repeatedly, until it
- * returns 0 to signal EOF. ReadDataFromArchive passes a buffer to read the
- * data into in *buf, of length *buflen. If that's not big enough for the
- * callback function, it can free() it and malloc() a new one, returning the
- * new buffer and its size in *buf and *buflen.
+ * readData will call the read function repeatedly, until it returns 0 to signal
+ * EOF. readData passes a buffer to read the data into in *buf, of length
+ * *buflen. If that's not big enough for the callback function, it can free() it
+ * and malloc() a new one, returning the new buffer and its size in *buf and
+ * *buflen.
*
* Returns the number of bytes read into *buf, or 0 on EOF.
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+
+ /*
+ * End compression and flush internal buffers if any.
+ */
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Callback function to read from an already processed input stream
+ */
+ ReadFunc readF;
+
+ /*
+ * Callback function to write an already processed chunk of data.
+ */
+ WriteFunc writeF;
+
+ /*
+ * Compression specification for this state.
+ */
+ pg_compress_specification compression_spec;
+
+ /*
+ * Private data to be used by the compressor.
+ */
+ void *private_data;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ /*
+ * Open a file in mode.
+ *
+ * Pass either 'path' or 'fd' depending on whether a filepath or a file
+ * descriptor is available. 'mode' can be one of 'r', 'rb', 'w', 'wb',
+ * 'a', and 'ab'. Requrires an already initialized CompressFileHandle.
+ */
+ int (*open_func) (const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH);
+
+ /*
+ * Open a file for writing.
+ *
+ * 'mode' can be one of ''w', 'wb', 'a', and 'ab'. Requrires an already
+ * initialized CompressFileHandle.
+ */
+ int (*open_write_func) (const char *path, const char *mode,
+ CompressFileHandle *CFH);
-typedef struct cfp cfp;
+ /*
+ * Read 'size' bytes of data from the file and store them into 'ptr'.
+ */
+ size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+ /*
+ * Write 'size' bytes of data into the file from 'ptr'.
+ */
+ size_t (*write_func) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ /*
+ * Read at most size - 1 characters from the compress file handle into
+ * 's'.
+ *
+ * Stop if an EOF or a newline is found first. 's' is always null
+ * terminated and contains the newline if it was found.
+ */
+ char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
+
+ /*
+ * Read the next character from the compress file handle as 'unsigned
+ * char' cast into 'int'.
+ */
+ int (*getc_func) (CompressFileHandle *CFH);
+
+ /*
+ * Test if EOF is reached in the compress file handle.
+ */
+ int (*eof_func) (CompressFileHandle *CFH);
+
+ /*
+ * Close an open file handle.
+ */
+ int (*close_func) (CompressFileHandle *CFH);
+
+ /*
+ * Get a pointer to a string that describes an error that occured during a
+ * compress file handle operation.
+ */
+ const char *(*get_error_func) (CompressFileHandle *CFH);
+
+ /*
+ * Compression specification for this file handle.
+ */
+ pg_compress_specification compression_spec;
+
+ /*
+ * Private data to be used by the compressor.
+ */
+ void *private_data;
+};
+
+/*
+ * Initialize a compress file handle with the requested compression.
+ */
+extern CompressFileHandle *InitCompressFileHandle(const pg_compress_specification compression_spec);
+
+/*
+ * Initialize a compress file stream. Deffer the compression algorithm
+ * from 'path', either by examining its suffix or by appending the supported
+ * suffixes in 'path'.
+ */
+extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int EndCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/compress_none.c b/src/bin/pg_dump/compress_none.c
new file mode 100644
index 0000000000..3a36f94898
--- /dev/null
+++ b/src/bin/pg_dump/compress_none.c
@@ -0,0 +1,206 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_none.c
+ * Routines for archivers to read or write an uncompressed stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_none.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_none.h"
+#include "pg_backup_utils.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+/*
+ * Private routines
+ */
+
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
+
+ free(buf);
+}
+
+
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+/*
+ * Public interface
+ */
+
+void
+InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+
+ cs->compression_spec = compression_spec;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * Private routines
+ */
+
+static size_t
+read_none(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ size_t ret;
+
+ if (size == 0)
+ return 0;
+
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
+
+ return ret;
+}
+
+static size_t
+write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
+}
+
+static const char *
+get_error_none(CompressFileHandle *CFH)
+{
+ return strerror(errno);
+}
+
+static char *
+gets_none(char *ptr, int size, CompressFileHandle *CFH)
+{
+ return fgets(ptr, size, (FILE *) CFH->private_data);
+}
+
+static int
+getc_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret;
+
+ ret = fgetc(fp);
+ if (ret == EOF)
+ {
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static int
+close_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret = 0;
+
+ CFH->private_data = NULL;
+
+ if (fp)
+ ret = fclose(fp);
+
+ return ret;
+}
+
+static int
+eof_none(CompressFileHandle *CFH)
+{
+ return feof((FILE *) CFH->private_data);
+}
+
+static int
+open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
+
+ if (fd >= 0)
+ CFH->private_data = fdopen(dup(fd), mode);
+ else
+ CFH->private_data = fopen(path, mode);
+
+ if (CFH->private_data == NULL)
+ return 1;
+
+ return 0;
+}
+
+static int
+open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
+
+ CFH->private_data = fopen(path, mode);
+ if (CFH->private_data == NULL)
+ return 1;
+
+ return 0;
+}
+
+/*
+ * Public interface
+ */
+
+void
+InitCompressNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ CFH->open_func = open_none;
+ CFH->open_write_func = open_write_none;
+ CFH->read_func = read_none;
+ CFH->write_func = write_none;
+ CFH->gets_func = gets_none;
+ CFH->getc_func = getc_none;
+ CFH->close_func = close_none;
+ CFH->eof_func = eof_none;
+ CFH->get_error_func = get_error_none;
+
+ CFH->private_data = NULL;
+}
diff --git a/src/bin/pg_dump/compress_none.h b/src/bin/pg_dump/compress_none.h
new file mode 100644
index 0000000000..0e704c5491
--- /dev/null
+++ b/src/bin/pg_dump/compress_none.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_none.h
+ * Uncompressed interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_none.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_NONE_H_
+#define _COMPRESS_NONE_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_NONE_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index ca62f9a374..84e9f0defa 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,7 +1,9 @@
# Copyright (c) 2022-2023, PostgreSQL Global Development Group
pg_dump_common_sources = files(
+ 'compress_gzip.c',
'compress_io.c',
+ 'compress_none.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 5f52ea40e8..04cf887526 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -355,7 +355,7 @@ RestoreArchive(Archive *AHX)
bool parallel_mode;
bool supports_compression;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1134,7 +1134,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1150,9 +1150,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1509,6 +1510,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1531,33 +1533,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1696,7 +1697,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2038,6 +2043,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2068,26 +2085,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2185,6 +2188,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2240,7 +2244,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3653,12 +3660,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3729,10 +3731,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
@@ -3744,10 +3747,17 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
+ {
+ bool unsupported = false;
+
#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ unsupported = true;
#endif
+ if (unsupported)
+ pg_fatal("archive is compressed, but this installation does not support compression");
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 4725e49747..18b38c17ab 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,13 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add
+ * compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index d1e54644a9..512ab043af 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 6800c3ccee..a2f88995c0 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read_func(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (EndCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof_func(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write_func(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc_func(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write_func(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read_func(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compression_spec);
+ if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
+ if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write_func(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index d92247c915..78454928cc 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -94,7 +94,7 @@ my %pgdump_runs = (
command => [
'pg_restore', '-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip-compressed'
},
},
@@ -239,8 +239,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default if available',
},
},
@@ -264,8 +264,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
glob_patterns => [
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 9a471e40aa..d44ebb06cc 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -14,6 +14,10 @@
#ifndef PG_COMPRESSION_H
#define PG_COMPRESSION_H
+/*
+ * These values are stored in disc, for example in files generated by pg_dump.
+ * Create the necessary backwards compatibility layers if their order changes.
+ */
typedef enum pg_compress_algorithm
{
PG_COMPRESSION_NONE,
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index e52fe9f509..db429474a2 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -150,7 +150,9 @@ do
# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.
+ test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
test "$f" = src/bin/pg_dump/pg_dump.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 51484ca7e2..7e62c6ef3d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -428,6 +428,7 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
+CompressFileHandle
CompressionLocation
CompressorState
ComputeXidHorizonsResult
@@ -1034,6 +1035,7 @@ GucStack
GucStackState
GucStringAssignHook
GucStringCheckHook
+GzipCompressorState
HANDLE
HASHACTION
HASHBUCKET
--
2.34.1
v24-0003-Add-LZ4-compression-to-pg_dump.patchtext/x-patch; name=v24-0003-Add-LZ4-compression-to-pg_dump.patchDownload
From c1d868f9ac967b800c20bb4c588a41883356a652 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Tue, 24 Jan 2023 15:42:43 +0000
Subject: [PATCH v24 3/3] Add LZ4 compression to pg_dump
This is mostly done within pg_dump's compression streaming and file APIs.
It is confined within the newly introduced compress_lz4.{c,h} files.
The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(). Where the
functionality was missing from the official API, it has been implemented
localy.
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 20 +-
src/bin/pg_dump/compress_lz4.c | 623 ++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 22 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 17 +-
src/bin/pg_dump/pg_backup_directory.c | 9 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 1 +
12 files changed, 781 insertions(+), 22 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e1..49d218905f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 0013bc080c..eb8f59459a 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -26,6 +27,7 @@ OBJS = \
$(WIN32RES) \
compress_gzip.o \
compress_io.o \
+ compress_lz4.o \
compress_none.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index ea4c266e08..4ec25e352c 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,7 +56,7 @@
* InitDiscoverCompressFileHandle tries to deffer the compression by the
* filename suffix. If the suffix is not yet known, then it tries to simply
* open the file, and if it fails, it tries to open the same file with the .gz
- * suffix.
+ * suffix, and then again with the .lz4 suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -70,6 +70,7 @@
#include "compress_gzip.h"
#include "compress_io.h"
+#include "compress_lz4.h"
#include "compress_none.h"
#include "pg_backup_utils.h"
@@ -95,6 +96,8 @@ AllocateCompressor(const pg_compress_specification compression_spec,
InitCompressorNone(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressorGzip(cs, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ InitCompressorLZ4(cs, compression_spec);
return cs;
}
@@ -159,6 +162,8 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
InitCompressNone(CFH, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressGzip(CFH, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ InitCompressLZ4(CFH, compression_spec);
return CFH;
}
@@ -172,7 +177,7 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
* throughout. Otherwise the compression will be deferred by iteratively trying
* to open the file at 'path', first as is, then by appending known compression
* suffixes. So if you pass "foo" as 'path', this will open either "foo" or
- * "foo.gz", trying in that order.
+ * "foo.gz" or "foo.lz4", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
@@ -210,6 +215,17 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
+#endif
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..644107f124
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,623 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.c
+ * Routines for archivers to write a LZ4 compressed data stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private_data;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private_data = NULL;
+ }
+}
+
+
+/*
+ * Public routines that support LZ4 compressed data I/O
+ */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ cs->compression_spec = compression_spec;
+
+ /* Will be lazy init'd */
+ cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the necessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurrence of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verify that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (!eol_flag) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the beginning of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int remaining = size;
+
+ /* Lazy init */
+ if (LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+/*
+ * Public routines
+ */
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ LZ4File *lz4fp;
+
+ CFH->open_func = LZ4File_open;
+ CFH->open_write_func = LZ4File_open_write;
+ CFH->read_func = LZ4File_read;
+ CFH->write_func = LZ4File_write;
+ CFH->gets_func = LZ4File_gets;
+ CFH->getc_func = LZ4File_getc;
+ CFH->eof_func = LZ4File_eof;
+ CFH->close_func = LZ4File_close;
+ CFH->get_error_func = LZ4File_get_error;
+
+ CFH->compression_spec = compression_spec;
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (CFH->compression_spec.level >= 0)
+ lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
+
+ CFH->private_data = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+
+void
+InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..0fe8c4e524
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,22 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.h
+ * LZ4 interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 84e9f0defa..0da476a4c3 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -3,6 +3,7 @@
pg_dump_common_sources = files(
'compress_gzip.c',
'compress_io.c',
+ 'compress_lz4.c',
'compress_none.c',
'dumputils.c',
'parallel.c',
@@ -18,7 +19,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -86,7 +87,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 04cf887526..52c9a4a634 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -385,7 +385,8 @@ RestoreArchive(Archive *AHX)
*/
supports_compression = false;
if (AH->compression_spec.algorithm == PG_COMPRESSION_NONE ||
- AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ AH->compression_spec.algorithm == PG_COMPRESSION_GZIP ||
+ AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
supports_compression = true;
if (AH->PrintTocDataPtr != NULL)
@@ -397,6 +398,10 @@ RestoreArchive(Archive *AHX)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
#endif
if (supports_compression == false)
pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
@@ -2081,7 +2086,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2091,6 +2096,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -3754,6 +3763,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
unsupported = true;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ unsupported = true;
#endif
if (unsupported)
pg_fatal("archive is compressed, but this installation does not support compression");
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index a2f88995c0..ce2a0838fa 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -779,10 +779,13 @@ _PrepParallelRestore(ArchiveHandle *AH)
if (stat(fname, &st) == 0)
te->dataLength = st.st_size;
- else
+ else if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
{
- /* It might be compressed */
- strlcat(fname, ".gz", sizeof(fname));
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ strlcat(fname, ".gz", sizeof(fname));
+ else if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ strlcat(fname, ".lz4", sizeof(fname));
+
if (stat(fname, &st) == 0)
te->dataLength = st.st_size;
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..08105337b1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 78454928cc..72b19ee6cd 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -139,6 +139,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files were compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4175,11 +4249,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index db429474a2..2c5042eb41 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -152,6 +152,7 @@ do
# as field names, which is unfortunate but we won't change it now.
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_lz4.h && continue
test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 7e62c6ef3d..692c5ceb0b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1383,6 +1383,7 @@ LWLock
LWLockHandle
LWLockMode
LWLockPadded
+LZ4CompressorState
LZ4F_compressionContext_t
LZ4F_decompressOptions_t
LZ4F_decompressionContext_t
--
2.34.1
On Tue, Jan 24, 2023 at 03:56:20PM +0000, gkokolatos@pm.me wrote:
On Monday, January 23rd, 2023 at 7:00 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Mon, Jan 23, 2023 at 05:31:55PM +0000, gkokolatos@pm.me wrote:
Please find attached v23 which reintroduces the split.
0001 is reworked to have a reduced footprint than before. Also in an attempt
to facilitate the readability, 0002 splits the API's and the uncompressed
implementation in separate files.Thanks for updating the patch. Could you address the review comments I
sent here ?
/messages/by-id/20230108194524.GA27637@telsasoft.comPlease find v24 attached.
Thanks for updating the patch.
In 001, RestoreArchive() does:
-#ifndef HAVE_LIBZ - if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP && - AH->PrintTocDataPtr != NULL) + supports_compression = false; + if (AH->compression_spec.algorithm == PG_COMPRESSION_NONE || + AH->compression_spec.algorithm == PG_COMPRESSION_GZIP) + supports_compression = true; + + if (AH->PrintTocDataPtr != NULL) { for (te = AH->toc->next; te != AH->toc; te = te->next) { if (te->hadDumper && (te->reqs & REQ_DATA) != 0) - pg_fatal("cannot restore from compressed archive (compression not supported in this installation)"); + { +#ifndef HAVE_LIBZ + if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP) + supports_compression = false; +#endif + if (supports_compression == false) + pg_fatal("cannot restore from compressed archive (compression not supported in this installation)"); + } } } -#endif
This first checks if the algorithm is implemented, and then checks if
the algorithm is supported by the current build - that confused me for a
bit. It seems unnecessary to check for unimplemented algorithms before
looping. That also requires referencing both GZIP and LZ4 in two
places.
I think it could be written to avoid the need to change for added
compression algorithms:
+ if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
+ {
+ /* Check if the compression algorithm is supported */
+ pg_compress_specification spec;
+ parse_compress_specification(AH->compression_spec.algorithm, NULL, &spec);
+ if (spec->parse_error != NULL)
+ pg_fatal(spec->parse_error);
+ }
Or maybe add a new function to compression.c to indicate whether a given
algorithm is supported.
That would also indicate *which* compression library isn't supported.
Other than that, I think 001 is ready.
002/003 use these names, which I think are too similar - initially I
didn't even realize there were two separate functions (each with a
second stub function to handle the case of unsupported compression):
+extern void InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
+extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec);
+extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
typo:
s/not build with/not built with/
Should AllocateCompressor() set cs->compression_spec, rather than doing
it in each compressor ?
Thanks for considering.
--
Justin
------- Original Message -------
On Wednesday, January 25th, 2023 at 2:42 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Tue, Jan 24, 2023 at 03:56:20PM +0000, gkokolatos@pm.me wrote:
On Monday, January 23rd, 2023 at 7:00 PM, Justin Pryzby pryzby@telsasoft.com wrote:
On Mon, Jan 23, 2023 at 05:31:55PM +0000, gkokolatos@pm.me wrote:
Please find attached v23 which reintroduces the split.
0001 is reworked to have a reduced footprint than before. Also in an attempt
to facilitate the readability, 0002 splits the API's and the uncompressed
implementation in separate files.Thanks for updating the patch. Could you address the review comments I
sent here ?
/messages/by-id/20230108194524.GA27637@telsasoft.comPlease find v24 attached.
Thanks for updating the patch.
In 001, RestoreArchive() does:
-#ifndef HAVE_LIBZ - if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP && - AH->PrintTocDataPtr != NULL) + supports_compression = false; + if (AH->compression_spec.algorithm == PG_COMPRESSION_NONE || + AH->compression_spec.algorithm == PG_COMPRESSION_GZIP) + supports_compression = true; + + if (AH->PrintTocDataPtr != NULL) { for (te = AH->toc->next; te != AH->toc; te = te->next) { if (te->hadDumper && (te->reqs & REQ_DATA) != 0) - pg_fatal("cannot restore from compressed archive (compression not supported in this installation)"); + { +#ifndef HAVE_LIBZ + if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP) + supports_compression = false; +#endif + if (supports_compression == false) + pg_fatal("cannot restore from compressed archive (compression not supported in this installation)"); + } } } -#endifThis first checks if the algorithm is implemented, and then checks if
the algorithm is supported by the current build - that confused me for a
bit. It seems unnecessary to check for unimplemented algorithms before
looping. That also requires referencing both GZIP and LZ4 in two
places.
I am not certain that it is unnecessary, at least not in the way that is
described. The idea is that new compression methods can be added, without
changing the archive's version number. It is very possible that it is
requested to restore an archive compressed with a method not implemented
in the current binary. The first check takes care of that and sets
supports_compression only for the supported versions. It is possible to
enter the loop with supports_compression already set to false, for example
because the archive was compressed with ZSTD, triggering the fatal error.
Of course, one can throw the error before entering the loop, yet I think
that it does not help the readability of the code. IMHO it is easier to
follow if the error is thrown once during that check.
I think it could be written to avoid the need to change for added
compression algorithms:+ if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
+ { + /* Check if the compression algorithm is supported */ + pg_compress_specification spec; + parse_compress_specification(AH->compression_spec.algorithm, NULL, &spec);+ if (spec->parse_error != NULL)
+ pg_fatal(spec->parse_error);
+ }
I am not certain how that would work in the example with ZSTD above.
If I am not wrong, parse_compress_specification() will not throw an error
if the codebase supports ZSTD, yet this specific pg_dump binary will not
support it because ZSTD is not implemented. parse_compress_specification()
is not aware of that and should not be aware of it, should it?
Or maybe add a new function to compression.c to indicate whether a given
algorithm is supported.
I am not certain how this would help, as compression.c is supposed to be
used by multiple binaries while this is a pg_dump specific detail.
That would also indicate which compression library isn't supported.
If anything, I can suggest to throw an error much earlier, i.e. in ReadHead(),
and remove altogether this check. On the other hand, I like the belts
and suspenders approach because there are no more checks after this point.
Other than that, I think 001 is ready.
Thank you.
002/003 use these names, which I think are too similar - initially I
didn't even realize there were two separate functions (each with a
second stub function to handle the case of unsupported compression):+extern void InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec); +extern void InitCompressGzip(CompressFileHandle *CFH, const pg_compress_specification compression_spec);+extern void InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec); +extern void InitCompressLZ4(CompressFileHandle *CFH, const pg_compress_specification compression_spec);
Fair enough. Names are now updated.
typo:
s/not build with/not built with/
Thank you.
Should AllocateCompressor() set cs->compression_spec, rather than doing
it in each compressor ?
I think that compression_spec should be owned by each compressor. With that
in mind, it makes more sense to set it within each compressor. This is not
a hill I am willing to die on though.
Please find v25 attached.
Show quoted text
Thanks for considering.
--
Justin
Attachments:
v25-0001-Prepare-pg_dump-internals-for-additional-compres.patchtext/x-patch; name=v25-0001-Prepare-pg_dump-internals-for-additional-compres.patchDownload
From 5cf128bc49d47a1e5a763dd77a2245cb7e5680e9 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 25 Jan 2023 14:43:23 +0000
Subject: [PATCH v25 1/3] Prepare pg_dump internals for additional compression
methods.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it and is using it throughout.
---
src/bin/pg_dump/compress_io.c | 66 +++++++++++---
src/bin/pg_dump/compress_io.h | 2 +
src/bin/pg_dump/pg_backup_archiver.c | 132 ++++++++++-----------------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-----
4 files changed, 103 insertions(+), 124 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 7a2c80bbc4..ef033914ba 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,10 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
/*----------------------
* Compressor API
*----------------------
@@ -490,16 +494,19 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_specification compression_spec)
{
- cfp *fp = pg_malloc(sizeof(cfp));
+ cfp *fp = pg_malloc0(sizeof(cfp));
if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
@@ -511,15 +518,20 @@ cfopen(const char *path, const char *mode,
snprintf(mode_compression, sizeof(mode_compression), "%s%d",
mode, compression_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
+ if (fd >= 0)
+ fp->compressedfp = gzdopen(fd, mode_compression);
+ else
+ fp->compressedfp = gzopen(path, mode_compression);
}
else
{
/* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
+ if (fd >= 0)
+ fp->compressedfp = gzdopen(fd, mode);
+ else
+ fp->compressedfp = gzopen(path, mode);
}
- fp->uncompressedfp = NULL;
if (fp->compressedfp == NULL)
{
free_keep_errno(fp);
@@ -531,10 +543,11 @@ cfopen(const char *path, const char *mode,
}
else
{
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
-#endif
- fp->uncompressedfp = fopen(path, mode);
+ if (fd >= 0)
+ fp->uncompressedfp = fdopen(fd, mode);
+ else
+ fp->uncompressedfp = fopen(path, mode);
+
if (fp->uncompressedfp == NULL)
{
free_keep_errno(fp);
@@ -545,6 +558,33 @@ cfopen(const char *path, const char *mode,
return fp;
}
+/*
+ * Opens file 'path' in 'mode' and compression as defined in
+ * compression_spec. The caller must verify that the compression
+ * is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+/*
+ * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
+ * and compression as defined in compression_spec. The caller must
+ * verify that the compression is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a429dc4789..a6cdf588dd 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -54,6 +54,8 @@ typedef struct cfp cfp;
extern cfp *cfopen(const char *path, const char *mode,
const pg_compress_specification compression_spec);
+extern cfp *cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
extern cfp *cfopen_write(const char *path, const char *mode,
const pg_compress_specification compression_spec);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index ba5e6acbbb..5f52ea40e8 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -362,8 +353,9 @@ RestoreArchive(Archive *AHX)
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
+ bool supports_compression;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +383,26 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
- AH->PrintTocDataPtr != NULL)
+ supports_compression = false;
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_NONE ||
+ AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = true;
+
+ if (AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+#ifndef HAVE_LIBZ
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ supports_compression = false;
+#endif
+ if (supports_compression == false)
+ pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1133,7 +1134,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1508,58 +1509,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1570,33 +1545,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1720,22 +1686,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2224,6 +2185,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2277,8 +2239,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b..4725e49747 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.34.1
v25-0002-Introduce-Compress-and-Compressor-API-in-pg_dump.patchtext/x-patch; name=v25-0002-Introduce-Compress-and-Compressor-API-in-pg_dump.patchDownload
From 8aa452d327c184cf54797430e4d912ac956ec187 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 25 Jan 2023 14:45:50 +0000
Subject: [PATCH v25 2/3] Introduce Compress and Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for file manipulation. The implementor of a new
compression method is now able to "simply" just add those definitions.
Additionaly custom compressed archives store the compression algorithm in their
header instead of the compression level. The header version number is bumped.
---
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_gzip.c | 401 ++++++++++++++
src/bin/pg_dump/compress_gzip.h | 24 +
src/bin/pg_dump/compress_io.c | 746 +++++---------------------
src/bin/pg_dump/compress_io.h | 166 +++++-
src/bin/pg_dump/compress_none.c | 206 +++++++
src/bin/pg_dump/compress_none.h | 24 +
src/bin/pg_dump/meson.build | 2 +
src/bin/pg_dump/pg_backup_archiver.c | 102 ++--
src/bin/pg_dump/pg_backup_archiver.h | 5 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 94 ++--
src/bin/pg_dump/t/002_pg_dump.pl | 10 +-
src/include/common/compression.h | 4 +
src/tools/pginclude/cpluspluscheck | 2 +
src/tools/pgindent/typedefs.list | 2 +
16 files changed, 1061 insertions(+), 752 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
create mode 100644 src/bin/pg_dump/compress_none.c
create mode 100644 src/bin/pg_dump/compress_none.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index ef1ed0f3e5..0013bc080c 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,7 +24,9 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
+ compress_none.o \
dumputils.o \
parallel.o \
pg_backup_archiver.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..24e68fd022
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,401 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.c
+ * Routines for archivers to read or write a gzip compressed data stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_gzip.h"
+#include "pg_backup_utils.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ /*
+ * A level of zero simply copies the input one block at the time. This
+ * is probably not what the user wanted when calling this interface.
+ */
+ if (cs->compression_spec.level == 0)
+ pg_fatal("requested to compress the archive yet no level was specified");
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ cs->compression_spec = compression_spec;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
+ cs->private_data = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ size_t ret;
+
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzwrite(gzfp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzgets(gzfp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ CFH->private_data = NULL;
+
+ return gzclose(gzfp);
+}
+
+static int
+Gzip_eof(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzeof(gzfp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ gzFile gzfp;
+ char mode_compression[32];
+
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
+ else
+ gzfp = gzopen(path, mode_compression);
+
+ if (gzfp == NULL)
+ return 1;
+
+ CFH->private_data = gzfp;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ CFH->open_func = Gzip_open;
+ CFH->open_write_func = Gzip_open_write;
+ CFH->read_func = Gzip_read;
+ CFH->write_func = Gzip_write;
+ CFH->gets_func = Gzip_gets;
+ CFH->getc_func = Gzip_getc;
+ CFH->close_func = Gzip_close;
+ CFH->eof_func = Gzip_eof;
+ CFH->get_error_func = Gzip_get_error;
+
+ CFH->compression_spec = compression_spec;
+
+ CFH->private_data = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+
+void
+InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..2392c697b4
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.h
+ * GZIP interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index ef033914ba..f47ba5e615 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -9,42 +9,54 @@
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
+ * underlying stream. The second API is a wrapper around fopen and
* friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
+ * the possible compression. The second API is aimed for the resulting
+ * files to be easily manipulated with an external compression utility
+ * program.
+ *
+ * This file also includes the implementation when compression is none for
+ * both API's.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
- *
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
- *
- * The interface is the same for compressed and uncompressed streams.
+ * AllocateCompressor, writeData, and EndCompressor. First you call
+ * AllocateCompressor, then write all the data by calling writeData as many
+ * times as needed, and finally EndCompressor. writeData will call the
+ * WriteFunc that was provided to AllocateCompressor for each chunk of
+ * compressed data.
+ *
+ * The interface for reading an archive consists of the same three functions:
+ * AllocateCompressor, readData, and EndCompressor. First you call
+ * AllocateCompressor, then read all the data by calling readData to read the
+ * whole compressed stream which repeatedly calls the given ReadFunc. ReadFunc
+ * returns the compressed data one chunk at a time. Then readData decompresses
+ * it and passes the decompressed data to ahwrite(), until ReadFunc returns 0
+ * to signal EOF. The interface is the same for compressed and uncompressed
+ * streams.
*
* Compressed stream API
* ----------------------
*
- * The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * The compressed stream API is providing a set of function pointers for
+ * opening, reading, writing, and finally closing files. The implemented
+ * function pointers are documented in the corresponding header file and are
+ * common for all streams. It allows the caller to use the same functions for
+ * both compressed and uncompressed streams.
+ *
+ * The interface consists of three functions, InitCompressFileHandle,
+ * InitDiscoverCompressFileHandle, and EndCompressFileHandle. If the
+ * compression is known, then start by calling InitCompressFileHandle,
+ * otherwise discover it by using InitDiscoverCompressFileHandle. Then call
+ * the function pointers as required for the read/write operations. Finally
+ * call EndCompressFileHandle to end the stream.
+ *
+ * InitDiscoverCompressFileHandle tries to deffer the compression by the
+ * filename suffix. If the suffix is not yet known, then it tries to simply
+ * open the file, and if it fails, it tries to open the same file with the .gz
+ * suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -53,122 +65,38 @@
*/
#include "postgres_fe.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
+#include "compress_gzip.h"
#include "compress_io.h"
+#include "compress_none.h"
#include "pg_backup_utils.h"
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#endif
-
/*----------------------
* Compressor API
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
-{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
-
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
-
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
-
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-
-/* Public interface routines */
-
-/* Allocate a new compressor */
+/*
+ * Allocate a new compressor.
+ */
CompressorState *
AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
- cs->compression_spec = compression_spec;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
-}
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF)
-{
if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
-}
+ InitCompressorNone(cs, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressorGzip(cs, compression_spec);
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ return cs;
}
/*
@@ -177,233 +105,31 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
-#endif
- free(cs);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
- */
-
-static void
-InitCompressorZlib(CompressorState *cs, int level)
-{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
-}
-
-static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
-{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
-
- free(cs->zlibOut);
- free(cs->zp);
-}
-
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
-{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
-
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
-
- if (res == Z_STREAM_END)
- break;
- }
-}
-
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
-}
-
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
-{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
-
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
-
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
-
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
-}
-#endif /* HAVE_LIBZ */
-
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
-{
- size_t cnt;
- char *buf;
- size_t buflen;
-
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
-
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
-
- free(buf);
-}
-
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->writeF(AH, data, dLen);
-}
-
-
/*----------------------
* Compressed stream API
*----------------------
*/
/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
+ * Private routines
*/
-struct cfp
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
-};
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
+}
/* free() without changing errno; useful in several places below */
static void
@@ -416,324 +142,102 @@ free_keep_errno(void *p)
}
/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
+ * Public interface
*/
-cfp *
-cfopen_read(const char *path, const char *mode)
-{
- cfp *fp;
-
- pg_compress_specification compression_spec = {0};
-
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
-
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
-}
/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- cfp *fp;
-
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
-}
-
-/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
+ * Initialize a compress file handle for the specified compression algorithm.
*/
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compression_spec)
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compression_spec)
{
- cfp *fp = pg_malloc0(sizeof(cfp));
+ CompressFileHandle *CFH;
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode_compression);
- else
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode);
- else
- fp->compressedfp = gzopen(path, mode);
- }
-
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
- else
- {
- if (fd >= 0)
- fp->uncompressedfp = fdopen(fd, mode);
- else
- fp->uncompressedfp = fopen(path, mode);
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
- }
+ if (compression_spec.algorithm == PG_COMPRESSION_NONE)
+ InitCompressFileHandleNone(CFH, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressFileHandleGzip(CFH, compression_spec);
- return fp;
+ return CFH;
}
/*
- * Opens file 'path' in 'mode' and compression as defined in
- * compression_spec. The caller must verify that the compression
- * is supported by the current build.
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
*
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- return cfopen_internal(path, -1, mode, compression_spec);
-}
-
-/*
- * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
- * and compression as defined in compression_spec. The caller must
- * verify that the compression is supported by the current build.
+ * If the file at 'path' contains the suffix of a supported compression method,
+ * currently this includes only ".gz", then this compression will be used
+ * throughout. Otherwise the compression will be deferred by iteratively trying
+ * to open the file at 'path', first as is, then by appending known compression
+ * suffixes. So if you pass "foo" as 'path', this will open either "foo" or
+ * "foo.gz", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec)
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- return cfopen_internal(NULL, fd, mode, compression_spec);
-}
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
-int
-cfread(void *ptr, int size, cfp *fp)
-{
- int ret;
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (size == 0)
- return 0;
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ fname = strdup(path);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
else
-#endif
{
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
- }
- return ret;
-}
-
-int
-cfwrite(const void *ptr, int size, cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
-#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
-}
-
-int
-cfgetc(cfp *fp)
-{
- int ret;
+ bool exists;
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not built with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
+ if (!exists)
{
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
- }
- else
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
}
- return ret;
-}
-
-char *
-cfgets(cfp *fp, char *buf, int len)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
-#endif
- return fgets(buf, len, fp->uncompressedfp);
-}
-
-int
-cfclose(cfp *fp)
-{
- int result;
-
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
- }
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ CFH = InitCompressFileHandle(compression_spec);
+ if (CFH->open_func(fname, -1, mode, CFH))
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
+ free_keep_errno(CFH);
+ CFH = NULL;
}
- else
-#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
- }
- free_keep_errno(fp);
+ free_keep_errno(fname);
- return result;
+ return CFH;
}
+/*
+ * Close an open file handle and release its memory.
+ *
+ * On failure, returns an error value and sets errno appropriately.
+ */
int
-cfeof(cfp *fp)
+EndCompressFileHandle(CompressFileHandle *CFH)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
-#endif
- return feof(fp->uncompressedfp);
-}
+ int ret = 0;
-const char *
-get_cfp_error(cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ if (CFH->private_data)
+ ret = CFH->close_func(CFH);
- if (errnum != Z_ERRNO)
- return errmsg;
- }
-#endif
- return strerror(errno);
-}
+ free_keep_errno(CFH);
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
-{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
-
- if (filenamelen < suffixlen)
- return 0;
-
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
+ return ret;
}
-
-#endif
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a6cdf588dd..a7a4fec036 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -21,50 +21,160 @@
#define ZLIB_OUT_SIZE 4096
#define ZLIB_IN_SIZE 4096
-/* Prototype for callback function to WriteDataToArchive() */
+/*
+ * Prototype for callback function used in writeData()
+ */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
/*
- * Prototype for callback function to ReadDataFromArchive()
+ * Prototype for callback function used in readData()
*
- * ReadDataFromArchive will call the read function repeatedly, until it
- * returns 0 to signal EOF. ReadDataFromArchive passes a buffer to read the
- * data into in *buf, of length *buflen. If that's not big enough for the
- * callback function, it can free() it and malloc() a new one, returning the
- * new buffer and its size in *buf and *buflen.
+ * readData will call the read function repeatedly, until it returns 0 to signal
+ * EOF. readData passes a buffer to read the data into in *buf, of length
+ * *buflen. If that's not big enough for the callback function, it can free() it
+ * and malloc() a new one, returning the new buffer and its size in *buf and
+ * *buflen.
*
* Returns the number of bytes read into *buf, or 0 on EOF.
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+
+ /*
+ * End compression and flush internal buffers if any.
+ */
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Callback function to read from an already processed input stream
+ */
+ ReadFunc readF;
+
+ /*
+ * Callback function to write an already processed chunk of data.
+ */
+ WriteFunc writeF;
+
+ /*
+ * Compression specification for this state.
+ */
+ pg_compress_specification compression_spec;
+
+ /*
+ * Private data to be used by the compressor.
+ */
+ void *private_data;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ /*
+ * Open a file in mode.
+ *
+ * Pass either 'path' or 'fd' depending on whether a filepath or a file
+ * descriptor is available. 'mode' can be one of 'r', 'rb', 'w', 'wb',
+ * 'a', and 'ab'. Requrires an already initialized CompressFileHandle.
+ */
+ int (*open_func) (const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH);
+
+ /*
+ * Open a file for writing.
+ *
+ * 'mode' can be one of ''w', 'wb', 'a', and 'ab'. Requrires an already
+ * initialized CompressFileHandle.
+ */
+ int (*open_write_func) (const char *path, const char *mode,
+ CompressFileHandle *CFH);
-typedef struct cfp cfp;
+ /*
+ * Read 'size' bytes of data from the file and store them into 'ptr'.
+ */
+ size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+ /*
+ * Write 'size' bytes of data into the file from 'ptr'.
+ */
+ size_t (*write_func) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ /*
+ * Read at most size - 1 characters from the compress file handle into
+ * 's'.
+ *
+ * Stop if an EOF or a newline is found first. 's' is always null
+ * terminated and contains the newline if it was found.
+ */
+ char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
+
+ /*
+ * Read the next character from the compress file handle as 'unsigned
+ * char' cast into 'int'.
+ */
+ int (*getc_func) (CompressFileHandle *CFH);
+
+ /*
+ * Test if EOF is reached in the compress file handle.
+ */
+ int (*eof_func) (CompressFileHandle *CFH);
+
+ /*
+ * Close an open file handle.
+ */
+ int (*close_func) (CompressFileHandle *CFH);
+
+ /*
+ * Get a pointer to a string that describes an error that occured during a
+ * compress file handle operation.
+ */
+ const char *(*get_error_func) (CompressFileHandle *CFH);
+
+ /*
+ * Compression specification for this file handle.
+ */
+ pg_compress_specification compression_spec;
+
+ /*
+ * Private data to be used by the compressor.
+ */
+ void *private_data;
+};
+
+/*
+ * Initialize a compress file handle with the requested compression.
+ */
+extern CompressFileHandle *InitCompressFileHandle(const pg_compress_specification compression_spec);
+
+/*
+ * Initialize a compress file stream. Deffer the compression algorithm
+ * from 'path', either by examining its suffix or by appending the supported
+ * suffixes in 'path'.
+ */
+extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int EndCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/compress_none.c b/src/bin/pg_dump/compress_none.c
new file mode 100644
index 0000000000..ecbcf4b04a
--- /dev/null
+++ b/src/bin/pg_dump/compress_none.c
@@ -0,0 +1,206 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_none.c
+ * Routines for archivers to read or write an uncompressed stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_none.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_none.h"
+#include "pg_backup_utils.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+/*
+ * Private routines
+ */
+
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
+
+ free(buf);
+}
+
+
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+/*
+ * Public interface
+ */
+
+void
+InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+
+ cs->compression_spec = compression_spec;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * Private routines
+ */
+
+static size_t
+read_none(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ size_t ret;
+
+ if (size == 0)
+ return 0;
+
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
+
+ return ret;
+}
+
+static size_t
+write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
+}
+
+static const char *
+get_error_none(CompressFileHandle *CFH)
+{
+ return strerror(errno);
+}
+
+static char *
+gets_none(char *ptr, int size, CompressFileHandle *CFH)
+{
+ return fgets(ptr, size, (FILE *) CFH->private_data);
+}
+
+static int
+getc_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret;
+
+ ret = fgetc(fp);
+ if (ret == EOF)
+ {
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static int
+close_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret = 0;
+
+ CFH->private_data = NULL;
+
+ if (fp)
+ ret = fclose(fp);
+
+ return ret;
+}
+
+static int
+eof_none(CompressFileHandle *CFH)
+{
+ return feof((FILE *) CFH->private_data);
+}
+
+static int
+open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
+
+ if (fd >= 0)
+ CFH->private_data = fdopen(dup(fd), mode);
+ else
+ CFH->private_data = fopen(path, mode);
+
+ if (CFH->private_data == NULL)
+ return 1;
+
+ return 0;
+}
+
+static int
+open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
+
+ CFH->private_data = fopen(path, mode);
+ if (CFH->private_data == NULL)
+ return 1;
+
+ return 0;
+}
+
+/*
+ * Public interface
+ */
+
+void
+InitCompressFileHandleNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ CFH->open_func = open_none;
+ CFH->open_write_func = open_write_none;
+ CFH->read_func = read_none;
+ CFH->write_func = write_none;
+ CFH->gets_func = gets_none;
+ CFH->getc_func = getc_none;
+ CFH->close_func = close_none;
+ CFH->eof_func = eof_none;
+ CFH->get_error_func = get_error_none;
+
+ CFH->private_data = NULL;
+}
diff --git a/src/bin/pg_dump/compress_none.h b/src/bin/pg_dump/compress_none.h
new file mode 100644
index 0000000000..143e599819
--- /dev/null
+++ b/src/bin/pg_dump/compress_none.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_none.h
+ * Uncompressed interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_none.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_NONE_H_
+#define _COMPRESS_NONE_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_NONE_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index ca62f9a374..84e9f0defa 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,7 +1,9 @@
# Copyright (c) 2022-2023, PostgreSQL Global Development Group
pg_dump_common_sources = files(
+ 'compress_gzip.c',
'compress_io.c',
+ 'compress_none.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 5f52ea40e8..04cf887526 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -355,7 +355,7 @@ RestoreArchive(Archive *AHX)
bool parallel_mode;
bool supports_compression;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1134,7 +1134,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1150,9 +1150,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1509,6 +1510,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1531,33 +1533,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1696,7 +1697,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2038,6 +2043,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2068,26 +2085,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2185,6 +2188,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2240,7 +2244,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3653,12 +3660,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3729,10 +3731,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
@@ -3744,10 +3747,17 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
+ if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
+ {
+ bool unsupported = false;
+
#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ unsupported = true;
#endif
+ if (unsupported)
+ pg_fatal("archive is compressed, but this installation does not support compression");
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 4725e49747..18b38c17ab 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,13 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add
+ * compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index d1e54644a9..512ab043af 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 6800c3ccee..a2f88995c0 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read_func(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (EndCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof_func(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write_func(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc_func(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write_func(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read_func(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compression_spec);
+ if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
+ if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write_func(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index d92247c915..78454928cc 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -94,7 +94,7 @@ my %pgdump_runs = (
command => [
'pg_restore', '-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip-compressed'
},
},
@@ -239,8 +239,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default if available',
},
},
@@ -264,8 +264,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
glob_patterns => [
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 9a471e40aa..d44ebb06cc 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -14,6 +14,10 @@
#ifndef PG_COMPRESSION_H
#define PG_COMPRESSION_H
+/*
+ * These values are stored in disc, for example in files generated by pg_dump.
+ * Create the necessary backwards compatibility layers if their order changes.
+ */
typedef enum pg_compress_algorithm
{
PG_COMPRESSION_NONE,
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index e52fe9f509..db429474a2 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -150,7 +150,9 @@ do
# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.
+ test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
test "$f" = src/bin/pg_dump/pg_dump.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 51484ca7e2..7e62c6ef3d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -428,6 +428,7 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
+CompressFileHandle
CompressionLocation
CompressorState
ComputeXidHorizonsResult
@@ -1034,6 +1035,7 @@ GucStack
GucStackState
GucStringAssignHook
GucStringCheckHook
+GzipCompressorState
HANDLE
HASHACTION
HASHBUCKET
--
2.34.1
v25-0003-Add-LZ4-compression-to-pg_dump.patchtext/x-patch; name=v25-0003-Add-LZ4-compression-to-pg_dump.patchDownload
From a240440a9ec73d0e156aa68c1ec3b7eb476a4f99 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 25 Jan 2023 14:46:07 +0000
Subject: [PATCH v25 3/3] Add LZ4 compression to pg_dump
This is mostly done within pg_dump's compression streaming and file APIs.
It is confined within the newly introduced compress_lz4.{c,h} files.
The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(). Where the
functionality was missing from the official API, it has been implemented
localy.
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 20 +-
src/bin/pg_dump/compress_lz4.c | 626 ++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 24 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 17 +-
src/bin/pg_dump/pg_backup_directory.c | 9 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 1 +
12 files changed, 786 insertions(+), 22 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e1..49d218905f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 0013bc080c..eb8f59459a 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -26,6 +27,7 @@ OBJS = \
$(WIN32RES) \
compress_gzip.o \
compress_io.o \
+ compress_lz4.o \
compress_none.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index f47ba5e615..55db63ac7a 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,7 +56,7 @@
* InitDiscoverCompressFileHandle tries to deffer the compression by the
* filename suffix. If the suffix is not yet known, then it tries to simply
* open the file, and if it fails, it tries to open the same file with the .gz
- * suffix.
+ * suffix, and then again with the .lz4 suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -70,6 +70,7 @@
#include "compress_gzip.h"
#include "compress_io.h"
+#include "compress_lz4.h"
#include "compress_none.h"
#include "pg_backup_utils.h"
@@ -95,6 +96,8 @@ AllocateCompressor(const pg_compress_specification compression_spec,
InitCompressorNone(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressorGzip(cs, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ InitCompressorLZ4(cs, compression_spec);
return cs;
}
@@ -159,6 +162,8 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
InitCompressFileHandleNone(CFH, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressFileHandleGzip(CFH, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ InitCompressFileHandleLZ4(CFH, compression_spec);
return CFH;
}
@@ -172,7 +177,7 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
* throughout. Otherwise the compression will be deferred by iteratively trying
* to open the file at 'path', first as is, then by appending known compression
* suffixes. So if you pass "foo" as 'path', this will open either "foo" or
- * "foo.gz", trying in that order.
+ * "foo.gz" or "foo.lz4", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
@@ -210,6 +215,17 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
+#endif
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..ee74cc8e28
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,626 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.c
+ * Routines for archivers to write a LZ4 compressed data stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private_data;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private_data = NULL;
+ }
+}
+
+
+/*
+ * Public routines that support LZ4 compressed data I/O
+ */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ cs->compression_spec = compression_spec;
+
+ /* Will be lazy init'd */
+ cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the necessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurrence of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verify that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (!eol_flag) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the beginning of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int remaining = size;
+
+ /* Lazy init */
+ if (LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+/*
+ * Public routines
+ */
+void
+InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ LZ4File *lz4fp;
+
+ CFH->open_func = LZ4File_open;
+ CFH->open_write_func = LZ4File_open_write;
+ CFH->read_func = LZ4File_read;
+ CFH->write_func = LZ4File_write;
+ CFH->gets_func = LZ4File_gets;
+ CFH->getc_func = LZ4File_getc;
+ CFH->eof_func = LZ4File_eof;
+ CFH->close_func = LZ4File_close;
+ CFH->get_error_func = LZ4File_get_error;
+
+ CFH->compression_spec = compression_spec;
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (CFH->compression_spec.level >= 0)
+ lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
+
+ CFH->private_data = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+
+void
+InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..40dbe00d46
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.h
+ * LZ4 interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 84e9f0defa..0da476a4c3 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -3,6 +3,7 @@
pg_dump_common_sources = files(
'compress_gzip.c',
'compress_io.c',
+ 'compress_lz4.c',
'compress_none.c',
'dumputils.c',
'parallel.c',
@@ -18,7 +19,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -86,7 +87,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 04cf887526..52c9a4a634 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -385,7 +385,8 @@ RestoreArchive(Archive *AHX)
*/
supports_compression = false;
if (AH->compression_spec.algorithm == PG_COMPRESSION_NONE ||
- AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ AH->compression_spec.algorithm == PG_COMPRESSION_GZIP ||
+ AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
supports_compression = true;
if (AH->PrintTocDataPtr != NULL)
@@ -397,6 +398,10 @@ RestoreArchive(Archive *AHX)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
supports_compression = false;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ supports_compression = false;
#endif
if (supports_compression == false)
pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
@@ -2081,7 +2086,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2091,6 +2096,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -3754,6 +3763,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
unsupported = true;
+#endif
+#ifndef USE_LZ4
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ unsupported = true;
#endif
if (unsupported)
pg_fatal("archive is compressed, but this installation does not support compression");
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index a2f88995c0..ce2a0838fa 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -779,10 +779,13 @@ _PrepParallelRestore(ArchiveHandle *AH)
if (stat(fname, &st) == 0)
te->dataLength = st.st_size;
- else
+ else if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
{
- /* It might be compressed */
- strlcat(fname, ".gz", sizeof(fname));
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ strlcat(fname, ".gz", sizeof(fname));
+ else if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ strlcat(fname, ".lz4", sizeof(fname));
+
if (stat(fname, &st) == 0)
te->dataLength = st.st_size;
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..08105337b1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 78454928cc..72b19ee6cd 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -139,6 +139,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files were compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4175,11 +4249,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index db429474a2..2c5042eb41 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -152,6 +152,7 @@ do
# as field names, which is unfortunate but we won't change it now.
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_lz4.h && continue
test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 7e62c6ef3d..692c5ceb0b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1383,6 +1383,7 @@ LWLock
LWLockHandle
LWLockMode
LWLockPadded
+LZ4CompressorState
LZ4F_compressionContext_t
LZ4F_decompressOptions_t
LZ4F_decompressionContext_t
--
2.34.1
On 1/25/23 16:37, gkokolatos@pm.me wrote:
------- Original Message -------
On Wednesday, January 25th, 2023 at 2:42 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:On Tue, Jan 24, 2023 at 03:56:20PM +0000, gkokolatos@pm.me wrote:
On Monday, January 23rd, 2023 at 7:00 PM, Justin Pryzby pryzby@telsasoft.com wrote:
On Mon, Jan 23, 2023 at 05:31:55PM +0000, gkokolatos@pm.me wrote:
Please find attached v23 which reintroduces the split.
0001 is reworked to have a reduced footprint than before. Also in an attempt
to facilitate the readability, 0002 splits the API's and the uncompressed
implementation in separate files.Thanks for updating the patch. Could you address the review comments I
sent here ?
/messages/by-id/20230108194524.GA27637@telsasoft.comPlease find v24 attached.
Thanks for updating the patch.
In 001, RestoreArchive() does:
-#ifndef HAVE_LIBZ - if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP && - AH->PrintTocDataPtr != NULL) + supports_compression = false; + if (AH->compression_spec.algorithm == PG_COMPRESSION_NONE || + AH->compression_spec.algorithm == PG_COMPRESSION_GZIP) + supports_compression = true; + + if (AH->PrintTocDataPtr != NULL) { for (te = AH->toc->next; te != AH->toc; te = te->next) { if (te->hadDumper && (te->reqs & REQ_DATA) != 0) - pg_fatal("cannot restore from compressed archive (compression not supported in this installation)"); + { +#ifndef HAVE_LIBZ + if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP) + supports_compression = false; +#endif + if (supports_compression == false) + pg_fatal("cannot restore from compressed archive (compression not supported in this installation)"); + } } } -#endifThis first checks if the algorithm is implemented, and then checks if
the algorithm is supported by the current build - that confused me for a
bit. It seems unnecessary to check for unimplemented algorithms before
looping. That also requires referencing both GZIP and LZ4 in two
places.I am not certain that it is unnecessary, at least not in the way that is
described. The idea is that new compression methods can be added, without
changing the archive's version number. It is very possible that it is
requested to restore an archive compressed with a method not implemented
in the current binary. The first check takes care of that and sets
supports_compression only for the supported versions. It is possible to
enter the loop with supports_compression already set to false, for example
because the archive was compressed with ZSTD, triggering the fatal error.Of course, one can throw the error before entering the loop, yet I think
that it does not help the readability of the code. IMHO it is easier to
follow if the error is thrown once during that check.
Actually, I don't understand why 0001 moves the check into the loop. I
mean, why not check HAVE_LIBZ before the loop?
I think it could be written to avoid the need to change for added
compression algorithms:+ if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
+ { + /* Check if the compression algorithm is supported */ + pg_compress_specification spec; + parse_compress_specification(AH->compression_spec.algorithm, NULL, &spec);+ if (spec->parse_error != NULL)
+ pg_fatal(spec->parse_error);
+ }
I am not certain how that would work in the example with ZSTD above.
If I am not wrong, parse_compress_specification() will not throw an error
if the codebase supports ZSTD, yet this specific pg_dump binary will not
support it because ZSTD is not implemented. parse_compress_specification()
is not aware of that and should not be aware of it, should it?
Not sure. What happens in a similar situation now? That is, when trying
to deal with an archive gzip-compressed in a build without libz?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Jan 25, 2023 at 03:37:12PM +0000, gkokolatos@pm.me wrote:
Of course, one can throw the error before entering the loop, yet I think
that it does not help the readability of the code. IMHO it is easier to
follow if the error is thrown once during that check.
If anything, I can suggest to throw an error much earlier, i.e. in ReadHead(),
and remove altogether this check. On the other hand, I like the belts
and suspenders approach because there are no more checks after this point.
While looking at this, I realized that commit 5e73a6048 introduced a
regression:
@@ -3740,19 +3762,24 @@ ReadHead(ArchiveHandle *AH)
- if (AH->compression != 0)
- pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ pg_fatal("archive is compressed, but this installation does not support compression");
Before, it was possible to restore non-data chunks of a dump file, even
if the current build didn't support its compression. But that's now
impossible - and it makes the code we're discussing in RestoreArchive()
unreachable.
I don't think we can currently test for that, since it requires creating a dump
using a build --with compression and then trying to restore using a build
--without compression. The coverage report disagrees with me, though...
https://coverage.postgresql.org/src/bin/pg_dump/pg_backup_archiver.c.gcov.html#3901
I think it could be written to avoid the need to change for added
compression algorithms:
...
I am not certain how that would work in the example with ZSTD above.
If I am not wrong, parse_compress_specification() will not throw an error
if the codebase supports ZSTD, yet this specific pg_dump binary will not
support it because ZSTD is not implemented. parse_compress_specification()
is not aware of that and should not be aware of it, should it?
You're right.
I think the 001 patch should try to remove hardcoded references to
LIBZ/GZIP, such that the later patches don't need to update those same
places for LZ4. For example in ReadHead() and RestoreArchive(), and
maybe other places dealing with file extensions. Maybe that could be
done by adding a function specific to pg_dump indicating whether or not
an algorithm is implemented and supported.
--
Justin
------- Original Message -------
On Wednesday, January 25th, 2023 at 6:28 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 1/25/23 16:37, gkokolatos@pm.me wrote:
------- Original Message -------
On Wednesday, January 25th, 2023 at 2:42 AM, Justin Pryzby pryzby@telsasoft.com wrote:On Tue, Jan 24, 2023 at 03:56:20PM +0000, gkokolatos@pm.me wrote:
On Monday, January 23rd, 2023 at 7:00 PM, Justin Pryzby pryzby@telsasoft.com wrote:
On Mon, Jan 23, 2023 at 05:31:55PM +0000, gkokolatos@pm.me wrote:
Please find attached v23 which reintroduces the split.
0001 is reworked to have a reduced footprint than before. Also in an attempt
to facilitate the readability, 0002 splits the API's and the uncompressed
implementation in separate files.Thanks for updating the patch. Could you address the review comments I
sent here ?
/messages/by-id/20230108194524.GA27637@telsasoft.comPlease find v24 attached.
Thanks for updating the patch.
In 001, RestoreArchive() does:
-#ifndef HAVE_LIBZ - if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP && - AH->PrintTocDataPtr != NULL) + supports_compression = false; + if (AH->compression_spec.algorithm == PG_COMPRESSION_NONE || + AH->compression_spec.algorithm == PG_COMPRESSION_GZIP) + supports_compression = true; + + if (AH->PrintTocDataPtr != NULL) { for (te = AH->toc->next; te != AH->toc; te = te->next) { if (te->hadDumper && (te->reqs & REQ_DATA) != 0) - pg_fatal("cannot restore from compressed archive (compression not supported in this installation)"); + { +#ifndef HAVE_LIBZ + if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP) + supports_compression = false; +#endif + if (supports_compression == false) + pg_fatal("cannot restore from compressed archive (compression not supported in this installation)"); + } } } -#endifThis first checks if the algorithm is implemented, and then checks if
the algorithm is supported by the current build - that confused me for a
bit. It seems unnecessary to check for unimplemented algorithms before
looping. That also requires referencing both GZIP and LZ4 in two
places.I am not certain that it is unnecessary, at least not in the way that is
described. The idea is that new compression methods can be added, without
changing the archive's version number. It is very possible that it is
requested to restore an archive compressed with a method not implemented
in the current binary. The first check takes care of that and sets
supports_compression only for the supported versions. It is possible to
enter the loop with supports_compression already set to false, for example
because the archive was compressed with ZSTD, triggering the fatal error.Of course, one can throw the error before entering the loop, yet I think
that it does not help the readability of the code. IMHO it is easier to
follow if the error is thrown once during that check.Actually, I don't understand why 0001 moves the check into the loop. I
mean, why not check HAVE_LIBZ before the loop?
The intention is to be able to restore archives that don't contain
data. In that case compression becomes irrelevant as only the data in
an archive is compressed.
I think it could be written to avoid the need to change for added
compression algorithms:+ if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
+ { + /* Check if the compression algorithm is supported */ + pg_compress_specification spec; + parse_compress_specification(AH->compression_spec.algorithm, NULL, &spec);+ if (spec->parse_error != NULL)
+ pg_fatal(spec->parse_error);
+ }
I am not certain how that would work in the example with ZSTD above.
If I am not wrong, parse_compress_specification() will not throw an error
if the codebase supports ZSTD, yet this specific pg_dump binary will not
support it because ZSTD is not implemented. parse_compress_specification()
is not aware of that and should not be aware of it, should it?Not sure. What happens in a similar situation now? That is, when trying
to deal with an archive gzip-compressed in a build without libz?
In case that there are no data chunks, the archive will be restored.
Cheers,
//Georgios
Show quoted text
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
------- Original Message -------
On Wednesday, January 25th, 2023 at 7:00 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Wed, Jan 25, 2023 at 03:37:12PM +0000, gkokolatos@pm.me wrote:
While looking at this, I realized that commit 5e73a6048 introduced a
regression:@@ -3740,19 +3762,24 @@ ReadHead(ArchiveHandle *AH)
- if (AH->compression != 0)
- pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available"); + if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)+ pg_fatal("archive is compressed, but this installation does not support compression");
Before, it was possible to restore non-data chunks of a dump file, even
if the current build didn't support its compression. But that's now
impossible - and it makes the code we're discussing in RestoreArchive()
unreachable.
Nice catch!
Cheers,
//Georgios
Show quoted text
--
Justin
On Wed, Jan 25, 2023 at 12:00:20PM -0600, Justin Pryzby wrote:
While looking at this, I realized that commit 5e73a6048 introduced a
regression:@@ -3740,19 +3762,24 @@ ReadHead(ArchiveHandle *AH)
- if (AH->compression != 0) - pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available"); + if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP) + pg_fatal("archive is compressed, but this installation does not support compression");Before, it was possible to restore non-data chunks of a dump file, even
if the current build didn't support its compression. But that's now
impossible - and it makes the code we're discussing in RestoreArchive()
unreachable.
Right. The impacts the possibility of looking at the header data,
which is useful with pg_restore -l for example. On a dump that's been
compressed, pg_restore <= 15 would always print the TOC entries with
or without compression support. On HEAD, this code prevents the
header lookup. All *nix or BSD platforms should have support for
zlib, I hope.. Still that could be an issue on Windows, and this
would prevent folks to check the contents of the dumps after saving it
on a WIN32 host, so let's undo that.
So, I have been testing the attached with four sets of binaries from
15/HEAD and with[out] zlib support, and this brings HEAD back to the
pre-15 state (header information able to show up, still failure when
attempting to restore the dump's data without zlib).
I don't think we can currently test for that, since it requires creating a dump
using a build --with compression and then trying to restore using a build
--without compression.
Right, the location of the data is in the header, and I don't see how
you would be able to do that without two sets of binaries at hand, but
our tests run under the assumption that you have only one. Well,
that's not entirely true as well, as you could create a TAP test like
pg_upgrade that relies on a environment variable pointing to a second
set of binaries. That's not worth the complication involved, IMO.
The coverage report disagrees with me, though...
https://coverage.postgresql.org/src/bin/pg_dump/pg_backup_archiver.c.gcov.html#3901
Isn't that one of the tests like compression_gzip_plain?
Thoughts?
--
Michael
Attachments:
dump-header-compress.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index ba5e6acbbb..cb4386f871 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -3784,7 +3784,7 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
+ pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
#endif
if (AH->version >= K_VERS_1_4)
On Thu, Jan 26, 2023 at 02:49:27PM +0900, Michael Paquier wrote:
On Wed, Jan 25, 2023 at 12:00:20PM -0600, Justin Pryzby wrote:
While looking at this, I realized that commit 5e73a6048 introduced a
regression:@@ -3740,19 +3762,24 @@ ReadHead(ArchiveHandle *AH)
- if (AH->compression != 0) - pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available"); + if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP) + pg_fatal("archive is compressed, but this installation does not support compression");Before, it was possible to restore non-data chunks of a dump file, even
if the current build didn't support its compression. But that's now
impossible - and it makes the code we're discussing in RestoreArchive()
unreachable.Right. The impacts the possibility of looking at the header data,
which is useful with pg_restore -l for example.
It's not just header data - it's schema and (I think) everything other
than table data.
The coverage report disagrees with me, though...
https://coverage.postgresql.org/src/bin/pg_dump/pg_backup_archiver.c.gcov.html#3901Isn't that one of the tests like compression_gzip_plain?
I'm not sure what you mean. Plain dump is restored with psql and not
with pg_restore.
My line number was wrong:
https://coverage.postgresql.org/src/bin/pg_dump/pg_backup_archiver.c.gcov.html#390
What test would hit that code without rebuilding ?
394 : #ifndef HAVE_LIBZ
395 : if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
Thoughts? #ifndef HAVE_LIBZ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP) - pg_fatal("archive is compressed, but this installation does not support compression"); + pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
Your patch is fine for now, but these errors should eventually specify
*which* compression algorithm is unavailable. I think that should be a
part of the 001 patch, ideally in a way that minimizes the number of
places which need to be updated when adding an algorithm.
--
Justin
------- Original Message -------
On Thursday, January 26th, 2023 at 7:28 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Thu, Jan 26, 2023 at 02:49:27PM +0900, Michael Paquier wrote:
On Wed, Jan 25, 2023 at 12:00:20PM -0600, Justin Pryzby wrote:
While looking at this, I realized that commit 5e73a6048 introduced a
regression:@@ -3740,19 +3762,24 @@ ReadHead(ArchiveHandle *AH)
- if (AH->compression != 0) - pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available"); + if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP) + pg_fatal("archive is compressed, but this installation does not support compression");Before, it was possible to restore non-data chunks of a dump file, even
if the current build didn't support its compression. But that's now
impossible - and it makes the code we're discussing in RestoreArchive()
unreachable.Right. The impacts the possibility of looking at the header data,
which is useful with pg_restore -l for example.It's not just header data - it's schema and (I think) everything other
than table data.The coverage report disagrees with me, though...
https://coverage.postgresql.org/src/bin/pg_dump/pg_backup_archiver.c.gcov.html#3901Isn't that one of the tests like compression_gzip_plain?
I'm not sure what you mean. Plain dump is restored with psql and not
with pg_restore.My line number was wrong:
https://coverage.postgresql.org/src/bin/pg_dump/pg_backup_archiver.c.gcov.html#390What test would hit that code without rebuilding ?
394 : #ifndef HAVE_LIBZ
395 : if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&Thoughts? #ifndef HAVE_LIBZ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP) - pg_fatal("archive is compressed, but this installation does not support compression"); + pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");Your patch is fine for now, but these errors should eventually specify
which compression algorithm is unavailable. I think that should be a
part of the 001 patch, ideally in a way that minimizes the number of
places which need to be updated when adding an algorithm.
I gave this a little bit of thought. I think that ReadHead should not
emit a warning, or at least not this warning as it is slightly misleading.
It implies that it will automatically turn off data restoration, which is
false. Further ahead, the code will fail with a conflicting error message
stating that the compression is not available.
Instead, it would be cleaner both for the user and the maintainer to
move the check in RestoreArchive and make it the sole responsible for
this logic.
Please find v26 attached. 0001 does the above and 0002 addresses Justin's
complaints regarding the code footprint.
//Cheers,
Georgios
Show quoted text
--
Justin
Attachments:
v26-0002-Prepare-pg_dump-internals-for-additional-compres.patchtext/x-patch; name=v26-0002-Prepare-pg_dump-internals-for-additional-compres.patchDownload
From 30ae227823baf3694e5bb9f0bac715082a317f09 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 26 Jan 2023 11:11:14 +0000
Subject: [PATCH v26 2/4] Prepare pg_dump internals for additional compression
methods.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it and is using it throughout.
---
src/bin/pg_dump/compress_io.c | 92 +++++++++++++++++---
src/bin/pg_dump/compress_io.h | 4 +
src/bin/pg_dump/pg_backup_archiver.c | 125 +++++++++------------------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-----
4 files changed, 124 insertions(+), 124 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 7a2c80bbc4..e1733ce57c 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,36 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
+/*----------------------
+ * Generic functions
+ *----------------------
+ */
+
+char *
+supports_compression(const pg_compress_specification compression_spec)
+{
+ const pg_compress_algorithm algorithm = compression_spec.algorithm;
+ bool supported = false;
+
+ if (algorithm == PG_COMPRESSION_NONE)
+ supported = true;
+#ifdef HAVE_LIBZ
+ if (algorithm == PG_COMPRESSION_GZIP)
+ supported = true;
+#endif
+
+ if (!supported)
+ return psprintf("this build does not support compression with %s",
+ get_compress_algorithm_name(algorithm));
+
+ return NULL;
+}
+
+
/*----------------------
* Compressor API
*----------------------
@@ -490,16 +520,19 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_specification compression_spec)
{
- cfp *fp = pg_malloc(sizeof(cfp));
+ cfp *fp = pg_malloc0(sizeof(cfp));
if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
@@ -511,15 +544,20 @@ cfopen(const char *path, const char *mode,
snprintf(mode_compression, sizeof(mode_compression), "%s%d",
mode, compression_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
+ if (fd >= 0)
+ fp->compressedfp = gzdopen(fd, mode_compression);
+ else
+ fp->compressedfp = gzopen(path, mode_compression);
}
else
{
/* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
+ if (fd >= 0)
+ fp->compressedfp = gzdopen(fd, mode);
+ else
+ fp->compressedfp = gzopen(path, mode);
}
- fp->uncompressedfp = NULL;
if (fp->compressedfp == NULL)
{
free_keep_errno(fp);
@@ -531,10 +569,11 @@ cfopen(const char *path, const char *mode,
}
else
{
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
-#endif
- fp->uncompressedfp = fopen(path, mode);
+ if (fd >= 0)
+ fp->uncompressedfp = fdopen(fd, mode);
+ else
+ fp->uncompressedfp = fopen(path, mode);
+
if (fp->uncompressedfp == NULL)
{
free_keep_errno(fp);
@@ -545,6 +584,33 @@ cfopen(const char *path, const char *mode,
return fp;
}
+/*
+ * Opens file 'path' in 'mode' and compression as defined in
+ * compression_spec. The caller must verify that the compression
+ * is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+/*
+ * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
+ * and compression as defined in compression_spec. The caller must
+ * verify that the compression is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a429dc4789..8beb1058ec 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -21,6 +21,8 @@
#define ZLIB_OUT_SIZE 4096
#define ZLIB_IN_SIZE 4096
+extern char *supports_compression(const pg_compress_specification compression_spec);
+
/* Prototype for callback function to WriteDataToArchive() */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
@@ -54,6 +56,8 @@ typedef struct cfp cfp;
extern cfp *cfopen(const char *path, const char *mode,
const pg_compress_specification compression_spec);
+extern cfp *cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
extern cfp *cfopen_write(const char *path, const char *mode,
const pg_compress_specification compression_spec);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 25b1ea0026..f3b92b01d9 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -363,7 +354,7 @@ RestoreArchive(Archive *AHX)
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +382,20 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
- AH->PrintTocDataPtr != NULL)
+ if (AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore data from compressed archive (compression not supported in this installation)");
+ {
+ char *errmsg = supports_compression(AH->compression_spec);
+ if (errmsg)
+ pg_fatal("cannot restore data from compressed archive (%s)", errmsg);
+ else
+ break;
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1133,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1508,58 +1502,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1570,33 +1538,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1720,22 +1679,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2224,6 +2178,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2277,8 +2232,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b..4725e49747 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.34.1
v26-0001-Address-regression-in-pg_dump-s-ReadHead.patchtext/x-patch; name=v26-0001-Address-regression-in-pg_dump-s-ReadHead.patchDownload
From 2c2db63ede38c2e61fb655ab38303f4160ec3f6c Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 26 Jan 2023 09:58:10 +0000
Subject: [PATCH v26 1/4] Address regression in pg_dump's ReadHead
Commit 5e73a6048 upgraded the check for supported compression while parsing an
archive's header, from warning to fatal. Prior to this commit, it was possible
to restore a compressed archive's schema even when the compression was not
supported by the binary.
Before the abovementioned commit, a warning message would be emmited:
pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
This message is slightly miss-leading, as it can be interpreted that the binary
will actively switch into schema only mode, which is not the case. Instead a new
fatal message will appear informing that the compression is not available.
This commit chooses to remove the check in ReadHead altogether in favour of a
more comprehensive error message when checking the archive for data.
Regression spotted by: Justin Pryzby
---
src/bin/pg_dump/pg_backup_archiver.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index ba5e6acbbb..25b1ea0026 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -398,7 +398,7 @@ RestoreArchive(Archive *AHX)
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ pg_fatal("cannot restore data from compressed archive (compression not supported in this installation)");
}
}
#endif
@@ -3782,11 +3782,6 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
-#endif
-
if (AH->version >= K_VERS_1_4)
{
struct tm crtm;
--
2.34.1
v26-0004-Add-LZ4-compression-to-pg_dump.patchtext/x-patch; name=v26-0004-Add-LZ4-compression-to-pg_dump.patchDownload
From f05feb5c446cb37b580629fa51e0b5757ad09c61 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 25 Jan 2023 14:46:07 +0000
Subject: [PATCH v26 4/4] Add LZ4 compression to pg_dump
This is mostly done within pg_dump's compression streaming and file APIs.
It is confined within the newly introduced compress_lz4.{c,h} files.
The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(). Where the
functionality was missing from the official API, it has been implemented
localy.
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 24 +-
src/bin/pg_dump/compress_lz4.c | 626 ++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 24 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 6 +-
src/bin/pg_dump/pg_backup_directory.c | 9 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 1 +
12 files changed, 780 insertions(+), 21 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e1..49d218905f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 0013bc080c..eb8f59459a 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -26,6 +27,7 @@ OBJS = \
$(WIN32RES) \
compress_gzip.o \
compress_io.o \
+ compress_lz4.o \
compress_none.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index e4e008d2e1..6563f67689 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,7 +56,7 @@
* InitDiscoverCompressFileHandle tries to deffer the compression by the
* filename suffix. If the suffix is not yet known, then it tries to simply
* open the file, and if it fails, it tries to open the same file with the .gz
- * suffix.
+ * suffix, and then again with the .lz4 suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -70,6 +70,7 @@
#include "compress_gzip.h"
#include "compress_io.h"
+#include "compress_lz4.h"
#include "compress_none.h"
#include "pg_backup_utils.h"
@@ -90,6 +91,10 @@ supports_compression(const pg_compress_specification compression_spec)
if (algorithm == PG_COMPRESSION_GZIP)
supported = true;
#endif
+#ifdef USE_LZ4
+ if (algorithm == PG_COMPRESSION_LZ4)
+ supported = true;
+#endif
if (!supported)
return psprintf("this build does not support compression with %s",
@@ -121,6 +126,8 @@ AllocateCompressor(const pg_compress_specification compression_spec,
InitCompressorNone(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressorGzip(cs, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ InitCompressorLZ4(cs, compression_spec);
return cs;
}
@@ -185,6 +192,8 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
InitCompressFileHandleNone(CFH, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressFileHandleGzip(CFH, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ InitCompressFileHandleLZ4(CFH, compression_spec);
return CFH;
}
@@ -198,7 +207,7 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
* throughout. Otherwise the compression will be deferred by iteratively trying
* to open the file at 'path', first as is, then by appending known compression
* suffixes. So if you pass "foo" as 'path', this will open either "foo" or
- * "foo.gz", trying in that order.
+ * "foo.gz" or "foo.lz4", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
@@ -236,6 +245,17 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
+#endif
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..ee74cc8e28
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,626 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.c
+ * Routines for archivers to write a LZ4 compressed data stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private_data;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private_data = NULL;
+ }
+}
+
+
+/*
+ * Public routines that support LZ4 compressed data I/O
+ */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ cs->compression_spec = compression_spec;
+
+ /* Will be lazy init'd */
+ cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the necessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurrence of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verify that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (!eol_flag) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the beginning of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int remaining = size;
+
+ /* Lazy init */
+ if (LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+/*
+ * Public routines
+ */
+void
+InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ LZ4File *lz4fp;
+
+ CFH->open_func = LZ4File_open;
+ CFH->open_write_func = LZ4File_open_write;
+ CFH->read_func = LZ4File_read;
+ CFH->write_func = LZ4File_write;
+ CFH->gets_func = LZ4File_gets;
+ CFH->getc_func = LZ4File_getc;
+ CFH->eof_func = LZ4File_eof;
+ CFH->close_func = LZ4File_close;
+ CFH->get_error_func = LZ4File_get_error;
+
+ CFH->compression_spec = compression_spec;
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (CFH->compression_spec.level >= 0)
+ lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
+
+ CFH->private_data = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+
+void
+InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..40dbe00d46
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.h
+ * LZ4 interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 84e9f0defa..0da476a4c3 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -3,6 +3,7 @@
pg_dump_common_sources = files(
'compress_gzip.c',
'compress_io.c',
+ 'compress_lz4.c',
'compress_none.c',
'dumputils.c',
'parallel.c',
@@ -18,7 +19,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -86,7 +87,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 1405459723..717002bf92 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -2074,7 +2074,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2084,6 +2084,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index a2f88995c0..ce2a0838fa 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -779,10 +779,13 @@ _PrepParallelRestore(ArchiveHandle *AH)
if (stat(fname, &st) == 0)
te->dataLength = st.st_size;
- else
+ else if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
{
- /* It might be compressed */
- strlcat(fname, ".gz", sizeof(fname));
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ strlcat(fname, ".gz", sizeof(fname));
+ else if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ strlcat(fname, ".lz4", sizeof(fname));
+
if (stat(fname, &st) == 0)
te->dataLength = st.st_size;
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..08105337b1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 78454928cc..72b19ee6cd 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -139,6 +139,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files were compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4175,11 +4249,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index db429474a2..2c5042eb41 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -152,6 +152,7 @@ do
# as field names, which is unfortunate but we won't change it now.
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_lz4.h && continue
test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 7e62c6ef3d..692c5ceb0b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1383,6 +1383,7 @@ LWLock
LWLockHandle
LWLockMode
LWLockPadded
+LZ4CompressorState
LZ4F_compressionContext_t
LZ4F_decompressOptions_t
LZ4F_decompressionContext_t
--
2.34.1
v26-0003-Introduce-Compress-and-Compressor-API-in-pg_dump.patchtext/x-patch; name=v26-0003-Introduce-Compress-and-Compressor-API-in-pg_dump.patchDownload
From f523d2a2edee4b9b7804a2539230832b8d1bdf2c Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 25 Jan 2023 14:45:50 +0000
Subject: [PATCH v26 3/4] Introduce Compress and Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for file manipulation. The implementor of a new
compression method is now able to "simply" just add those definitions.
Additionaly custom compressed archives store the compression algorithm in their
header instead of the compression level. The header version number is bumped.
---
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_gzip.c | 401 ++++++++++++++
src/bin/pg_dump/compress_gzip.h | 24 +
src/bin/pg_dump/compress_io.c | 744 +++++---------------------
src/bin/pg_dump/compress_io.h | 166 +++++-
src/bin/pg_dump/compress_none.c | 206 +++++++
src/bin/pg_dump/compress_none.h | 24 +
src/bin/pg_dump/meson.build | 2 +
src/bin/pg_dump/pg_backup_archiver.c | 91 ++--
src/bin/pg_dump/pg_backup_archiver.h | 5 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 94 ++--
src/bin/pg_dump/t/002_pg_dump.pl | 10 +-
src/include/common/compression.h | 4 +
src/tools/pginclude/cpluspluscheck | 2 +
src/tools/pgindent/typedefs.list | 2 +
16 files changed, 1051 insertions(+), 749 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
create mode 100644 src/bin/pg_dump/compress_none.c
create mode 100644 src/bin/pg_dump/compress_none.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index ef1ed0f3e5..0013bc080c 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,7 +24,9 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
+ compress_none.o \
dumputils.o \
parallel.o \
pg_backup_archiver.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..24e68fd022
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,401 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.c
+ * Routines for archivers to read or write a gzip compressed data stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_gzip.h"
+#include "pg_backup_utils.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ /*
+ * A level of zero simply copies the input one block at the time. This
+ * is probably not what the user wanted when calling this interface.
+ */
+ if (cs->compression_spec.level == 0)
+ pg_fatal("requested to compress the archive yet no level was specified");
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ cs->compression_spec = compression_spec;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
+ cs->private_data = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ size_t ret;
+
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzwrite(gzfp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzgets(gzfp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ CFH->private_data = NULL;
+
+ return gzclose(gzfp);
+}
+
+static int
+Gzip_eof(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzeof(gzfp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ gzFile gzfp;
+ char mode_compression[32];
+
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
+ else
+ gzfp = gzopen(path, mode_compression);
+
+ if (gzfp == NULL)
+ return 1;
+
+ CFH->private_data = gzfp;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ CFH->open_func = Gzip_open;
+ CFH->open_write_func = Gzip_open_write;
+ CFH->read_func = Gzip_read;
+ CFH->write_func = Gzip_write;
+ CFH->gets_func = Gzip_gets;
+ CFH->getc_func = Gzip_getc;
+ CFH->close_func = Gzip_close;
+ CFH->eof_func = Gzip_eof;
+ CFH->get_error_func = Gzip_get_error;
+
+ CFH->compression_spec = compression_spec;
+
+ CFH->private_data = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+
+void
+InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..2392c697b4
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.h
+ * GZIP interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index e1733ce57c..e4e008d2e1 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -9,42 +9,54 @@
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
+ * underlying stream. The second API is a wrapper around fopen and
* friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
+ * the possible compression. The second API is aimed for the resulting
+ * files to be easily manipulated with an external compression utility
+ * program.
+ *
+ * This file also includes the implementation when compression is none for
+ * both API's.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
- *
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
+ * AllocateCompressor, writeData, and EndCompressor. First you call
+ * AllocateCompressor, then write all the data by calling writeData as many
+ * times as needed, and finally EndCompressor. writeData will call the
+ * WriteFunc that was provided to AllocateCompressor for each chunk of
+ * compressed data.
*
- * The interface is the same for compressed and uncompressed streams.
+ * The interface for reading an archive consists of the same three functions:
+ * AllocateCompressor, readData, and EndCompressor. First you call
+ * AllocateCompressor, then read all the data by calling readData to read the
+ * whole compressed stream which repeatedly calls the given ReadFunc. ReadFunc
+ * returns the compressed data one chunk at a time. Then readData decompresses
+ * it and passes the decompressed data to ahwrite(), until ReadFunc returns 0
+ * to signal EOF. The interface is the same for compressed and uncompressed
+ * streams.
*
* Compressed stream API
* ----------------------
*
- * The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * The compressed stream API is providing a set of function pointers for
+ * opening, reading, writing, and finally closing files. The implemented
+ * function pointers are documented in the corresponding header file and are
+ * common for all streams. It allows the caller to use the same functions for
+ * both compressed and uncompressed streams.
+ *
+ * The interface consists of three functions, InitCompressFileHandle,
+ * InitDiscoverCompressFileHandle, and EndCompressFileHandle. If the
+ * compression is known, then start by calling InitCompressFileHandle,
+ * otherwise discover it by using InitDiscoverCompressFileHandle. Then call
+ * the function pointers as required for the read/write operations. Finally
+ * call EndCompressFileHandle to end the stream.
+ *
+ * InitDiscoverCompressFileHandle tries to deffer the compression by the
+ * filename suffix. If the suffix is not yet known, then it tries to simply
+ * open the file, and if it fails, it tries to open the same file with the .gz
+ * suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -53,13 +65,14 @@
*/
#include "postgres_fe.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
+#include "compress_gzip.h"
#include "compress_io.h"
+#include "compress_none.h"
#include "pg_backup_utils.h"
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#endif
-
/*----------------------
* Generic functions
*----------------------
@@ -91,110 +104,25 @@ supports_compression(const pg_compress_specification compression_spec)
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
-{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
-
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
-
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
-
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-
-/* Public interface routines */
-
-/* Allocate a new compressor */
+/*
+ * Allocate a new compressor.
+ */
CompressorState *
AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
- cs->compression_spec = compression_spec;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
-}
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF)
-{
if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
-}
+ InitCompressorNone(cs, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressorGzip(cs, compression_spec);
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ return cs;
}
/*
@@ -203,233 +131,31 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
-#endif
- free(cs);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
- */
-
-static void
-InitCompressorZlib(CompressorState *cs, int level)
-{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
-}
-
-static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
-{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
-
- free(cs->zlibOut);
- free(cs->zp);
-}
-
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
-{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
-
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
-
- if (res == Z_STREAM_END)
- break;
- }
-}
-
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
-}
-
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
-{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
-
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
-
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
-
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
-}
-#endif /* HAVE_LIBZ */
-
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
-{
- size_t cnt;
- char *buf;
- size_t buflen;
-
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
-
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
-
- free(buf);
-}
-
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->writeF(AH, data, dLen);
-}
-
-
/*----------------------
* Compressed stream API
*----------------------
*/
/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
+ * Private routines
*/
-struct cfp
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
-};
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
+}
/* free() without changing errno; useful in several places below */
static void
@@ -442,324 +168,102 @@ free_keep_errno(void *p)
}
/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
+ * Public interface
*/
-cfp *
-cfopen_read(const char *path, const char *mode)
-{
- cfp *fp;
-
- pg_compress_specification compression_spec = {0};
-
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
-
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
-}
/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
+ * Initialize a compress file handle for the specified compression algorithm.
*/
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compression_spec)
{
- cfp *fp;
-
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
-}
-
-/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
- */
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compression_spec)
-{
- cfp *fp = pg_malloc0(sizeof(cfp));
-
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode_compression);
- else
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode);
- else
- fp->compressedfp = gzopen(path, mode);
- }
+ CompressFileHandle *CFH;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
- else
- {
- if (fd >= 0)
- fp->uncompressedfp = fdopen(fd, mode);
- else
- fp->uncompressedfp = fopen(path, mode);
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
- }
+ if (compression_spec.algorithm == PG_COMPRESSION_NONE)
+ InitCompressFileHandleNone(CFH, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressFileHandleGzip(CFH, compression_spec);
- return fp;
+ return CFH;
}
/*
- * Opens file 'path' in 'mode' and compression as defined in
- * compression_spec. The caller must verify that the compression
- * is supported by the current build.
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
*
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- return cfopen_internal(path, -1, mode, compression_spec);
-}
-
-/*
- * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
- * and compression as defined in compression_spec. The caller must
- * verify that the compression is supported by the current build.
+ * If the file at 'path' contains the suffix of a supported compression method,
+ * currently this includes only ".gz", then this compression will be used
+ * throughout. Otherwise the compression will be deferred by iteratively trying
+ * to open the file at 'path', first as is, then by appending known compression
+ * suffixes. So if you pass "foo" as 'path', this will open either "foo" or
+ * "foo.gz", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec)
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- return cfopen_internal(NULL, fd, mode, compression_spec);
-}
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
-int
-cfread(void *ptr, int size, cfp *fp)
-{
- int ret;
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (size == 0)
- return 0;
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ fname = strdup(path);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
else
-#endif
{
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
- }
- return ret;
-}
-
-int
-cfwrite(const void *ptr, int size, cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
-#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
-}
-
-int
-cfgetc(cfp *fp)
-{
- int ret;
+ bool exists;
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not built with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
+ if (!exists)
{
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
- }
- else
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
}
- return ret;
-}
-
-char *
-cfgets(cfp *fp, char *buf, int len)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
-#endif
- return fgets(buf, len, fp->uncompressedfp);
-}
-
-int
-cfclose(cfp *fp)
-{
- int result;
-
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
- }
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ CFH = InitCompressFileHandle(compression_spec);
+ if (CFH->open_func(fname, -1, mode, CFH))
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
+ free_keep_errno(CFH);
+ CFH = NULL;
}
- else
-#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
- }
- free_keep_errno(fp);
+ free_keep_errno(fname);
- return result;
+ return CFH;
}
+/*
+ * Close an open file handle and release its memory.
+ *
+ * On failure, returns an error value and sets errno appropriately.
+ */
int
-cfeof(cfp *fp)
+EndCompressFileHandle(CompressFileHandle *CFH)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
-#endif
- return feof(fp->uncompressedfp);
-}
+ int ret = 0;
-const char *
-get_cfp_error(cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ if (CFH->private_data)
+ ret = CFH->close_func(CFH);
- if (errnum != Z_ERRNO)
- return errmsg;
- }
-#endif
- return strerror(errno);
-}
+ free_keep_errno(CFH);
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
-{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
-
- if (filenamelen < suffixlen)
- return 0;
-
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
+ return ret;
}
-
-#endif
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 8beb1058ec..8280d7b0dc 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -23,50 +23,160 @@
extern char *supports_compression(const pg_compress_specification compression_spec);
-/* Prototype for callback function to WriteDataToArchive() */
+/*
+ * Prototype for callback function used in writeData()
+ */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
/*
- * Prototype for callback function to ReadDataFromArchive()
+ * Prototype for callback function used in readData()
*
- * ReadDataFromArchive will call the read function repeatedly, until it
- * returns 0 to signal EOF. ReadDataFromArchive passes a buffer to read the
- * data into in *buf, of length *buflen. If that's not big enough for the
- * callback function, it can free() it and malloc() a new one, returning the
- * new buffer and its size in *buf and *buflen.
+ * readData will call the read function repeatedly, until it returns 0 to signal
+ * EOF. readData passes a buffer to read the data into in *buf, of length
+ * *buflen. If that's not big enough for the callback function, it can free() it
+ * and malloc() a new one, returning the new buffer and its size in *buf and
+ * *buflen.
*
* Returns the number of bytes read into *buf, or 0 on EOF.
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+
+ /*
+ * End compression and flush internal buffers if any.
+ */
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Callback function to read from an already processed input stream
+ */
+ ReadFunc readF;
+
+ /*
+ * Callback function to write an already processed chunk of data.
+ */
+ WriteFunc writeF;
+
+ /*
+ * Compression specification for this state.
+ */
+ pg_compress_specification compression_spec;
+
+ /*
+ * Private data to be used by the compressor.
+ */
+ void *private_data;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ /*
+ * Open a file in mode.
+ *
+ * Pass either 'path' or 'fd' depending on whether a filepath or a file
+ * descriptor is available. 'mode' can be one of 'r', 'rb', 'w', 'wb',
+ * 'a', and 'ab'. Requrires an already initialized CompressFileHandle.
+ */
+ int (*open_func) (const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH);
+
+ /*
+ * Open a file for writing.
+ *
+ * 'mode' can be one of ''w', 'wb', 'a', and 'ab'. Requrires an already
+ * initialized CompressFileHandle.
+ */
+ int (*open_write_func) (const char *path, const char *mode,
+ CompressFileHandle *CFH);
-typedef struct cfp cfp;
+ /*
+ * Read 'size' bytes of data from the file and store them into 'ptr'.
+ */
+ size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+ /*
+ * Write 'size' bytes of data into the file from 'ptr'.
+ */
+ size_t (*write_func) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ /*
+ * Read at most size - 1 characters from the compress file handle into
+ * 's'.
+ *
+ * Stop if an EOF or a newline is found first. 's' is always null
+ * terminated and contains the newline if it was found.
+ */
+ char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
+
+ /*
+ * Read the next character from the compress file handle as 'unsigned
+ * char' cast into 'int'.
+ */
+ int (*getc_func) (CompressFileHandle *CFH);
+
+ /*
+ * Test if EOF is reached in the compress file handle.
+ */
+ int (*eof_func) (CompressFileHandle *CFH);
+
+ /*
+ * Close an open file handle.
+ */
+ int (*close_func) (CompressFileHandle *CFH);
+
+ /*
+ * Get a pointer to a string that describes an error that occured during a
+ * compress file handle operation.
+ */
+ const char *(*get_error_func) (CompressFileHandle *CFH);
+
+ /*
+ * Compression specification for this file handle.
+ */
+ pg_compress_specification compression_spec;
+
+ /*
+ * Private data to be used by the compressor.
+ */
+ void *private_data;
+};
+
+/*
+ * Initialize a compress file handle with the requested compression.
+ */
+extern CompressFileHandle *InitCompressFileHandle(const pg_compress_specification compression_spec);
+
+/*
+ * Initialize a compress file stream. Deffer the compression algorithm
+ * from 'path', either by examining its suffix or by appending the supported
+ * suffixes in 'path'.
+ */
+extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int EndCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/compress_none.c b/src/bin/pg_dump/compress_none.c
new file mode 100644
index 0000000000..ecbcf4b04a
--- /dev/null
+++ b/src/bin/pg_dump/compress_none.c
@@ -0,0 +1,206 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_none.c
+ * Routines for archivers to read or write an uncompressed stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_none.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_none.h"
+#include "pg_backup_utils.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+/*
+ * Private routines
+ */
+
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
+
+ free(buf);
+}
+
+
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+/*
+ * Public interface
+ */
+
+void
+InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+
+ cs->compression_spec = compression_spec;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * Private routines
+ */
+
+static size_t
+read_none(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ size_t ret;
+
+ if (size == 0)
+ return 0;
+
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
+
+ return ret;
+}
+
+static size_t
+write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
+}
+
+static const char *
+get_error_none(CompressFileHandle *CFH)
+{
+ return strerror(errno);
+}
+
+static char *
+gets_none(char *ptr, int size, CompressFileHandle *CFH)
+{
+ return fgets(ptr, size, (FILE *) CFH->private_data);
+}
+
+static int
+getc_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret;
+
+ ret = fgetc(fp);
+ if (ret == EOF)
+ {
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static int
+close_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret = 0;
+
+ CFH->private_data = NULL;
+
+ if (fp)
+ ret = fclose(fp);
+
+ return ret;
+}
+
+static int
+eof_none(CompressFileHandle *CFH)
+{
+ return feof((FILE *) CFH->private_data);
+}
+
+static int
+open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
+
+ if (fd >= 0)
+ CFH->private_data = fdopen(dup(fd), mode);
+ else
+ CFH->private_data = fopen(path, mode);
+
+ if (CFH->private_data == NULL)
+ return 1;
+
+ return 0;
+}
+
+static int
+open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
+
+ CFH->private_data = fopen(path, mode);
+ if (CFH->private_data == NULL)
+ return 1;
+
+ return 0;
+}
+
+/*
+ * Public interface
+ */
+
+void
+InitCompressFileHandleNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ CFH->open_func = open_none;
+ CFH->open_write_func = open_write_none;
+ CFH->read_func = read_none;
+ CFH->write_func = write_none;
+ CFH->gets_func = gets_none;
+ CFH->getc_func = getc_none;
+ CFH->close_func = close_none;
+ CFH->eof_func = eof_none;
+ CFH->get_error_func = get_error_none;
+
+ CFH->private_data = NULL;
+}
diff --git a/src/bin/pg_dump/compress_none.h b/src/bin/pg_dump/compress_none.h
new file mode 100644
index 0000000000..143e599819
--- /dev/null
+++ b/src/bin/pg_dump/compress_none.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_none.h
+ * Uncompressed interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_none.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_NONE_H_
+#define _COMPRESS_NONE_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_NONE_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index ca62f9a374..84e9f0defa 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,7 +1,9 @@
# Copyright (c) 2022-2023, PostgreSQL Global Development Group
pg_dump_common_sources = files(
+ 'compress_gzip.c',
'compress_io.c',
+ 'compress_none.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index f3b92b01d9..1405459723 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -354,7 +354,7 @@ RestoreArchive(Archive *AHX)
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1127,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1143,9 +1143,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1502,6 +1503,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1524,33 +1526,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1689,7 +1690,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2031,6 +2036,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2061,26 +2078,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2178,6 +2181,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2233,7 +2237,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3646,12 +3653,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3722,10 +3724,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 4725e49747..18b38c17ab 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,13 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add
+ * compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index d1e54644a9..512ab043af 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 6800c3ccee..a2f88995c0 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read_func(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (EndCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof_func(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write_func(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc_func(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write_func(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read_func(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compression_spec);
+ if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
+ if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write_func(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index d92247c915..78454928cc 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -94,7 +94,7 @@ my %pgdump_runs = (
command => [
'pg_restore', '-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip-compressed'
},
},
@@ -239,8 +239,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default if available',
},
},
@@ -264,8 +264,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
glob_patterns => [
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 9a471e40aa..d44ebb06cc 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -14,6 +14,10 @@
#ifndef PG_COMPRESSION_H
#define PG_COMPRESSION_H
+/*
+ * These values are stored in disc, for example in files generated by pg_dump.
+ * Create the necessary backwards compatibility layers if their order changes.
+ */
typedef enum pg_compress_algorithm
{
PG_COMPRESSION_NONE,
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index e52fe9f509..db429474a2 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -150,7 +150,9 @@ do
# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.
+ test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
test "$f" = src/bin/pg_dump/pg_dump.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 51484ca7e2..7e62c6ef3d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -428,6 +428,7 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
+CompressFileHandle
CompressionLocation
CompressorState
ComputeXidHorizonsResult
@@ -1034,6 +1035,7 @@ GucStack
GucStackState
GucStringAssignHook
GucStringCheckHook
+GzipCompressorState
HANDLE
HASHACTION
HASHBUCKET
--
2.34.1
On Thu, Jan 26, 2023 at 11:24:47AM +0000, gkokolatos@pm.me wrote:
I gave this a little bit of thought. I think that ReadHead should not
emit a warning, or at least not this warning as it is slightly misleading.
It implies that it will automatically turn off data restoration, which is
false. Further ahead, the code will fail with a conflicting error message
stating that the compression is not available.Instead, it would be cleaner both for the user and the maintainer to
move the check in RestoreArchive and make it the sole responsible for
this logic.
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ pg_fatal("cannot restore data from compressed archive (compression not supported in this installation)");
Hmm. I don't mind changing this part as you suggest.
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
-#endif
However I think that we'd better keep the warning, as it can offer a
hint when using pg_restore -l not built with compression support if
looking at a dump that has been compressed.
--
Michael
------- Original Message -------
On Thursday, January 26th, 2023 at 12:53 PM, Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Jan 26, 2023 at 11:24:47AM +0000, gkokolatos@pm.me wrote:
I gave this a little bit of thought. I think that ReadHead should not
emit a warning, or at least not this warning as it is slightly misleading.
It implies that it will automatically turn off data restoration, which is
false. Further ahead, the code will fail with a conflicting error message
stating that the compression is not available.Instead, it would be cleaner both for the user and the maintainer to
move the check in RestoreArchive and make it the sole responsible for
this logic.- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)"); + pg_fatal("cannot restore data from compressed archive (compression not supported in this installation)"); Hmm. I don't mind changing this part as you suggest.-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)- pg_fatal("archive is compressed, but this installation does not support compression");
-#endif
However I think that we'd better keep the warning, as it can offer a
hint when using pg_restore -l not built with compression support if
looking at a dump that has been compressed.
Fair enough. Please find v27 attached.
Cheers,
//Georgios
Show quoted text
--
Michael
Attachments:
v27-0001-Address-regression-in-pg_dump-s-ReadHead.patchtext/x-patch; name=v27-0001-Address-regression-in-pg_dump-s-ReadHead.patchDownload
From d68ef7799f2cfb2d94402679157c8bcf6bca5273 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 26 Jan 2023 17:49:06 +0000
Subject: [PATCH v27 1/4] Address regression in pg_dump's ReadHead
Commit 5e73a6048 upgraded the check for supported compression while parsing an
archive's header, from warning to fatal. Prior to this commit, it was possible
to restore a compressed archive's schema even when the compression was not
supported by the binary.
Before the abovementioned commit, a warning message would be emmited:
pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
This message is slightly miss-leading, as it can be interpreted that the binary
will actively switch into schema only mode, which is not the case. Instead a new
fatal message will appear informing that the compression is not available.
This commit chooses to remove the check in ReadHead altogether in favour of a
more comprehensive error message when checking the archive for data.
Regression spotted by: Justin Pryzby
---
src/bin/pg_dump/pg_backup_archiver.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index ba5e6acbbb..4d9114cdd8 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -398,7 +398,7 @@ RestoreArchive(Archive *AHX)
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ pg_fatal("cannot restore data from compressed archive (compression not supported in this installation)");
}
}
#endif
@@ -3784,9 +3784,10 @@ ReadHead(ArchiveHandle *AH)
#ifndef HAVE_LIBZ
if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
+ pg_log_warning("archive is compressed, but this installation does not support compression");
#endif
+
if (AH->version >= K_VERS_1_4)
{
struct tm crtm;
--
2.34.1
v27-0004-Add-LZ4-compression-to-pg_dump.patchtext/x-patch; name=v27-0004-Add-LZ4-compression-to-pg_dump.patchDownload
From a6673dadf3f4e2f0ac08631181fecf98a8aa4948 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 26 Jan 2023 18:00:23 +0000
Subject: [PATCH v27 4/4] Add LZ4 compression to pg_dump
This is mostly done within pg_dump's compression streaming and file APIs.
It is confined within the newly introduced compress_lz4.{c,h} files.
The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(). Where the
functionality was missing from the official API, it has been implemented
localy.
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 24 +-
src/bin/pg_dump/compress_lz4.c | 626 ++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 24 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 6 +-
src/bin/pg_dump/pg_backup_directory.c | 9 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 1 +
12 files changed, 780 insertions(+), 21 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e1..49d218905f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 0013bc080c..eb8f59459a 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -26,6 +27,7 @@ OBJS = \
$(WIN32RES) \
compress_gzip.o \
compress_io.o \
+ compress_lz4.o \
compress_none.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index e4e008d2e1..6563f67689 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,7 +56,7 @@
* InitDiscoverCompressFileHandle tries to deffer the compression by the
* filename suffix. If the suffix is not yet known, then it tries to simply
* open the file, and if it fails, it tries to open the same file with the .gz
- * suffix.
+ * suffix, and then again with the .lz4 suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -70,6 +70,7 @@
#include "compress_gzip.h"
#include "compress_io.h"
+#include "compress_lz4.h"
#include "compress_none.h"
#include "pg_backup_utils.h"
@@ -90,6 +91,10 @@ supports_compression(const pg_compress_specification compression_spec)
if (algorithm == PG_COMPRESSION_GZIP)
supported = true;
#endif
+#ifdef USE_LZ4
+ if (algorithm == PG_COMPRESSION_LZ4)
+ supported = true;
+#endif
if (!supported)
return psprintf("this build does not support compression with %s",
@@ -121,6 +126,8 @@ AllocateCompressor(const pg_compress_specification compression_spec,
InitCompressorNone(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressorGzip(cs, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ InitCompressorLZ4(cs, compression_spec);
return cs;
}
@@ -185,6 +192,8 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
InitCompressFileHandleNone(CFH, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressFileHandleGzip(CFH, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ InitCompressFileHandleLZ4(CFH, compression_spec);
return CFH;
}
@@ -198,7 +207,7 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
* throughout. Otherwise the compression will be deferred by iteratively trying
* to open the file at 'path', first as is, then by appending known compression
* suffixes. So if you pass "foo" as 'path', this will open either "foo" or
- * "foo.gz", trying in that order.
+ * "foo.gz" or "foo.lz4", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
@@ -236,6 +245,17 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
+#endif
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..ee74cc8e28
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,626 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.c
+ * Routines for archivers to write a LZ4 compressed data stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private_data;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private_data = NULL;
+ }
+}
+
+
+/*
+ * Public routines that support LZ4 compressed data I/O
+ */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ cs->compression_spec = compression_spec;
+
+ /* Will be lazy init'd */
+ cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the necessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurrence of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verify that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (!eol_flag) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the beginning of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int remaining = size;
+
+ /* Lazy init */
+ if (LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+/*
+ * Public routines
+ */
+void
+InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ LZ4File *lz4fp;
+
+ CFH->open_func = LZ4File_open;
+ CFH->open_write_func = LZ4File_open_write;
+ CFH->read_func = LZ4File_read;
+ CFH->write_func = LZ4File_write;
+ CFH->gets_func = LZ4File_gets;
+ CFH->getc_func = LZ4File_getc;
+ CFH->eof_func = LZ4File_eof;
+ CFH->close_func = LZ4File_close;
+ CFH->get_error_func = LZ4File_get_error;
+
+ CFH->compression_spec = compression_spec;
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (CFH->compression_spec.level >= 0)
+ lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
+
+ CFH->private_data = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+
+void
+InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..40dbe00d46
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.h
+ * LZ4 interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 84e9f0defa..0da476a4c3 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -3,6 +3,7 @@
pg_dump_common_sources = files(
'compress_gzip.c',
'compress_io.c',
+ 'compress_lz4.c',
'compress_none.c',
'dumputils.c',
'parallel.c',
@@ -18,7 +19,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -86,7 +87,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 02178bdc53..d368cacabb 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -2074,7 +2074,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2084,6 +2084,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index a2f88995c0..ce2a0838fa 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -779,10 +779,13 @@ _PrepParallelRestore(ArchiveHandle *AH)
if (stat(fname, &st) == 0)
te->dataLength = st.st_size;
- else
+ else if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
{
- /* It might be compressed */
- strlcat(fname, ".gz", sizeof(fname));
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ strlcat(fname, ".gz", sizeof(fname));
+ else if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ strlcat(fname, ".lz4", sizeof(fname));
+
if (stat(fname, &st) == 0)
te->dataLength = st.st_size;
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..08105337b1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 78454928cc..72b19ee6cd 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -139,6 +139,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files were compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4175,11 +4249,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index db429474a2..2c5042eb41 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -152,6 +152,7 @@ do
# as field names, which is unfortunate but we won't change it now.
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_lz4.h && continue
test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 7e62c6ef3d..692c5ceb0b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1383,6 +1383,7 @@ LWLock
LWLockHandle
LWLockMode
LWLockPadded
+LZ4CompressorState
LZ4F_compressionContext_t
LZ4F_decompressOptions_t
LZ4F_decompressionContext_t
--
2.34.1
v27-0002-Prepare-pg_dump-internals-for-additional-compres.patchtext/x-patch; name=v27-0002-Prepare-pg_dump-internals-for-additional-compres.patchDownload
From 31142304229200fcb189f6852dad28a827160443 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 26 Jan 2023 17:58:54 +0000
Subject: [PATCH v27 2/4] Prepare pg_dump internals for additional compression
methods.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it and is using it throughout.
---
src/bin/pg_dump/compress_io.c | 92 +++++++++++++++---
src/bin/pg_dump/compress_io.h | 4 +
src/bin/pg_dump/pg_backup_archiver.c | 138 ++++++++++-----------------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-----
4 files changed, 132 insertions(+), 129 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 7a2c80bbc4..e1733ce57c 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,36 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
+/*----------------------
+ * Generic functions
+ *----------------------
+ */
+
+char *
+supports_compression(const pg_compress_specification compression_spec)
+{
+ const pg_compress_algorithm algorithm = compression_spec.algorithm;
+ bool supported = false;
+
+ if (algorithm == PG_COMPRESSION_NONE)
+ supported = true;
+#ifdef HAVE_LIBZ
+ if (algorithm == PG_COMPRESSION_GZIP)
+ supported = true;
+#endif
+
+ if (!supported)
+ return psprintf("this build does not support compression with %s",
+ get_compress_algorithm_name(algorithm));
+
+ return NULL;
+}
+
+
/*----------------------
* Compressor API
*----------------------
@@ -490,16 +520,19 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_specification compression_spec)
{
- cfp *fp = pg_malloc(sizeof(cfp));
+ cfp *fp = pg_malloc0(sizeof(cfp));
if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
@@ -511,15 +544,20 @@ cfopen(const char *path, const char *mode,
snprintf(mode_compression, sizeof(mode_compression), "%s%d",
mode, compression_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
+ if (fd >= 0)
+ fp->compressedfp = gzdopen(fd, mode_compression);
+ else
+ fp->compressedfp = gzopen(path, mode_compression);
}
else
{
/* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
+ if (fd >= 0)
+ fp->compressedfp = gzdopen(fd, mode);
+ else
+ fp->compressedfp = gzopen(path, mode);
}
- fp->uncompressedfp = NULL;
if (fp->compressedfp == NULL)
{
free_keep_errno(fp);
@@ -531,10 +569,11 @@ cfopen(const char *path, const char *mode,
}
else
{
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
-#endif
- fp->uncompressedfp = fopen(path, mode);
+ if (fd >= 0)
+ fp->uncompressedfp = fdopen(fd, mode);
+ else
+ fp->uncompressedfp = fopen(path, mode);
+
if (fp->uncompressedfp == NULL)
{
free_keep_errno(fp);
@@ -545,6 +584,33 @@ cfopen(const char *path, const char *mode,
return fp;
}
+/*
+ * Opens file 'path' in 'mode' and compression as defined in
+ * compression_spec. The caller must verify that the compression
+ * is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+/*
+ * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
+ * and compression as defined in compression_spec. The caller must
+ * verify that the compression is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a429dc4789..8beb1058ec 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -21,6 +21,8 @@
#define ZLIB_OUT_SIZE 4096
#define ZLIB_IN_SIZE 4096
+extern char *supports_compression(const pg_compress_specification compression_spec);
+
/* Prototype for callback function to WriteDataToArchive() */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
@@ -54,6 +56,8 @@ typedef struct cfp cfp;
extern cfp *cfopen(const char *path, const char *mode,
const pg_compress_specification compression_spec);
+extern cfp *cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
extern cfp *cfopen_write(const char *path, const char *mode,
const pg_compress_specification compression_spec);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 4d9114cdd8..a4e5df0817 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -363,7 +354,7 @@ RestoreArchive(Archive *AHX)
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +382,20 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
- AH->PrintTocDataPtr != NULL)
+ if (AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore data from compressed archive (compression not supported in this installation)");
+ {
+ char *errmsg = supports_compression(AH->compression_spec);
+ if (errmsg)
+ pg_fatal("cannot restore data from compressed archive (%s)", errmsg);
+ else
+ break;
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1133,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1508,58 +1502,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1570,33 +1538,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1720,22 +1679,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2224,6 +2178,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2277,8 +2232,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3713,6 +3668,7 @@ WriteHead(ArchiveHandle *AH)
void
ReadHead(ArchiveHandle *AH)
{
+ char *errmsg;
char vmaj,
vmin,
vrev;
@@ -3782,11 +3738,13 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_log_warning("archive is compressed, but this installation does not support compression");
-#endif
-
+ errmsg = supports_compression(AH->compression_spec);
+ if (errmsg)
+ {
+ pg_log_warning("archive is compressed but data cannot be restored (%s)",
+ errmsg);
+ pg_free(errmsg);
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b..4725e49747 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.34.1
v27-0003-Introduce-Compress-and-Compressor-API-in-pg_dump.patchtext/x-patch; name=v27-0003-Introduce-Compress-and-Compressor-API-in-pg_dump.patchDownload
From 06c6385d338d6b30390f4d125b1059148d4badd3 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 26 Jan 2023 17:59:35 +0000
Subject: [PATCH v27 3/4] Introduce Compress and Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for file manipulation. The implementor of a new
compression method is now able to "simply" just add those definitions.
Additionaly custom compressed archives store the compression algorithm in their
header instead of the compression level. The header version number is bumped.
---
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_gzip.c | 401 ++++++++++++++
src/bin/pg_dump/compress_gzip.h | 24 +
src/bin/pg_dump/compress_io.c | 744 +++++---------------------
src/bin/pg_dump/compress_io.h | 166 +++++-
src/bin/pg_dump/compress_none.c | 206 +++++++
src/bin/pg_dump/compress_none.h | 24 +
src/bin/pg_dump/meson.build | 2 +
src/bin/pg_dump/pg_backup_archiver.c | 91 ++--
src/bin/pg_dump/pg_backup_archiver.h | 5 +-
src/bin/pg_dump/pg_backup_custom.c | 23 +-
src/bin/pg_dump/pg_backup_directory.c | 94 ++--
src/bin/pg_dump/t/002_pg_dump.pl | 10 +-
src/include/common/compression.h | 4 +
src/tools/pginclude/cpluspluscheck | 2 +
src/tools/pgindent/typedefs.list | 2 +
16 files changed, 1051 insertions(+), 749 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
create mode 100644 src/bin/pg_dump/compress_none.c
create mode 100644 src/bin/pg_dump/compress_none.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index ef1ed0f3e5..0013bc080c 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,7 +24,9 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
+ compress_none.o \
dumputils.o \
parallel.o \
pg_backup_archiver.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..24e68fd022
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,401 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.c
+ * Routines for archivers to read or write a gzip compressed data stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_gzip.h"
+#include "pg_backup_utils.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ /*
+ * A level of zero simply copies the input one block at the time. This
+ * is probably not what the user wanted when calling this interface.
+ */
+ if (cs->compression_spec.level == 0)
+ pg_fatal("requested to compress the archive yet no level was specified");
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ cs->compression_spec = compression_spec;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
+ cs->private_data = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ size_t ret;
+
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzwrite(gzfp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzgets(gzfp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ CFH->private_data = NULL;
+
+ return gzclose(gzfp);
+}
+
+static int
+Gzip_eof(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzeof(gzfp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ gzFile gzfp;
+ char mode_compression[32];
+
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
+ else
+ gzfp = gzopen(path, mode_compression);
+
+ if (gzfp == NULL)
+ return 1;
+
+ CFH->private_data = gzfp;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ CFH->open_func = Gzip_open;
+ CFH->open_write_func = Gzip_open_write;
+ CFH->read_func = Gzip_read;
+ CFH->write_func = Gzip_write;
+ CFH->gets_func = Gzip_gets;
+ CFH->getc_func = Gzip_getc;
+ CFH->close_func = Gzip_close;
+ CFH->eof_func = Gzip_eof;
+ CFH->get_error_func = Gzip_get_error;
+
+ CFH->compression_spec = compression_spec;
+
+ CFH->private_data = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+
+void
+InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..2392c697b4
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.h
+ * GZIP interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index e1733ce57c..e4e008d2e1 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -9,42 +9,54 @@
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
+ * underlying stream. The second API is a wrapper around fopen and
* friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
+ * the possible compression. The second API is aimed for the resulting
+ * files to be easily manipulated with an external compression utility
+ * program.
+ *
+ * This file also includes the implementation when compression is none for
+ * both API's.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
- *
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
+ * AllocateCompressor, writeData, and EndCompressor. First you call
+ * AllocateCompressor, then write all the data by calling writeData as many
+ * times as needed, and finally EndCompressor. writeData will call the
+ * WriteFunc that was provided to AllocateCompressor for each chunk of
+ * compressed data.
*
- * The interface is the same for compressed and uncompressed streams.
+ * The interface for reading an archive consists of the same three functions:
+ * AllocateCompressor, readData, and EndCompressor. First you call
+ * AllocateCompressor, then read all the data by calling readData to read the
+ * whole compressed stream which repeatedly calls the given ReadFunc. ReadFunc
+ * returns the compressed data one chunk at a time. Then readData decompresses
+ * it and passes the decompressed data to ahwrite(), until ReadFunc returns 0
+ * to signal EOF. The interface is the same for compressed and uncompressed
+ * streams.
*
* Compressed stream API
* ----------------------
*
- * The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * The compressed stream API is providing a set of function pointers for
+ * opening, reading, writing, and finally closing files. The implemented
+ * function pointers are documented in the corresponding header file and are
+ * common for all streams. It allows the caller to use the same functions for
+ * both compressed and uncompressed streams.
+ *
+ * The interface consists of three functions, InitCompressFileHandle,
+ * InitDiscoverCompressFileHandle, and EndCompressFileHandle. If the
+ * compression is known, then start by calling InitCompressFileHandle,
+ * otherwise discover it by using InitDiscoverCompressFileHandle. Then call
+ * the function pointers as required for the read/write operations. Finally
+ * call EndCompressFileHandle to end the stream.
+ *
+ * InitDiscoverCompressFileHandle tries to deffer the compression by the
+ * filename suffix. If the suffix is not yet known, then it tries to simply
+ * open the file, and if it fails, it tries to open the same file with the .gz
+ * suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -53,13 +65,14 @@
*/
#include "postgres_fe.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
+#include "compress_gzip.h"
#include "compress_io.h"
+#include "compress_none.h"
#include "pg_backup_utils.h"
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#endif
-
/*----------------------
* Generic functions
*----------------------
@@ -91,110 +104,25 @@ supports_compression(const pg_compress_specification compression_spec)
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
-{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
-
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
-
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
-
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-
-/* Public interface routines */
-
-/* Allocate a new compressor */
+/*
+ * Allocate a new compressor.
+ */
CompressorState *
AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
- cs->compression_spec = compression_spec;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
-}
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF)
-{
if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
-}
+ InitCompressorNone(cs, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressorGzip(cs, compression_spec);
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ return cs;
}
/*
@@ -203,233 +131,31 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
-#endif
- free(cs);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
- */
-
-static void
-InitCompressorZlib(CompressorState *cs, int level)
-{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
-}
-
-static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
-{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
-
- free(cs->zlibOut);
- free(cs->zp);
-}
-
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
-{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
-
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
-
- if (res == Z_STREAM_END)
- break;
- }
-}
-
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
-}
-
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
-{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
-
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
-
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
-
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
-}
-#endif /* HAVE_LIBZ */
-
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
-{
- size_t cnt;
- char *buf;
- size_t buflen;
-
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
-
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
-
- free(buf);
-}
-
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->writeF(AH, data, dLen);
-}
-
-
/*----------------------
* Compressed stream API
*----------------------
*/
/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
+ * Private routines
*/
-struct cfp
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
-};
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
+}
/* free() without changing errno; useful in several places below */
static void
@@ -442,324 +168,102 @@ free_keep_errno(void *p)
}
/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
+ * Public interface
*/
-cfp *
-cfopen_read(const char *path, const char *mode)
-{
- cfp *fp;
-
- pg_compress_specification compression_spec = {0};
-
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
-
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
-}
/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
+ * Initialize a compress file handle for the specified compression algorithm.
*/
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compression_spec)
{
- cfp *fp;
-
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
-}
-
-/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
- */
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compression_spec)
-{
- cfp *fp = pg_malloc0(sizeof(cfp));
-
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode_compression);
- else
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode);
- else
- fp->compressedfp = gzopen(path, mode);
- }
+ CompressFileHandle *CFH;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
- else
- {
- if (fd >= 0)
- fp->uncompressedfp = fdopen(fd, mode);
- else
- fp->uncompressedfp = fopen(path, mode);
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
- }
+ if (compression_spec.algorithm == PG_COMPRESSION_NONE)
+ InitCompressFileHandleNone(CFH, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressFileHandleGzip(CFH, compression_spec);
- return fp;
+ return CFH;
}
/*
- * Opens file 'path' in 'mode' and compression as defined in
- * compression_spec. The caller must verify that the compression
- * is supported by the current build.
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
*
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- return cfopen_internal(path, -1, mode, compression_spec);
-}
-
-/*
- * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
- * and compression as defined in compression_spec. The caller must
- * verify that the compression is supported by the current build.
+ * If the file at 'path' contains the suffix of a supported compression method,
+ * currently this includes only ".gz", then this compression will be used
+ * throughout. Otherwise the compression will be deferred by iteratively trying
+ * to open the file at 'path', first as is, then by appending known compression
+ * suffixes. So if you pass "foo" as 'path', this will open either "foo" or
+ * "foo.gz", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec)
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- return cfopen_internal(NULL, fd, mode, compression_spec);
-}
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
-int
-cfread(void *ptr, int size, cfp *fp)
-{
- int ret;
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (size == 0)
- return 0;
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ fname = strdup(path);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
else
-#endif
{
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
- }
- return ret;
-}
-
-int
-cfwrite(const void *ptr, int size, cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
-#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
-}
-
-int
-cfgetc(cfp *fp)
-{
- int ret;
+ bool exists;
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not built with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
+ if (!exists)
{
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
- }
- else
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
}
- return ret;
-}
-
-char *
-cfgets(cfp *fp, char *buf, int len)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
-#endif
- return fgets(buf, len, fp->uncompressedfp);
-}
-
-int
-cfclose(cfp *fp)
-{
- int result;
-
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
- }
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ CFH = InitCompressFileHandle(compression_spec);
+ if (CFH->open_func(fname, -1, mode, CFH))
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
+ free_keep_errno(CFH);
+ CFH = NULL;
}
- else
-#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
- }
- free_keep_errno(fp);
+ free_keep_errno(fname);
- return result;
+ return CFH;
}
+/*
+ * Close an open file handle and release its memory.
+ *
+ * On failure, returns an error value and sets errno appropriately.
+ */
int
-cfeof(cfp *fp)
+EndCompressFileHandle(CompressFileHandle *CFH)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
-#endif
- return feof(fp->uncompressedfp);
-}
+ int ret = 0;
-const char *
-get_cfp_error(cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ if (CFH->private_data)
+ ret = CFH->close_func(CFH);
- if (errnum != Z_ERRNO)
- return errmsg;
- }
-#endif
- return strerror(errno);
-}
+ free_keep_errno(CFH);
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
-{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
-
- if (filenamelen < suffixlen)
- return 0;
-
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
+ return ret;
}
-
-#endif
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 8beb1058ec..8280d7b0dc 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -23,50 +23,160 @@
extern char *supports_compression(const pg_compress_specification compression_spec);
-/* Prototype for callback function to WriteDataToArchive() */
+/*
+ * Prototype for callback function used in writeData()
+ */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
/*
- * Prototype for callback function to ReadDataFromArchive()
+ * Prototype for callback function used in readData()
*
- * ReadDataFromArchive will call the read function repeatedly, until it
- * returns 0 to signal EOF. ReadDataFromArchive passes a buffer to read the
- * data into in *buf, of length *buflen. If that's not big enough for the
- * callback function, it can free() it and malloc() a new one, returning the
- * new buffer and its size in *buf and *buflen.
+ * readData will call the read function repeatedly, until it returns 0 to signal
+ * EOF. readData passes a buffer to read the data into in *buf, of length
+ * *buflen. If that's not big enough for the callback function, it can free() it
+ * and malloc() a new one, returning the new buffer and its size in *buf and
+ * *buflen.
*
* Returns the number of bytes read into *buf, or 0 on EOF.
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+
+ /*
+ * End compression and flush internal buffers if any.
+ */
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Callback function to read from an already processed input stream
+ */
+ ReadFunc readF;
+
+ /*
+ * Callback function to write an already processed chunk of data.
+ */
+ WriteFunc writeF;
+
+ /*
+ * Compression specification for this state.
+ */
+ pg_compress_specification compression_spec;
+
+ /*
+ * Private data to be used by the compressor.
+ */
+ void *private_data;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ /*
+ * Open a file in mode.
+ *
+ * Pass either 'path' or 'fd' depending on whether a filepath or a file
+ * descriptor is available. 'mode' can be one of 'r', 'rb', 'w', 'wb',
+ * 'a', and 'ab'. Requrires an already initialized CompressFileHandle.
+ */
+ int (*open_func) (const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH);
+
+ /*
+ * Open a file for writing.
+ *
+ * 'mode' can be one of ''w', 'wb', 'a', and 'ab'. Requrires an already
+ * initialized CompressFileHandle.
+ */
+ int (*open_write_func) (const char *path, const char *mode,
+ CompressFileHandle *CFH);
-typedef struct cfp cfp;
+ /*
+ * Read 'size' bytes of data from the file and store them into 'ptr'.
+ */
+ size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+ /*
+ * Write 'size' bytes of data into the file from 'ptr'.
+ */
+ size_t (*write_func) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ /*
+ * Read at most size - 1 characters from the compress file handle into
+ * 's'.
+ *
+ * Stop if an EOF or a newline is found first. 's' is always null
+ * terminated and contains the newline if it was found.
+ */
+ char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
+
+ /*
+ * Read the next character from the compress file handle as 'unsigned
+ * char' cast into 'int'.
+ */
+ int (*getc_func) (CompressFileHandle *CFH);
+
+ /*
+ * Test if EOF is reached in the compress file handle.
+ */
+ int (*eof_func) (CompressFileHandle *CFH);
+
+ /*
+ * Close an open file handle.
+ */
+ int (*close_func) (CompressFileHandle *CFH);
+
+ /*
+ * Get a pointer to a string that describes an error that occured during a
+ * compress file handle operation.
+ */
+ const char *(*get_error_func) (CompressFileHandle *CFH);
+
+ /*
+ * Compression specification for this file handle.
+ */
+ pg_compress_specification compression_spec;
+
+ /*
+ * Private data to be used by the compressor.
+ */
+ void *private_data;
+};
+
+/*
+ * Initialize a compress file handle with the requested compression.
+ */
+extern CompressFileHandle *InitCompressFileHandle(const pg_compress_specification compression_spec);
+
+/*
+ * Initialize a compress file stream. Deffer the compression algorithm
+ * from 'path', either by examining its suffix or by appending the supported
+ * suffixes in 'path'.
+ */
+extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int EndCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/compress_none.c b/src/bin/pg_dump/compress_none.c
new file mode 100644
index 0000000000..ecbcf4b04a
--- /dev/null
+++ b/src/bin/pg_dump/compress_none.c
@@ -0,0 +1,206 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_none.c
+ * Routines for archivers to read or write an uncompressed stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_none.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_none.h"
+#include "pg_backup_utils.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+/*
+ * Private routines
+ */
+
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
+
+ free(buf);
+}
+
+
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+/*
+ * Public interface
+ */
+
+void
+InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+
+ cs->compression_spec = compression_spec;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * Private routines
+ */
+
+static size_t
+read_none(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ size_t ret;
+
+ if (size == 0)
+ return 0;
+
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
+
+ return ret;
+}
+
+static size_t
+write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
+}
+
+static const char *
+get_error_none(CompressFileHandle *CFH)
+{
+ return strerror(errno);
+}
+
+static char *
+gets_none(char *ptr, int size, CompressFileHandle *CFH)
+{
+ return fgets(ptr, size, (FILE *) CFH->private_data);
+}
+
+static int
+getc_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret;
+
+ ret = fgetc(fp);
+ if (ret == EOF)
+ {
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static int
+close_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret = 0;
+
+ CFH->private_data = NULL;
+
+ if (fp)
+ ret = fclose(fp);
+
+ return ret;
+}
+
+static int
+eof_none(CompressFileHandle *CFH)
+{
+ return feof((FILE *) CFH->private_data);
+}
+
+static int
+open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
+
+ if (fd >= 0)
+ CFH->private_data = fdopen(dup(fd), mode);
+ else
+ CFH->private_data = fopen(path, mode);
+
+ if (CFH->private_data == NULL)
+ return 1;
+
+ return 0;
+}
+
+static int
+open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
+
+ CFH->private_data = fopen(path, mode);
+ if (CFH->private_data == NULL)
+ return 1;
+
+ return 0;
+}
+
+/*
+ * Public interface
+ */
+
+void
+InitCompressFileHandleNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ CFH->open_func = open_none;
+ CFH->open_write_func = open_write_none;
+ CFH->read_func = read_none;
+ CFH->write_func = write_none;
+ CFH->gets_func = gets_none;
+ CFH->getc_func = getc_none;
+ CFH->close_func = close_none;
+ CFH->eof_func = eof_none;
+ CFH->get_error_func = get_error_none;
+
+ CFH->private_data = NULL;
+}
diff --git a/src/bin/pg_dump/compress_none.h b/src/bin/pg_dump/compress_none.h
new file mode 100644
index 0000000000..143e599819
--- /dev/null
+++ b/src/bin/pg_dump/compress_none.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_none.h
+ * Uncompressed interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_none.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_NONE_H_
+#define _COMPRESS_NONE_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_NONE_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index ca62f9a374..84e9f0defa 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,7 +1,9 @@
# Copyright (c) 2022-2023, PostgreSQL Global Development Group
pg_dump_common_sources = files(
+ 'compress_gzip.c',
'compress_io.c',
+ 'compress_none.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index a4e5df0817..02178bdc53 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -354,7 +354,7 @@ RestoreArchive(Archive *AHX)
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1127,7 +1127,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1143,9 +1143,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1502,6 +1503,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1524,33 +1526,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1689,7 +1690,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2031,6 +2036,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2061,26 +2078,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2178,6 +2181,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2233,7 +2237,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3646,12 +3653,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3723,10 +3725,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 4725e49747..18b38c17ab 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,13 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add
+ * compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index d1e54644a9..512ab043af 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 6800c3ccee..a2f88995c0 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read_func(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (EndCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof_func(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write_func(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc_func(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write_func(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read_func(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compression_spec);
+ if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
+ if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write_func(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index d92247c915..78454928cc 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -94,7 +94,7 @@ my %pgdump_runs = (
command => [
'pg_restore', '-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip-compressed'
},
},
@@ -239,8 +239,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default if available',
},
},
@@ -264,8 +264,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
glob_patterns => [
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 9a471e40aa..d44ebb06cc 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -14,6 +14,10 @@
#ifndef PG_COMPRESSION_H
#define PG_COMPRESSION_H
+/*
+ * These values are stored in disc, for example in files generated by pg_dump.
+ * Create the necessary backwards compatibility layers if their order changes.
+ */
typedef enum pg_compress_algorithm
{
PG_COMPRESSION_NONE,
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index e52fe9f509..db429474a2 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -150,7 +150,9 @@ do
# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.
+ test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
test "$f" = src/bin/pg_dump/pg_dump.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 51484ca7e2..7e62c6ef3d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -428,6 +428,7 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
+CompressFileHandle
CompressionLocation
CompressorState
ComputeXidHorizonsResult
@@ -1034,6 +1035,7 @@ GucStack
GucStackState
GucStringAssignHook
GucStringCheckHook
+GzipCompressorState
HANDLE
HASHACTION
HASHBUCKET
--
2.34.1
On Wed, Jan 25, 2023 at 07:57:18PM +0000, gkokolatos@pm.me wrote:
On Wednesday, January 25th, 2023 at 7:00 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
While looking at this, I realized that commit 5e73a6048 introduced a
regression:@@ -3740,19 +3762,24 @@ ReadHead(ArchiveHandle *AH)
- if (AH->compression != 0)
- pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available"); + if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)+ pg_fatal("archive is compressed, but this installation does not support compression");
Before, it was possible to restore non-data chunks of a dump file, even
if the current build didn't support its compression. But that's now
impossible - and it makes the code we're discussing in RestoreArchive()
unreachable.
On Thu, Jan 26, 2023 at 08:53:28PM +0900, Michael Paquier wrote:
On Thu, Jan 26, 2023 at 11:24:47AM +0000, gkokolatos@pm.me wrote:
I gave this a little bit of thought. I think that ReadHead should not
emit a warning, or at least not this warning as it is slightly misleading.
It implies that it will automatically turn off data restoration, which is
false. Further ahead, the code will fail with a conflicting error message
stating that the compression is not available.Instead, it would be cleaner both for the user and the maintainer to
move the check in RestoreArchive and make it the sole responsible for
this logic.- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)"); + pg_fatal("cannot restore data from compressed archive (compression not supported in this installation)"); Hmm. I don't mind changing this part as you suggest.-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("archive is compressed, but this installation does not support compression");
-#endif
However I think that we'd better keep the warning, as it can offer a
hint when using pg_restore -l not built with compression support if
looking at a dump that has been compressed.
Yeah. But the original log_warning text was better, and should be
restored:
- if (AH->compression != 0)
- pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
That commit also added this to pg-dump.c:
+ case PG_COMPRESSION_ZSTD:
+ pg_fatal("compression with %s is not yet supported", "ZSTD");
+ break;
+ case PG_COMPRESSION_LZ4:
+ pg_fatal("compression with %s is not yet supported", "LZ4");
+ break;
In 002, that could be simplified by re-using the supports_compression()
function. (And maybe the same in WriteDataToArchive()?)
--
Justin
Import Notes
Reply to msg id not found: Y9JpuLGng5zLO4Mx@paquier.xyz4ZDVII-_kusch7xndcrTUt7OPSuQWQGX7n8rqp625Hdv6DijBjWZ38gjhxLSzPu8M_Dka-50eeKCiCU_dOdawwpv06dZtwGroYE9dHlIBLc@pm.me | Resolved by subject fallback
On Thu, Jan 26, 2023 at 12:22:45PM -0600, Justin Pryzby wrote:
Yeah. But the original log_warning text was better, and should be
restored:- if (AH->compression != 0)
- pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
Yeah, this one's on me. So I have gone with the simplest solution and
applied a fix to restore the original behavior, with the same warning
showing up.
--
Michael
On Mon, Jan 16, 2023 at 11:54:46AM +0900, Michael Paquier wrote:
On Sun, Jan 15, 2023 at 07:56:25PM -0600, Justin Pryzby wrote:
On Mon, Jan 16, 2023 at 10:28:50AM +0900, Michael Paquier wrote:
The functions changed by 0001 are cfopen[_write](),
AllocateCompressor() and ReadDataFromArchive(). Why is it a good idea
to change these interfaces which basically exist to handle inputs?I changed to pass pg_compress_specification as a pointer, since that's
the usual convention for structs, as followed by the existing uses of
pg_compress_specification.Okay, but what do we gain here? It seems to me that this introduces
the risk that a careless change in one of the internal routines if
they change slight;ly compress_spec, hence impacting any of their
callers? Or is that fixing an actual bug (except if I am missing your
point, that does not seem to be the case)?
To circle back to this: I was not saying there's any bug. The proposed
change was only to follow normal and existing normal conventions for
passing structs. It could also be a pointer to const. It's fine with
me if you say that it's intentional how it's written already.
Show quoted text
Is there some benefit in changing compression_spec within the
internals of these routines before going back one layer down to their
callers? Changing the compression_spec on-the-fly in these internal
paths could be risky, actually, no?I think what you're saying is that if the spec is passed as a pointer,
then the called functions shouldn't set spec->algorithm=something.Yes. HEAD makes sure of that, 0001 would not prevent that. So I am a
bit confused in seeing how this is a benefit.
On Thu, Jan 26, 2023 at 12:22:45PM -0600, Justin Pryzby wrote:
That commit also added this to pg-dump.c:
+ case PG_COMPRESSION_ZSTD: + pg_fatal("compression with %s is not yet supported", "ZSTD"); + break; + case PG_COMPRESSION_LZ4: + pg_fatal("compression with %s is not yet supported", "LZ4"); + break;In 002, that could be simplified by re-using the supports_compression()
function. (And maybe the same in WriteDataToArchive()?)
The first patch aims to minimize references to ".gz" and "GZIP" and
ZLIB. pg_backup_directory.c comments still refers to ".gz". I think
the patch should ideally change to refer to "the compressed file
extension" (similar to compress_io.c), avoiding the need to update it
later.
I think the file extension stuff could be generalized, so it doesn't
need to be updated in multiple places (pg_backup_directory.c and
compress_io.c). Maybe it's useful to add a function to return the
extension of a given compression method. It could go in compression.c,
and be useful in basebackup.
For the 2nd patch:
I might be in the minority, but I still think some references to "gzip"
should say "zlib":
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
In my mind, three things here are misleading, because it doesn't use
gzip headers:
| GzipCompressorState, DeflateCompressorGzip, "gzip compressed".
This comment is about exactly that:
* underlying stream. The second API is a wrapper around fopen/gzopen and
* friends, providing an interface similar to those, but abstracts away
* the possible compression. Both APIs use libz for the compression, but
* the second API uses gzip headers, so the resulting files can be easily
* manipulated with the gzip utility.
AIUI, Michael says that it's fine that the user-facing command-line
options use "-Z gzip" (even though the "custom" format doesn't use gzip
headers). I'm okay with that, as long as that's discussed/understood.
--
Justin
------- Original Message -------
On Friday, January 27th, 2023 at 6:23 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Thu, Jan 26, 2023 at 12:22:45PM -0600, Justin Pryzby wrote:
That commit also added this to pg-dump.c:
+ case PG_COMPRESSION_ZSTD: + pg_fatal("compression with %s is not yet supported", "ZSTD"); + break; + case PG_COMPRESSION_LZ4: + pg_fatal("compression with %s is not yet supported", "LZ4"); + break;In 002, that could be simplified by re-using the supports_compression()
function. (And maybe the same in WriteDataToArchive()?)The first patch aims to minimize references to ".gz" and "GZIP" and
ZLIB. pg_backup_directory.c comments still refers to ".gz". I think
the patch should ideally change to refer to "the compressed file
extension" (similar to compress_io.c), avoiding the need to update it
later.I think the file extension stuff could be generalized, so it doesn't
need to be updated in multiple places (pg_backup_directory.c and
compress_io.c). Maybe it's useful to add a function to return the
extension of a given compression method. It could go in compression.c,
and be useful in basebackup.For the 2nd patch:
I might be in the minority, but I still think some references to "gzip"
should say "zlib":+} GzipCompressorState; + +/* Private routines that support gzip compressed data I/O */ +static void +DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)In my mind, three things here are misleading, because it doesn't use
gzip headers:| GzipCompressorState, DeflateCompressorGzip, "gzip compressed".
This comment is about exactly that:
* underlying stream. The second API is a wrapper around fopen/gzopen and
* friends, providing an interface similar to those, but abstracts away
* the possible compression. Both APIs use libz for the compression, but
* the second API uses gzip headers, so the resulting files can be easily
* manipulated with the gzip utility.AIUI, Michael says that it's fine that the user-facing command-line
options use "-Z gzip" (even though the "custom" format doesn't use gzip
headers). I'm okay with that, as long as that's discussed/understood.
Thank you for the input Justin. I am currently waiting for input from a
third person to get some conclusion. I thought that it should be stated
before my inactiveness is considered as indifference, which is not.
Cheers,
//Georgios
Show quoted text
--
Justin
On Tue, Jan 31, 2023 at 09:00:56AM +0000, gkokolatos@pm.me wrote:
In my mind, three things here are misleading, because it doesn't use
gzip headers:| GzipCompressorState, DeflateCompressorGzip, "gzip compressed".
This comment is about exactly that:
* underlying stream. The second API is a wrapper around fopen/gzopen and
* friends, providing an interface similar to those, but abstracts away
* the possible compression. Both APIs use libz for the compression, but
* the second API uses gzip headers, so the resulting files can be easily
* manipulated with the gzip utility.AIUI, Michael says that it's fine that the user-facing command-line
options use "-Z gzip" (even though the "custom" format doesn't use gzip
headers). I'm okay with that, as long as that's discussed/understood.Thank you for the input Justin. I am currently waiting for input from a
third person to get some conclusion. I thought that it should be stated
before my inactiveness is considered as indifference, which is not.
I'm not sure what there is to lose by making the names more accurate -
especially since they're private/internal-only.
Tomas marked himself as a committer, so maybe could comment.
It'd be nice to also come to some conclusion about whether -Fc -Z gzip
is confusing (due to not actually using gzip).
BTW, do you intend to merge this for v16 ? I verified in earlier patch
versions that tests all pass with lz4 as the default compression method.
And checked that gzip output is compatible with before, and that old
dumps restore correctly, and there's no memory leaks or other errors.
--
Justin
On Fri, Jan 27, 2023 2:04 AM gkokolatos@pm.me <gkokolatos@pm.me> wrote:
------- Original Message -------
On Thursday, January 26th, 2023 at 12:53 PM, Michael Paquier
<michael@paquier.xyz> wrote:On Thu, Jan 26, 2023 at 11:24:47AM +0000, gkokolatos@pm.me wrote:
I gave this a little bit of thought. I think that ReadHead should not
emit a warning, or at least not this warning as it is slightly misleading.
It implies that it will automatically turn off data restoration, which is
false. Further ahead, the code will fail with a conflicting error message
stating that the compression is not available.Instead, it would be cleaner both for the user and the maintainer to
move the check in RestoreArchive and make it the sole responsible for
this logic.- pg_fatal("cannot restore from compressed archive (compression not
supported in this installation)");
+ pg_fatal("cannot restore data from compressed archive (compression not
supported in this installation)");
Hmm. I don't mind changing this part as you suggest.
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)- pg_fatal("archive is compressed, but this installation does not support
compression");
-#endif
However I think that we'd better keep the warning, as it can offer a
hint when using pg_restore -l not built with compression support if
looking at a dump that has been compressed.Fair enough. Please find v27 attached.
Hi,
I am interested in this feature and tried the patch. While reading the comments,
I noticed some minor things that could possibly be improved (in v27-0003 patch).
1.
+ /*
+ * Open a file for writing.
+ *
+ * 'mode' can be one of ''w', 'wb', 'a', and 'ab'. Requrires an already
+ * initialized CompressFileHandle.
+ */
+ int (*open_write_func) (const char *path, const char *mode,
+ CompressFileHandle *CFH);
There is a redundant single quote in front of 'w'.
2.
/*
* Callback function for WriteDataToArchive. Writes one block of (compressed)
* data to the archive.
*/
/*
* Callback function for ReadDataFromArchive. To keep things simple, we
* always read one compressed block at a time.
*/
Should the function names in the comments be updated?
WriteDataToArchive
->
writeData
ReadDataFromArchive
->
readData
3.
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);
Could we use PG_BINARY_R instead of "r" and "rb" here?
Regards,
Shi Yu
------- Original Message -------
On Wednesday, February 15th, 2023 at 2:51 PM, shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com> wrote:
Hi,
I am interested in this feature and tried the patch. While reading the comments,
I noticed some minor things that could possibly be improved (in v27-0003 patch).
Thank you very much for the interest. Please find a rebased v28 attached. Due to
the rebase, 0001 of v27 is no longer relevant and has been removed. Your comments
are applied on v28-0002.
1. + /* + * Open a file for writing. + * + * 'mode' can be one of ''w', 'wb', 'a', and 'ab'. Requrires an already + * initialized CompressFileHandle. + */ + int (*open_write_func) (const char *path, const char *mode, + CompressFileHandle CFH);There is a redundant single quote in front of 'w'.
Fixed.
2.
/
* Callback function for WriteDataToArchive. Writes one block of (compressed)
* data to the archive.
/
/
* Callback function for ReadDataFromArchive. To keep things simple, we
* always read one compressed block at a time.
*/Should the function names in the comments be updated?
Agreed. Fixed.
3.
+ Assert(strcmp(mode, "r") == 0 || strcmp(mode, "rb") == 0);Could we use PG_BINARY_R instead of "r" and "rb" here?
We could and we should. Using PG_BINARY_R has the added benefit
of needing only one strcmp() call. Fixed.
Cheers,
//Georgios
Show quoted text
Regards,
Shi Yu
Attachments:
v28-0003-Add-LZ4-compression-to-pg_dump.patchtext/x-patch; name=v28-0003-Add-LZ4-compression-to-pg_dump.patchDownload
From d5a8e11d78e8ca9b592f9abb5f66396df4547727 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 16 Feb 2023 11:00:35 +0000
Subject: [PATCH v28 3/3] Add LZ4 compression to pg_dump
This is mostly done within pg_dump's compression streaming and file APIs.
It is confined within the newly introduced compress_lz4.{c,h} files.
The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(). Where the
functionality was missing from the official API, it has been implemented
localy.
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 24 +-
src/bin/pg_dump/compress_lz4.c | 626 ++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 24 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 6 +-
src/bin/pg_dump/pg_backup_directory.c | 9 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 1 +
12 files changed, 780 insertions(+), 21 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e1..49d218905f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 0013bc080c..eb8f59459a 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -26,6 +27,7 @@ OBJS = \
$(WIN32RES) \
compress_gzip.o \
compress_io.o \
+ compress_lz4.o \
compress_none.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index c507eeb7b3..929b38902b 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,7 +56,7 @@
* InitDiscoverCompressFileHandle tries to deffer the compression by the
* filename suffix. If the suffix is not yet known, then it tries to simply
* open the file, and if it fails, it tries to open the same file with the .gz
- * suffix.
+ * suffix, and then again with the .lz4 suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -70,6 +70,7 @@
#include "compress_gzip.h"
#include "compress_io.h"
+#include "compress_lz4.h"
#include "compress_none.h"
#include "pg_backup_utils.h"
@@ -90,6 +91,10 @@ supports_compression(const pg_compress_specification compression_spec)
if (algorithm == PG_COMPRESSION_GZIP)
supported = true;
#endif
+#ifdef USE_LZ4
+ if (algorithm == PG_COMPRESSION_LZ4)
+ supported = true;
+#endif
if (!supported)
return psprintf("this build does not support compression with %s",
@@ -121,6 +126,8 @@ AllocateCompressor(const pg_compress_specification compression_spec,
InitCompressorNone(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressorGzip(cs, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ InitCompressorLZ4(cs, compression_spec);
return cs;
}
@@ -185,6 +192,8 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
InitCompressFileHandleNone(CFH, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressFileHandleGzip(CFH, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ InitCompressFileHandleLZ4(CFH, compression_spec);
return CFH;
}
@@ -198,7 +207,7 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
* throughout. Otherwise the compression will be deferred by iteratively trying
* to open the file at 'path', first as is, then by appending known compression
* suffixes. So if you pass "foo" as 'path', this will open either "foo" or
- * "foo.gz", trying in that order.
+ * "foo.gz" or "foo.lz4", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
@@ -236,6 +245,17 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
+#endif
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..ee74cc8e28
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,626 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.c
+ * Routines for archivers to write a LZ4 compressed data stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private_data;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private_data = NULL;
+ }
+}
+
+
+/*
+ * Public routines that support LZ4 compressed data I/O
+ */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ cs->compression_spec = compression_spec;
+
+ /* Will be lazy init'd */
+ cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the necessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurrence of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verify that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (!eol_flag) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the beginning of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int remaining = size;
+
+ /* Lazy init */
+ if (LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+/*
+ * Public routines
+ */
+void
+InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ LZ4File *lz4fp;
+
+ CFH->open_func = LZ4File_open;
+ CFH->open_write_func = LZ4File_open_write;
+ CFH->read_func = LZ4File_read;
+ CFH->write_func = LZ4File_write;
+ CFH->gets_func = LZ4File_gets;
+ CFH->getc_func = LZ4File_getc;
+ CFH->eof_func = LZ4File_eof;
+ CFH->close_func = LZ4File_close;
+ CFH->get_error_func = LZ4File_get_error;
+
+ CFH->compression_spec = compression_spec;
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (CFH->compression_spec.level >= 0)
+ lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
+
+ CFH->private_data = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+
+void
+InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..40dbe00d46
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.h
+ * LZ4 interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 84e9f0defa..0da476a4c3 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -3,6 +3,7 @@
pg_dump_common_sources = files(
'compress_gzip.c',
'compress_io.c',
+ 'compress_lz4.c',
'compress_none.c',
'dumputils.c',
'parallel.c',
@@ -18,7 +19,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -86,7 +87,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 2bc5648ed6..027ded4bae 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -2075,7 +2075,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2085,6 +2085,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index a2f88995c0..ce2a0838fa 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -779,10 +779,13 @@ _PrepParallelRestore(ArchiveHandle *AH)
if (stat(fname, &st) == 0)
te->dataLength = st.st_size;
- else
+ else if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
{
- /* It might be compressed */
- strlcat(fname, ".gz", sizeof(fname));
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ strlcat(fname, ".gz", sizeof(fname));
+ else if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ strlcat(fname, ".lz4", sizeof(fname));
+
if (stat(fname, &st) == 0)
te->dataLength = st.st_size;
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..08105337b1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 78454928cc..72b19ee6cd 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -139,6 +139,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files were compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4175,11 +4249,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index db429474a2..2c5042eb41 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -152,6 +152,7 @@ do
# as field names, which is unfortunate but we won't change it now.
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_lz4.h && continue
test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 1d59b82003..126b33af66 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1386,6 +1386,7 @@ LWLock
LWLockHandle
LWLockMode
LWLockPadded
+LZ4CompressorState
LZ4F_compressionContext_t
LZ4F_decompressOptions_t
LZ4F_decompressionContext_t
--
2.34.1
v28-0002-Introduce-Compress-and-Compressor-API-in-pg_dump.patchtext/x-patch; name=v28-0002-Introduce-Compress-and-Compressor-API-in-pg_dump.patchDownload
From 6094277728b21df5d9e48d958d2a46b66e95f103 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 16 Feb 2023 11:00:25 +0000
Subject: [PATCH v28 2/3] Introduce Compress and Compressor API in pg_dump
The purpose of this API is to allow for easier addition of new compression
methods. CompressFileHandle is substituting the cfp* family of functions under a
struct of function pointers for file manipulation. The implementor of a new
compression method is now able to "simply" just add those definitions.
Additionaly custom compressed archives store the compression algorithm in their
header instead of the compression level. The header version number is bumped.
---
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_gzip.c | 401 ++++++++++++++
src/bin/pg_dump/compress_gzip.h | 24 +
src/bin/pg_dump/compress_io.c | 744 +++++---------------------
src/bin/pg_dump/compress_io.h | 166 +++++-
src/bin/pg_dump/compress_none.c | 206 +++++++
src/bin/pg_dump/compress_none.h | 24 +
src/bin/pg_dump/meson.build | 2 +
src/bin/pg_dump/pg_backup_archiver.c | 91 ++--
src/bin/pg_dump/pg_backup_archiver.h | 5 +-
src/bin/pg_dump/pg_backup_custom.c | 27 +-
src/bin/pg_dump/pg_backup_directory.c | 94 ++--
src/bin/pg_dump/t/002_pg_dump.pl | 10 +-
src/include/common/compression.h | 4 +
src/tools/pginclude/cpluspluscheck | 2 +
src/tools/pgindent/typedefs.list | 2 +
16 files changed, 1053 insertions(+), 751 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
create mode 100644 src/bin/pg_dump/compress_none.c
create mode 100644 src/bin/pg_dump/compress_none.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index ef1ed0f3e5..0013bc080c 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,7 +24,9 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
+ compress_none.o \
dumputils.o \
parallel.o \
pg_backup_archiver.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..24e68fd022
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,401 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.c
+ * Routines for archivers to read or write a gzip compressed data stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_gzip.h"
+#include "pg_backup_utils.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ /*
+ * A level of zero simply copies the input one block at the time. This
+ * is probably not what the user wanted when calling this interface.
+ */
+ if (cs->compression_spec.level == 0)
+ pg_fatal("requested to compress the archive yet no level was specified");
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ cs->compression_spec = compression_spec;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
+ cs->private_data = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ size_t ret;
+
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzwrite(gzfp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzgets(gzfp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ CFH->private_data = NULL;
+
+ return gzclose(gzfp);
+}
+
+static int
+Gzip_eof(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzeof(gzfp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ gzFile gzfp;
+ char mode_compression[32];
+
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
+ else
+ gzfp = gzopen(path, mode_compression);
+
+ if (gzfp == NULL)
+ return 1;
+
+ CFH->private_data = gzfp;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ CFH->open_func = Gzip_open;
+ CFH->open_write_func = Gzip_open_write;
+ CFH->read_func = Gzip_read;
+ CFH->write_func = Gzip_write;
+ CFH->gets_func = Gzip_gets;
+ CFH->getc_func = Gzip_getc;
+ CFH->close_func = Gzip_close;
+ CFH->eof_func = Gzip_eof;
+ CFH->get_error_func = Gzip_get_error;
+
+ CFH->compression_spec = compression_spec;
+
+ CFH->private_data = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+
+void
+InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..2392c697b4
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.h
+ * GZIP interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index e1733ce57c..c507eeb7b3 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -9,42 +9,54 @@
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
+ * underlying stream. The second API is a wrapper around fopen and
* friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
+ * the possible compression. The second API is aimed for the resulting
+ * files to be easily manipulated with an external compression utility
+ * program.
+ *
+ * This file also includes the implementation when compression is none for
+ * both API's.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
- *
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
+ * AllocateCompressor, writeData, and EndCompressor. First you call
+ * AllocateCompressor, then write all the data by calling writeData as many
+ * times as needed, and finally EndCompressor. writeData will call the
+ * WriteFunc that was provided to AllocateCompressor for each chunk of
+ * compressed data.
*
- * The interface is the same for compressed and uncompressed streams.
+ * The interface for reading an archive consists of the same three functions:
+ * AllocateCompressor, readData, and EndCompressor. First you call
+ * AllocateCompressor, then read all the data by calling readData to read the
+ * whole compressed stream which repeatedly calls the given ReadFunc. ReadFunc
+ * returns the compressed data one chunk at a time. Then readData decompresses
+ * it and passes the decompressed data to ahwrite(), until ReadFunc returns 0
+ * to signal EOF. The interface is the same for compressed and uncompressed
+ * streams.
*
* Compressed stream API
* ----------------------
*
- * The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * The compressed stream API is providing a set of function pointers for
+ * opening, reading, writing, and finally closing files. The implemented
+ * function pointers are documented in the corresponding header file and are
+ * common for all streams. It allows the caller to use the same functions for
+ * both compressed and uncompressed streams.
+ *
+ * The interface consists of three functions, InitCompressFileHandle,
+ * InitDiscoverCompressFileHandle, and EndCompressFileHandle. If the
+ * compression is known, then start by calling InitCompressFileHandle,
+ * otherwise discover it by using InitDiscoverCompressFileHandle. Then call
+ * the function pointers as required for the read/write operations. Finally
+ * call EndCompressFileHandle to end the stream.
+ *
+ * InitDiscoverCompressFileHandle tries to deffer the compression by the
+ * filename suffix. If the suffix is not yet known, then it tries to simply
+ * open the file, and if it fails, it tries to open the same file with the .gz
+ * suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -53,13 +65,14 @@
*/
#include "postgres_fe.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
+#include "compress_gzip.h"
#include "compress_io.h"
+#include "compress_none.h"
#include "pg_backup_utils.h"
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#endif
-
/*----------------------
* Generic functions
*----------------------
@@ -91,110 +104,25 @@ supports_compression(const pg_compress_specification compression_spec)
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
-{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
-
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
-
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
-
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-
-/* Public interface routines */
-
-/* Allocate a new compressor */
+/*
+ * Allocate a new compressor.
+ */
CompressorState *
AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
- cs->compression_spec = compression_spec;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
-}
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF)
-{
if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
-}
+ InitCompressorNone(cs, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressorGzip(cs, compression_spec);
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ return cs;
}
/*
@@ -203,233 +131,31 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
-#endif
- free(cs);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
- */
-
-static void
-InitCompressorZlib(CompressorState *cs, int level)
-{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
-}
-
-static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
-{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
-
- free(cs->zlibOut);
- free(cs->zp);
-}
-
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
-{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
-
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
-
- if (res == Z_STREAM_END)
- break;
- }
-}
-
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
-}
-
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
-{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
-
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
-
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
-
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
-}
-#endif /* HAVE_LIBZ */
-
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
-{
- size_t cnt;
- char *buf;
- size_t buflen;
-
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
-
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
-
- free(buf);
-}
-
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->writeF(AH, data, dLen);
-}
-
-
/*----------------------
* Compressed stream API
*----------------------
*/
/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
+ * Private routines
*/
-struct cfp
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
-};
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
+}
/* free() without changing errno; useful in several places below */
static void
@@ -442,324 +168,102 @@ free_keep_errno(void *p)
}
/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
+ * Public interface
*/
-cfp *
-cfopen_read(const char *path, const char *mode)
-{
- cfp *fp;
-
- pg_compress_specification compression_spec = {0};
-
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
-
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
-}
/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
+ * Initialize a compress file handle for the specified compression algorithm.
*/
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compression_spec)
{
- cfp *fp;
-
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
-}
-
-/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
- */
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compression_spec)
-{
- cfp *fp = pg_malloc0(sizeof(cfp));
-
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode_compression);
- else
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode);
- else
- fp->compressedfp = gzopen(path, mode);
- }
+ CompressFileHandle *CFH;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
- else
- {
- if (fd >= 0)
- fp->uncompressedfp = fdopen(fd, mode);
- else
- fp->uncompressedfp = fopen(path, mode);
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
- }
+ if (compression_spec.algorithm == PG_COMPRESSION_NONE)
+ InitCompressFileHandleNone(CFH, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressFileHandleGzip(CFH, compression_spec);
- return fp;
+ return CFH;
}
/*
- * Opens file 'path' in 'mode' and compression as defined in
- * compression_spec. The caller must verify that the compression
- * is supported by the current build.
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
*
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- return cfopen_internal(path, -1, mode, compression_spec);
-}
-
-/*
- * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
- * and compression as defined in compression_spec. The caller must
- * verify that the compression is supported by the current build.
+ * If the file at 'path' contains the suffix of a supported compression method,
+ * currently this includes only ".gz", then this compression will be used
+ * throughout. Otherwise the compression will be deferred by iteratively trying
+ * to open the file at 'path', first as is, then by appending known compression
+ * suffixes. So if you pass "foo" as 'path', this will open either "foo" or
+ * "foo.gz", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec)
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- return cfopen_internal(NULL, fd, mode, compression_spec);
-}
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
-int
-cfread(void *ptr, int size, cfp *fp)
-{
- int ret;
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (size == 0)
- return 0;
+ Assert(strcmp(mode, PG_BINARY_R) == 0);
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ fname = strdup(path);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
else
-#endif
{
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
- }
- return ret;
-}
-
-int
-cfwrite(const void *ptr, int size, cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
-#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
-}
-
-int
-cfgetc(cfp *fp)
-{
- int ret;
+ bool exists;
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not built with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
+ if (!exists)
{
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
- }
- else
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
}
- return ret;
-}
-
-char *
-cfgets(cfp *fp, char *buf, int len)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
-#endif
- return fgets(buf, len, fp->uncompressedfp);
-}
-
-int
-cfclose(cfp *fp)
-{
- int result;
-
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
- }
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ CFH = InitCompressFileHandle(compression_spec);
+ if (CFH->open_func(fname, -1, mode, CFH))
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
+ free_keep_errno(CFH);
+ CFH = NULL;
}
- else
-#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
- }
- free_keep_errno(fp);
+ free_keep_errno(fname);
- return result;
+ return CFH;
}
+/*
+ * Close an open file handle and release its memory.
+ *
+ * On failure, returns an error value and sets errno appropriately.
+ */
int
-cfeof(cfp *fp)
+EndCompressFileHandle(CompressFileHandle *CFH)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
-#endif
- return feof(fp->uncompressedfp);
-}
+ int ret = 0;
-const char *
-get_cfp_error(cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ if (CFH->private_data)
+ ret = CFH->close_func(CFH);
- if (errnum != Z_ERRNO)
- return errmsg;
- }
-#endif
- return strerror(errno);
-}
+ free_keep_errno(CFH);
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
-{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
-
- if (filenamelen < suffixlen)
- return 0;
-
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
+ return ret;
}
-
-#endif
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 8beb1058ec..d41c495025 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -23,50 +23,160 @@
extern char *supports_compression(const pg_compress_specification compression_spec);
-/* Prototype for callback function to WriteDataToArchive() */
+/*
+ * Prototype for callback function used in writeData()
+ */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
/*
- * Prototype for callback function to ReadDataFromArchive()
+ * Prototype for callback function used in readData()
*
- * ReadDataFromArchive will call the read function repeatedly, until it
- * returns 0 to signal EOF. ReadDataFromArchive passes a buffer to read the
- * data into in *buf, of length *buflen. If that's not big enough for the
- * callback function, it can free() it and malloc() a new one, returning the
- * new buffer and its size in *buf and *buflen.
+ * readData will call the read function repeatedly, until it returns 0 to signal
+ * EOF. readData passes a buffer to read the data into in *buf, of length
+ * *buflen. If that's not big enough for the callback function, it can free() it
+ * and malloc() a new one, returning the new buffer and its size in *buf and
+ * *buflen.
*
* Returns the number of bytes read into *buf, or 0 on EOF.
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+
+ /*
+ * End compression and flush internal buffers if any.
+ */
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Callback function to read from an already processed input stream
+ */
+ ReadFunc readF;
+
+ /*
+ * Callback function to write an already processed chunk of data.
+ */
+ WriteFunc writeF;
+
+ /*
+ * Compression specification for this state.
+ */
+ pg_compress_specification compression_spec;
+
+ /*
+ * Private data to be used by the compressor.
+ */
+ void *private_data;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ /*
+ * Open a file in mode.
+ *
+ * Pass either 'path' or 'fd' depending on whether a filepath or a file
+ * descriptor is available. 'mode' can be one of 'r', 'rb', 'w', 'wb',
+ * 'a', and 'ab'. Requrires an already initialized CompressFileHandle.
+ */
+ int (*open_func) (const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH);
+
+ /*
+ * Open a file for writing.
+ *
+ * 'mode' can be one of 'w', 'wb', 'a', and 'ab'. Requrires an already
+ * initialized CompressFileHandle.
+ */
+ int (*open_write_func) (const char *path, const char *mode,
+ CompressFileHandle *CFH);
-typedef struct cfp cfp;
+ /*
+ * Read 'size' bytes of data from the file and store them into 'ptr'.
+ */
+ size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+ /*
+ * Write 'size' bytes of data into the file from 'ptr'.
+ */
+ size_t (*write_func) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ /*
+ * Read at most size - 1 characters from the compress file handle into
+ * 's'.
+ *
+ * Stop if an EOF or a newline is found first. 's' is always null
+ * terminated and contains the newline if it was found.
+ */
+ char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
+
+ /*
+ * Read the next character from the compress file handle as 'unsigned
+ * char' cast into 'int'.
+ */
+ int (*getc_func) (CompressFileHandle *CFH);
+
+ /*
+ * Test if EOF is reached in the compress file handle.
+ */
+ int (*eof_func) (CompressFileHandle *CFH);
+
+ /*
+ * Close an open file handle.
+ */
+ int (*close_func) (CompressFileHandle *CFH);
+
+ /*
+ * Get a pointer to a string that describes an error that occured during a
+ * compress file handle operation.
+ */
+ const char *(*get_error_func) (CompressFileHandle *CFH);
+
+ /*
+ * Compression specification for this file handle.
+ */
+ pg_compress_specification compression_spec;
+
+ /*
+ * Private data to be used by the compressor.
+ */
+ void *private_data;
+};
+
+/*
+ * Initialize a compress file handle with the requested compression.
+ */
+extern CompressFileHandle *InitCompressFileHandle(const pg_compress_specification compression_spec);
+
+/*
+ * Initialize a compress file stream. Deffer the compression algorithm
+ * from 'path', either by examining its suffix or by appending the supported
+ * suffixes in 'path'.
+ */
+extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int EndCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/compress_none.c b/src/bin/pg_dump/compress_none.c
new file mode 100644
index 0000000000..ecbcf4b04a
--- /dev/null
+++ b/src/bin/pg_dump/compress_none.c
@@ -0,0 +1,206 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_none.c
+ * Routines for archivers to read or write an uncompressed stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_none.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_none.h"
+#include "pg_backup_utils.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+/*
+ * Private routines
+ */
+
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
+
+ free(buf);
+}
+
+
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+/*
+ * Public interface
+ */
+
+void
+InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+
+ cs->compression_spec = compression_spec;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * Private routines
+ */
+
+static size_t
+read_none(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ size_t ret;
+
+ if (size == 0)
+ return 0;
+
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
+
+ return ret;
+}
+
+static size_t
+write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
+}
+
+static const char *
+get_error_none(CompressFileHandle *CFH)
+{
+ return strerror(errno);
+}
+
+static char *
+gets_none(char *ptr, int size, CompressFileHandle *CFH)
+{
+ return fgets(ptr, size, (FILE *) CFH->private_data);
+}
+
+static int
+getc_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret;
+
+ ret = fgetc(fp);
+ if (ret == EOF)
+ {
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static int
+close_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret = 0;
+
+ CFH->private_data = NULL;
+
+ if (fp)
+ ret = fclose(fp);
+
+ return ret;
+}
+
+static int
+eof_none(CompressFileHandle *CFH)
+{
+ return feof((FILE *) CFH->private_data);
+}
+
+static int
+open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
+
+ if (fd >= 0)
+ CFH->private_data = fdopen(dup(fd), mode);
+ else
+ CFH->private_data = fopen(path, mode);
+
+ if (CFH->private_data == NULL)
+ return 1;
+
+ return 0;
+}
+
+static int
+open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
+
+ CFH->private_data = fopen(path, mode);
+ if (CFH->private_data == NULL)
+ return 1;
+
+ return 0;
+}
+
+/*
+ * Public interface
+ */
+
+void
+InitCompressFileHandleNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ CFH->open_func = open_none;
+ CFH->open_write_func = open_write_none;
+ CFH->read_func = read_none;
+ CFH->write_func = write_none;
+ CFH->gets_func = gets_none;
+ CFH->getc_func = getc_none;
+ CFH->close_func = close_none;
+ CFH->eof_func = eof_none;
+ CFH->get_error_func = get_error_none;
+
+ CFH->private_data = NULL;
+}
diff --git a/src/bin/pg_dump/compress_none.h b/src/bin/pg_dump/compress_none.h
new file mode 100644
index 0000000000..143e599819
--- /dev/null
+++ b/src/bin/pg_dump/compress_none.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_none.h
+ * Uncompressed interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_none.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_NONE_H_
+#define _COMPRESS_NONE_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_NONE_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index ca62f9a374..84e9f0defa 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,7 +1,9 @@
# Copyright (c) 2022-2023, PostgreSQL Global Development Group
pg_dump_common_sources = files(
+ 'compress_gzip.c',
'compress_io.c',
+ 'compress_none.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index e02ce22db2..2bc5648ed6 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -354,7 +354,7 @@ RestoreArchive(Archive *AHX)
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1128,7 +1128,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1144,9 +1144,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1503,6 +1504,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1525,33 +1527,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1690,7 +1691,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2032,6 +2037,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2062,26 +2079,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2179,6 +2182,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2234,7 +2238,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3646,12 +3653,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3723,10 +3725,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 4725e49747..18b38c17ab 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,13 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add
+ * compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index 7529367a7b..b576b29924 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
@@ -977,7 +988,7 @@ _readBlockHeader(ArchiveHandle *AH, int *type, int *id)
}
/*
- * Callback function for WriteDataToArchive. Writes one block of (compressed)
+ * Callback function for writeData. Writes one block of (compressed)
* data to the archive.
*/
static void
@@ -992,7 +1003,7 @@ _CustomWriteFunc(ArchiveHandle *AH, const char *buf, size_t len)
}
/*
- * Callback function for ReadDataFromArchive. To keep things simple, we
+ * Callback function for readData. To keep things simple, we
* always read one compressed block at a time.
*/
static size_t
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 6800c3ccee..a2f88995c0 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read_func(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (EndCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof_func(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write_func(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc_func(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write_func(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read_func(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compression_spec);
+ if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
+ if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write_func(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index d92247c915..78454928cc 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -94,7 +94,7 @@ my %pgdump_runs = (
command => [
'pg_restore', '-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip-compressed'
},
},
@@ -239,8 +239,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default if available',
},
},
@@ -264,8 +264,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
glob_patterns => [
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 9a471e40aa..d44ebb06cc 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -14,6 +14,10 @@
#ifndef PG_COMPRESSION_H
#define PG_COMPRESSION_H
+/*
+ * These values are stored in disc, for example in files generated by pg_dump.
+ * Create the necessary backwards compatibility layers if their order changes.
+ */
typedef enum pg_compress_algorithm
{
PG_COMPRESSION_NONE,
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index e52fe9f509..db429474a2 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -150,7 +150,9 @@ do
# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.
+ test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
test "$f" = src/bin/pg_dump/pg_dump.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 45fc5759ce..1d59b82003 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -428,6 +428,7 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
+CompressFileHandle
CompressionLocation
CompressorState
ComputeXidHorizonsResult
@@ -1034,6 +1035,7 @@ GucStack
GucStackState
GucStringAssignHook
GucStringCheckHook
+GzipCompressorState
HANDLE
HASHACTION
HASHBUCKET
--
2.34.1
v28-0001-Prepare-pg_dump-internals-for-additional-compres.patchtext/x-patch; name=v28-0001-Prepare-pg_dump-internals-for-additional-compres.patchDownload
From 1dc258e7d995145894b288d30ebd819de694ddfa Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 16 Feb 2023 10:47:59 +0000
Subject: [PATCH v28 1/3] Prepare pg_dump internals for additional compression
methods.
Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of
unifying compression related code and allow for the introduction of additional
archive formats. However, pg_backup_archiver.c was not using that API. This
commit teaches pg_backup_archiver.c about it and is using it throughout.
---
src/bin/pg_dump/compress_io.c | 92 +++++++++++++++---
src/bin/pg_dump/compress_io.h | 4 +
src/bin/pg_dump/pg_backup_archiver.c | 138 ++++++++++-----------------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-----
4 files changed, 133 insertions(+), 128 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 7a2c80bbc4..e1733ce57c 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,36 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
+/*----------------------
+ * Generic functions
+ *----------------------
+ */
+
+char *
+supports_compression(const pg_compress_specification compression_spec)
+{
+ const pg_compress_algorithm algorithm = compression_spec.algorithm;
+ bool supported = false;
+
+ if (algorithm == PG_COMPRESSION_NONE)
+ supported = true;
+#ifdef HAVE_LIBZ
+ if (algorithm == PG_COMPRESSION_GZIP)
+ supported = true;
+#endif
+
+ if (!supported)
+ return psprintf("this build does not support compression with %s",
+ get_compress_algorithm_name(algorithm));
+
+ return NULL;
+}
+
+
/*----------------------
* Compressor API
*----------------------
@@ -490,16 +520,19 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_specification compression_spec)
{
- cfp *fp = pg_malloc(sizeof(cfp));
+ cfp *fp = pg_malloc0(sizeof(cfp));
if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
@@ -511,15 +544,20 @@ cfopen(const char *path, const char *mode,
snprintf(mode_compression, sizeof(mode_compression), "%s%d",
mode, compression_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
+ if (fd >= 0)
+ fp->compressedfp = gzdopen(fd, mode_compression);
+ else
+ fp->compressedfp = gzopen(path, mode_compression);
}
else
{
/* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
+ if (fd >= 0)
+ fp->compressedfp = gzdopen(fd, mode);
+ else
+ fp->compressedfp = gzopen(path, mode);
}
- fp->uncompressedfp = NULL;
if (fp->compressedfp == NULL)
{
free_keep_errno(fp);
@@ -531,10 +569,11 @@ cfopen(const char *path, const char *mode,
}
else
{
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
-#endif
- fp->uncompressedfp = fopen(path, mode);
+ if (fd >= 0)
+ fp->uncompressedfp = fdopen(fd, mode);
+ else
+ fp->uncompressedfp = fopen(path, mode);
+
if (fp->uncompressedfp == NULL)
{
free_keep_errno(fp);
@@ -545,6 +584,33 @@ cfopen(const char *path, const char *mode,
return fp;
}
+/*
+ * Opens file 'path' in 'mode' and compression as defined in
+ * compression_spec. The caller must verify that the compression
+ * is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+/*
+ * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
+ * and compression as defined in compression_spec. The caller must
+ * verify that the compression is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a429dc4789..8beb1058ec 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -21,6 +21,8 @@
#define ZLIB_OUT_SIZE 4096
#define ZLIB_IN_SIZE 4096
+extern char *supports_compression(const pg_compress_specification compression_spec);
+
/* Prototype for callback function to WriteDataToArchive() */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
@@ -54,6 +56,8 @@ typedef struct cfp cfp;
extern cfp *cfopen(const char *path, const char *mode,
const pg_compress_specification compression_spec);
+extern cfp *cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
extern cfp *cfopen_write(const char *path, const char *mode,
const pg_compress_specification compression_spec);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 269bfce019..e02ce22db2 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -363,7 +354,7 @@ RestoreArchive(Archive *AHX)
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +382,21 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
- AH->PrintTocDataPtr != NULL)
+ if (AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+ char *errmsg = supports_compression(AH->compression_spec);
+ if (errmsg)
+ pg_fatal("cannot restore data from compressed archive (%s)",
+ errmsg);
+ else
+ break;
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1133,7 +1128,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1508,58 +1503,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1570,33 +1539,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1720,22 +1680,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2224,6 +2179,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2277,8 +2233,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3712,6 +3668,7 @@ WriteHead(ArchiveHandle *AH)
void
ReadHead(ArchiveHandle *AH)
{
+ char *errmsg;
char vmaj,
vmin,
vrev;
@@ -3781,10 +3738,13 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
-#endif
+ errmsg = supports_compression(AH->compression_spec);
+ if (errmsg)
+ {
+ pg_log_warning("archive is compressed, but this installation does not support compression (%s) -- no data will be available",
+ errmsg);
+ pg_free(errmsg);
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b..4725e49747 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.34.1
Hi Georgios,
I spent some time looking at the patch again, and IMO it's RFC. But I
need some help with the commit messages - I updated 0001 and 0002 but I
wasn't quite sure what some of the stuff meant to say and/or it seemed
maybe coming from an earlier patch version and obsolete.
Could you go over them and check if I got it right? Also feel free to
update the list of reviewers (I compiled that from substantial reviews
on the thread).
The 0003 commit message seems somewhat confusing - I added some XXX
lines asking about unclear stuff.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
0001-Prepare-pg_dump-internals-for-additional-compres-v29.patchtext/x-patch; charset=UTF-8; name=0001-Prepare-pg_dump-internals-for-additional-compres-v29.patchDownload
From 1c0fdd6b294411368d4abd352c46bd31c58ba121 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 16 Feb 2023 10:47:59 +0000
Subject: [PATCH 1/3] Prepare pg_dump internals for additional compression
methods
Commit bf9aa490db a generic compression API in compress_io.{c,h} to make
reuse easier, and allow adding more compression algorithms. However,
pg_backup_archiver.c was not switched to this API and continued to call
the compression directly.
This commit teaches pg_backup_archiver.c about the compression API, so
that it can benefit from bf9aa490db (simpler code, easier addition of
new compression methods).
Most of the code was written by Georgios Kokolatos. Rachel Heaton provided
invaluable help out with expanding the testing coverage, testing on
different platforms and providing debug information on those, as well as
native speaker wording.
Author: Georgios Kokolatos, Rachel Heaton
Reviewed-by: Michael Paquier, Justin Pryzby, Daniel Gustafsson, Tomas
Vondra
Discussion: https://postgr.es/m/faUNEOpts9vunEaLnmxmG-DldLSg_ql137OC3JYDmgrOMHm1RvvWY2IdBkv_CRxm5spCCb_OmKNk2T03TMm0fBEWveFF9wA1WizPuAgB7Ss%3D%40protonmail.com
---
src/bin/pg_dump/compress_io.c | 92 +++++++++++++++---
src/bin/pg_dump/compress_io.h | 4 +
src/bin/pg_dump/pg_backup_archiver.c | 138 ++++++++++-----------------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-----
4 files changed, 133 insertions(+), 128 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 7a2c80bbc4c..e1733ce57cf 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,36 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
+/*----------------------
+ * Generic functions
+ *----------------------
+ */
+
+char *
+supports_compression(const pg_compress_specification compression_spec)
+{
+ const pg_compress_algorithm algorithm = compression_spec.algorithm;
+ bool supported = false;
+
+ if (algorithm == PG_COMPRESSION_NONE)
+ supported = true;
+#ifdef HAVE_LIBZ
+ if (algorithm == PG_COMPRESSION_GZIP)
+ supported = true;
+#endif
+
+ if (!supported)
+ return psprintf("this build does not support compression with %s",
+ get_compress_algorithm_name(algorithm));
+
+ return NULL;
+}
+
+
/*----------------------
* Compressor API
*----------------------
@@ -490,16 +520,19 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_specification compression_spec)
{
- cfp *fp = pg_malloc(sizeof(cfp));
+ cfp *fp = pg_malloc0(sizeof(cfp));
if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
@@ -511,15 +544,20 @@ cfopen(const char *path, const char *mode,
snprintf(mode_compression, sizeof(mode_compression), "%s%d",
mode, compression_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
+ if (fd >= 0)
+ fp->compressedfp = gzdopen(fd, mode_compression);
+ else
+ fp->compressedfp = gzopen(path, mode_compression);
}
else
{
/* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
+ if (fd >= 0)
+ fp->compressedfp = gzdopen(fd, mode);
+ else
+ fp->compressedfp = gzopen(path, mode);
}
- fp->uncompressedfp = NULL;
if (fp->compressedfp == NULL)
{
free_keep_errno(fp);
@@ -531,10 +569,11 @@ cfopen(const char *path, const char *mode,
}
else
{
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
-#endif
- fp->uncompressedfp = fopen(path, mode);
+ if (fd >= 0)
+ fp->uncompressedfp = fdopen(fd, mode);
+ else
+ fp->uncompressedfp = fopen(path, mode);
+
if (fp->uncompressedfp == NULL)
{
free_keep_errno(fp);
@@ -545,6 +584,33 @@ cfopen(const char *path, const char *mode,
return fp;
}
+/*
+ * Opens file 'path' in 'mode' and compression as defined in
+ * compression_spec. The caller must verify that the compression
+ * is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+/*
+ * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
+ * and compression as defined in compression_spec. The caller must
+ * verify that the compression is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a429dc4789d..8beb1058ec2 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -21,6 +21,8 @@
#define ZLIB_OUT_SIZE 4096
#define ZLIB_IN_SIZE 4096
+extern char *supports_compression(const pg_compress_specification compression_spec);
+
/* Prototype for callback function to WriteDataToArchive() */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
@@ -54,6 +56,8 @@ typedef struct cfp cfp;
extern cfp *cfopen(const char *path, const char *mode,
const pg_compress_specification compression_spec);
+extern cfp *cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
extern cfp *cfopen_write(const char *path, const char *mode,
const pg_compress_specification compression_spec);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 269bfce019b..e02ce22db2e 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -363,7 +354,7 @@ RestoreArchive(Archive *AHX)
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +382,21 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
- AH->PrintTocDataPtr != NULL)
+ if (AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+ char *errmsg = supports_compression(AH->compression_spec);
+ if (errmsg)
+ pg_fatal("cannot restore data from compressed archive (%s)",
+ errmsg);
+ else
+ break;
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1133,7 +1128,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1508,58 +1503,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1570,33 +1539,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1720,22 +1680,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2224,6 +2179,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2277,8 +2233,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3712,6 +3668,7 @@ WriteHead(ArchiveHandle *AH)
void
ReadHead(ArchiveHandle *AH)
{
+ char *errmsg;
char vmaj,
vmin,
vrev;
@@ -3781,10 +3738,13 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
-#endif
+ errmsg = supports_compression(AH->compression_spec);
+ if (errmsg)
+ {
+ pg_log_warning("archive is compressed, but this installation does not support compression (%s) -- no data will be available",
+ errmsg);
+ pg_free(errmsg);
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b4..4725e49747b 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.39.1
0002-Introduce-a-generic-pg_dump-compression-API-v29.patchtext/x-patch; charset=UTF-8; name=0002-Introduce-a-generic-pg_dump-compression-API-v29.patchDownload
From f5002e563b48a0196143910748ce0d6b92a387c9 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 16 Feb 2023 11:00:25 +0000
Subject: [PATCH 2/3] Introduce a generic pg_dump compression API
Switch pg_dump to use the Compression API, implemented by bf9aa490db.
The CompressFileHandle replaces the cfp* family of functions with a
struct of callbacks for accessing (compressed) files. This allows adding
new compression methods simply by introducing a new struct instance with
appropriate implementation of the callbacks.
Archives compressed using custom compression methods store an identifier
of the compression algorithm in their header instead of the compression
level. The header version is bumped.
Most of the code was written by Georgios Kokolatos. Rachel Heaton provided
invaluable help out with expanding the testing coverage, testing on
different platforms and providing debug information on those, as well as
native speaker wording.
Author: Georgios Kokolatos, Rachel Heaton
Reviewed-by: Michael Paquier, Justin Pryzby, Daniel Gustafsson, Tomas
Vondra
Discussion: https://postgr.es/m/faUNEOpts9vunEaLnmxmG-DldLSg_ql137OC3JYDmgrOMHm1RvvWY2IdBkv_CRxm5spCCb_OmKNk2T03TMm0fBEWveFF9wA1WizPuAgB7Ss%3D%40protonmail.com
---
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_gzip.c | 401 ++++++++++++++
src/bin/pg_dump/compress_gzip.h | 24 +
src/bin/pg_dump/compress_io.c | 744 +++++---------------------
src/bin/pg_dump/compress_io.h | 166 +++++-
src/bin/pg_dump/compress_none.c | 206 +++++++
src/bin/pg_dump/compress_none.h | 24 +
src/bin/pg_dump/meson.build | 2 +
src/bin/pg_dump/pg_backup_archiver.c | 91 ++--
src/bin/pg_dump/pg_backup_archiver.h | 5 +-
src/bin/pg_dump/pg_backup_custom.c | 27 +-
src/bin/pg_dump/pg_backup_directory.c | 94 ++--
src/bin/pg_dump/t/002_pg_dump.pl | 10 +-
src/include/common/compression.h | 4 +
src/tools/pginclude/cpluspluscheck | 2 +
src/tools/pgindent/typedefs.list | 2 +
16 files changed, 1053 insertions(+), 751 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
create mode 100644 src/bin/pg_dump/compress_none.c
create mode 100644 src/bin/pg_dump/compress_none.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index ef1ed0f3e51..0013bc080cf 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,7 +24,9 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
+ compress_none.o \
dumputils.o \
parallel.o \
pg_backup_archiver.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 00000000000..24e68fd0221
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,401 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.c
+ * Routines for archivers to read or write a gzip compressed data stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_gzip.h"
+#include "pg_backup_utils.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ /*
+ * A level of zero simply copies the input one block at the time. This
+ * is probably not what the user wanted when calling this interface.
+ */
+ if (cs->compression_spec.level == 0)
+ pg_fatal("requested to compress the archive yet no level was specified");
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ cs->compression_spec = compression_spec;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
+ cs->private_data = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ size_t ret;
+
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzwrite(gzfp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzgets(gzfp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ CFH->private_data = NULL;
+
+ return gzclose(gzfp);
+}
+
+static int
+Gzip_eof(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzeof(gzfp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ gzFile gzfp;
+ char mode_compression[32];
+
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
+ else
+ gzfp = gzopen(path, mode_compression);
+
+ if (gzfp == NULL)
+ return 1;
+
+ CFH->private_data = gzfp;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ CFH->open_func = Gzip_open;
+ CFH->open_write_func = Gzip_open_write;
+ CFH->read_func = Gzip_read;
+ CFH->write_func = Gzip_write;
+ CFH->gets_func = Gzip_gets;
+ CFH->getc_func = Gzip_getc;
+ CFH->close_func = Gzip_close;
+ CFH->eof_func = Gzip_eof;
+ CFH->get_error_func = Gzip_get_error;
+
+ CFH->compression_spec = compression_spec;
+
+ CFH->private_data = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+
+void
+InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 00000000000..2392c697b4c
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.h
+ * GZIP interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index e1733ce57cf..c507eeb7b3d 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -9,42 +9,54 @@
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
+ * underlying stream. The second API is a wrapper around fopen and
* friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
+ * the possible compression. The second API is aimed for the resulting
+ * files to be easily manipulated with an external compression utility
+ * program.
+ *
+ * This file also includes the implementation when compression is none for
+ * both API's.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
- *
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
+ * AllocateCompressor, writeData, and EndCompressor. First you call
+ * AllocateCompressor, then write all the data by calling writeData as many
+ * times as needed, and finally EndCompressor. writeData will call the
+ * WriteFunc that was provided to AllocateCompressor for each chunk of
+ * compressed data.
*
- * The interface is the same for compressed and uncompressed streams.
+ * The interface for reading an archive consists of the same three functions:
+ * AllocateCompressor, readData, and EndCompressor. First you call
+ * AllocateCompressor, then read all the data by calling readData to read the
+ * whole compressed stream which repeatedly calls the given ReadFunc. ReadFunc
+ * returns the compressed data one chunk at a time. Then readData decompresses
+ * it and passes the decompressed data to ahwrite(), until ReadFunc returns 0
+ * to signal EOF. The interface is the same for compressed and uncompressed
+ * streams.
*
* Compressed stream API
* ----------------------
*
- * The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * The compressed stream API is providing a set of function pointers for
+ * opening, reading, writing, and finally closing files. The implemented
+ * function pointers are documented in the corresponding header file and are
+ * common for all streams. It allows the caller to use the same functions for
+ * both compressed and uncompressed streams.
+ *
+ * The interface consists of three functions, InitCompressFileHandle,
+ * InitDiscoverCompressFileHandle, and EndCompressFileHandle. If the
+ * compression is known, then start by calling InitCompressFileHandle,
+ * otherwise discover it by using InitDiscoverCompressFileHandle. Then call
+ * the function pointers as required for the read/write operations. Finally
+ * call EndCompressFileHandle to end the stream.
+ *
+ * InitDiscoverCompressFileHandle tries to deffer the compression by the
+ * filename suffix. If the suffix is not yet known, then it tries to simply
+ * open the file, and if it fails, it tries to open the same file with the .gz
+ * suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -53,13 +65,14 @@
*/
#include "postgres_fe.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
+#include "compress_gzip.h"
#include "compress_io.h"
+#include "compress_none.h"
#include "pg_backup_utils.h"
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#endif
-
/*----------------------
* Generic functions
*----------------------
@@ -91,110 +104,25 @@ supports_compression(const pg_compress_specification compression_spec)
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
-{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
-
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
-
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
-
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-
-/* Public interface routines */
-
-/* Allocate a new compressor */
+/*
+ * Allocate a new compressor.
+ */
CompressorState *
AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
- cs->compression_spec = compression_spec;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
-}
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF)
-{
if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
-}
+ InitCompressorNone(cs, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressorGzip(cs, compression_spec);
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ return cs;
}
/*
@@ -203,233 +131,31 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
-#endif
- free(cs);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
- */
-
-static void
-InitCompressorZlib(CompressorState *cs, int level)
-{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
-}
-
-static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
-{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
-
- free(cs->zlibOut);
- free(cs->zp);
-}
-
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
-{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
-
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
-
- if (res == Z_STREAM_END)
- break;
- }
-}
-
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
-}
-
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
-{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
-
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
-
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
-
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
-}
-#endif /* HAVE_LIBZ */
-
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
-{
- size_t cnt;
- char *buf;
- size_t buflen;
-
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
-
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
-
- free(buf);
-}
-
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->writeF(AH, data, dLen);
-}
-
-
/*----------------------
* Compressed stream API
*----------------------
*/
/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
+ * Private routines
*/
-struct cfp
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
-};
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
+}
/* free() without changing errno; useful in several places below */
static void
@@ -442,324 +168,102 @@ free_keep_errno(void *p)
}
/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
+ * Public interface
*/
-cfp *
-cfopen_read(const char *path, const char *mode)
-{
- cfp *fp;
-
- pg_compress_specification compression_spec = {0};
-
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
-
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
-}
/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
+ * Initialize a compress file handle for the specified compression algorithm.
*/
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compression_spec)
{
- cfp *fp;
-
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
-}
-
-/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
- */
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compression_spec)
-{
- cfp *fp = pg_malloc0(sizeof(cfp));
-
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode_compression);
- else
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode);
- else
- fp->compressedfp = gzopen(path, mode);
- }
+ CompressFileHandle *CFH;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
- else
- {
- if (fd >= 0)
- fp->uncompressedfp = fdopen(fd, mode);
- else
- fp->uncompressedfp = fopen(path, mode);
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
- }
+ if (compression_spec.algorithm == PG_COMPRESSION_NONE)
+ InitCompressFileHandleNone(CFH, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressFileHandleGzip(CFH, compression_spec);
- return fp;
+ return CFH;
}
/*
- * Opens file 'path' in 'mode' and compression as defined in
- * compression_spec. The caller must verify that the compression
- * is supported by the current build.
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
*
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- return cfopen_internal(path, -1, mode, compression_spec);
-}
-
-/*
- * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
- * and compression as defined in compression_spec. The caller must
- * verify that the compression is supported by the current build.
+ * If the file at 'path' contains the suffix of a supported compression method,
+ * currently this includes only ".gz", then this compression will be used
+ * throughout. Otherwise the compression will be deferred by iteratively trying
+ * to open the file at 'path', first as is, then by appending known compression
+ * suffixes. So if you pass "foo" as 'path', this will open either "foo" or
+ * "foo.gz", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec)
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- return cfopen_internal(NULL, fd, mode, compression_spec);
-}
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
-int
-cfread(void *ptr, int size, cfp *fp)
-{
- int ret;
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (size == 0)
- return 0;
+ Assert(strcmp(mode, PG_BINARY_R) == 0);
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ fname = strdup(path);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
else
-#endif
{
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
- }
- return ret;
-}
-
-int
-cfwrite(const void *ptr, int size, cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
-#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
-}
-
-int
-cfgetc(cfp *fp)
-{
- int ret;
+ bool exists;
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not built with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
+ if (!exists)
{
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
- }
- else
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
}
- return ret;
-}
-
-char *
-cfgets(cfp *fp, char *buf, int len)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
-#endif
- return fgets(buf, len, fp->uncompressedfp);
-}
-
-int
-cfclose(cfp *fp)
-{
- int result;
-
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
- }
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ CFH = InitCompressFileHandle(compression_spec);
+ if (CFH->open_func(fname, -1, mode, CFH))
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
+ free_keep_errno(CFH);
+ CFH = NULL;
}
- else
-#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
- }
- free_keep_errno(fp);
+ free_keep_errno(fname);
- return result;
+ return CFH;
}
+/*
+ * Close an open file handle and release its memory.
+ *
+ * On failure, returns an error value and sets errno appropriately.
+ */
int
-cfeof(cfp *fp)
+EndCompressFileHandle(CompressFileHandle *CFH)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
-#endif
- return feof(fp->uncompressedfp);
-}
+ int ret = 0;
-const char *
-get_cfp_error(cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ if (CFH->private_data)
+ ret = CFH->close_func(CFH);
- if (errnum != Z_ERRNO)
- return errmsg;
- }
-#endif
- return strerror(errno);
-}
+ free_keep_errno(CFH);
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
-{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
-
- if (filenamelen < suffixlen)
- return 0;
-
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
+ return ret;
}
-
-#endif
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 8beb1058ec2..d41c4950257 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -23,50 +23,160 @@
extern char *supports_compression(const pg_compress_specification compression_spec);
-/* Prototype for callback function to WriteDataToArchive() */
+/*
+ * Prototype for callback function used in writeData()
+ */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
/*
- * Prototype for callback function to ReadDataFromArchive()
+ * Prototype for callback function used in readData()
*
- * ReadDataFromArchive will call the read function repeatedly, until it
- * returns 0 to signal EOF. ReadDataFromArchive passes a buffer to read the
- * data into in *buf, of length *buflen. If that's not big enough for the
- * callback function, it can free() it and malloc() a new one, returning the
- * new buffer and its size in *buf and *buflen.
+ * readData will call the read function repeatedly, until it returns 0 to signal
+ * EOF. readData passes a buffer to read the data into in *buf, of length
+ * *buflen. If that's not big enough for the callback function, it can free() it
+ * and malloc() a new one, returning the new buffer and its size in *buf and
+ * *buflen.
*
* Returns the number of bytes read into *buf, or 0 on EOF.
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+
+ /*
+ * End compression and flush internal buffers if any.
+ */
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Callback function to read from an already processed input stream
+ */
+ ReadFunc readF;
+
+ /*
+ * Callback function to write an already processed chunk of data.
+ */
+ WriteFunc writeF;
+
+ /*
+ * Compression specification for this state.
+ */
+ pg_compress_specification compression_spec;
+
+ /*
+ * Private data to be used by the compressor.
+ */
+ void *private_data;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ /*
+ * Open a file in mode.
+ *
+ * Pass either 'path' or 'fd' depending on whether a filepath or a file
+ * descriptor is available. 'mode' can be one of 'r', 'rb', 'w', 'wb',
+ * 'a', and 'ab'. Requrires an already initialized CompressFileHandle.
+ */
+ int (*open_func) (const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH);
+
+ /*
+ * Open a file for writing.
+ *
+ * 'mode' can be one of 'w', 'wb', 'a', and 'ab'. Requrires an already
+ * initialized CompressFileHandle.
+ */
+ int (*open_write_func) (const char *path, const char *mode,
+ CompressFileHandle *CFH);
-typedef struct cfp cfp;
+ /*
+ * Read 'size' bytes of data from the file and store them into 'ptr'.
+ */
+ size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+ /*
+ * Write 'size' bytes of data into the file from 'ptr'.
+ */
+ size_t (*write_func) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ /*
+ * Read at most size - 1 characters from the compress file handle into
+ * 's'.
+ *
+ * Stop if an EOF or a newline is found first. 's' is always null
+ * terminated and contains the newline if it was found.
+ */
+ char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
+
+ /*
+ * Read the next character from the compress file handle as 'unsigned
+ * char' cast into 'int'.
+ */
+ int (*getc_func) (CompressFileHandle *CFH);
+
+ /*
+ * Test if EOF is reached in the compress file handle.
+ */
+ int (*eof_func) (CompressFileHandle *CFH);
+
+ /*
+ * Close an open file handle.
+ */
+ int (*close_func) (CompressFileHandle *CFH);
+
+ /*
+ * Get a pointer to a string that describes an error that occured during a
+ * compress file handle operation.
+ */
+ const char *(*get_error_func) (CompressFileHandle *CFH);
+
+ /*
+ * Compression specification for this file handle.
+ */
+ pg_compress_specification compression_spec;
+
+ /*
+ * Private data to be used by the compressor.
+ */
+ void *private_data;
+};
+
+/*
+ * Initialize a compress file handle with the requested compression.
+ */
+extern CompressFileHandle *InitCompressFileHandle(const pg_compress_specification compression_spec);
+
+/*
+ * Initialize a compress file stream. Deffer the compression algorithm
+ * from 'path', either by examining its suffix or by appending the supported
+ * suffixes in 'path'.
+ */
+extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int EndCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/compress_none.c b/src/bin/pg_dump/compress_none.c
new file mode 100644
index 00000000000..ecbcf4b04a7
--- /dev/null
+++ b/src/bin/pg_dump/compress_none.c
@@ -0,0 +1,206 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_none.c
+ * Routines for archivers to read or write an uncompressed stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_none.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_none.h"
+#include "pg_backup_utils.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+/*
+ * Private routines
+ */
+
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
+
+ free(buf);
+}
+
+
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+/*
+ * Public interface
+ */
+
+void
+InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+
+ cs->compression_spec = compression_spec;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * Private routines
+ */
+
+static size_t
+read_none(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ size_t ret;
+
+ if (size == 0)
+ return 0;
+
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
+
+ return ret;
+}
+
+static size_t
+write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
+}
+
+static const char *
+get_error_none(CompressFileHandle *CFH)
+{
+ return strerror(errno);
+}
+
+static char *
+gets_none(char *ptr, int size, CompressFileHandle *CFH)
+{
+ return fgets(ptr, size, (FILE *) CFH->private_data);
+}
+
+static int
+getc_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret;
+
+ ret = fgetc(fp);
+ if (ret == EOF)
+ {
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static int
+close_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret = 0;
+
+ CFH->private_data = NULL;
+
+ if (fp)
+ ret = fclose(fp);
+
+ return ret;
+}
+
+static int
+eof_none(CompressFileHandle *CFH)
+{
+ return feof((FILE *) CFH->private_data);
+}
+
+static int
+open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
+
+ if (fd >= 0)
+ CFH->private_data = fdopen(dup(fd), mode);
+ else
+ CFH->private_data = fopen(path, mode);
+
+ if (CFH->private_data == NULL)
+ return 1;
+
+ return 0;
+}
+
+static int
+open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
+
+ CFH->private_data = fopen(path, mode);
+ if (CFH->private_data == NULL)
+ return 1;
+
+ return 0;
+}
+
+/*
+ * Public interface
+ */
+
+void
+InitCompressFileHandleNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ CFH->open_func = open_none;
+ CFH->open_write_func = open_write_none;
+ CFH->read_func = read_none;
+ CFH->write_func = write_none;
+ CFH->gets_func = gets_none;
+ CFH->getc_func = getc_none;
+ CFH->close_func = close_none;
+ CFH->eof_func = eof_none;
+ CFH->get_error_func = get_error_none;
+
+ CFH->private_data = NULL;
+}
diff --git a/src/bin/pg_dump/compress_none.h b/src/bin/pg_dump/compress_none.h
new file mode 100644
index 00000000000..143e599819d
--- /dev/null
+++ b/src/bin/pg_dump/compress_none.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_none.h
+ * Uncompressed interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_none.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_NONE_H_
+#define _COMPRESS_NONE_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_NONE_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index ca62f9a3740..84e9f0defa4 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,7 +1,9 @@
# Copyright (c) 2022-2023, PostgreSQL Global Development Group
pg_dump_common_sources = files(
+ 'compress_gzip.c',
'compress_io.c',
+ 'compress_none.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index e02ce22db2e..2bc5648ed64 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -354,7 +354,7 @@ RestoreArchive(Archive *AHX)
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1128,7 +1128,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1144,9 +1144,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1503,6 +1504,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1525,33 +1527,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1690,7 +1691,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2032,6 +2037,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2062,26 +2079,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2179,6 +2182,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2234,7 +2238,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3646,12 +3653,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3723,10 +3725,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 4725e49747b..18b38c17abc 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,13 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add
+ * compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index 7529367a7b9..b576b299240 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
@@ -977,7 +988,7 @@ _readBlockHeader(ArchiveHandle *AH, int *type, int *id)
}
/*
- * Callback function for WriteDataToArchive. Writes one block of (compressed)
+ * Callback function for writeData. Writes one block of (compressed)
* data to the archive.
*/
static void
@@ -992,7 +1003,7 @@ _CustomWriteFunc(ArchiveHandle *AH, const char *buf, size_t len)
}
/*
- * Callback function for ReadDataFromArchive. To keep things simple, we
+ * Callback function for readData. To keep things simple, we
* always read one compressed block at a time.
*/
static size_t
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 6800c3cceef..a2f88995c00 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read_func(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (EndCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof_func(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write_func(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc_func(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write_func(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read_func(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compression_spec);
+ if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
+ if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write_func(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index d92247c915c..78454928cca 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -94,7 +94,7 @@ my %pgdump_runs = (
command => [
'pg_restore', '-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip-compressed'
},
},
@@ -239,8 +239,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default if available',
},
},
@@ -264,8 +264,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
glob_patterns => [
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 9a471e40aab..d44ebb06cc2 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -14,6 +14,10 @@
#ifndef PG_COMPRESSION_H
#define PG_COMPRESSION_H
+/*
+ * These values are stored in disc, for example in files generated by pg_dump.
+ * Create the necessary backwards compatibility layers if their order changes.
+ */
typedef enum pg_compress_algorithm
{
PG_COMPRESSION_NONE,
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index e52fe9f5099..db429474a25 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -150,7 +150,9 @@ do
# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.
+ test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
test "$f" = src/bin/pg_dump/pg_dump.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 22ea42c16b5..d4bb7442bec 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -429,6 +429,7 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
+CompressFileHandle
CompressionLocation
CompressorState
ComputeXidHorizonsResult
@@ -1035,6 +1036,7 @@ GucStack
GucStackState
GucStringAssignHook
GucStringCheckHook
+GzipCompressorState
HANDLE
HASHACTION
HASHBUCKET
--
2.39.1
0003-Add-LZ4-compression-to-pg_dump-v29.patchtext/x-patch; charset=UTF-8; name=0003-Add-LZ4-compression-to-pg_dump-v29.patchDownload
From 3a67132fa394b659c728f2748cd3aad33ff5cdd3 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 16 Feb 2023 11:00:35 +0000
Subject: [PATCH 3/3] Add LZ4 compression to pg_dump
This is mostly done within pg_dump's compression streaming and file APIs.
It is confined within the newly introduced compress_lz4.{c,h} files.
The first one, is aimed at inlined use cases and thus simple
lz4.h calls can be used directly. The second one is generating output, or is
parsing input, which can be read/generated via the lz4 utility.
XXX The first what? File or streaming API?
XXX How is this related to lz4 utility?
Wherever the LZ4F api does not implement all the functionality corresponding
to fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(). Where the
functionality was missing from the official API, it has been implemented
localy.
XXX What us LZ4F? Does this rely on another library, not on plain lz4?
Most of the code was written by Georgios Kokolatos. Rachel Heaton provided
invaluable help out with expanding the testing coverage, testing on
different platforms and providing debug information on those, as well as
native speaker wording.
Author: Georgios Kokolatos, Rachel Heaton
Reviewed-by: Michael Paquier, Justin Pryzby, Daniel Gustafsson, Tomas
Vondra
Discussion: https://postgr.es/m/faUNEOpts9vunEaLnmxmG-DldLSg_ql137OC3JYDmgrOMHm1RvvWY2IdBkv_CRxm5spCCb_OmKNk2T03TMm0fBEWveFF9wA1WizPuAgB7Ss%3D%40protonmail.com
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 24 +-
src/bin/pg_dump/compress_lz4.c | 626 ++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 24 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 6 +-
src/bin/pg_dump/pg_backup_directory.c | 9 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 1 +
12 files changed, 780 insertions(+), 21 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e14..49d218905fb 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 0013bc080cf..eb8f59459a1 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -26,6 +27,7 @@ OBJS = \
$(WIN32RES) \
compress_gzip.o \
compress_io.o \
+ compress_lz4.o \
compress_none.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index c507eeb7b3d..929b38902b2 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,7 +56,7 @@
* InitDiscoverCompressFileHandle tries to deffer the compression by the
* filename suffix. If the suffix is not yet known, then it tries to simply
* open the file, and if it fails, it tries to open the same file with the .gz
- * suffix.
+ * suffix, and then again with the .lz4 suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -70,6 +70,7 @@
#include "compress_gzip.h"
#include "compress_io.h"
+#include "compress_lz4.h"
#include "compress_none.h"
#include "pg_backup_utils.h"
@@ -90,6 +91,10 @@ supports_compression(const pg_compress_specification compression_spec)
if (algorithm == PG_COMPRESSION_GZIP)
supported = true;
#endif
+#ifdef USE_LZ4
+ if (algorithm == PG_COMPRESSION_LZ4)
+ supported = true;
+#endif
if (!supported)
return psprintf("this build does not support compression with %s",
@@ -121,6 +126,8 @@ AllocateCompressor(const pg_compress_specification compression_spec,
InitCompressorNone(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressorGzip(cs, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ InitCompressorLZ4(cs, compression_spec);
return cs;
}
@@ -185,6 +192,8 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
InitCompressFileHandleNone(CFH, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressFileHandleGzip(CFH, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ InitCompressFileHandleLZ4(CFH, compression_spec);
return CFH;
}
@@ -198,7 +207,7 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
* throughout. Otherwise the compression will be deferred by iteratively trying
* to open the file at 'path', first as is, then by appending known compression
* suffixes. So if you pass "foo" as 'path', this will open either "foo" or
- * "foo.gz", trying in that order.
+ * "foo.gz" or "foo.lz4", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
@@ -236,6 +245,17 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
+#endif
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 00000000000..ee74cc8e28a
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,626 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.c
+ * Routines for archivers to write a LZ4 compressed data stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private_data;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private_data = NULL;
+ }
+}
+
+
+/*
+ * Public routines that support LZ4 compressed data I/O
+ */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ cs->compression_spec = compression_spec;
+
+ /* Will be lazy init'd */
+ cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the necessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurrence of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verify that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (!eol_flag) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the beginning of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int remaining = size;
+
+ /* Lazy init */
+ if (LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+/*
+ * Public routines
+ */
+void
+InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ LZ4File *lz4fp;
+
+ CFH->open_func = LZ4File_open;
+ CFH->open_write_func = LZ4File_open_write;
+ CFH->read_func = LZ4File_read;
+ CFH->write_func = LZ4File_write;
+ CFH->gets_func = LZ4File_gets;
+ CFH->getc_func = LZ4File_getc;
+ CFH->eof_func = LZ4File_eof;
+ CFH->close_func = LZ4File_close;
+ CFH->get_error_func = LZ4File_get_error;
+
+ CFH->compression_spec = compression_spec;
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (CFH->compression_spec.level >= 0)
+ lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
+
+ CFH->private_data = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+
+void
+InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 00000000000..40dbe00d461
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.h
+ * LZ4 interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 84e9f0defa4..0da476a4c34 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -3,6 +3,7 @@
pg_dump_common_sources = files(
'compress_gzip.c',
'compress_io.c',
+ 'compress_lz4.c',
'compress_none.c',
'dumputils.c',
'parallel.c',
@@ -18,7 +19,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -86,7 +87,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 2bc5648ed64..027ded4baea 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -2075,7 +2075,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2085,6 +2085,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index a2f88995c00..ce2a0838fa7 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -779,10 +779,13 @@ _PrepParallelRestore(ArchiveHandle *AH)
if (stat(fname, &st) == 0)
te->dataLength = st.st_size;
- else
+ else if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
{
- /* It might be compressed */
- strlcat(fname, ".gz", sizeof(fname));
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ strlcat(fname, ".gz", sizeof(fname));
+ else if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ strlcat(fname, ".lz4", sizeof(fname));
+
if (stat(fname, &st) == 0)
te->dataLength = st.st_size;
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab2..08105337b15 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 78454928cca..72b19ee6cde 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -139,6 +139,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files were compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4175,11 +4249,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index db429474a25..2c5042eb417 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -152,6 +152,7 @@ do
# as field names, which is unfortunate but we won't change it now.
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_lz4.h && continue
test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index d4bb7442bec..d53cff94fe7 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1387,6 +1387,7 @@ LWLock
LWLockHandle
LWLockMode
LWLockPadded
+LZ4CompressorState
LZ4F_compressionContext_t
LZ4F_decompressOptions_t
LZ4F_decompressionContext_t
--
2.39.1
Some little updates since I last checked:
+ * This file also includes the implementation when compression is none for
+ * both API's.
=> this comment is obsolete.
s/deffer/infer/ ?
or determine ?
This typo occurs multiple times.
currently this includes only ".gz"
=> remove this phase from the 002 patch (or at least update it in 003).
deferred by iteratively
=> inferred?
s/Requrires/Requires/
twice.
s/occured/occurred/
s/disc/disk/ ?
Probably unimportant, but "disc" isn't used anywhere else.
"compress file handle"
=> maybe these should say "compressed"
supports_compression():
Since this is an exported function, it should probably be called
pgdump_supports_compresion.
------- Original Message -------
On Sunday, February 19th, 2023 at 6:10 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
Hi Georgios,
I spent some time looking at the patch again, and IMO it's RFC. But I
need some help with the commit messages - I updated 0001 and 0002 but I
wasn't quite sure what some of the stuff meant to say and/or it seemed
maybe coming from an earlier patch version and obsolete.
Thank you very much Tomas! Indeed I have not being paying any attention
at the commit messages.
Could you go over them and check if I got it right? Also feel free to
update the list of reviewers (I compiled that from substantial reviews
on the thread).
Done. Rachel has been correctly identified as author in the relevant parts
up to commit 98fe74218d. After that, she had offered review comments and I
have taken the liberty to add her as a reviewer through out.
Also I think that Shi Yu should be credited as a reviewer of 0003.
The 0003 commit message seems somewhat confusing - I added some XXX
lines asking about unclear stuff.
Please find in the attached v30 an updated message, as well as an amended
reviewer list. Also v30 addresses the final comments raised by Justin.
Cheers,
//Georgios
Show quoted text
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
v30-0001-Prepare-pg_dump-internals-for-additional-compres.patchtext/x-patch; name=v30-0001-Prepare-pg_dump-internals-for-additional-compres.patchDownload
From d34fb1856387c645dd0f172d636fd73ab8837ce4 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 20 Feb 2023 12:13:04 +0000
Subject: [PATCH v30 1/3] Prepare pg_dump internals for additional compression
methods
Commit bf9aa490db introduced a generic compression API in compress_io.{c,h} to
make reuse easier, and allow adding more compression algorithms. However,
pg_backup_archiver.c was not switched to this API and continued to call
the compression directly.
This commit teaches pg_backup_archiver.c about the compression API, so
that it can benefit from bf9aa490db (simpler code, easier addition of
new compression methods).
Author: Georgios Kokolatos
Reviewed-by: Michael Paquier, Rachel Heaton, Justin Pryzby, Tomas Vondra
Discussion:
https://postgr.es/m/faUNEOpts9vunEaLnmxmG-DldLSg_ql137OC3JYDmgrOMHm1RvvWY2IdBkv_CRxm5spCCb_OmKNk2T03TMm0fBEWveFF9wA1WizPuAgB7Ss%3D%40protonmail.com
---
src/bin/pg_dump/compress_io.c | 98 ++++++++++++++++---
src/bin/pg_dump/compress_io.h | 4 +
src/bin/pg_dump/pg_backup_archiver.c | 138 ++++++++++-----------------
src/bin/pg_dump/pg_backup_archiver.h | 27 +-----
4 files changed, 139 insertions(+), 128 deletions(-)
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 7a2c80bbc4..4074cc031c 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -56,6 +56,42 @@
#include "compress_io.h"
#include "pg_backup_utils.h"
+#ifdef HAVE_LIBZ
+#include <zlib.h>
+#endif
+
+/*----------------------
+ * Generic functions
+ *----------------------
+ */
+
+/*
+ * Checks wether a compression algorithm is supported.
+ *
+ * On success returns NULL, otherwise returns a malloc'ed string which can be
+ * used by the caller in an error message.
+ */
+char *
+supports_compression(const pg_compress_specification compression_spec)
+{
+ const pg_compress_algorithm algorithm = compression_spec.algorithm;
+ bool supported = false;
+
+ if (algorithm == PG_COMPRESSION_NONE)
+ supported = true;
+#ifdef HAVE_LIBZ
+ if (algorithm == PG_COMPRESSION_GZIP)
+ supported = true;
+#endif
+
+ if (!supported)
+ return psprintf("this build does not support compression with %s",
+ get_compress_algorithm_name(algorithm));
+
+ return NULL;
+}
+
+
/*----------------------
* Compressor API
*----------------------
@@ -490,16 +526,19 @@ cfopen_write(const char *path, const char *mode,
}
/*
- * Opens file 'path' in 'mode'. If compression is GZIP, the file
- * is opened with libz gzopen(), otherwise with plain fopen().
+ * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
+ * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
+ * descriptor is not dup'ed and it is the caller's responsibility to do so.
+ * The caller must verify that the 'compress_algorithm' is supported by the
+ * current build.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static cfp *
+cfopen_internal(const char *path, int fd, const char *mode,
+ pg_compress_specification compression_spec)
{
- cfp *fp = pg_malloc(sizeof(cfp));
+ cfp *fp = pg_malloc0(sizeof(cfp));
if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
{
@@ -511,15 +550,20 @@ cfopen(const char *path, const char *mode,
snprintf(mode_compression, sizeof(mode_compression), "%s%d",
mode, compression_spec.level);
- fp->compressedfp = gzopen(path, mode_compression);
+ if (fd >= 0)
+ fp->compressedfp = gzdopen(fd, mode_compression);
+ else
+ fp->compressedfp = gzopen(path, mode_compression);
}
else
{
/* don't specify a level, just use the zlib default */
- fp->compressedfp = gzopen(path, mode);
+ if (fd >= 0)
+ fp->compressedfp = gzdopen(fd, mode);
+ else
+ fp->compressedfp = gzopen(path, mode);
}
- fp->uncompressedfp = NULL;
if (fp->compressedfp == NULL)
{
free_keep_errno(fp);
@@ -531,10 +575,11 @@ cfopen(const char *path, const char *mode,
}
else
{
-#ifdef HAVE_LIBZ
- fp->compressedfp = NULL;
-#endif
- fp->uncompressedfp = fopen(path, mode);
+ if (fd >= 0)
+ fp->uncompressedfp = fdopen(fd, mode);
+ else
+ fp->uncompressedfp = fopen(path, mode);
+
if (fp->uncompressedfp == NULL)
{
free_keep_errno(fp);
@@ -545,6 +590,33 @@ cfopen(const char *path, const char *mode,
return fp;
}
+/*
+ * Opens file 'path' in 'mode' and compression as defined in
+ * compression_spec. The caller must verify that the compression
+ * is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfopen(const char *path, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(path, -1, mode, compression_spec);
+}
+
+/*
+ * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
+ * and compression as defined in compression_spec. The caller must
+ * verify that the compression is supported by the current build.
+ *
+ * On failure, return NULL with an error code in errno.
+ */
+cfp *
+cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec)
+{
+ return cfopen_internal(NULL, fd, mode, compression_spec);
+}
int
cfread(void *ptr, int size, cfp *fp)
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index a429dc4789..8beb1058ec 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -21,6 +21,8 @@
#define ZLIB_OUT_SIZE 4096
#define ZLIB_IN_SIZE 4096
+extern char *supports_compression(const pg_compress_specification compression_spec);
+
/* Prototype for callback function to WriteDataToArchive() */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
@@ -54,6 +56,8 @@ typedef struct cfp cfp;
extern cfp *cfopen(const char *path, const char *mode,
const pg_compress_specification compression_spec);
+extern cfp *cfdopen(int fd, const char *mode,
+ const pg_compress_specification compression_spec);
extern cfp *cfopen_read(const char *path, const char *mode);
extern cfp *cfopen_write(const char *path, const char *mode,
const pg_compress_specification compression_spec);
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 269bfce019..e02ce22db2 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -31,6 +31,7 @@
#endif
#include "common/string.h"
+#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
@@ -43,13 +44,6 @@
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
-/* state needed to save/restore an archive's output target */
-typedef struct _outputContext
-{
- void *OF;
- int gzOut;
-} OutputContext;
-
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
@@ -101,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static OutputContext SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, OutputContext savedContext);
+static cfp *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -277,11 +271,8 @@ CloseArchive(Archive *AHX)
AH->ClosePtr(AH);
/* Close the output */
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else if (AH->OF != stdout)
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -363,7 +354,7 @@ RestoreArchive(Archive *AHX)
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
TocEntry *te;
- OutputContext sav;
+ cfp *sav;
AH->stage = STAGE_INITIALIZING;
@@ -391,17 +382,21 @@ RestoreArchive(Archive *AHX)
/*
* Make sure we won't need (de)compression we haven't got
*/
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP &&
- AH->PrintTocDataPtr != NULL)
+ if (AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
- pg_fatal("cannot restore from compressed archive (compression not supported in this installation)");
+ {
+ char *errmsg = supports_compression(AH->compression_spec);
+ if (errmsg)
+ pg_fatal("cannot restore data from compressed archive (%s)",
+ errmsg);
+ else
+ break;
+ }
}
}
-#endif
/*
* Prepare index arrays, so we can assume we have them throughout restore.
@@ -1133,7 +1128,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- OutputContext sav;
+ cfp *sav;
const char *fmtName;
char stamp_str[64];
@@ -1508,58 +1503,32 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
- int fn;
+ const char *mode;
+ int fn = -1;
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
- else
- fn = -1;
}
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
- fn = -1;
filename = AH->fSpec;
}
else
fn = fileno(stdout);
- /* If compression explicitly requested, use gzopen */
-#ifdef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
- char fmode[14];
+ if (AH->mode == archModeAppend)
+ mode = PG_BINARY_A;
+ else
+ mode = PG_BINARY_W;
- /* Don't use PG_BINARY_x since this is zlib */
- sprintf(fmode, "wb%d", compression_spec.level);
- if (fn >= 0)
- AH->OF = gzdopen(dup(fn), fmode);
- else
- AH->OF = gzopen(filename, fmode);
- AH->gzOut = 1;
- }
+ if (fn >= 0)
+ AH->OF = cfdopen(dup(fn), mode, compression_spec);
else
-#endif
- { /* Use fopen */
- if (AH->mode == archModeAppend)
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_A);
- else
- AH->OF = fopen(filename, PG_BINARY_A);
- }
- else
- {
- if (fn >= 0)
- AH->OF = fdopen(dup(fn), PG_BINARY_W);
- else
- AH->OF = fopen(filename, PG_BINARY_W);
- }
- AH->gzOut = 0;
- }
+ AH->OF = cfopen(filename, mode, compression_spec);
if (!AH->OF)
{
@@ -1570,33 +1539,24 @@ SetOutput(ArchiveHandle *AH, const char *filename,
}
}
-static OutputContext
+static cfp *
SaveOutput(ArchiveHandle *AH)
{
- OutputContext sav;
-
- sav.OF = AH->OF;
- sav.gzOut = AH->gzOut;
-
- return sav;
+ return (cfp *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, OutputContext savedContext)
+RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
{
int res;
- errno = 0; /* in case gzclose() doesn't set it */
- if (AH->gzOut)
- res = GZCLOSE(AH->OF);
- else
- res = fclose(AH->OF);
+ errno = 0;
+ res = cfclose(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
- AH->gzOut = savedContext.gzOut;
- AH->OF = savedContext.OF;
+ AH->OF = savedOutput;
}
@@ -1720,22 +1680,17 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
bytes_written = size * nmemb;
}
- else if (AH->gzOut)
- bytes_written = GZWRITE(ptr, size, nmemb, AH->OF);
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
+ /*
+ * If we're doing a restore, and it's direct to DB, and we're connected
+ * then send it to the DB.
+ */
+ else if (RestoringToDB(AH))
+ bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- {
- /*
- * If we're doing a restore, and it's direct to DB, and we're
- * connected then send it to the DB.
- */
- if (RestoringToDB(AH))
- bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
- else
- bytes_written = fwrite(ptr, size, nmemb, AH->OF) * size;
- }
+ bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2224,6 +2179,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
@@ -2277,8 +2233,8 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
/* Open stdout with no compression for AH output handle */
- AH->gzOut = 0;
- AH->OF = stdout;
+ out_compress_spec.algorithm = PG_COMPRESSION_NONE;
+ AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3712,6 +3668,7 @@ WriteHead(ArchiveHandle *AH)
void
ReadHead(ArchiveHandle *AH)
{
+ char *errmsg;
char vmaj,
vmin,
vrev;
@@ -3781,10 +3738,13 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
-#ifndef HAVE_LIBZ
- if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_log_warning("archive is compressed, but this installation does not support compression -- no data will be available");
-#endif
+ errmsg = supports_compression(AH->compression_spec);
+ if (errmsg)
+ {
+ pg_log_warning("archive is compressed, but this installation does not support compression (%s) -- no data will be available",
+ errmsg);
+ pg_free(errmsg);
+ }
if (AH->version >= K_VERS_1_4)
{
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index f72446ed5b..4725e49747 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -32,30 +32,6 @@
#define LOBBUFSIZE 16384
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#define GZCLOSE(fh) gzclose(fh)
-#define GZWRITE(p, s, n, fh) gzwrite(fh, p, (n) * (s))
-#define GZREAD(p, s, n, fh) gzread(fh, p, (n) * (s))
-#define GZEOF(fh) gzeof(fh)
-#else
-#define GZCLOSE(fh) fclose(fh)
-#define GZWRITE(p, s, n, fh) (fwrite(p, s, n, fh) * (s))
-#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
-#define GZEOF(fh) feof(fh)
-/* this is just the redefinition of a libz constant */
-#define Z_DEFAULT_COMPRESSION (-1)
-
-typedef struct _z_stream
-{
- void *next_in;
- void *next_out;
- size_t avail_in;
- size_t avail_out;
-} z_stream;
-typedef z_stream *z_streamp;
-#endif
-
/* Data block types */
#define BLK_DATA 1
#define BLK_BLOBS 3
@@ -319,8 +295,7 @@ struct _archiveHandle
char *fSpec; /* Archive File Spec */
FILE *FH; /* General purpose file handle */
- void *OF;
- int gzOut; /* Output file */
+ void *OF; /* Output file */
struct _tocEntry *toc; /* Header of circular list of TOC entries */
int tocCount; /* Number of TOC entries */
--
2.34.1
v30-0003-Add-LZ4-compression-to-pg_dump.patchtext/x-patch; name=v30-0003-Add-LZ4-compression-to-pg_dump.patchDownload
From ad6743d75c3c467f981a6800e7152cf14982e0f4 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 20 Feb 2023 12:19:15 +0000
Subject: [PATCH v30 3/3] Add LZ4 compression to pg_dump
This commit expands pg_dump's compression streaming and file APIs to support
PG_COMPRESSION_LZ4. It is implemented in the newly added compress_lz4.{c,h}
files which cover all the functionality of the aforementioned APIs. Minor
changes were also necessary in individual pg_backup_* files, where code for the
'lz4' file suffix has been added, as well as pg_dump's compression option
parsing.
Author: Georgios Kokolatos
Reviewed-by: Michael Paquier, Rachel Heaton, Justin Pryzby, Shi Yu, Tomas Vondra
Discussion:
https://postgr.es/m/faUNEOpts9vunEaLnmxmG-DldLSg_ql137OC3JYDmgrOMHm1RvvWY2IdBkv_CRxm5spCCb_OmKNk2T03TMm0fBEWveFF9wA1WizPuAgB7Ss%3D%40protonmail.com
---
doc/src/sgml/ref/pg_dump.sgml | 13 +-
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_io.c | 26 +-
src/bin/pg_dump/compress_lz4.c | 626 ++++++++++++++++++++++++++
src/bin/pg_dump/compress_lz4.h | 24 +
src/bin/pg_dump/meson.build | 8 +-
src/bin/pg_dump/pg_backup_archiver.c | 6 +-
src/bin/pg_dump/pg_backup_directory.c | 9 +-
src/bin/pg_dump/pg_dump.c | 5 +-
src/bin/pg_dump/t/002_pg_dump.pl | 82 +++-
src/tools/pginclude/cpluspluscheck | 1 +
src/tools/pgindent/typedefs.list | 1 +
12 files changed, 781 insertions(+), 22 deletions(-)
create mode 100644 src/bin/pg_dump/compress_lz4.c
create mode 100644 src/bin/pg_dump/compress_lz4.h
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2c938cd7e1..49d218905f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -330,9 +330,10 @@ PostgreSQL documentation
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
- can be compressed with the <application>gzip</application> tool.
- This format is compressed by default and also supports parallel
- dumps.
+ can be compressed with the <application>gzip</application> or
+ <application>lz4</application>tool.
+ This format is compressed by default using <literal>gzip</literal>
+ and also supports parallel dumps.
</para>
</listitem>
</varlistentry>
@@ -654,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>none</literal> for no compression.
+ <literal>lz4</literal> or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
@@ -675,8 +676,8 @@ PostgreSQL documentation
individual table-data segments, and the default is to compress using
<literal>gzip</literal> at a moderate level. For plain text output,
setting a nonzero compression level causes the entire output file to be compressed,
- as though it had been fed through <application>gzip</application>; but the default
- is not to compress.
+ as though it had been fed through <application>gzip</application> or
+ <application>lz4</application>; but the default is not to compress.
</para>
<para>
The tar archive format currently does not support compression at all.
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 0013bc080c..eb8f59459a 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
export GZIP_PROGRAM=$(GZIP)
+export LZ4
export with_icu
override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
@@ -26,6 +27,7 @@ OBJS = \
$(WIN32RES) \
compress_gzip.o \
compress_io.o \
+ compress_lz4.o \
compress_none.o \
dumputils.o \
parallel.o \
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index dae4dc01d5..1ff4aca650 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -53,7 +53,7 @@
* InitDiscoverCompressFileHandle tries to infer the compression by the
* filename suffix. If the suffix is not yet known then it tries to simply
* open the file and if it fails, it tries to open the same file with the .gz
- * suffix.
+ * suffix, and then again with the .lz4 suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -67,6 +67,7 @@
#include "compress_gzip.h"
#include "compress_io.h"
+#include "compress_lz4.h"
#include "compress_none.h"
#include "pg_backup_utils.h"
@@ -93,6 +94,10 @@ supports_compression(const pg_compress_specification compression_spec)
if (algorithm == PG_COMPRESSION_GZIP)
supported = true;
#endif
+#ifdef USE_LZ4
+ if (algorithm == PG_COMPRESSION_LZ4)
+ supported = true;
+#endif
if (!supported)
return psprintf("this build does not support compression with %s",
@@ -124,6 +129,8 @@ AllocateCompressor(const pg_compress_specification compression_spec,
InitCompressorNone(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressorGzip(cs, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ InitCompressorLZ4(cs, compression_spec);
return cs;
}
@@ -188,6 +195,8 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
InitCompressFileHandleNone(CFH, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressFileHandleGzip(CFH, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ InitCompressFileHandleLZ4(CFH, compression_spec);
return CFH;
}
@@ -197,11 +206,11 @@ InitCompressFileHandle(const pg_compress_specification compression_spec)
* be either "r" or "rb".
*
* If the file at 'path' contains the suffix of a supported compression method,
- * currently this includes only ".gz", then this compression will be used
+ * currently this includes ".gz" and ".lz4", then this compression will be used
* throughout. Otherwise the compression will be inferred by iteratively trying
* to open the file at 'path', first as is, then by appending known compression
* suffixes. So if you pass "foo" as 'path', this will open either "foo" or
- * "foo.gz", trying in that order.
+ * "foo.gz" or "foo.lz4", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
@@ -239,6 +248,17 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
if (exists)
compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
+#endif
+#ifdef USE_LZ4
+ if (!exists)
+ {
+ free_keep_errno(fname);
+ fname = psprintf("%s.lz4", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_LZ4;
+ }
#endif
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
new file mode 100644
index 0000000000..ee74cc8e28
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -0,0 +1,626 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.c
+ * Routines for archivers to write a LZ4 compressed data stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include "pg_backup_utils.h"
+
+#include "compress_lz4.h"
+
+#ifdef USE_LZ4
+#include <lz4.h>
+#include <lz4frame.h>
+
+#define LZ4_OUT_SIZE (4 * 1024)
+#define LZ4_IN_SIZE (16 * 1024)
+
+/*
+ * LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
+ * Redefine it for installations with a lesser version.
+ */
+#ifndef LZ4F_HEADER_SIZE_MAX
+#define LZ4F_HEADER_SIZE_MAX 32
+#endif
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+typedef struct LZ4CompressorState
+{
+ char *outbuf;
+ size_t outsize;
+} LZ4CompressorState;
+
+/* Private routines that support LZ4 compressed data I/O */
+static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
+
+static void
+ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4_streamDecode_t lz4StreamDecode;
+ char *buf;
+ char *decbuf;
+ size_t buflen;
+ size_t cnt;
+
+ buflen = LZ4_IN_SIZE;
+ buf = pg_malloc(buflen);
+ decbuf = pg_malloc(buflen);
+
+ LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
+ buf, decbuf,
+ cnt, buflen);
+
+ ahwrite(decbuf, 1, decBytes, AH);
+ }
+
+ pg_free(buf);
+ pg_free(decbuf);
+}
+
+static void
+WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
+ size_t compressed;
+ size_t requiredsize = LZ4_compressBound(dLen);
+
+ if (requiredsize > LZ4cs->outsize)
+ {
+ LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
+ LZ4cs->outsize = requiredsize;
+ }
+
+ compressed = LZ4_compress_default(data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize);
+
+ if (compressed <= 0)
+ pg_fatal("failed to LZ4 compress data");
+
+ cs->writeF(AH, LZ4cs->outbuf, compressed);
+}
+
+static void
+EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
+{
+ LZ4CompressorState *LZ4cs;
+
+ LZ4cs = (LZ4CompressorState *) cs->private_data;
+ if (LZ4cs)
+ {
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
+ cs->private_data = NULL;
+ }
+}
+
+
+/*
+ * Public routines that support LZ4 compressed data I/O
+ */
+void
+InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveLZ4;
+ cs->writeData = WriteDataToArchiveLZ4;
+ cs->end = EndCompressorLZ4;
+
+ cs->compression_spec = compression_spec;
+
+ /* Will be lazy init'd */
+ cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+}
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * State needed for LZ4 (de)compression using the CompressFileHandle API.
+ */
+typedef struct LZ4File
+{
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ bool inited;
+ bool compressing;
+
+ size_t buflen;
+ char *buffer;
+
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ size_t errcode;
+} LZ4File;
+
+/*
+ * LZ4 equivalent to feof() or gzeof(). The end of file
+ * is reached if there is no decompressed output in the
+ * overflow buffer and the end of the file is reached.
+ */
+static int
+LZ4File_eof(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+
+ return fs->overflowlen == 0 && feof(fs->fp);
+}
+
+static const char *
+LZ4File_get_error(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ const char *errmsg;
+
+ if (LZ4F_isError(fs->errcode))
+ errmsg = LZ4F_getErrorName(fs->errcode);
+ else
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+/*
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ *
+ * It creates the necessary contexts for the operations. When compressing,
+ * it additionally writes the LZ4 header in the output stream.
+ */
+static int
+LZ4File_init(LZ4File * fs, int size, bool compressing)
+{
+ size_t status;
+
+ if (fs->inited)
+ return 0;
+
+ fs->compressing = compressing;
+ fs->inited = true;
+
+ if (fs->compressing)
+ {
+ fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
+ fs->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buffer = pg_malloc(fs->buflen);
+ status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
+ &fs->prefs);
+
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ return 1;
+ }
+ }
+ else
+ {
+ status = LZ4F_createDecompressionContext(&fs->dtx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return 1;
+ }
+
+ fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buffer = pg_malloc(fs->buflen);
+
+ fs->overflowalloclen = fs->buflen;
+ fs->overflowbuf = pg_malloc(fs->overflowalloclen);
+ fs->overflowlen = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Read already decompressed content from the overflow buffer into 'ptr' up to
+ * 'size' bytes, if available. If the eol_flag is set, then stop at the first
+ * occurrence of the new line char prior to 'size' bytes.
+ *
+ * Any unread content in the overflow buffer, is moved to the beginning.
+ */
+static int
+LZ4File_read_overflow(LZ4File * fs, void *ptr, int size, bool eol_flag)
+{
+ char *p;
+ int readlen = 0;
+
+ if (fs->overflowlen == 0)
+ return 0;
+
+ if (fs->overflowlen >= size)
+ readlen = size;
+ else
+ readlen = fs->overflowlen;
+
+ if (eol_flag && (p = memchr(fs->overflowbuf, '\n', readlen)))
+ /* Include the line terminating char */
+ readlen = p - fs->overflowbuf + 1;
+
+ memcpy(ptr, fs->overflowbuf, readlen);
+ fs->overflowlen -= readlen;
+
+ if (fs->overflowlen > 0)
+ memmove(fs->overflowbuf, fs->overflowbuf + readlen, fs->overflowlen);
+
+ return readlen;
+}
+
+/*
+ * The workhorse for reading decompressed content out of an LZ4 compressed
+ * stream.
+ *
+ * It will read up to 'ptrsize' decompressed content, or up to the new line char
+ * if found first when the eol_flag is set. It is possible that the decompressed
+ * output generated by reading any compressed input via the LZ4F API, exceeds
+ * 'ptrsize'. Any exceeding decompressed content is stored at an overflow
+ * buffer within LZ4File. Of course, when the function is called, it will first
+ * try to consume any decompressed content already present in the overflow
+ * buffer, before decompressing new content.
+ */
+static int
+LZ4File_read_internal(LZ4File * fs, void *ptr, int ptrsize, bool eol_flag)
+{
+ size_t dsize = 0;
+ size_t rsize;
+ size_t size = ptrsize;
+ bool eol_found = false;
+
+ void *readbuf;
+
+ /* Lazy init */
+ if (LZ4File_init(fs, size, false /* decompressing */ ))
+ return -1;
+
+ /* Verify that there is enough space in the outbuf */
+ if (size > fs->buflen)
+ {
+ fs->buflen = size;
+ fs->buffer = pg_realloc(fs->buffer, size);
+ }
+
+ /* use already decompressed content if available */
+ dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
+ return dsize;
+
+ readbuf = pg_malloc(size);
+
+ do
+ {
+ char *rp;
+ char *rend;
+
+ rsize = fread(readbuf, 1, size, fs->fp);
+ if (rsize < size && !feof(fs->fp))
+ return -1;
+
+ rp = (char *) readbuf;
+ rend = (char *) readbuf + rsize;
+
+ while (rp < rend)
+ {
+ size_t status;
+ size_t outlen = fs->buflen;
+ size_t read_remain = rend - rp;
+
+ memset(fs->buffer, 0, outlen);
+ status = LZ4F_decompress(fs->dtx, fs->buffer, &outlen,
+ rp, &read_remain, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ rp += read_remain;
+
+ /*
+ * fill in what space is available in ptr if the eol flag is set,
+ * either skip if one already found or fill up to EOL if present
+ * in the outbuf
+ */
+ if (outlen > 0 && dsize < size && eol_found == false)
+ {
+ char *p;
+ size_t lib = (!eol_flag) ? size - dsize : size - 1 - dsize;
+ size_t len = outlen < lib ? outlen : lib;
+
+ if (eol_flag &&
+ (p = memchr(fs->buffer, '\n', outlen)) &&
+ (size_t) (p - fs->buffer + 1) <= len)
+ {
+ len = p - fs->buffer + 1;
+ eol_found = true;
+ }
+
+ memcpy((char *) ptr + dsize, fs->buffer, len);
+ dsize += len;
+
+ /* move what did not fit, if any, at the beginning of the buf */
+ if (len < outlen)
+ memmove(fs->buffer, fs->buffer + len, outlen - len);
+ outlen -= len;
+ }
+
+ /* if there is available output, save it */
+ if (outlen > 0)
+ {
+ while (fs->overflowlen + outlen > fs->overflowalloclen)
+ {
+ fs->overflowalloclen *= 2;
+ fs->overflowbuf = pg_realloc(fs->overflowbuf,
+ fs->overflowalloclen);
+ }
+
+ memcpy(fs->overflowbuf + fs->overflowlen, fs->buffer, outlen);
+ fs->overflowlen += outlen;
+ }
+ }
+ } while (rsize == size && dsize < size && eol_found == 0);
+
+ pg_free(readbuf);
+
+ return (int) dsize;
+}
+
+/*
+ * Compress size bytes from ptr and write them to the stream.
+ */
+static size_t
+LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int remaining = size;
+
+ /* Lazy init */
+ if (LZ4File_init(fs, size, true))
+ return -1;
+
+ while (remaining > 0)
+ {
+ int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+
+ remaining -= chunk;
+
+ status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
+ ptr, chunk, NULL);
+ if (LZ4F_isError(status))
+ {
+ fs->errcode = status;
+ return -1;
+ }
+
+ if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ return 1;
+ }
+ }
+
+ return size;
+}
+
+/*
+ * fread() equivalent implementation for LZ4 compressed files.
+ */
+static size_t
+LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ int ret;
+
+ ret = LZ4File_read_internal(fs, ptr, size, false);
+ if (ret != size && !LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ return ret;
+}
+
+/*
+ * fgetc() equivalent implementation for LZ4 compressed files.
+ */
+static int
+LZ4File_getc(CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ unsigned char c;
+
+ if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ {
+ if (!LZ4File_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return c;
+}
+
+/*
+ * fgets() equivalent implementation for LZ4 compressed files.
+ */
+static char *
+LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t dsize;
+
+ dsize = LZ4File_read_internal(fs, ptr, size, true);
+ if (dsize < 0)
+ pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+
+ /* Done reading */
+ if (dsize == 0)
+ return NULL;
+
+ return ptr;
+}
+
+/*
+ * Finalize (de)compression of a stream. When compressing it will write any
+ * remaining content and/or generated footer from the LZ4 API.
+ */
+static int
+LZ4File_close(CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *fs = (LZ4File *) CFH->private_data;
+ size_t status;
+ int ret;
+
+ fp = fs->fp;
+ if (fs->inited)
+ {
+ if (fs->compressing)
+ {
+ status = LZ4F_compressEnd(fs->ctx, fs->buffer, fs->buflen, NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ {
+ errno = (errno) ? errno : ENOSPC;
+ WRITE_ERROR_EXIT;
+ }
+
+ status = LZ4F_freeCompressionContext(fs->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+ }
+ else
+ {
+ status = LZ4F_freeDecompressionContext(fs->dtx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end decompression: %s",
+ LZ4F_getErrorName(status));
+ pg_free(fs->overflowbuf);
+ }
+
+ pg_free(fs->buffer);
+ }
+
+ pg_free(fs);
+
+ return fclose(fp);
+}
+
+static int
+LZ4File_open(const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH)
+{
+ FILE *fp;
+ LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+
+ if (fd >= 0)
+ fp = fdopen(fd, mode);
+ else
+ fp = fopen(path, mode);
+ if (fp == NULL)
+ {
+ lz4fp->errcode = errno;
+ return 1;
+ }
+
+ lz4fp->fp = fp;
+
+ return 0;
+}
+
+static int
+LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+
+ fname = psprintf("%s.lz4", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+ pg_free(fname);
+
+ return ret;
+}
+
+/*
+ * Public routines
+ */
+void
+InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ LZ4File *lz4fp;
+
+ CFH->open_func = LZ4File_open;
+ CFH->open_write_func = LZ4File_open_write;
+ CFH->read_func = LZ4File_read;
+ CFH->write_func = LZ4File_write;
+ CFH->gets_func = LZ4File_gets;
+ CFH->getc_func = LZ4File_getc;
+ CFH->eof_func = LZ4File_eof;
+ CFH->close_func = LZ4File_close;
+ CFH->get_error_func = LZ4File_get_error;
+
+ CFH->compression_spec = compression_spec;
+ lz4fp = pg_malloc0(sizeof(*lz4fp));
+ if (CFH->compression_spec.level >= 0)
+ lz4fp->prefs.compressionLevel = CFH->compression_spec.level;
+
+ CFH->private_data = lz4fp;
+}
+#else /* USE_LZ4 */
+void
+InitCompressorLZ4(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+
+void
+InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "LZ4");
+}
+#endif /* USE_LZ4 */
diff --git a/src/bin/pg_dump/compress_lz4.h b/src/bin/pg_dump/compress_lz4.h
new file mode 100644
index 0000000000..40dbe00d46
--- /dev/null
+++ b/src/bin/pg_dump/compress_lz4.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_lz4.h
+ * LZ4 interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_lz4.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_LZ4_H_
+#define _COMPRESS_LZ4_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorLZ4(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleLZ4(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_LZ4_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index 84e9f0defa..0da476a4c3 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -3,6 +3,7 @@
pg_dump_common_sources = files(
'compress_gzip.c',
'compress_io.c',
+ 'compress_lz4.c',
'compress_none.c',
'dumputils.c',
'parallel.c',
@@ -18,7 +19,7 @@ pg_dump_common_sources = files(
pg_dump_common = static_library('libpgdump_common',
pg_dump_common_sources,
c_pch: pch_postgres_fe_h,
- dependencies: [frontend_code, libpq, zlib],
+ dependencies: [frontend_code, libpq, lz4, zlib],
kwargs: internal_lib_args,
)
@@ -86,7 +87,10 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
- 'env': {'GZIP_PROGRAM': gzip.path()},
+ 'env': {
+ 'GZIP_PROGRAM': gzip.path(),
+ 'LZ4': program_lz4.found() ? program_lz4.path() : '',
+ },
'tests': [
't/001_basic.pl',
't/002_pg_dump.pl',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 2bc5648ed6..027ded4bae 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -2075,7 +2075,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
/*
* Check if the specified archive is a directory. If so, check if
- * there's a "toc.dat" (or "toc.dat.gz") file in it.
+ * there's a "toc.dat" (or "toc.dat.{gz,lz4}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
@@ -2085,6 +2085,10 @@ _discoverArchiveFormat(ArchiveHandle *AH)
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
+#endif
+#ifdef USE_LZ4
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
+ return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index a2f88995c0..ce2a0838fa 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -779,10 +779,13 @@ _PrepParallelRestore(ArchiveHandle *AH)
if (stat(fname, &st) == 0)
te->dataLength = st.st_size;
- else
+ else if (AH->compression_spec.algorithm != PG_COMPRESSION_NONE)
{
- /* It might be compressed */
- strlcat(fname, ".gz", sizeof(fname));
+ if (AH->compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ strlcat(fname, ".gz", sizeof(fname));
+ else if (AH->compression_spec.algorithm == PG_COMPRESSION_LZ4)
+ strlcat(fname, ".lz4", sizeof(fname));
+
if (stat(fname, &st) == 0)
te->dataLength = st.st_size;
}
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..08105337b1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -715,13 +715,12 @@ main(int argc, char **argv)
case PG_COMPRESSION_NONE:
/* fallthrough */
case PG_COMPRESSION_GZIP:
+ /* fallthrough */
+ case PG_COMPRESSION_LZ4:
break;
case PG_COMPRESSION_ZSTD:
pg_fatal("compression with %s is not yet supported", "ZSTD");
break;
- case PG_COMPRESSION_LZ4:
- pg_fatal("compression with %s is not yet supported", "LZ4");
- break;
}
/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 78454928cc..72b19ee6cd 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -139,6 +139,80 @@ my %pgdump_runs = (
args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ],
},
},
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_custom => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=custom',
+ '--compress=lz4', "--file=$tempdir/compression_lz4_custom.dump",
+ 'postgres',
+ ],
+ restore_cmd => [
+ 'pg_restore',
+ "--file=$tempdir/compression_lz4_custom.sql",
+ "$tempdir/compression_lz4_custom.dump",
+ ],
+ command_like => {
+ command => [
+ 'pg_restore',
+ '-l', "$tempdir/compression_lz4_custom.dump",
+ ],
+ expected => qr/Compression: lz4/,
+ name => 'data content is lz4 compressed'
+ },
+ },
+
+ # Do not use --no-sync to give test coverage for data sync.
+ compression_lz4_dir => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--jobs=2',
+ '--format=directory', '--compress=lz4:1',
+ "--file=$tempdir/compression_lz4_dir", 'postgres',
+ ],
+ # Give coverage for manually compressed blob.toc files during
+ # restore.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-z', '-f', '--rm',
+ "$tempdir/compression_lz4_dir/blobs.toc",
+ "$tempdir/compression_lz4_dir/blobs.toc.lz4",
+ ],
+ },
+ # Verify that data files were compressed
+ glob_patterns => [
+ "$tempdir/compression_lz4_dir/toc.dat",
+ "$tempdir/compression_lz4_dir/*.dat.lz4",
+ ],
+ restore_cmd => [
+ 'pg_restore', '--jobs=2',
+ "--file=$tempdir/compression_lz4_dir.sql",
+ "$tempdir/compression_lz4_dir",
+ ],
+ },
+
+ compression_lz4_plain => {
+ test_key => 'compression',
+ compile_option => 'lz4',
+ dump_cmd => [
+ 'pg_dump', '--format=plain', '--compress=lz4',
+ "--file=$tempdir/compression_lz4_plain.sql.lz4", 'postgres',
+ ],
+ # Decompress the generated file to run through the tests.
+ compress_cmd => {
+ program => $ENV{'LZ4'},
+ args => [
+ '-d', '-f',
+ "$tempdir/compression_lz4_plain.sql.lz4",
+ "$tempdir/compression_lz4_plain.sql",
+ ],
+ },
+ },
+
clean => {
dump_cmd => [
'pg_dump',
@@ -4175,11 +4249,11 @@ foreach my $run (sort keys %pgdump_runs)
my $run_db = 'postgres';
# Skip command-level tests for gzip if there is no support for it.
- if ( defined($pgdump_runs{$run}->{compile_option})
- && $pgdump_runs{$run}->{compile_option} eq 'gzip'
- && !$supports_gzip)
+ if ($pgdump_runs{$run}->{compile_option} &&
+ ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
{
- note "$run: skipped due to no gzip support";
+ note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
}
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index db429474a2..2c5042eb41 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -152,6 +152,7 @@ do
# as field names, which is unfortunate but we won't change it now.
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_lz4.h && continue
test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index d4bb7442be..d53cff94fe 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1387,6 +1387,7 @@ LWLock
LWLockHandle
LWLockMode
LWLockPadded
+LZ4CompressorState
LZ4F_compressionContext_t
LZ4F_decompressOptions_t
LZ4F_decompressionContext_t
--
2.34.1
v30-0002-Introduce-a-generic-pg_dump-compression-API.patchtext/x-patch; name=v30-0002-Introduce-a-generic-pg_dump-compression-API.patchDownload
From 30cd2d5f6f672252215b1e93c68580815cbc8b4c Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 20 Feb 2023 12:14:56 +0000
Subject: [PATCH v30 2/3] Introduce a generic pg_dump compression API
Switch pg_dump to use the Compression API, implemented by bf9aa490db.
The CompressFileHandle replaces the cfp* family of functions with a
struct of callbacks for accessing (compressed) files. This allows adding
new compression methods simply by introducing a new struct instance with
appropriate implementation of the callbacks.
Archives compressed using custom compression methods store an identifier
of the compression algorithm in their header instead of the compression
level. The header version is bumped.
Author: Georgios Kokolatos
Reviewed-by: Michael Paquier, Rachel Heaton, Justin Pryzby, Tomas Vondra
Discussion:
https://postgr.es/m/faUNEOpts9vunEaLnmxmG-DldLSg_ql137OC3JYDmgrOMHm1RvvWY2IdBkv_CRxm5spCCb_OmKNk2T03TMm0fBEWveFF9wA1WizPuAgB7Ss%3D%40protonmail.com
---
src/bin/pg_dump/Makefile | 2 +
src/bin/pg_dump/compress_gzip.c | 401 ++++++++++++++
src/bin/pg_dump/compress_gzip.h | 24 +
src/bin/pg_dump/compress_io.c | 741 +++++---------------------
src/bin/pg_dump/compress_io.h | 166 +++++-
src/bin/pg_dump/compress_none.c | 206 +++++++
src/bin/pg_dump/compress_none.h | 24 +
src/bin/pg_dump/meson.build | 2 +
src/bin/pg_dump/pg_backup_archiver.c | 91 ++--
src/bin/pg_dump/pg_backup_archiver.h | 5 +-
src/bin/pg_dump/pg_backup_custom.c | 27 +-
src/bin/pg_dump/pg_backup_directory.c | 94 ++--
src/bin/pg_dump/t/002_pg_dump.pl | 10 +-
src/include/common/compression.h | 4 +
src/tools/pginclude/cpluspluscheck | 2 +
src/tools/pgindent/typedefs.list | 2 +
16 files changed, 1050 insertions(+), 751 deletions(-)
create mode 100644 src/bin/pg_dump/compress_gzip.c
create mode 100644 src/bin/pg_dump/compress_gzip.h
create mode 100644 src/bin/pg_dump/compress_none.c
create mode 100644 src/bin/pg_dump/compress_none.h
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index ef1ed0f3e5..0013bc080c 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -24,7 +24,9 @@ LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
OBJS = \
$(WIN32RES) \
+ compress_gzip.o \
compress_io.o \
+ compress_none.o \
dumputils.o \
parallel.o \
pg_backup_archiver.o \
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
new file mode 100644
index 0000000000..24e68fd022
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -0,0 +1,401 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.c
+ * Routines for archivers to read or write a gzip compressed data stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_gzip.h"
+#include "pg_backup_utils.h"
+
+#ifdef HAVE_LIBZ
+#include "zlib.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+typedef struct GzipCompressorState
+{
+ z_streamp zp;
+
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
+
+/* Private routines that support gzip compressed data I/O */
+static void
+DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
+ int res = Z_OK;
+
+ while (gzipcs->zp->avail_in != 0 || flush)
+ {
+ res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
+ if (res == Z_STREAM_ERROR)
+ pg_fatal("could not compress data: %s", zp->msg);
+ if ((flush && (zp->avail_out < gzipcs->outsize))
+ || (zp->avail_out == 0)
+ || (zp->avail_in != 0)
+ )
+ {
+ /*
+ * Extra paranoia: avoid zero-length chunks, since a zero length
+ * chunk is the EOF marker in the custom format. This should never
+ * happen but...
+ */
+ if (zp->avail_out < gzipcs->outsize)
+ {
+ /*
+ * Any write function should do its own error checking but to
+ * make sure we do a check here as well...
+ */
+ size_t len = gzipcs->outsize - zp->avail_out;
+
+ cs->writeF(AH, (char *) out, len);
+ }
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ if (res == Z_STREAM_END)
+ break;
+ }
+}
+
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorGzip(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
+static void
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (!gzipcs->zp)
+ {
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ /*
+ * A level of zero simply copies the input one block at the time. This
+ * is probably not what the user wanted when calling this interface.
+ */
+ if (cs->compression_spec.level == 0)
+ pg_fatal("requested to compress the archive yet no level was specified");
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorGzip(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ z_streamp zp;
+ char *out;
+ int res = Z_OK;
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ buf = pg_malloc(ZLIB_IN_SIZE);
+ buflen = ZLIB_IN_SIZE;
+
+ out = pg_malloc(ZLIB_OUT_SIZE + 1);
+
+ if (inflateInit(zp) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* no minimal chunk size for zlib */
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ zp->next_in = (void *) buf;
+ zp->avail_in = cnt;
+
+ while (zp->avail_in > 0)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+ }
+
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+ while (res != Z_STREAM_END)
+ {
+ zp->next_out = (void *) out;
+ zp->avail_out = ZLIB_OUT_SIZE;
+ res = inflate(zp, 0);
+ if (res != Z_OK && res != Z_STREAM_END)
+ pg_fatal("could not uncompress data: %s", zp->msg);
+
+ out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ }
+
+ if (inflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression library: %s", zp->msg);
+
+ free(buf);
+ free(out);
+ free(zp);
+}
+
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ GzipCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
+
+ cs->compression_spec = compression_spec;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+
+ cs->private_data = gzipcs;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ size_t ret;
+
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
+ {
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
+
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
+ }
+
+ return ret;
+}
+
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzwrite(gzfp, ptr, size);
+}
+
+static int
+Gzip_getc(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int ret;
+
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
+ {
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzgets(gzfp, ptr, size);
+}
+
+static int
+Gzip_close(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ CFH->private_data = NULL;
+
+ return gzclose(gzfp);
+}
+
+static int
+Gzip_eof(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+
+ return gzeof(gzfp);
+}
+
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
+{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ const char *errmsg;
+ int errnum;
+
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
+
+ return errmsg;
+}
+
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ gzFile gzfp;
+ char mode_compression[32];
+
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
+ {
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
+ }
+ else
+ strcpy(mode_compression, mode);
+
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
+ else
+ gzfp = gzopen(path, mode_compression);
+
+ if (gzfp == NULL)
+ return 1;
+
+ CFH->private_data = gzfp;
+
+ return 0;
+}
+
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ char *fname;
+ int ret;
+ int save_errno;
+
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
+}
+
+void
+InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ CFH->open_func = Gzip_open;
+ CFH->open_write_func = Gzip_open_write;
+ CFH->read_func = Gzip_read;
+ CFH->write_func = Gzip_write;
+ CFH->gets_func = Gzip_gets;
+ CFH->getc_func = Gzip_getc;
+ CFH->close_func = Gzip_close;
+ CFH->eof_func = Gzip_eof;
+ CFH->get_error_func = Gzip_get_error;
+
+ CFH->compression_spec = compression_spec;
+
+ CFH->private_data = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+
+void
+InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
new file mode 100644
index 0000000000..2392c697b4
--- /dev/null
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_gzip.h
+ * GZIP interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_gzip.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_GZIP_H_
+#define _COMPRESS_GZIP_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 4074cc031c..dae4dc01d5 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -9,42 +9,51 @@
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
+ * underlying stream. The second API is a wrapper around fopen and
* friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
+ * the possible compression. The second API is aimed for the resulting
+ * files to be easily manipulated with an external compression utility
+ * program.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
+ * AllocateCompressor, writeData, and EndCompressor. First you call
+ * AllocateCompressor, then write all the data by calling writeData as many
+ * times as needed, and finally EndCompressor. writeData will call the
+ * WriteFunc that was provided to AllocateCompressor for each chunk of
+ * compressed data.
*
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
- *
- * The interface is the same for compressed and uncompressed streams.
+ * The interface for reading an archive consists of the same three functions:
+ * AllocateCompressor, readData, and EndCompressor. First you call
+ * AllocateCompressor, then read all the data by calling readData to read the
+ * whole compressed stream which repeatedly calls the given ReadFunc. ReadFunc
+ * returns the compressed data one chunk at a time. Then readData decompresses
+ * it and passes the decompressed data to ahwrite(), until ReadFunc returns 0
+ * to signal EOF. The interface is the same for compressed and uncompressed
+ * streams.
*
* Compressed stream API
* ----------------------
*
- * The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
+ * The compressed stream API is providing a set of function pointers for
+ * opening, reading, writing, and finally closing files. The implemented
+ * function pointers are documented in the corresponding header file and are
+ * common for all streams. It allows the caller to use the same functions for
+ * both compressed and uncompressed streams.
+ *
+ * The interface consists of three functions, InitCompressFileHandle,
+ * InitDiscoverCompressFileHandle, and EndCompressFileHandle. If the
+ * compression is known, then start by calling InitCompressFileHandle,
+ * otherwise discover it by using InitDiscoverCompressFileHandle. Then call
+ * the function pointers as required for the read/write operations. Finally
+ * call EndCompressFileHandle to end the stream.
+ *
+ * InitDiscoverCompressFileHandle tries to infer the compression by the
+ * filename suffix. If the suffix is not yet known then it tries to simply
+ * open the file and if it fails, it tries to open the same file with the .gz
+ * suffix.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
@@ -53,13 +62,14 @@
*/
#include "postgres_fe.h"
+#include <sys/stat.h>
+#include <unistd.h>
+
+#include "compress_gzip.h"
#include "compress_io.h"
+#include "compress_none.h"
#include "pg_backup_utils.h"
-#ifdef HAVE_LIBZ
-#include <zlib.h>
-#endif
-
/*----------------------
* Generic functions
*----------------------
@@ -97,110 +107,25 @@ supports_compression(const pg_compress_specification compression_spec)
*----------------------
*/
-/* typedef appears in compress_io.h */
-struct CompressorState
-{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
-
-#ifdef HAVE_LIBZ
- z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
-
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
-
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-
-/* Public interface routines */
-
-/* Allocate a new compressor */
+/*
+ * Allocate a new compressor.
+ */
CompressorState *
AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
+ ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
-
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
+ cs->readF = readF;
cs->writeF = writeF;
- cs->compression_spec = compression_spec;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
-}
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF)
-{
if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
-}
+ InitCompressorNone(cs, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressorGzip(cs, compression_spec);
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
+ return cs;
}
/*
@@ -209,233 +134,31 @@ WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
{
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
-#endif
- free(cs);
-}
-
-/* Private routines, specific to each compression method. */
-
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
- */
-
-static void
-InitCompressorZlib(CompressorState *cs, int level)
-{
- z_streamp zp;
-
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
- */
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
-}
-
-static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
-{
- z_streamp zp = cs->zp;
-
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
-
- free(cs->zlibOut);
- free(cs->zp);
-}
-
-static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
-{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
- int res = Z_OK;
-
- while (cs->zp->avail_in != 0 || flush)
- {
- res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
- if (res == Z_STREAM_ERROR)
- pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
- || (zp->avail_out == 0)
- || (zp->avail_in != 0)
- )
- {
- /*
- * Extra paranoia: avoid zero-length chunks, since a zero length
- * chunk is the EOF marker in the custom format. This should never
- * happen but...
- */
- if (zp->avail_out < cs->zlibOutSize)
- {
- /*
- * Any write function should do its own error checking but to
- * make sure we do a check here as well...
- */
- size_t len = cs->zlibOutSize - zp->avail_out;
-
- cs->writeF(AH, out, len);
- }
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
- }
-
- if (res == Z_STREAM_END)
- break;
- }
-}
-
-static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
-}
-
-static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
-{
- z_streamp zp;
- char *out;
- int res = Z_OK;
- size_t cnt;
- char *buf;
- size_t buflen;
-
- zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
-
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
-
- if (inflateInit(zp) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
-
- /* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- zp->next_in = (void *) buf;
- zp->avail_in = cnt;
-
- while (zp->avail_in > 0)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
-
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
- }
-
- zp->next_in = NULL;
- zp->avail_in = 0;
- while (res != Z_STREAM_END)
- {
- zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
- res = inflate(zp, 0);
- if (res != Z_OK && res != Z_STREAM_END)
- pg_fatal("could not uncompress data: %s", zp->msg);
-
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
- }
-
- if (inflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression library: %s", zp->msg);
-
- free(buf);
- free(out);
- free(zp);
-}
-#endif /* HAVE_LIBZ */
-
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
-{
- size_t cnt;
- char *buf;
- size_t buflen;
-
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
-
- while ((cnt = readF(AH, &buf, &buflen)))
- {
- ahwrite(buf, 1, cnt, AH);
- }
-
- free(buf);
-}
-
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->writeF(AH, data, dLen);
+ cs->end(AH, cs);
+ pg_free(cs);
}
-
/*----------------------
* Compressed stream API
*----------------------
*/
/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
+ * Private routines
*/
-struct cfp
+static int
+hasSuffix(const char *filename, const char *suffix)
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
-};
+ int filenamelen = strlen(filename);
+ int suffixlen = strlen(suffix);
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ if (filenamelen < suffixlen)
+ return 0;
+
+ return memcmp(&filename[filenamelen - suffixlen],
+ suffix,
+ suffixlen) == 0;
+}
/* free() without changing errno; useful in several places below */
static void
@@ -448,324 +171,102 @@ free_keep_errno(void *p)
}
/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_read(const char *path, const char *mode)
-{
- cfp *fp;
-
- pg_compress_specification compression_spec = {0};
-
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
- }
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
-
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
-}
-
-/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
+ * Public interface
*/
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- cfp *fp;
-
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
-
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
- fp = NULL; /* keep compiler quiet */
-#endif
- }
- return fp;
-}
/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
+ * Initialize a compress file handle for the specified compression algorithm.
*/
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compression_spec)
+CompressFileHandle *
+InitCompressFileHandle(const pg_compress_specification compression_spec)
{
- cfp *fp = pg_malloc0(sizeof(cfp));
-
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
- {
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
-
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode_compression);
- else
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
- {
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode);
- else
- fp->compressedfp = gzopen(path, mode);
- }
+ CompressFileHandle *CFH;
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
- else
- {
- if (fd >= 0)
- fp->uncompressedfp = fdopen(fd, mode);
- else
- fp->uncompressedfp = fopen(path, mode);
+ CFH = pg_malloc0(sizeof(CompressFileHandle));
- if (fp->uncompressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
- }
+ if (compression_spec.algorithm == PG_COMPRESSION_NONE)
+ InitCompressFileHandleNone(CFH, compression_spec);
+ else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
+ InitCompressFileHandleGzip(CFH, compression_spec);
- return fp;
+ return CFH;
}
/*
- * Opens file 'path' in 'mode' and compression as defined in
- * compression_spec. The caller must verify that the compression
- * is supported by the current build.
+ * Open a file for reading. 'path' is the file to open, and 'mode' should
+ * be either "r" or "rb".
*
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
-{
- return cfopen_internal(path, -1, mode, compression_spec);
-}
-
-/*
- * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
- * and compression as defined in compression_spec. The caller must
- * verify that the compression is supported by the current build.
+ * If the file at 'path' contains the suffix of a supported compression method,
+ * currently this includes only ".gz", then this compression will be used
+ * throughout. Otherwise the compression will be inferred by iteratively trying
+ * to open the file at 'path', first as is, then by appending known compression
+ * suffixes. So if you pass "foo" as 'path', this will open either "foo" or
+ * "foo.gz", trying in that order.
*
* On failure, return NULL with an error code in errno.
*/
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec)
+CompressFileHandle *
+InitDiscoverCompressFileHandle(const char *path, const char *mode)
{
- return cfopen_internal(NULL, fd, mode, compression_spec);
-}
+ CompressFileHandle *CFH = NULL;
+ struct stat st;
+ char *fname;
+ pg_compress_specification compression_spec = {0};
-int
-cfread(void *ptr, int size, cfp *fp)
-{
- int ret;
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (size == 0)
- return 0;
+ Assert(strcmp(mode, PG_BINARY_R) == 0);
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ fname = strdup(path);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
+ if (hasSuffix(fname, ".gz"))
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
else
-#endif
{
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
- }
- return ret;
-}
-
-int
-cfwrite(const void *ptr, int size, cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
-#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
-}
-
-int
-cfgetc(cfp *fp)
-{
- int ret;
+ bool exists;
+ exists = (stat(path, &st) == 0);
+ /* avoid unused warning if it is not built with compression */
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_NONE;
#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
+ if (!exists)
{
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
+ free_keep_errno(fname);
+ fname = psprintf("%s.gz", path);
+ exists = (stat(fname, &st) == 0);
+
+ if (exists)
+ compression_spec.algorithm = PG_COMPRESSION_GZIP;
}
- }
- else
#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
}
- return ret;
-}
-
-char *
-cfgets(cfp *fp, char *buf, int len)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
- else
-#endif
- return fgets(buf, len, fp->uncompressedfp);
-}
-
-int
-cfclose(cfp *fp)
-{
- int result;
-
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
- }
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
+ CFH = InitCompressFileHandle(compression_spec);
+ if (CFH->open_func(fname, -1, mode, CFH))
{
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
+ free_keep_errno(CFH);
+ CFH = NULL;
}
- else
-#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
- }
- free_keep_errno(fp);
+ free_keep_errno(fname);
- return result;
+ return CFH;
}
+/*
+ * Close an open file handle and release its memory.
+ *
+ * On failure, returns an error value and sets errno appropriately.
+ */
int
-cfeof(cfp *fp)
+EndCompressFileHandle(CompressFileHandle *CFH)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
-#endif
- return feof(fp->uncompressedfp);
-}
+ int ret = 0;
-const char *
-get_cfp_error(cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ if (CFH->private_data)
+ ret = CFH->close_func(CFH);
- if (errnum != Z_ERRNO)
- return errmsg;
- }
-#endif
- return strerror(errno);
-}
+ free_keep_errno(CFH);
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
-{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
-
- if (filenamelen < suffixlen)
- return 0;
-
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
+ return ret;
}
-
-#endif
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 8beb1058ec..74ba5dda64 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -23,50 +23,160 @@
extern char *supports_compression(const pg_compress_specification compression_spec);
-/* Prototype for callback function to WriteDataToArchive() */
+/*
+ * Prototype for callback function used in writeData()
+ */
typedef void (*WriteFunc) (ArchiveHandle *AH, const char *buf, size_t len);
/*
- * Prototype for callback function to ReadDataFromArchive()
+ * Prototype for callback function used in readData()
*
- * ReadDataFromArchive will call the read function repeatedly, until it
- * returns 0 to signal EOF. ReadDataFromArchive passes a buffer to read the
- * data into in *buf, of length *buflen. If that's not big enough for the
- * callback function, it can free() it and malloc() a new one, returning the
- * new buffer and its size in *buf and *buflen.
+ * readData will call the read function repeatedly, until it returns 0 to signal
+ * EOF. readData passes a buffer to read the data into in *buf, of length
+ * *buflen. If that's not big enough for the callback function, it can free() it
+ * and malloc() a new one, returning the new buffer and its size in *buf and
+ * *buflen.
*
* Returns the number of bytes read into *buf, or 0 on EOF.
*/
typedef size_t (*ReadFunc) (ArchiveHandle *AH, char **buf, size_t *buflen);
-/* struct definition appears in compress_io.c */
typedef struct CompressorState CompressorState;
+struct CompressorState
+{
+ /*
+ * Read all compressed data from the input stream (via readF) and print it
+ * out with ahwrite().
+ */
+ void (*readData) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Compress and write data to the output stream (via writeF).
+ */
+ void (*writeData) (ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+
+ /*
+ * End compression and flush internal buffers if any.
+ */
+ void (*end) (ArchiveHandle *AH, CompressorState *cs);
+
+ /*
+ * Callback function to read from an already processed input stream
+ */
+ ReadFunc readF;
+
+ /*
+ * Callback function to write an already processed chunk of data.
+ */
+ WriteFunc writeF;
+
+ /*
+ * Compression specification for this state.
+ */
+ pg_compress_specification compression_spec;
+
+ /*
+ * Private data to be used by the compressor.
+ */
+ void *private_data;
+};
extern CompressorState *AllocateCompressor(const pg_compress_specification compression_spec,
+ ReadFunc readF,
WriteFunc writeF);
-extern void ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF);
-extern void WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
+/*
+ * Compress File Handle
+ */
+typedef struct CompressFileHandle CompressFileHandle;
+
+struct CompressFileHandle
+{
+ /*
+ * Open a file in mode.
+ *
+ * Pass either 'path' or 'fd' depending on whether a filepath or a file
+ * descriptor is available. 'mode' can be one of 'r', 'rb', 'w', 'wb',
+ * 'a', and 'ab'. Requires an already initialized CompressFileHandle.
+ */
+ int (*open_func) (const char *path, int fd, const char *mode,
+ CompressFileHandle *CFH);
+
+ /*
+ * Open a file for writing.
+ *
+ * 'mode' can be one of 'w', 'wb', 'a', and 'ab'. Requires an already
+ * initialized CompressFileHandle.
+ */
+ int (*open_write_func) (const char *path, const char *mode,
+ CompressFileHandle *CFH);
-typedef struct cfp cfp;
+ /*
+ * Read 'size' bytes of data from the file and store them into 'ptr'.
+ */
+ size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
-extern cfp *cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec);
-extern cfp *cfopen_read(const char *path, const char *mode);
-extern cfp *cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec);
-extern int cfread(void *ptr, int size, cfp *fp);
-extern int cfwrite(const void *ptr, int size, cfp *fp);
-extern int cfgetc(cfp *fp);
-extern char *cfgets(cfp *fp, char *buf, int len);
-extern int cfclose(cfp *fp);
-extern int cfeof(cfp *fp);
-extern const char *get_cfp_error(cfp *fp);
+ /*
+ * Write 'size' bytes of data into the file from 'ptr'.
+ */
+ size_t (*write_func) (const void *ptr, size_t size,
+ struct CompressFileHandle *CFH);
+ /*
+ * Read at most size - 1 characters from the compress file handle into
+ * 's'.
+ *
+ * Stop if an EOF or a newline is found first. 's' is always null
+ * terminated and contains the newline if it was found.
+ */
+ char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
+
+ /*
+ * Read the next character from the compress file handle as 'unsigned
+ * char' cast into 'int'.
+ */
+ int (*getc_func) (CompressFileHandle *CFH);
+
+ /*
+ * Test if EOF is reached in the compress file handle.
+ */
+ int (*eof_func) (CompressFileHandle *CFH);
+
+ /*
+ * Close an open file handle.
+ */
+ int (*close_func) (CompressFileHandle *CFH);
+
+ /*
+ * Get a pointer to a string that describes an error that occurred during a
+ * compress file handle operation.
+ */
+ const char *(*get_error_func) (CompressFileHandle *CFH);
+
+ /*
+ * Compression specification for this file handle.
+ */
+ pg_compress_specification compression_spec;
+
+ /*
+ * Private data to be used by the compressor.
+ */
+ void *private_data;
+};
+
+/*
+ * Initialize a compress file handle with the requested compression.
+ */
+extern CompressFileHandle *InitCompressFileHandle(const pg_compress_specification compression_spec);
+
+/*
+ * Initialize a compress file stream. Deffer the compression algorithm
+ * from 'path', either by examining its suffix or by appending the supported
+ * suffixes in 'path'.
+ */
+extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
+ const char *mode);
+extern int EndCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/compress_none.c b/src/bin/pg_dump/compress_none.c
new file mode 100644
index 0000000000..ecbcf4b04a
--- /dev/null
+++ b/src/bin/pg_dump/compress_none.c
@@ -0,0 +1,206 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_none.c
+ * Routines for archivers to read or write an uncompressed stream.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_none.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+#include <unistd.h>
+
+#include "compress_none.h"
+#include "pg_backup_utils.h"
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
+
+/*
+ * Private routines
+ */
+
+static void
+ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ size_t cnt;
+ char *buf;
+ size_t buflen;
+
+ buf = pg_malloc(ZLIB_OUT_SIZE);
+ buflen = ZLIB_OUT_SIZE;
+
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
+ {
+ ahwrite(buf, 1, cnt, AH);
+ }
+
+ free(buf);
+}
+
+
+static void
+WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ cs->writeF(AH, data, dLen);
+}
+
+static void
+EndCompressorNone(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* no op */
+}
+
+/*
+ * Public interface
+ */
+
+void
+InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ cs->readData = ReadDataFromArchiveNone;
+ cs->writeData = WriteDataToArchiveNone;
+ cs->end = EndCompressorNone;
+
+ cs->compression_spec = compression_spec;
+}
+
+
+/*----------------------
+ * Compress File API
+ *----------------------
+ */
+
+/*
+ * Private routines
+ */
+
+static size_t
+read_none(void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ size_t ret;
+
+ if (size == 0)
+ return 0;
+
+ ret = fread(ptr, 1, size, fp);
+ if (ret != size && !feof(fp))
+ pg_fatal("could not read from input file: %s",
+ strerror(errno));
+
+ return ret;
+}
+
+static size_t
+write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
+{
+ return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
+}
+
+static const char *
+get_error_none(CompressFileHandle *CFH)
+{
+ return strerror(errno);
+}
+
+static char *
+gets_none(char *ptr, int size, CompressFileHandle *CFH)
+{
+ return fgets(ptr, size, (FILE *) CFH->private_data);
+}
+
+static int
+getc_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret;
+
+ ret = fgetc(fp);
+ if (ret == EOF)
+ {
+ if (!feof(fp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
+ else
+ pg_fatal("could not read from input file: end of file");
+ }
+
+ return ret;
+}
+
+static int
+close_none(CompressFileHandle *CFH)
+{
+ FILE *fp = (FILE *) CFH->private_data;
+ int ret = 0;
+
+ CFH->private_data = NULL;
+
+ if (fp)
+ ret = fclose(fp);
+
+ return ret;
+}
+
+static int
+eof_none(CompressFileHandle *CFH)
+{
+ return feof((FILE *) CFH->private_data);
+}
+
+static int
+open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
+
+ if (fd >= 0)
+ CFH->private_data = fdopen(dup(fd), mode);
+ else
+ CFH->private_data = fopen(path, mode);
+
+ if (CFH->private_data == NULL)
+ return 1;
+
+ return 0;
+}
+
+static int
+open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
+{
+ Assert(CFH->private_data == NULL);
+
+ CFH->private_data = fopen(path, mode);
+ if (CFH->private_data == NULL)
+ return 1;
+
+ return 0;
+}
+
+/*
+ * Public interface
+ */
+
+void
+InitCompressFileHandleNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ CFH->open_func = open_none;
+ CFH->open_write_func = open_write_none;
+ CFH->read_func = read_none;
+ CFH->write_func = write_none;
+ CFH->gets_func = gets_none;
+ CFH->getc_func = getc_none;
+ CFH->close_func = close_none;
+ CFH->eof_func = eof_none;
+ CFH->get_error_func = get_error_none;
+
+ CFH->private_data = NULL;
+}
diff --git a/src/bin/pg_dump/compress_none.h b/src/bin/pg_dump/compress_none.h
new file mode 100644
index 0000000000..143e599819
--- /dev/null
+++ b/src/bin/pg_dump/compress_none.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * compress_none.h
+ * Uncompressed interface to compress_io.c routines
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/bin/pg_dump/compress_none.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef _COMPRESS_NONE_H_
+#define _COMPRESS_NONE_H_
+
+#include "compress_io.h"
+
+extern void InitCompressorNone(CompressorState *cs,
+ const pg_compress_specification compression_spec);
+extern void InitCompressFileHandleNone(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec);
+
+#endif /* _COMPRESS_NONE_H_ */
diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build
index ca62f9a374..84e9f0defa 100644
--- a/src/bin/pg_dump/meson.build
+++ b/src/bin/pg_dump/meson.build
@@ -1,7 +1,9 @@
# Copyright (c) 2022-2023, PostgreSQL Global Development Group
pg_dump_common_sources = files(
+ 'compress_gzip.c',
'compress_io.c',
+ 'compress_none.c',
'dumputils.c',
'parallel.c',
'pg_backup_archiver.c',
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index e02ce22db2..2bc5648ed6 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -95,8 +95,8 @@ static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
-static cfp *SaveOutput(ArchiveHandle *AH);
-static void RestoreOutput(ArchiveHandle *AH, cfp *savedOutput);
+static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
+static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
static void restore_toc_entries_prefork(ArchiveHandle *AH,
@@ -272,7 +272,7 @@ CloseArchive(Archive *AHX)
/* Close the output */
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -354,7 +354,7 @@ RestoreArchive(Archive *AHX)
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
TocEntry *te;
- cfp *sav;
+ CompressFileHandle *sav;
AH->stage = STAGE_INITIALIZING;
@@ -1128,7 +1128,7 @@ PrintTOCSummary(Archive *AHX)
TocEntry *te;
pg_compress_specification out_compression_spec = {0};
teSection curSection;
- cfp *sav;
+ CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
@@ -1144,9 +1144,10 @@ PrintTOCSummary(Archive *AHX)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
- ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
+ ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
- AH->tocCount, AH->compression_spec.level);
+ AH->tocCount,
+ get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
@@ -1503,6 +1504,7 @@ static void
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
{
+ CompressFileHandle *CFH;
const char *mode;
int fn = -1;
@@ -1525,33 +1527,32 @@ SetOutput(ArchiveHandle *AH, const char *filename,
else
mode = PG_BINARY_W;
- if (fn >= 0)
- AH->OF = cfdopen(dup(fn), mode, compression_spec);
- else
- AH->OF = cfopen(filename, mode, compression_spec);
+ CFH = InitCompressFileHandle(compression_spec);
- if (!AH->OF)
+ if (CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
+
+ AH->OF = CFH;
}
-static cfp *
+static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
- return (cfp *) AH->OF;
+ return (CompressFileHandle *) AH->OF;
}
static void
-RestoreOutput(ArchiveHandle *AH, cfp *savedOutput)
+RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
int res;
errno = 0;
- res = cfclose(AH->OF);
+ res = EndCompressFileHandle(AH->OF);
if (res != 0)
pg_fatal("could not close output file: %m");
@@ -1690,7 +1691,11 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
- bytes_written = cfwrite(ptr, size * nmemb, AH->OF);
+ {
+ CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
+
+ bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ }
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
@@ -2032,6 +2037,18 @@ ReadStr(ArchiveHandle *AH)
return buf;
}
+static bool
+_fileExistsInDirectory(const char *dir, const char *filename)
+{
+ struct stat st;
+ char buf[MAXPGPATH];
+
+ if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
+ pg_fatal("directory name too long: \"%s\"", dir);
+
+ return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
+}
+
static int
_discoverArchiveFormat(ArchiveHandle *AH)
{
@@ -2062,26 +2079,12 @@ _discoverArchiveFormat(ArchiveHandle *AH)
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
- char buf[MAXPGPATH];
-
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
- }
-
#ifdef HAVE_LIBZ
- if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
- pg_fatal("directory name too long: \"%s\"",
- AH->fSpec);
- if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
- {
- AH->format = archDirectory;
+ if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
- }
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
@@ -2179,6 +2182,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
SetupWorkerPtrType setupWorkerPtr)
{
ArchiveHandle *AH;
+ CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
pg_log_debug("allocating AH for %s, format %d",
@@ -2234,7 +2238,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
- AH->OF = cfdopen(dup(fileno(stdout)), PG_BINARY_A, out_compress_spec);
+ CFH = InitCompressFileHandle(out_compress_spec);
+ if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ pg_fatal("could not open stdout for appending: %m");
+ AH->OF = CFH;
/*
* On Windows, we need to use binary mode to read/write non-text files,
@@ -3646,12 +3653,7 @@ WriteHead(ArchiveHandle *AH)
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
- /*
- * For now the compression type is implied by the level. This will need
- * to change once support for more compression algorithms is added,
- * requiring a format bump.
- */
- WriteInt(AH, AH->compression_spec.level);
+ AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
@@ -3723,10 +3725,11 @@ ReadHead(ArchiveHandle *AH)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
- /* Guess the compression method based on the level */
- AH->compression_spec.algorithm = PG_COMPRESSION_NONE;
- if (AH->version >= K_VERS_1_2)
+ if (AH->version >= K_VERS_1_15)
+ AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
+ else if (AH->version >= K_VERS_1_2)
{
+ /* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 4725e49747..18b38c17ab 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -65,10 +65,13 @@
#define K_VERS_1_13 MAKE_ARCHIVE_VERSION(1, 13, 0) /* change search_path
* behavior */
#define K_VERS_1_14 MAKE_ARCHIVE_VERSION(1, 14, 0) /* add tableam */
+#define K_VERS_1_15 MAKE_ARCHIVE_VERSION(1, 15, 0) /* add
+ * compression_algorithm
+ * in header */
/* Current archive version number (the format we can output) */
#define K_VERS_MAJOR 1
-#define K_VERS_MINOR 14
+#define K_VERS_MINOR 15
#define K_VERS_REV 0
#define K_VERS_SELF MAKE_ARCHIVE_VERSION(K_VERS_MAJOR, K_VERS_MINOR, K_VERS_REV)
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index 7529367a7b..b576b29924 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -298,7 +298,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
_WriteByte(AH, BLK_DATA); /* Block type */
WriteInt(AH, te->dumpId); /* For sanity check */
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -317,15 +319,15 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressorState *cs = ctx->cs;
if (dLen > 0)
- /* WriteDataToArchive() internally throws write errors */
- WriteDataToArchive(AH, cs, data, dLen);
+ /* writeData() internally throws write errors */
+ cs->writeData(AH, cs, data, dLen);
}
/*
* Called by the archiver when a dumper's 'DataDumper' routine has
* finished.
*
- * Optional.
+ * Mandatory.
*/
static void
_EndData(ArchiveHandle *AH, TocEntry *te)
@@ -333,6 +335,8 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
EndCompressor(AH, ctx->cs);
+ ctx->cs = NULL;
+
/* Send the end marker */
WriteInt(AH, 0);
}
@@ -377,7 +381,9 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
WriteInt(AH, oid);
- ctx->cs = AllocateCompressor(AH->compression_spec, _CustomWriteFunc);
+ ctx->cs = AllocateCompressor(AH->compression_spec,
+ NULL,
+ _CustomWriteFunc);
}
/*
@@ -566,7 +572,12 @@ _PrintTocData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintData(ArchiveHandle *AH)
{
- ReadDataFromArchive(AH, AH->compression_spec, _CustomReadFunc);
+ CompressorState *cs;
+
+ cs = AllocateCompressor(AH->compression_spec,
+ _CustomReadFunc, NULL);
+ cs->readData(AH, cs);
+ EndCompressor(AH, cs);
}
static void
@@ -977,7 +988,7 @@ _readBlockHeader(ArchiveHandle *AH, int *type, int *id)
}
/*
- * Callback function for WriteDataToArchive. Writes one block of (compressed)
+ * Callback function for writeData. Writes one block of (compressed)
* data to the archive.
*/
static void
@@ -992,7 +1003,7 @@ _CustomWriteFunc(ArchiveHandle *AH, const char *buf, size_t len)
}
/*
- * Callback function for ReadDataFromArchive. To keep things simple, we
+ * Callback function for readData. To keep things simple, we
* always read one compressed block at a time.
*/
static size_t
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 6800c3ccee..a2f88995c0 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -50,9 +50,8 @@ typedef struct
*/
char *directory;
- cfp *dataFH; /* currently open data file */
-
- cfp *LOsTocFH; /* file handle for blobs.toc */
+ CompressFileHandle *dataFH; /* currently open data file */
+ CompressFileHandle *LOsTocFH; /* file handle for blobs.toc */
ParallelState *pstate; /* for parallel backup / restore */
} lclContext;
@@ -198,11 +197,11 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
else
{ /* Read Mode */
char fname[MAXPGPATH];
- cfp *tocFH;
+ CompressFileHandle *tocFH;
setFilePath(AH, fname, "toc.dat");
- tocFH = cfopen_read(fname, PG_BINARY_R);
+ tocFH = InitDiscoverCompressFileHandle(fname, PG_BINARY_R);
if (tocFH == NULL)
pg_fatal("could not open input file \"%s\": %m", fname);
@@ -218,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -327,9 +326,9 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
setFilePath(AH, fname, tctx->filename);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W,
- AH->compression_spec);
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -346,15 +345,16 @@ static void
_WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && cfwrite(data, dLen, ctx->dataFH) != dLen)
+ if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (cfclose(ctx->dataFH) != 0)
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -385,26 +385,25 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
size_t cnt;
char *buf;
size_t buflen;
- cfp *cfp;
+ CompressFileHandle *CFH;
if (!filename)
return;
- cfp = cfopen_read(filename, PG_BINARY_R);
-
- if (!cfp)
+ CFH = InitDiscoverCompressFileHandle(filename, PG_BINARY_R);
+ if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = cfread(buf, buflen, cfp)))
+ while ((cnt = CFH->read_func(buf, buflen, CFH)))
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (cfclose(cfp) != 0)
+ if (EndCompressFileHandle(CFH) != 0)
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -435,6 +434,7 @@ _LoadLOs(ArchiveHandle *AH)
{
Oid oid;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH;
char tocfname[MAXPGPATH];
char line[MAXPGPATH];
@@ -442,14 +442,14 @@ _LoadLOs(ArchiveHandle *AH)
setFilePath(AH, tocfname, "blobs.toc");
- ctx->LOsTocFH = cfopen_read(tocfname, PG_BINARY_R);
+ CFH = ctx->LOsTocFH = InitDiscoverCompressFileHandle(tocfname, PG_BINARY_R);
if (ctx->LOsTocFH == NULL)
pg_fatal("could not open large object TOC file \"%s\" for input: %m",
tocfname);
/* Read the LOs TOC file line-by-line, and process each LO */
- while ((cfgets(ctx->LOsTocFH, line, MAXPGPATH)) != NULL)
+ while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
{
char lofname[MAXPGPATH + 1];
char path[MAXPGPATH];
@@ -464,11 +464,11 @@ _LoadLOs(ArchiveHandle *AH)
_PrintFileData(AH, path);
EndRestoreLO(AH, oid);
}
- if (!cfeof(ctx->LOsTocFH))
+ if (!CFH->eof_func(CFH))
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (cfclose(ctx->LOsTocFH) != 0)
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -488,15 +488,16 @@ _WriteByte(ArchiveHandle *AH, const int i)
{
unsigned char c = (unsigned char) i;
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(&c, 1, ctx->dataFH) != 1)
+ if (CFH->write_func(&c, 1, CFH) != 1)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
return 1;
@@ -512,8 +513,9 @@ static int
_ReadByte(ArchiveHandle *AH)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
- return cfgetc(ctx->dataFH);
+ return CFH->getc_func(CFH);
}
/*
@@ -524,15 +526,16 @@ static void
_WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (cfwrite(buf, len, ctx->dataFH) != len)
+ if (CFH->write_func(buf, len, CFH) != len)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
pg_fatal("could not write to output file: %s",
- get_cfp_error(ctx->dataFH));
+ CFH->get_error_func(CFH));
}
}
@@ -545,12 +548,13 @@ static void
_ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->dataFH;
/*
- * If there was an I/O error, we already exited in cfread(), so here we
+ * If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (cfread(buf, len, ctx->dataFH) != len)
+ if (CFH->read_func(buf, len, CFH) != len)
pg_fatal("could not read from input file: end of file");
}
@@ -573,7 +577,7 @@ _CloseArchive(ArchiveHandle *AH)
if (AH->mode == archModeWrite)
{
- cfp *tocFH;
+ CompressFileHandle *tocFH;
pg_compress_specification compression_spec = {0};
char fname[MAXPGPATH];
@@ -584,8 +588,8 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- tocFH = cfopen_write(fname, PG_BINARY_W, compression_spec);
- if (tocFH == NULL)
+ tocFH = InitCompressFileHandle(compression_spec);
+ if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -598,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (cfclose(tocFH) != 0)
+ if (EndCompressFileHandle(tocFH) != 0)
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -649,8 +653,8 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
- ctx->LOsTocFH = cfopen_write(fname, "ab", compression_spec);
- if (ctx->LOsTocFH == NULL)
+ ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
+ if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -667,9 +671,8 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
- ctx->dataFH = cfopen_write(fname, PG_BINARY_W, AH->compression_spec);
-
- if (ctx->dataFH == NULL)
+ ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
+ if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -682,18 +685,19 @@ static void
_EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
{
lclContext *ctx = (lclContext *) AH->formatData;
+ CompressFileHandle *CFH = ctx->LOsTocFH;
char buf[50];
int len;
- /* Close the LO data file itself */
- if (cfclose(ctx->dataFH) != 0)
- pg_fatal("could not close LO data file: %m");
+ /* Close the BLOB data file itself */
+ if (EndCompressFileHandle(ctx->dataFH) != 0)
+ pg_fatal("could not close blob data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (cfwrite(buf, len, ctx->LOsTocFH) != len)
- pg_fatal("could not write to LOs TOC file");
+ if (CFH->write_func(buf, len, CFH) != len)
+ pg_fatal("could not write to blobs TOC file");
}
/*
@@ -706,8 +710,8 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (cfclose(ctx->LOsTocFH) != 0)
- pg_fatal("could not close LOs TOC file: %m");
+ if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
+ pg_fatal("could not close blobs TOC file: %m");
ctx->LOsTocFH = NULL;
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index d92247c915..78454928cc 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -94,7 +94,7 @@ my %pgdump_runs = (
command => [
'pg_restore', '-l', "$tempdir/compression_gzip_custom.dump",
],
- expected => qr/Compression: 1/,
+ expected => qr/Compression: gzip/,
name => 'data content is gzip-compressed'
},
},
@@ -239,8 +239,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_custom_format.dump", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default if available',
},
},
@@ -264,8 +264,8 @@ my %pgdump_runs = (
command =>
[ 'pg_restore', '-l', "$tempdir/defaults_dir_format", ],
expected => $supports_gzip ?
- qr/Compression: -1/ :
- qr/Compression: 0/,
+ qr/Compression: gzip/ :
+ qr/Compression: none/,
name => 'data content is gzip-compressed by default',
},
glob_patterns => [
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 9a471e40aa..b48c173022 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -14,6 +14,10 @@
#ifndef PG_COMPRESSION_H
#define PG_COMPRESSION_H
+/*
+ * These values are stored in disk, for example in files generated by pg_dump.
+ * Create the necessary backwards compatibility layers if their order changes.
+ */
typedef enum pg_compress_algorithm
{
PG_COMPRESSION_NONE,
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index e52fe9f509..db429474a2 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -150,7 +150,9 @@ do
# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.
+ test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
+ test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
test "$f" = src/bin/pg_dump/pg_dump.h && continue
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 22ea42c16b..d4bb7442be 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -429,6 +429,7 @@ CompiledExprState
CompositeIOData
CompositeTypeStmt
CompoundAffixFlag
+CompressFileHandle
CompressionLocation
CompressorState
ComputeXidHorizonsResult
@@ -1035,6 +1036,7 @@ GucStack
GucStackState
GucStringAssignHook
GucStringCheckHook
+GzipCompressorState
HANDLE
HASHACTION
HASHBUCKET
--
2.34.1
Thanks for v30 with the updated commit messages. I've pushed 0001 after
fixing a comment typo and removing (I think) an unnecessary change in an
error message.
I'll give the buildfarm a bit of time before pushing 0002 and 0003.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2/23/23 16:26, Tomas Vondra wrote:
Thanks for v30 with the updated commit messages. I've pushed 0001 after
fixing a comment typo and removing (I think) an unnecessary change in an
error message.I'll give the buildfarm a bit of time before pushing 0002 and 0003.
I've now pushed 0002 and 0003, after minor tweaks (a couple typos etc.),
and marked the CF entry as committed. Thanks for the patch!
I wonder how difficult would it be to add the zstd compression, so that
we don't have the annoying "unsupported" cases.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Thu, Feb 23, 2023 at 09:24:46PM +0100, Tomas Vondra wrote:
I've now pushed 0002 and 0003, after minor tweaks (a couple typos etc.),
and marked the CF entry as committed. Thanks for the patch!
A big thanks from me to everyone involved.
I wonder how difficult would it be to add the zstd compression, so that
we don't have the annoying "unsupported" cases.
I'll send a patch soon. I first submitted patches for that 2 years ago
(before PGDG was ready to add zstd).
https://commitfest.postgresql.org/31/2888/
--
Justin
On Thu, Feb 23, 2023 at 07:51:16PM -0600, Justin Pryzby wrote:
On Thu, Feb 23, 2023 at 09:24:46PM +0100, Tomas Vondra wrote:
I've now pushed 0002 and 0003, after minor tweaks (a couple typos etc.),
and marked the CF entry as committed. Thanks for the patch!A big thanks from me to everyone involved.
Wow, nice! The APIs are clear to follow.
I'll send a patch soon. I first submitted patches for that 2 years ago
(before PGDG was ready to add zstd).
https://commitfest.postgresql.org/31/2888/
Thanks. It should be straight-forward to see that in 16, I guess.
--
Michael
------- Original Message -------
On Friday, February 24th, 2023 at 5:35 AM, Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Feb 23, 2023 at 07:51:16PM -0600, Justin Pryzby wrote:
On Thu, Feb 23, 2023 at 09:24:46PM +0100, Tomas Vondra wrote:
I've now pushed 0002 and 0003, after minor tweaks (a couple typos etc.),
and marked the CF entry as committed. Thanks for the patch!A big thanks from me to everyone involved.
Wow, nice! The APIs are clear to follow.
I am out of words, thank you all so very much. I learned a lot.
Show quoted text
I'll send a patch soon. I first submitted patches for that 2 years ago
(before PGDG was ready to add zstd).
https://commitfest.postgresql.org/31/2888/Thanks. It should be straight-forward to see that in 16, I guess.
--
Michael
I have some fixes (attached) and questions while polishing the patch for
zstd compression. The fixes are small and could be integrated with the
patch for zstd, but could be applied independently.
- I'm unclear about get_error_func(). That's called in three places
from pg_backup_directory.c, after failures from write_func(), to
supply an compression-specific error message to pg_fatal(). But it's
not being used outside of directory format, nor for errors for other
function pointers, or even for all errors in write_func(). Is there
some reason why each compression method's write_func() shouldn't call
pg_fatal() directly, with its compression-specific message ?
- I still think supports_compression() should be renamed, or made into a
static function in the necessary file. The main reason is that it's
more clear what it indicates - whether compression is "implemented by
pgdump" and not whether compression is "supported by this postgres
build". It also seems possible that we'd want to add a function
called something like supports_compression(), indicating whether the
algorithm is supported by the current build. It'd be better if pgdump
didn't subjugate that name.
- Finally, the "Nothing to do in the default case" comment comes from
Michael's commit 5e73a6048:
+ /*
+ * Custom and directory formats are compressed by default with gzip when
+ * available, not the others.
+ */
+ if ((archiveFormat == archCustom || archiveFormat == archDirectory) &&
+ !user_compression_defined)
{
#ifdef HAVE_LIBZ
- if (archiveFormat == archCustom || archiveFormat == archDirectory)
- compressLevel = Z_DEFAULT_COMPRESSION;
- else
+ parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
+ &compression_spec);
+#else
+ /* Nothing to do in the default case */
#endif
- compressLevel = 0;
}
As the comment says: for -Fc and -Fd, the compression is set to zlib, if
enabled, and when not otherwise specified by the user.
Before 5e73a6048, this set compressLevel=0 for -Fp and -Ft, *and* when
zlib was unavailable.
But I'm not sure why there's now an empty "#else". I also don't know
what "the default case" refers to.
Maybe the best thing here is to move the preprocessor #if, since it's no
longer in the middle of a runtime conditional:
#ifdef HAVE_LIBZ
+ if ((archiveFormat == archCustom || archiveFormat == archDirectory) &&
+ !user_compression_defined)
+ parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
+ &compression_spec);
#endif
...but that elicits a warning about "variable set but not used"...
--
Justin
Attachments:
0001-f-fixes-for-LZ4.patchtext/x-diff; charset=us-asciiDownload
From e31901414a8509317297972d1033c2e08629d903 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Fri, 24 Feb 2023 14:07:09 -0600
Subject: [PATCH] f!fixes for LZ4
---
doc/src/sgml/ref/pg_dump.sgml | 4 ++--
src/bin/pg_dump/compress_io.c | 2 +-
src/bin/pg_dump/compress_io.h | 4 ++--
src/bin/pg_dump/compress_lz4.c | 4 ++--
src/bin/pg_dump/pg_backup_archiver.c | 4 ++--
src/bin/pg_dump/pg_dump.c | 2 --
src/bin/pg_dump/t/002_pg_dump.pl | 6 +++---
7 files changed, 12 insertions(+), 14 deletions(-)
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 49d218905fb..6fbe49f7ede 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -331,7 +331,7 @@ PostgreSQL documentation
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
can be compressed with the <application>gzip</application> or
- <application>lz4</application>tool.
+ <application>lz4</application> tools.
This format is compressed by default using <literal>gzip</literal>
and also supports parallel dumps.
</para>
@@ -655,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>lz4</literal> or <literal>none</literal> for no compression.
+ <literal>lz4</literal>, or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index ce06f1eac9c..9239dbb2755 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -83,7 +83,7 @@
* used by the caller in an error message.
*/
char *
-supports_compression(const pg_compress_specification compression_spec)
+pgdump_supports_compression(const pg_compress_specification compression_spec)
{
const pg_compress_algorithm algorithm = compression_spec.algorithm;
bool supported = false;
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index bbde2693915..46815fa2ebe 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -21,7 +21,7 @@
#define ZLIB_OUT_SIZE 4096
#define ZLIB_IN_SIZE 4096
-extern char *supports_compression(const pg_compress_specification compression_spec);
+extern char *pgdump_supports_compression(const pg_compress_specification compression_spec);
/*
* Prototype for callback function used in writeData()
@@ -172,7 +172,7 @@ struct CompressFileHandle
extern CompressFileHandle *InitCompressFileHandle(const pg_compress_specification compression_spec);
/*
- * Initialize a compress file stream. Deffer the compression algorithm
+ * Initialize a compress file stream. Infer the compression algorithm
* from 'path', either by examining its suffix or by appending the supported
* suffixes in 'path'.
*/
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index fe1014e6e77..63e794cdc68 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -161,8 +161,8 @@ typedef struct LZ4File
} LZ4File;
/*
- * LZ4 equivalent to feof() or gzeof(). The end of file is reached if there
- * is no decompressed output in the overflow buffer and the end of the file
+ * LZ4 equivalent to feof() or gzeof(). Return true iff there is no
+ * decompressed output in the overflow buffer and the end of the backing file
* is reached.
*/
static int
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 61ebb8fe85d..2063d6f239d 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -388,7 +388,7 @@ RestoreArchive(Archive *AHX)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
{
- char *errmsg = supports_compression(AH->compression_spec);
+ char *errmsg = pgdump_supports_compression(AH->compression_spec);
if (errmsg)
pg_fatal("cannot restore from compressed archive (%s)",
errmsg);
@@ -3745,7 +3745,7 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
- errmsg = supports_compression(AH->compression_spec);
+ errmsg = pgdump_supports_compression(AH->compression_spec);
if (errmsg)
{
pg_log_warning("archive is compressed, but this installation does not support compression (%s) -- no data will be available",
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 24ba936332d..ce2242195f3 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -733,8 +733,6 @@ main(int argc, char **argv)
#ifdef HAVE_LIBZ
parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
&compression_spec);
-#else
- /* Nothing to do in the default case */
#endif
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 72b19ee6cde..ad7bc5c194b 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -4248,10 +4248,10 @@ foreach my $run (sort keys %pgdump_runs)
my $test_key = $run;
my $run_db = 'postgres';
- # Skip command-level tests for gzip if there is no support for it.
+ # Skip command-level tests for gzip/lz4 if they're not supported.
if ($pgdump_runs{$run}->{compile_option} &&
- ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
- ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
+ (($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4)))
{
note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
--
2.34.1
On Fri, Feb 24, 2023 at 11:02:14PM -0600, Justin Pryzby wrote:
I have some fixes (attached) and questions while polishing the patch for
zstd compression. The fixes are small and could be integrated with the
patch for zstd, but could be applied independently.
One more - WriteDataToArchiveGzip() says:
+ if (cs->compression_spec.level == 0)
+ pg_fatal("requested to compress the archive yet no level was specified");
That was added at e9960732a.
But if you specify gzip:0, the compression level is already enforced by
validate_compress_specification(), before hitting gzip.c:
| pg_dump: error: invalid compression specification: compression algorithm "gzip" expects a compression level between 1 and 9 (default at -1)
5e73a6048 intended that to work as before, and you *can* specify -Z0:
The change is backward-compatible, hence specifying only an integer
leads to no compression for a level of 0 and gzip compression when the
level is greater than 0.
$ time ./src/bin/pg_dump/pg_dump -h /tmp regression -t int8_tbl -Fp --compress 0 |file -
/dev/stdin: ASCII text
Right now, I think that pg_fatal in gzip.c is dead code - that was first
added in the patch version sent on 21 Dec 2022.
--
Justin
Attachments:
0001-f-fixes-for-LZ4.patchtext/x-diff; charset=us-asciiDownload
From 07b446803ec89ccd0954583640f314fa7f77048f Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Fri, 24 Feb 2023 14:07:09 -0600
Subject: [PATCH] f!fixes for LZ4
---
doc/src/sgml/ref/pg_dump.sgml | 4 ++--
src/bin/pg_dump/compress_gzip.c | 7 -------
src/bin/pg_dump/compress_io.c | 2 +-
src/bin/pg_dump/compress_io.h | 4 ++--
src/bin/pg_dump/compress_lz4.c | 4 ++--
src/bin/pg_dump/pg_backup_archiver.c | 4 ++--
src/bin/pg_dump/pg_dump.c | 2 --
src/bin/pg_dump/t/002_pg_dump.pl | 6 +++---
8 files changed, 12 insertions(+), 21 deletions(-)
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 49d218905fb..6fbe49f7ede 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -331,7 +331,7 @@ PostgreSQL documentation
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
can be compressed with the <application>gzip</application> or
- <application>lz4</application>tool.
+ <application>lz4</application> tools.
This format is compressed by default using <literal>gzip</literal>
and also supports parallel dumps.
</para>
@@ -655,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>lz4</literal> or <literal>none</literal> for no compression.
+ <literal>lz4</literal>, or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index 0af65afeb4e..af68fd27a12 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -123,13 +123,6 @@ WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
gzipcs->outsize = ZLIB_OUT_SIZE;
- /*
- * A level of zero simply copies the input one block at the time. This
- * is probably not what the user wanted when calling this interface.
- */
- if (cs->compression_spec.level == 0)
- pg_fatal("requested to compress the archive yet no level was specified");
-
if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
pg_fatal("could not initialize compression library: %s", zp->msg);
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index ce06f1eac9c..9239dbb2755 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -83,7 +83,7 @@
* used by the caller in an error message.
*/
char *
-supports_compression(const pg_compress_specification compression_spec)
+pgdump_supports_compression(const pg_compress_specification compression_spec)
{
const pg_compress_algorithm algorithm = compression_spec.algorithm;
bool supported = false;
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index bbde2693915..46815fa2ebe 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -21,7 +21,7 @@
#define ZLIB_OUT_SIZE 4096
#define ZLIB_IN_SIZE 4096
-extern char *supports_compression(const pg_compress_specification compression_spec);
+extern char *pgdump_supports_compression(const pg_compress_specification compression_spec);
/*
* Prototype for callback function used in writeData()
@@ -172,7 +172,7 @@ struct CompressFileHandle
extern CompressFileHandle *InitCompressFileHandle(const pg_compress_specification compression_spec);
/*
- * Initialize a compress file stream. Deffer the compression algorithm
+ * Initialize a compress file stream. Infer the compression algorithm
* from 'path', either by examining its suffix or by appending the supported
* suffixes in 'path'.
*/
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index fe1014e6e77..63e794cdc68 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -161,8 +161,8 @@ typedef struct LZ4File
} LZ4File;
/*
- * LZ4 equivalent to feof() or gzeof(). The end of file is reached if there
- * is no decompressed output in the overflow buffer and the end of the file
+ * LZ4 equivalent to feof() or gzeof(). Return true iff there is no
+ * decompressed output in the overflow buffer and the end of the backing file
* is reached.
*/
static int
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 61ebb8fe85d..2063d6f239d 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -388,7 +388,7 @@ RestoreArchive(Archive *AHX)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
{
- char *errmsg = supports_compression(AH->compression_spec);
+ char *errmsg = pgdump_supports_compression(AH->compression_spec);
if (errmsg)
pg_fatal("cannot restore from compressed archive (%s)",
errmsg);
@@ -3745,7 +3745,7 @@ ReadHead(ArchiveHandle *AH)
else
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
- errmsg = supports_compression(AH->compression_spec);
+ errmsg = pgdump_supports_compression(AH->compression_spec);
if (errmsg)
{
pg_log_warning("archive is compressed, but this installation does not support compression (%s) -- no data will be available",
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 24ba936332d..ce2242195f3 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -733,8 +733,6 @@ main(int argc, char **argv)
#ifdef HAVE_LIBZ
parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
&compression_spec);
-#else
- /* Nothing to do in the default case */
#endif
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 72b19ee6cde..ad7bc5c194b 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -4248,10 +4248,10 @@ foreach my $run (sort keys %pgdump_runs)
my $test_key = $run;
my $run_db = 'postgres';
- # Skip command-level tests for gzip if there is no support for it.
+ # Skip command-level tests for gzip/lz4 if they're not supported.
if ($pgdump_runs{$run}->{compile_option} &&
- ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
- ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
+ (($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4)))
{
note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
--
2.34.1
On 2/25/23 06:02, Justin Pryzby wrote:
I have some fixes (attached) and questions while polishing the patch for
zstd compression. The fixes are small and could be integrated with the
patch for zstd, but could be applied independently.- I'm unclear about get_error_func(). That's called in three places
from pg_backup_directory.c, after failures from write_func(), to
supply an compression-specific error message to pg_fatal(). But it's
not being used outside of directory format, nor for errors for other
function pointers, or even for all errors in write_func(). Is there
some reason why each compression method's write_func() shouldn't call
pg_fatal() directly, with its compression-specific message ?
I think there are a couple more places that might/should call
get_error_func(). For example ahwrite() in pg_backup_archiver.c now
simply does
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
but perhaps it should call get_error_func() too. There are probably
other places that call write_func() and should use get_error_func().
- I still think supports_compression() should be renamed, or made into a
static function in the necessary file. The main reason is that it's
more clear what it indicates - whether compression is "implemented by
pgdump" and not whether compression is "supported by this postgres
build". It also seems possible that we'd want to add a function
called something like supports_compression(), indicating whether the
algorithm is supported by the current build. It'd be better if pgdump
didn't subjugate that name.
If we choose to rename this to have pgdump_ prefix, fine with me. But I
don't think there's a realistic chance of conflict, as it's restricted
to pgdump header etc. And it's not part of an API, so I guess we could
rename that in the future if needed.
- Finally, the "Nothing to do in the default case" comment comes from
Michael's commit 5e73a6048:+ /* + * Custom and directory formats are compressed by default with gzip when + * available, not the others. + */ + if ((archiveFormat == archCustom || archiveFormat == archDirectory) && + !user_compression_defined) { #ifdef HAVE_LIBZ - if (archiveFormat == archCustom || archiveFormat == archDirectory) - compressLevel = Z_DEFAULT_COMPRESSION; - else + parse_compress_specification(PG_COMPRESSION_GZIP, NULL, + &compression_spec); +#else + /* Nothing to do in the default case */ #endif - compressLevel = 0; }As the comment says: for -Fc and -Fd, the compression is set to zlib, if
enabled, and when not otherwise specified by the user.Before 5e73a6048, this set compressLevel=0 for -Fp and -Ft, *and* when
zlib was unavailable.But I'm not sure why there's now an empty "#else". I also don't know
what "the default case" refers to.Maybe the best thing here is to move the preprocessor #if, since it's no
longer in the middle of a runtime conditional:#ifdef HAVE_LIBZ + if ((archiveFormat == archCustom || archiveFormat == archDirectory) && + !user_compression_defined) + parse_compress_specification(PG_COMPRESSION_GZIP, NULL, + &compression_spec); #endif...but that elicits a warning about "variable set but not used"...
Not sure, I need to think about this a bit.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Sat, Feb 25, 2023 at 08:05:53AM -0600, Justin Pryzby wrote:
On Fri, Feb 24, 2023 at 11:02:14PM -0600, Justin Pryzby wrote:
I have some fixes (attached) and questions while polishing the patch for
zstd compression. The fixes are small and could be integrated with the
patch for zstd, but could be applied independently.One more - WriteDataToArchiveGzip() says:
One more again.
The LZ4 path is using non-streaming mode, which compresses each block
without persistent state, giving poor compression for -Fc compared with
-Fp. If the data is highly compressible, the difference can be orders
of magnitude.
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -Z lz4 -Fp |wc -c
12351763
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -Z lz4 -Fc |wc -c
21890708
That's not true for gzip:
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Z gzip -Fc |wc -c
2118869
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Z gzip -Fp |wc -c
2115832
The function ought to at least use streaming mode, so each block/row
isn't compressioned in isolation. 003 is a simple patch to use
streaming mode, which improves the -Fc case:
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -Z lz4 -Fc |wc -c
15178283
However, that still flushes the compression buffer, writing a block
header, for every row. With a single-column table, pg_dump -Fc -Z lz4
still outputs ~10% *more* data than with no compression at all. And
that's for compressible data.
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Fc -Z lz4 |wc -c
12890296
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Fc -Z none |wc -c
11890296
I think this should use the LZ4F API with frames, which are buffered to
avoid outputting a header for every single row. The LZ4F format isn't
compatible with the LZ4 format, so (unlike changing to the streaming
API) that's not something we can change in a bugfix release. I consider
this an Opened Item.
With the LZ4F API in 004, -Fp and -Fc are essentially the same size
(like gzip). (Oh, and the output is three times smaller, too.)
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Z lz4 -Fp |wc -c
4155448
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Z lz4 -Fc |wc -c
4156548
--
Justin
Attachments:
0001-f-fixes-for-LZ4.patchtext/x-diff; charset=us-asciiDownload
From 3a980f956bf314fb161fbf0a76f62ed0c2c35bfe Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Fri, 24 Feb 2023 14:07:09 -0600
Subject: [PATCH 1/4] f!fixes for LZ4
---
src/bin/pg_dump/compress_gzip.c | 8 --------
src/bin/pg_dump/compress_io.h | 2 +-
src/bin/pg_dump/compress_lz4.c | 12 ++++--------
src/bin/pg_dump/pg_dump.c | 2 --
4 files changed, 5 insertions(+), 19 deletions(-)
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index 0af65afeb4e..52f41c2e58c 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -123,17 +123,9 @@ WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
gzipcs->outsize = ZLIB_OUT_SIZE;
- /*
- * A level of zero simply copies the input one block at the time. This
- * is probably not what the user wanted when calling this interface.
- */
- if (cs->compression_spec.level == 0)
- pg_fatal("requested to compress the archive yet no level was specified");
-
if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
pg_fatal("could not initialize compression library: %s", zp->msg);
- /* Just be paranoid - maybe End is called after Start, with no Write */
zp->next_out = gzipcs->outbuf;
zp->avail_out = gzipcs->outsize;
}
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index bbde2693915..cdb15951ea9 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -172,7 +172,7 @@ struct CompressFileHandle
extern CompressFileHandle *InitCompressFileHandle(const pg_compress_specification compression_spec);
/*
- * Initialize a compress file stream. Deffer the compression algorithm
+ * Initialize a compress file stream. Infer the compression algorithm
* from 'path', either by examining its suffix or by appending the supported
* suffixes in 'path'.
*/
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index fe1014e6e77..7dacfeae469 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -105,12 +105,8 @@ EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
LZ4CompressorState *LZ4cs;
LZ4cs = (LZ4CompressorState *) cs->private_data;
- if (LZ4cs)
- {
- pg_free(LZ4cs->outbuf);
- pg_free(LZ4cs);
- cs->private_data = NULL;
- }
+ pg_free(LZ4cs->outbuf);
+ pg_free(LZ4cs);
}
@@ -161,8 +157,8 @@ typedef struct LZ4File
} LZ4File;
/*
- * LZ4 equivalent to feof() or gzeof(). The end of file is reached if there
- * is no decompressed output in the overflow buffer and the end of the file
+ * LZ4 equivalent to feof() or gzeof(). Return true iff there is no
+ * decompressed output in the overflow buffer and the end of the backing file
* is reached.
*/
static int
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 24ba936332d..ce2242195f3 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -733,8 +733,6 @@ main(int argc, char **argv)
#ifdef HAVE_LIBZ
parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
&compression_spec);
-#else
- /* Nothing to do in the default case */
#endif
}
--
2.34.1
0002-f-fixes-for-LZ4-which-also-conflict-with-the-ZSTD-pa.patchtext/x-diff; charset=us-asciiDownload
From 8e846443a7a3e5e2df37fabc917ae3964d6d1500 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sun, 26 Feb 2023 17:47:09 -0600
Subject: [PATCH 2/4] f!fixes for LZ4 which also conflict with the ZSTD patch
---
doc/src/sgml/ref/pg_dump.sgml | 4 ++--
src/bin/pg_dump/t/002_pg_dump.pl | 6 +++---
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 49d218905fb..6fbe49f7ede 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -331,7 +331,7 @@ PostgreSQL documentation
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
can be compressed with the <application>gzip</application> or
- <application>lz4</application>tool.
+ <application>lz4</application> tools.
This format is compressed by default using <literal>gzip</literal>
and also supports parallel dumps.
</para>
@@ -655,7 +655,7 @@ PostgreSQL documentation
<para>
Specify the compression method and/or the compression level to use.
The compression method can be set to <literal>gzip</literal> or
- <literal>lz4</literal> or <literal>none</literal> for no compression.
+ <literal>lz4</literal>, or <literal>none</literal> for no compression.
A compression detail string can optionally be specified. If the
detail string is an integer, it specifies the compression level.
Otherwise, it should be a comma-separated list of items, each of the
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 72b19ee6cde..ad7bc5c194b 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -4248,10 +4248,10 @@ foreach my $run (sort keys %pgdump_runs)
my $test_key = $run;
my $run_db = 'postgres';
- # Skip command-level tests for gzip if there is no support for it.
+ # Skip command-level tests for gzip/lz4 if they're not supported.
if ($pgdump_runs{$run}->{compile_option} &&
- ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
- ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
+ (($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4)))
{
note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
--
2.34.1
0003-pg_dump-lz4-use-lz4-streaming-compression.patchtext/x-diff; charset=us-asciiDownload
From 083f1092aaf6e32a99649f02ad986df66a1d0d82 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sun, 26 Feb 2023 11:32:34 -0600
Subject: [PATCH 3/4] pg_dump/lz4: use lz4 streaming compression..
Since 0da243fed, each row was independently compressed, but that gives
poor compression (especially for tables with few columns) due to not
sharing compression state.
Also, respect the user's requested compression level.
---
src/bin/pg_dump/compress_lz4.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 7dacfeae469..32a8f668907 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -40,6 +40,7 @@ typedef struct LZ4CompressorState
{
char *outbuf;
size_t outsize;
+ LZ4_stream_t *stream;
} LZ4CompressorState;
/* Private routines that support LZ4 compressed data I/O */
@@ -90,8 +91,12 @@ WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
LZ4cs->outsize = requiredsize;
}
- compressed = LZ4_compress_default(data, LZ4cs->outbuf,
- dLen, LZ4cs->outsize);
+ if (LZ4cs->stream == NULL)
+ LZ4cs->stream = LZ4_createStream();
+
+ compressed = LZ4_compress_fast_continue(LZ4cs->stream, data, LZ4cs->outbuf,
+ dLen, LZ4cs->outsize,
+ AH->compression_spec.level);
if (compressed <= 0)
pg_fatal("failed to LZ4 compress data");
@@ -105,6 +110,10 @@ EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
LZ4CompressorState *LZ4cs;
LZ4cs = (LZ4CompressorState *) cs->private_data;
+
+ if (LZ4cs->stream != NULL)
+ LZ4_freeStream(LZ4cs->stream);
+
pg_free(LZ4cs->outbuf);
pg_free(LZ4cs);
}
--
2.34.1
0004-WIP-change-to-use-LZ4-frame-API.patchtext/x-diff; charset=us-asciiDownload
From 9c4fc70af1bba238b639639a55df8f26d44e36fe Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sun, 26 Feb 2023 11:41:47 -0600
Subject: [PATCH 4/4] WIP: +change to use LZ4 frame API
This uses buffering to avoid flushing and writing a block header for
each row, for an additional improvement in compressed size.
XXX: update archive version since this changes the meaning of "lz4" in
the header
---
src/bin/pg_dump/compress_lz4.c | 102 +++++++++++++++++++++++++--------
1 file changed, 79 insertions(+), 23 deletions(-)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 32a8f668907..88a819c553b 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -40,7 +40,7 @@ typedef struct LZ4CompressorState
{
char *outbuf;
size_t outsize;
- LZ4_stream_t *stream;
+ LZ4F_compressionContext_t ctx;
} LZ4CompressorState;
/* Private routines that support LZ4 compressed data I/O */
@@ -52,29 +52,59 @@ static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
static void
ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
{
- LZ4_streamDecode_t lz4StreamDecode;
+ size_t res;
char *buf;
char *decbuf;
size_t buflen;
+ size_t decbuflen;
size_t cnt;
+ LZ4F_decompressOptions_t opts = {.stableDst = 0}; /* not stable */
+ LZ4F_decompressionContext_t dtx;
+
+ res = LZ4F_createDecompressionContext(&dtx, LZ4F_VERSION);
+ if (LZ4F_isError(res))
+ pg_fatal("failed to LZ4F_createDecompressionContext: %s",
+ LZ4F_getErrorName(res));
+
buflen = LZ4_IN_SIZE;
buf = pg_malloc(buflen);
- decbuf = pg_malloc(buflen);
- LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+ decbuflen = LZ4_IN_SIZE;
+ decbuf = pg_malloc(LZ4_IN_SIZE);
while ((cnt = cs->readF(AH, &buf, &buflen)))
{
- int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
- buf, decbuf,
- cnt, buflen);
+ char *end = buf + cnt;
- ahwrite(decbuf, 1, decBytes, AH);
+ for (char *ptr = buf; ptr != end;)
+ {
+ size_t decBytes = decbuflen;
+ size_t srcBytes = cnt;
+
+ res = LZ4F_decompress(dtx,
+ decbuf, &decBytes,
+ ptr, &srcBytes, &opts);
+ if (LZ4F_isError(res))
+ pg_fatal("failed to LZ4F_decompress: %s", LZ4F_getErrorName(res));
+
+ ptr += srcBytes;
+ cnt -= srcBytes;
+ if (decBytes > 0)
+ ahwrite(decbuf, 1, decBytes, AH);
+
+ if (decbuflen < res)
+ {
+ /* resize the buffer to the expected size */
+ decbuf = pg_realloc(decbuf, res);
+ decbuflen = res;
+ }
+ }
}
pg_free(buf);
pg_free(decbuf);
+ LZ4F_freeDecompressionContext(dtx);
}
static void
@@ -82,40 +112,66 @@ WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen)
{
LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
- size_t compressed;
- size_t requiredsize = LZ4_compressBound(dLen);
+ size_t requiredsize = LZ4F_compressBound(dLen, NULL);
+ size_t res;
+ char *oldout = LZ4cs->outbuf;
if (requiredsize > LZ4cs->outsize)
{
LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
LZ4cs->outsize = requiredsize;
}
- if (LZ4cs->stream == NULL)
- LZ4cs->stream = LZ4_createStream();
-
- compressed = LZ4_compress_fast_continue(LZ4cs->stream, data, LZ4cs->outbuf,
- dLen, LZ4cs->outsize,
- AH->compression_spec.level);
+ if (oldout == NULL)
+ {
+ LZ4F_preferences_t prefs = {
+ .compressionLevel = cs->compression_spec.level
+ };
+
+ res = LZ4F_createCompressionContext(&LZ4cs->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(res))
+ pg_fatal("failed to LZ4F_createCompressionContext: %s", LZ4F_getErrorName(res));
+
+ res = LZ4F_compressBegin(LZ4cs->ctx, LZ4cs->outbuf, LZ4cs->outsize, &prefs);
+ if (LZ4F_isError(res))
+ pg_fatal("failed to LZ4F_compressBegin: %s", LZ4F_getErrorName(res));
+ cs->writeF(AH, LZ4cs->outbuf, res);
+ }
- if (compressed <= 0)
- pg_fatal("failed to LZ4 compress data");
+ res = LZ4F_compressUpdate(LZ4cs->ctx, LZ4cs->outbuf, LZ4cs->outsize,
+ data, dLen, NULL);
+ if (LZ4F_isError(res))
+ pg_fatal("failed to LZ4F_compressUpdate: %s", LZ4F_getErrorName(res));
- cs->writeF(AH, LZ4cs->outbuf, compressed);
+ if (res > 0)
+ cs->writeF(AH, LZ4cs->outbuf, res);
}
static void
EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
{
LZ4CompressorState *LZ4cs;
+ size_t res;
LZ4cs = (LZ4CompressorState *) cs->private_data;
- if (LZ4cs->stream != NULL)
- LZ4_freeStream(LZ4cs->stream);
+ if (LZ4cs->outbuf != NULL)
+ {
+ res = LZ4F_compressEnd(LZ4cs->ctx, LZ4cs->outbuf, LZ4cs->outsize, NULL);
+ if (LZ4F_isError(res))
+ pg_fatal("failed to LZ4F_compressEnd: %s", LZ4F_getErrorName(res));
+
+ if (res > 0)
+ cs->writeF(AH, LZ4cs->outbuf, res);
- pg_free(LZ4cs->outbuf);
- pg_free(LZ4cs);
+ LZ4F_freeCompressionContext(LZ4cs->ctx);
+ if (LZ4F_isError(res))
+ pg_fatal("failed to LZ4F_freeCompressionContext: %s", LZ4F_getErrorName(res));
+
+ pg_free(LZ4cs->outbuf);
+ LZ4cs->outbuf = NULL;
+ pg_free(LZ4cs);
+ }
}
--
2.34.1
------- Original Message -------
On Sunday, February 26th, 2023 at 3:59 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 2/25/23 06:02, Justin Pryzby wrote:
I have some fixes (attached) and questions while polishing the patch for
zstd compression. The fixes are small and could be integrated with the
patch for zstd, but could be applied independently.- I'm unclear about get_error_func(). That's called in three places
from pg_backup_directory.c, after failures from write_func(), to
supply an compression-specific error message to pg_fatal(). But it's
not being used outside of directory format, nor for errors for other
function pointers, or even for all errors in write_func(). Is there
some reason why each compression method's write_func() shouldn't call
pg_fatal() directly, with its compression-specific message ?I think there are a couple more places that might/should call
get_error_func(). For example ahwrite() in pg_backup_archiver.c now
simply doesif (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;but perhaps it should call get_error_func() too. There are probably
other places that call write_func() and should use get_error_func().
Agreed, calling get_error_func() would be preferable to a fatal error. It
should be the caller of the api who decides how to proceed.
- I still think supports_compression() should be renamed, or made into a
static function in the necessary file. The main reason is that it's
more clear what it indicates - whether compression is "implemented by
pgdump" and not whether compression is "supported by this postgres
build". It also seems possible that we'd want to add a function
called something like supports_compression(), indicating whether the
algorithm is supported by the current build. It'd be better if pgdump
didn't subjugate that name.If we choose to rename this to have pgdump_ prefix, fine with me. But I
don't think there's a realistic chance of conflict, as it's restricted
to pgdump header etc. And it's not part of an API, so I guess we could
rename that in the future if needed.
Agreed, it is very unrealistic that one will include that header file anywhere
but within pg_dump. Also. I think that adding a prefix, "pgdump", "pg_dump",
or similar does not add value and subtracts readability.
- Finally, the "Nothing to do in the default case" comment comes from
Michael's commit 5e73a6048:+ /* + * Custom and directory formats are compressed by default with gzip when + * available, not the others. + / + if ((archiveFormat == archCustom || archiveFormat == archDirectory) && + !user_compression_defined) { #ifdef HAVE_LIBZ - if (archiveFormat == archCustom || archiveFormat == archDirectory) - compressLevel = Z_DEFAULT_COMPRESSION; - else + parse_compress_specification(PG_COMPRESSION_GZIP, NULL, + &compression_spec); +#else + / Nothing to do in the default case */ #endif - compressLevel = 0; }As the comment says: for -Fc and -Fd, the compression is set to zlib, if
enabled, and when not otherwise specified by the user.Before 5e73a6048, this set compressLevel=0 for -Fp and -Ft, and when
zlib was unavailable.But I'm not sure why there's now an empty "#else". I also don't know
what "the default case" refers to.Maybe the best thing here is to move the preprocessor #if, since it's no
longer in the middle of a runtime conditional:#ifdef HAVE_LIBZ + if ((archiveFormat == archCustom || archiveFormat == archDirectory) && + !user_compression_defined) + parse_compress_specification(PG_COMPRESSION_GZIP, NULL, + &compression_spec); #endif...but that elicits a warning about "variable set but not used"...
Not sure, I need to think about this a bit.
Not having warnings is preferable, isn't it? I can understand the confusion
on the message though. Maybe a phrasing like:
/* Nothing to do for the default case when LIBZ is not available */
is easier to understand.
Cheers,
//Georgios
Show quoted text
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
------- Original Message -------
On Saturday, February 25th, 2023 at 3:05 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Fri, Feb 24, 2023 at 11:02:14PM -0600, Justin Pryzby wrote:
I have some fixes (attached) and questions while polishing the patch for
zstd compression. The fixes are small and could be integrated with the
patch for zstd, but could be applied independently.
Please find some comments on the rest of the fixes patch that Tomas has not
commented on.
can be compressed with the <application>gzip</application> or
- <application>lz4</application>tool.
+ <application>lz4</application> tools.
+1
The compression method can be set to <literal>gzip</literal> or
- <literal>lz4</literal> or <literal>none</literal> for no compression.
+ <literal>lz4</literal>, or <literal>none</literal> for no compression.
I am not a native English speaker. Yet I think that if one adds commas
in one of the options, then one should add commas to all the options.
Namely, the aboveis missing a comma between gzip and lz4. However I
think that not having any commas still works grammatically and
syntactically.
- /*
- * A level of zero simply copies the input one block at the time. This
- * is probably not what the user wanted when calling this interface.
- */
- if (cs->compression_spec.level == 0)
- pg_fatal("requested to compress the archive yet no level was specified");
I disagree with change. WriteDataToArchiveGzip() is far away from
what ever the code in pg_dump.c is doing. Any non valid values for
level will emit an error in when the proper gzip/zlib code is
called. A zero value however, will not emit such error. Having the
extra check there is a future proof guarantee in a very low cost.
Furthermore, it quickly informs the reader of the code about that
specific value helping with readability and comprehension.
If any change is required, something for which I vote strongly
against, I would at least recommend to replace it with an
assertion.
- * Initialize a compress file stream. Deffer the compression algorithm
+ * Initialize a compress file stream. Infer the compression algorithm
:+1:
- # Skip command-level tests for gzip if there is no support for it.
+ # Skip command-level tests for gzip/lz4 if they're not supported.
We will be back at that again soon. Maybe change to:
Skip command-level test for unsupported compression methods
To include everything.
- ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
- ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
+ (($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4)))
Good catch, :+1:
Cheers,
//Georgios
Show quoted text
--
Justin
On Thu, Feb 23, 2023 at 09:24:46PM +0100, Tomas Vondra wrote:
On 2/23/23 16:26, Tomas Vondra wrote:
Thanks for v30 with the updated commit messages. I've pushed 0001 after
fixing a comment typo and removing (I think) an unnecessary change in an
error message.I'll give the buildfarm a bit of time before pushing 0002 and 0003.
I've now pushed 0002 and 0003, after minor tweaks (a couple typos etc.),
and marked the CF entry as committed. Thanks for the patch!
I found that e9960732a broke writing of empty gzip-compressed data,
specifically LOs. pg_dump succeeds, but then the restore fails:
postgres=# SELECT lo_create(1234);
lo_create | 1234
$ time ./src/bin/pg_dump/pg_dump -h /tmp -d postgres -Fc |./src/bin/pg_dump/pg_restore -f /dev/null -v
pg_restore: implied data-only restore
pg_restore: executing BLOB 1234
pg_restore: processing BLOBS
pg_restore: restoring large object with OID 1234
pg_restore: error: could not uncompress data: (null)
The inline patch below fixes it, but you won't be able to apply it
directly, as it's on top of other patches which rename the functions
back to "Zlib" and rearranges the functions to their original order, to
allow running:
git diff --diff-algorithm=minimal -w e9960732a~:./src/bin/pg_dump/compress_io.c ./src/bin/pg_dump/compress_gzip.c
The current function order avoids 3 lines of declarations, but it's
obviously pretty useful to be able to run that diff command. I already
argued for not calling the functions "Gzip" on the grounds that the name
was inaccurate.
I'd want to create an empty large object in src/test/sql/largeobject.sql
to exercise this tested during pgupgrade. But unfortunately that
doesn't use -Fc, so this isn't hit. Empty input is an important enough
test case to justify a tap test, if there's no better way.
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index f3f5e87c9a8..68f3111b2fe 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -55,6 +55,32 @@ InitCompressorZlib(CompressorState *cs,
gzipcs = (ZlibCompressorState *) pg_malloc0(sizeof(ZlibCompressorState));
cs->private_data = gzipcs;
+
+ if (cs->writeF)
+ {
+ z_streamp zp;
+ zp = gzipcs->zp = (z_streamp) pg_malloc0(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to append a
+ * trailing zero byte to the zlib output.
+ */
+
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ if (deflateInit(gzipcs->zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
}
static void
@@ -63,7 +89,7 @@ EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
ZlibCompressorState *gzipcs = (ZlibCompressorState *) cs->private_data;
z_streamp zp;
- if (gzipcs->zp)
+ if (cs->writeF != NULL)
{
zp = gzipcs->zp;
zp->next_in = NULL;
@@ -131,29 +157,6 @@ WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen)
{
ZlibCompressorState *gzipcs = (ZlibCompressorState *) cs->private_data;
- z_streamp zp;
-
- if (!gzipcs->zp)
- {
- zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * outsize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to
- * append a trailing zero byte to the zlib output.
- */
- gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
- gzipcs->outsize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
- pg_fatal("could not initialize compression library: %s", zp->msg);
-
- zp->next_out = gzipcs->outbuf;
- zp->avail_out = gzipcs->outsize;
- }
gzipcs->zp->next_in = (void *) unconstify(void *, data);
gzipcs->zp->avail_in = dLen;
Attachments:
0001-Rename-functions-structures-comments-which-don-t-use.patchtext/x-diff; charset=us-asciiDownload
From 1c707279596f3cffde9c97b450dcbef0b6ddbd94 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Sun, 26 Feb 2023 17:12:03 -0600
Subject: [PATCH 1/3] Rename functions/structures/comments which don't use
"gzip"
zlib's "deflate" functions use a header other than gzip's, so it's
misleading to use names that say gzip.
https://www.postgresql.org/message-id/20221217232615.GS1153%40telsasoft.com
https://www.postgresql.org/message-id/20230127172320.GZ22427%40telsasoft.com
---
src/bin/pg_dump/compress_gzip.c | 42 ++++++++++++++++-----------------
src/bin/pg_dump/compress_gzip.h | 2 +-
src/bin/pg_dump/compress_io.c | 2 +-
3 files changed, 23 insertions(+), 23 deletions(-)
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index 52f41c2e58c..dd769750c8f 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -1,50 +1,50 @@
/*-------------------------------------------------------------------------
*
* compress_gzip.c
- * Routines for archivers to read or write a gzip compressed data stream.
+ * Routines for archivers to read or write a zlib/gzip compressed data streams.
*
* Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* IDENTIFICATION
* src/bin/pg_dump/compress_gzip.c
*
*-------------------------------------------------------------------------
*/
#include "postgres_fe.h"
#include <unistd.h>
#include "compress_gzip.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
#include "zlib.h"
/*----------------------
* Compressor API
*----------------------
*/
-typedef struct GzipCompressorState
+typedef struct ZlibCompressorState
{
z_streamp zp;
void *outbuf;
size_t outsize;
-} GzipCompressorState;
+} ZlibCompressorState;
-/* Private routines that support gzip compressed data I/O */
+/* Private routines that support zlib compressed data I/O */
static void
-DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
{
- GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ ZlibCompressorState *gzipcs = (ZlibCompressorState *) cs->private_data;
z_streamp zp = gzipcs->zp;
void *out = gzipcs->outbuf;
int res = Z_OK;
while (gzipcs->zp->avail_in != 0 || flush)
{
res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
if (res == Z_STREAM_ERROR)
pg_fatal("could not compress data: %s", zp->msg);
if ((flush && (zp->avail_out < gzipcs->outsize))
|| (zp->avail_out == 0)
@@ -68,52 +68,52 @@ DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
}
zp->next_out = out;
zp->avail_out = gzipcs->outsize;
}
if (res == Z_STREAM_END)
break;
}
}
static void
-EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
{
- GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ ZlibCompressorState *gzipcs = (ZlibCompressorState *) cs->private_data;
z_streamp zp;
if (gzipcs->zp)
{
zp = gzipcs->zp;
zp->next_in = NULL;
zp->avail_in = 0;
/* Flush any remaining data from zlib buffer */
- DeflateCompressorGzip(AH, cs, true);
+ DeflateCompressorZlib(AH, cs, true);
if (deflateEnd(zp) != Z_OK)
pg_fatal("could not close compression stream: %s", zp->msg);
pg_free(gzipcs->outbuf);
pg_free(gzipcs->zp);
}
pg_free(gzipcs);
cs->private_data = NULL;
}
static void
-WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen)
{
- GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ ZlibCompressorState *gzipcs = (ZlibCompressorState *) cs->private_data;
z_streamp zp;
if (!gzipcs->zp)
{
zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
zp->zalloc = Z_NULL;
zp->zfree = Z_NULL;
zp->opaque = Z_NULL;
/*
* outsize is the buffer size we tell zlib it can output to. We
@@ -124,27 +124,27 @@ WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
gzipcs->outsize = ZLIB_OUT_SIZE;
if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
pg_fatal("could not initialize compression library: %s", zp->msg);
zp->next_out = gzipcs->outbuf;
zp->avail_out = gzipcs->outsize;
}
gzipcs->zp->next_in = (void *) unconstify(void *, data);
gzipcs->zp->avail_in = dLen;
- DeflateCompressorGzip(AH, cs, false);
+ DeflateCompressorZlib(AH, cs, false);
}
static void
-ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
+ReadDataFromArchiveZlib(ArchiveHandle *AH, CompressorState *cs)
{
z_streamp zp;
char *out;
int res = Z_OK;
size_t cnt;
char *buf;
size_t buflen;
zp = (z_streamp) pg_malloc(sizeof(z_stream));
zp->zalloc = Z_NULL;
zp->zfree = Z_NULL;
@@ -193,36 +193,36 @@ ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
}
if (inflateEnd(zp) != Z_OK)
pg_fatal("could not close compression library: %s", zp->msg);
free(buf);
free(out);
free(zp);
}
-/* Public routines that support gzip compressed data I/O */
+/* Public routines that support zlib compressed data I/O */
void
-InitCompressorGzip(CompressorState *cs,
+InitCompressorZlib(CompressorState *cs,
const pg_compress_specification compression_spec)
{
- GzipCompressorState *gzipcs;
+ ZlibCompressorState *gzipcs;
- cs->readData = ReadDataFromArchiveGzip;
- cs->writeData = WriteDataToArchiveGzip;
- cs->end = EndCompressorGzip;
+ cs->readData = ReadDataFromArchiveZlib;
+ cs->writeData = WriteDataToArchiveZlib;
+ cs->end = EndCompressorZlib;
cs->compression_spec = compression_spec;
- gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+ gzipcs = (ZlibCompressorState *) pg_malloc0(sizeof(ZlibCompressorState));
cs->private_data = gzipcs;
}
/*----------------------
* Compress File API
*----------------------
*/
static size_t
@@ -370,23 +370,23 @@ InitCompressFileHandleGzip(CompressFileHandle *CFH,
CFH->getc_func = Gzip_getc;
CFH->close_func = Gzip_close;
CFH->eof_func = Gzip_eof;
CFH->get_error_func = Gzip_get_error;
CFH->compression_spec = compression_spec;
CFH->private_data = NULL;
}
#else /* HAVE_LIBZ */
void
-InitCompressorGzip(CompressorState *cs,
+InitCompressorZlib(CompressorState *cs,
const pg_compress_specification compression_spec)
{
pg_fatal("this build does not support compression with %s", "gzip");
}
void
InitCompressFileHandleGzip(CompressFileHandle *CFH,
const pg_compress_specification compression_spec)
{
pg_fatal("this build does not support compression with %s", "gzip");
}
diff --git a/src/bin/pg_dump/compress_gzip.h b/src/bin/pg_dump/compress_gzip.h
index 2392c697b4c..784a45edaae 100644
--- a/src/bin/pg_dump/compress_gzip.h
+++ b/src/bin/pg_dump/compress_gzip.h
@@ -8,17 +8,17 @@
*
* IDENTIFICATION
* src/bin/pg_dump/compress_gzip.h
*
*-------------------------------------------------------------------------
*/
#ifndef _COMPRESS_GZIP_H_
#define _COMPRESS_GZIP_H_
#include "compress_io.h"
-extern void InitCompressorGzip(CompressorState *cs,
+extern void InitCompressorZlib(CompressorState *cs,
const pg_compress_specification compression_spec);
extern void InitCompressFileHandleGzip(CompressFileHandle *CFH,
const pg_compress_specification compression_spec);
#endif /* _COMPRESS_GZIP_H_ */
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index ce06f1eac9c..e8277a1e13a 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -119,23 +119,23 @@ AllocateCompressor(const pg_compress_specification compression_spec,
ReadFunc readF, WriteFunc writeF)
{
CompressorState *cs;
cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
cs->readF = readF;
cs->writeF = writeF;
if (compression_spec.algorithm == PG_COMPRESSION_NONE)
InitCompressorNone(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorGzip(cs, compression_spec);
+ InitCompressorZlib(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
InitCompressorLZ4(cs, compression_spec);
return cs;
}
/*
* Terminate compression library context and flush its buffers.
*/
void
EndCompressor(ArchiveHandle *AH, CompressorState *cs)
--
2.34.1
0002-also-rearrange-the-functions-to-their-original-order.patchtext/x-diff; charset=us-asciiDownload
From f8529ab684ab2957a775b6add1e6fc94a4a12476 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Tue, 28 Feb 2023 13:34:06 -0600
Subject: [PATCH 2/3] +also rearrange the functions to their original order..
This allows comparing like:
git diff --diff-algorithm=minimal -w e9960732a~:../src/bin/pg_dump/compress_io.c ../src/bin/pg_dump/compress_gzip.c
---
src/bin/pg_dump/compress_gzip.c | 95 ++++++++++++++++++---------------
1 file changed, 51 insertions(+), 44 deletions(-)
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index dd769750c8f..f3f5e87c9a8 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -24,22 +24,73 @@
* Compressor API
*----------------------
*/
typedef struct ZlibCompressorState
{
z_streamp zp;
void *outbuf;
size_t outsize;
} ZlibCompressorState;
+static void ReadDataFromArchiveZlib(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
+static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
+ bool flush);
+
+/* Public routines that support zlib compressed data I/O */
+void
+InitCompressorZlib(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ ZlibCompressorState *gzipcs;
+
+ cs->readData = ReadDataFromArchiveZlib;
+ cs->writeData = WriteDataToArchiveZlib;
+ cs->end = EndCompressorZlib;
+
+ cs->compression_spec = compression_spec;
+
+ gzipcs = (ZlibCompressorState *) pg_malloc0(sizeof(ZlibCompressorState));
+
+ cs->private_data = gzipcs;
+}
+
+static void
+EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
+{
+ ZlibCompressorState *gzipcs = (ZlibCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ if (gzipcs->zp)
+ {
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorZlib(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ }
+
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
/* Private routines that support zlib compressed data I/O */
static void
DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
{
ZlibCompressorState *gzipcs = (ZlibCompressorState *) cs->private_data;
z_streamp zp = gzipcs->zp;
void *out = gzipcs->outbuf;
int res = Z_OK;
while (gzipcs->zp->avail_in != 0 || flush)
{
@@ -67,48 +118,22 @@ DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
cs->writeF(AH, (char *) out, len);
}
zp->next_out = out;
zp->avail_out = gzipcs->outsize;
}
if (res == Z_STREAM_END)
break;
}
}
-static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
-{
- ZlibCompressorState *gzipcs = (ZlibCompressorState *) cs->private_data;
- z_streamp zp;
-
- if (gzipcs->zp)
- {
- zp = gzipcs->zp;
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
-
- pg_free(gzipcs->outbuf);
- pg_free(gzipcs->zp);
- }
-
- pg_free(gzipcs);
- cs->private_data = NULL;
-}
-
static void
WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen)
{
ZlibCompressorState *gzipcs = (ZlibCompressorState *) cs->private_data;
z_streamp zp;
if (!gzipcs->zp)
{
zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
zp->zalloc = Z_NULL;
@@ -193,40 +218,22 @@ ReadDataFromArchiveZlib(ArchiveHandle *AH, CompressorState *cs)
ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
}
if (inflateEnd(zp) != Z_OK)
pg_fatal("could not close compression library: %s", zp->msg);
free(buf);
free(out);
free(zp);
}
-/* Public routines that support zlib compressed data I/O */
-void
-InitCompressorZlib(CompressorState *cs,
- const pg_compress_specification compression_spec)
-{
- ZlibCompressorState *gzipcs;
-
- cs->readData = ReadDataFromArchiveZlib;
- cs->writeData = WriteDataToArchiveZlib;
- cs->end = EndCompressorZlib;
-
- cs->compression_spec = compression_spec;
-
- gzipcs = (ZlibCompressorState *) pg_malloc0(sizeof(ZlibCompressorState));
-
- cs->private_data = gzipcs;
-}
-
/*----------------------
* Compress File API
*----------------------
*/
static size_t
Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
{
gzFile gzfp = (gzFile) CFH->private_data;
size_t ret;
--
2.34.1
0003-pg_dump-call-deflateInit-in-Init-rather-than-in-Writ.patchtext/x-diff; charset=us-asciiDownload
From d27585f527365bdb32f032e789ed924d6bb2dca5 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Tue, 28 Feb 2023 17:06:59 -0600
Subject: [PATCH 3/3] pg_dump: call deflateInit() in Init() rather than in
Write()..
This resolves an issue in e9960732a causing empty LOs to be dumped
incorrectly, resulting in errors during pg_restore.
---
src/bin/pg_dump/compress_gzip.c | 51 +++++++++++++++++----------------
1 file changed, 27 insertions(+), 24 deletions(-)
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index f3f5e87c9a8..68f3111b2fe 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -47,31 +47,57 @@ InitCompressorZlib(CompressorState *cs,
ZlibCompressorState *gzipcs;
cs->readData = ReadDataFromArchiveZlib;
cs->writeData = WriteDataToArchiveZlib;
cs->end = EndCompressorZlib;
cs->compression_spec = compression_spec;
gzipcs = (ZlibCompressorState *) pg_malloc0(sizeof(ZlibCompressorState));
cs->private_data = gzipcs;
+
+ if (cs->writeF)
+ {
+ z_streamp zp;
+ zp = gzipcs->zp = (z_streamp) pg_malloc0(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to append a
+ * trailing zero byte to the zlib output.
+ */
+
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ if (deflateInit(gzipcs->zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s",
+ zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+ }
}
static void
EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
{
ZlibCompressorState *gzipcs = (ZlibCompressorState *) cs->private_data;
z_streamp zp;
- if (gzipcs->zp)
+ if (cs->writeF != NULL)
{
zp = gzipcs->zp;
zp->next_in = NULL;
zp->avail_in = 0;
/* Flush any remaining data from zlib buffer */
DeflateCompressorZlib(AH, cs, true);
if (deflateEnd(zp) != Z_OK)
pg_fatal("could not close compression stream: %s", zp->msg);
@@ -123,45 +149,22 @@ DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
if (res == Z_STREAM_END)
break;
}
}
static void
WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen)
{
ZlibCompressorState *gzipcs = (ZlibCompressorState *) cs->private_data;
- z_streamp zp;
-
- if (!gzipcs->zp)
- {
- zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * outsize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to
- * append a trailing zero byte to the zlib output.
- */
- gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
- gzipcs->outsize = ZLIB_OUT_SIZE;
-
- if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
- pg_fatal("could not initialize compression library: %s", zp->msg);
-
- zp->next_out = gzipcs->outbuf;
- zp->avail_out = gzipcs->outsize;
- }
gzipcs->zp->next_in = (void *) unconstify(void *, data);
gzipcs->zp->avail_in = dLen;
DeflateCompressorZlib(AH, cs, false);
}
static void
ReadDataFromArchiveZlib(ArchiveHandle *AH, CompressorState *cs)
{
z_streamp zp;
char *out;
--
2.34.1
On Tue, Feb 28, 2023 at 05:58:34PM -0600, Justin Pryzby wrote:
I found that e9960732a broke writing of empty gzip-compressed data,
specifically LOs. pg_dump succeeds, but then the restore fails:
The number of issues you have been reporting here begins to worries
me.. How many of them have you found? Is it right to assume that all
of them have basically 03d02f5 as oldest origin point?
--
Michael
------- Original Message -------
On Wednesday, March 1st, 2023 at 12:58 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:
I found that e9960732a broke writing of empty gzip-compressed data,
specifically LOs. pg_dump succeeds, but then the restore fails:postgres=# SELECT lo_create(1234);
lo_create | 1234$ time ./src/bin/pg_dump/pg_dump -h /tmp -d postgres -Fc |./src/bin/pg_dump/pg_restore -f /dev/null -v
pg_restore: implied data-only restore
pg_restore: executing BLOB 1234
pg_restore: processing BLOBS
pg_restore: restoring large object with OID 1234
pg_restore: error: could not uncompress data: (null)
Thank you for looking. This was an untested case.
The inline patch below fixes it, but you won't be able to apply it
directly, as it's on top of other patches which rename the functions
back to "Zlib" and rearranges the functions to their original order, to
allow running:git diff --diff-algorithm=minimal -w e9960732a~:./src/bin/pg_dump/compress_io.c ./src/bin/pg_dump/compress_gzip.c
Please find a patch attached that can be applied directly.
The current function order avoids 3 lines of declarations, but it's
obviously pretty useful to be able to run that diff command. I already
argued for not calling the functions "Gzip" on the grounds that the name
was inaccurate.
I have no idea why we are back on the naming issue. I stand by the name
because in my humble opinion helps the code reader. There is a certain
uniformity when the compression_spec.algorithm and the compressor
functions match as the following code sample shows.
if (compression_spec.algorithm == PG_COMPRESSION_NONE)
InitCompressorNone(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressorGzip(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
InitCompressorLZ4(cs, compression_spec);
When the reader wants to see what happens when the PG_COMPRESSION_XXX
is set, has to simply search for the XXX part. I think that this is
justification enough for the use of the names.
I'd want to create an empty large object in src/test/sql/largeobject.sql
to exercise this tested during pgupgrade. But unfortunately that
doesn't use -Fc, so this isn't hit. Empty input is an important enough
test case to justify a tap test, if there's no better way.
Please find in the attached a test case that exercises this codepath.
Cheers,
//Georgios
Attachments:
0001-Properly-gzip-compress-when-no-data-is-available.patchtext/x-patch; name=0001-Properly-gzip-compress-when-no-data-is-available.patchDownload
From 95450f0e7e90f0a1a3cdfc12c760a9520bd2995f Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 1 Mar 2023 12:42:32 +0000
Subject: [PATCH vX] Properly gzip compress when no data is available
When creating dumps with the Compressor API, it is possible to only call the
Allocate and End compressor functions without ever writing any data. The gzip
implementation wrongly assumed that the write function would be called and
defered the initialization of the internal compression system for the first
write call. The End call would only finilize the internal compression system if
that was ever initialized.
The problem with that approach is that it violated the expectations of the
internal compression system during decompression.
This commit initializes the internal compression system during the Allocate
call, under the condition that a write function was provided by the caller.
Given that decompression does not need to keep track of any state, the
compressor's private_data member is now populated only during compression.
Tests are added to cover this scenario.
Initial patch by Justin Pruzby.
Reported-by: Justin Pryzby
---
src/bin/pg_dump/compress_gzip.c | 118 +++++++++++++++++--------------
src/bin/pg_dump/t/002_pg_dump.pl | 27 ++++++-
2 files changed, 91 insertions(+), 54 deletions(-)
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index 0af65afeb4..f5d32cf059 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -32,9 +32,48 @@ typedef struct GzipCompressorState
size_t outsize;
} GzipCompressorState;
+
/* Private routines that support gzip compressed data I/O */
static void
-DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+DeflateCompressorInit(CompressorState *cs)
+{
+ GzipCompressorState *gzipcs;
+ z_streamp zp;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We
+ * actually allocate one extra byte because some routines want to
+ * append a trailing zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ /*
+ * A level of zero simply copies the input one block at the time. This
+ * is probably not what the user wanted when calling this interface.
+ */
+ if (cs->compression_spec.level == 0)
+ pg_fatal("requested to compress the archive yet no level was specified");
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+
+ /* Keep track of gzipcs */
+ cs->private_data = gzipcs;
+}
+
+static void
+DeflateCompressorData(ArchiveHandle *AH, CompressorState *cs, bool flush)
{
GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
z_streamp zp = gzipcs->zp;
@@ -76,71 +115,44 @@ DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
}
static void
-EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+DeflateCompressorEnd(ArchiveHandle *AH, CompressorState *cs)
{
GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
z_streamp zp;
- if (gzipcs->zp)
- {
- zp = gzipcs->zp;
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorGzip(AH, cs, true);
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorData(AH, cs, true);
- pg_free(gzipcs->outbuf);
- pg_free(gzipcs->zp);
- }
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
pg_free(gzipcs);
cs->private_data = NULL;
}
+static void
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
+{
+ /* If deflation was initialized, finalize it */
+ if (cs->private_data)
+ DeflateCompressorEnd(AH, cs);
+}
+
static void
WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen)
{
GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
- z_streamp zp;
-
- if (!gzipcs->zp)
- {
- zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * outsize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to
- * append a trailing zero byte to the zlib output.
- */
- gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
- gzipcs->outsize = ZLIB_OUT_SIZE;
-
- /*
- * A level of zero simply copies the input one block at the time. This
- * is probably not what the user wanted when calling this interface.
- */
- if (cs->compression_spec.level == 0)
- pg_fatal("requested to compress the archive yet no level was specified");
-
- if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
- pg_fatal("could not initialize compression library: %s", zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = gzipcs->outbuf;
- zp->avail_out = gzipcs->outsize;
- }
gzipcs->zp->next_in = (void *) unconstify(void *, data);
gzipcs->zp->avail_in = dLen;
- DeflateCompressorGzip(AH, cs, false);
+ DeflateCompressorData(AH, cs, false);
}
static void
@@ -214,17 +226,19 @@ void
InitCompressorGzip(CompressorState *cs,
const pg_compress_specification compression_spec)
{
- GzipCompressorState *gzipcs;
-
cs->readData = ReadDataFromArchiveGzip;
cs->writeData = WriteDataToArchiveGzip;
cs->end = EndCompressorGzip;
cs->compression_spec = compression_spec;
- gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
-
- cs->private_data = gzipcs;
+ /*
+ * If the caller has defined a write function, prepare the necessary state.
+ * Avoid initializing during the first write call, because End may be called
+ * without ever writing any data.
+ */
+ if (cs->writeF)
+ DeflateCompressorInit(cs);
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 72b19ee6cd..7b5a6e190c 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -1194,6 +1194,29 @@ my %tests = (
},
},
+ 'LO create (with no data)' => {
+ create_sql =>
+ 'SELECT pg_catalog.lo_create(0);',
+ regexp => qr/^
+ \QSELECT pg_catalog.lo_open\E \('\d+',\ \d+\);\n
+ \QSELECT pg_catalog.lo_close(0);\E
+ /xm,
+ like => {
+ %full_runs,
+ column_inserts => 1,
+ data_only => 1,
+ inserts => 1,
+ section_data => 1,
+ test_schema_plus_large_objects => 1,
+ },
+ unlike => {
+ binary_upgrade => 1,
+ no_large_objects => 1,
+ schema_only => 1,
+ section_pre_data => 1,
+ },
+ },
+
'COMMENT ON DATABASE postgres' => {
regexp => qr/^COMMENT ON DATABASE postgres IS .+;/m,
@@ -4250,8 +4273,8 @@ foreach my $run (sort keys %pgdump_runs)
# Skip command-level tests for gzip if there is no support for it.
if ($pgdump_runs{$run}->{compile_option} &&
- ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
- ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4))
+ (($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) ||
+ ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4)))
{
note "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support";
next;
--
2.34.1
On 3/1/23 08:24, Michael Paquier wrote:
On Tue, Feb 28, 2023 at 05:58:34PM -0600, Justin Pryzby wrote:
I found that e9960732a broke writing of empty gzip-compressed data,
specifically LOs. pg_dump succeeds, but then the restore fails:The number of issues you have been reporting here begins to worries
me.. How many of them have you found? Is it right to assume that all
of them have basically 03d02f5 as oldest origin point?
AFAICS a lot of the issues are more a discussion about wording in a
couple places, whether it's nicer to do A or B, name the functions
differently or what.
I'm aware of three genuine issues that I intend to fix shortly:
1) incorrect "if" condition in a TAP test
2) failure when compressing empty LO (which we had no test for)
3) change in handling "compression level = 0" (which I believe should be
made to behave like before)
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 3/1/23 14:39, gkokolatos@pm.me wrote:
------- Original Message -------
On Wednesday, March 1st, 2023 at 12:58 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:I found that e9960732a broke writing of empty gzip-compressed data,
specifically LOs. pg_dump succeeds, but then the restore fails:postgres=# SELECT lo_create(1234);
lo_create | 1234$ time ./src/bin/pg_dump/pg_dump -h /tmp -d postgres -Fc |./src/bin/pg_dump/pg_restore -f /dev/null -v
pg_restore: implied data-only restore
pg_restore: executing BLOB 1234
pg_restore: processing BLOBS
pg_restore: restoring large object with OID 1234
pg_restore: error: could not uncompress data: (null)Thank you for looking. This was an untested case.
Yeah :-(
The inline patch below fixes it, but you won't be able to apply it
directly, as it's on top of other patches which rename the functions
back to "Zlib" and rearranges the functions to their original order, to
allow running:git diff --diff-algorithm=minimal -w e9960732a~:./src/bin/pg_dump/compress_io.c ./src/bin/pg_dump/compress_gzip.c
Please find a patch attached that can be applied directly.
The current function order avoids 3 lines of declarations, but it's
obviously pretty useful to be able to run that diff command. I already
argued for not calling the functions "Gzip" on the grounds that the name
was inaccurate.I have no idea why we are back on the naming issue. I stand by the name
because in my humble opinion helps the code reader. There is a certain
uniformity when the compression_spec.algorithm and the compressor
functions match as the following code sample shows.if (compression_spec.algorithm == PG_COMPRESSION_NONE)
InitCompressorNone(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressorGzip(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
InitCompressorLZ4(cs, compression_spec);When the reader wants to see what happens when the PG_COMPRESSION_XXX
is set, has to simply search for the XXX part. I think that this is
justification enough for the use of the names.
I don't recall the previous discussion about the naming, but I'm not
sure why would it be inaccurate. We call it 'gzip' pretty much
everywhere, and I agree with Georgios there's it helps to make this
consistent with the PG_COMPRESSION_ stuff.
The one thing that concerned me while reviewing it earlier was that it
might make the backpatcheing harder. But that's mostly irrelevant due to
all the other changes I think.
I'd want to create an empty large object in src/test/sql/largeobject.sql
to exercise this tested during pgupgrade. But unfortunately that
doesn't use -Fc, so this isn't hit. Empty input is an important enough
test case to justify a tap test, if there's no better way.Please find in the attached a test case that exercises this codepath.
Thanks. That seems correct to me, but I find it somewhat confusing,
because we now have
DeflateCompressorInit vs. InitCompressorGzip
DeflateCompressorEnd vs. EndCompressorGzip
DeflateCompressorData - The name doesn't really say what it does (would
be better to have a verb in there, I think).
I wonder if we can make this somehow clearer?
Also, InitCompressorGzip says this:
/*
* If the caller has defined a write function, prepare the necessary
* state. Avoid initializing during the first write call, because End
* may be called without ever writing any data.
*/
if (cs->writeF)
DeflateCompressorInit(cs);
Does it actually make sense to not have writeF defined in some cases?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2/27/23 15:56, gkokolatos@pm.me wrote:
------- Original Message -------
On Saturday, February 25th, 2023 at 3:05 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:On Fri, Feb 24, 2023 at 11:02:14PM -0600, Justin Pryzby wrote:
I have some fixes (attached) and questions while polishing the patch for
zstd compression. The fixes are small and could be integrated with the
patch for zstd, but could be applied independently.Please find some comments on the rest of the fixes patch that Tomas has not
commented on.can be compressed with the <application>gzip</application> or - <application>lz4</application>tool. + <application>lz4</application> tools.+1
The compression method can be set to <literal>gzip</literal> or - <literal>lz4</literal> or <literal>none</literal> for no compression. + <literal>lz4</literal>, or <literal>none</literal> for no compression.I am not a native English speaker. Yet I think that if one adds commas
in one of the options, then one should add commas to all the options.
Namely, the aboveis missing a comma between gzip and lz4. However I
think that not having any commas still works grammatically and
syntactically.
I pushed a fix with most of these wording changes. As for this comma, I
believe the correct style is
a, b, or c
At least that's what the other places in the pg_dump.sgml file do.
- ($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) || - ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4)) + (($pgdump_runs{$run}->{compile_option} eq 'gzip' && !$supports_gzip) || + ($pgdump_runs{$run}->{compile_option} eq 'lz4' && !$supports_lz4)))
Pushed a fix for this too.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2/25/23 15:05, Justin Pryzby wrote:
On Fri, Feb 24, 2023 at 11:02:14PM -0600, Justin Pryzby wrote:
I have some fixes (attached) and questions while polishing the patch for
zstd compression. The fixes are small and could be integrated with the
patch for zstd, but could be applied independently.One more - WriteDataToArchiveGzip() says:
+ if (cs->compression_spec.level == 0) + pg_fatal("requested to compress the archive yet no level was specified");That was added at e9960732a.
But if you specify gzip:0, the compression level is already enforced by
validate_compress_specification(), before hitting gzip.c:| pg_dump: error: invalid compression specification: compression algorithm "gzip" expects a compression level between 1 and 9 (default at -1)
5e73a6048 intended that to work as before, and you *can* specify -Z0:
The change is backward-compatible, hence specifying only an integer
leads to no compression for a level of 0 and gzip compression when the
level is greater than 0.$ time ./src/bin/pg_dump/pg_dump -h /tmp regression -t int8_tbl -Fp --compress 0 |file -
/dev/stdin: ASCII text
FWIW I agree we should make this backwards-compatible - accept "0" and
treat it as no compression.
Georgios, can you prepare a patch doing that?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2/27/23 05:49, Justin Pryzby wrote:
On Sat, Feb 25, 2023 at 08:05:53AM -0600, Justin Pryzby wrote:
On Fri, Feb 24, 2023 at 11:02:14PM -0600, Justin Pryzby wrote:
I have some fixes (attached) and questions while polishing the patch for
zstd compression. The fixes are small and could be integrated with the
patch for zstd, but could be applied independently.One more - WriteDataToArchiveGzip() says:
One more again.
The LZ4 path is using non-streaming mode, which compresses each block
without persistent state, giving poor compression for -Fc compared with
-Fp. If the data is highly compressible, the difference can be orders
of magnitude.$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -Z lz4 -Fp |wc -c
12351763
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -Z lz4 -Fc |wc -c
21890708That's not true for gzip:
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Z gzip -Fc |wc -c
2118869
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Z gzip -Fp |wc -c
2115832The function ought to at least use streaming mode, so each block/row
isn't compressioned in isolation. 003 is a simple patch to use
streaming mode, which improves the -Fc case:$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -Z lz4 -Fc |wc -c
15178283However, that still flushes the compression buffer, writing a block
header, for every row. With a single-column table, pg_dump -Fc -Z lz4
still outputs ~10% *more* data than with no compression at all. And
that's for compressible data.$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Fc -Z lz4 |wc -c
12890296
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Fc -Z none |wc -c
11890296I think this should use the LZ4F API with frames, which are buffered to
avoid outputting a header for every single row. The LZ4F format isn't
compatible with the LZ4 format, so (unlike changing to the streaming
API) that's not something we can change in a bugfix release. I consider
this an Opened Item.With the LZ4F API in 004, -Fp and -Fc are essentially the same size
(like gzip). (Oh, and the output is three times smaller, too.)$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Z lz4 -Fp |wc -c
4155448
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Z lz4 -Fc |wc -c
4156548
Thanks. Those are definitely interesting improvements/optimizations!
I suggest we track them as a separate patch series - please add them to
the CF app (I guess you'll have to add them to 2023-07 at this point,
but we can get them in, I think).
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Mar 01, 2023 at 01:39:14PM +0000, gkokolatos@pm.me wrote:
On Wednesday, March 1st, 2023 at 12:58 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:
The current function order avoids 3 lines of declarations, but it's
obviously pretty useful to be able to run that diff command. I already
argued for not calling the functions "Gzip" on the grounds that the name
was inaccurate.I have no idea why we are back on the naming issue. I stand by the name
because in my humble opinion helps the code reader. There is a certain
uniformity when the compression_spec.algorithm and the compressor
functions match as the following code sample shows.
I mentioned that it's because this allows usefully running "diff"
against the previous commits.
if (compression_spec.algorithm == PG_COMPRESSION_NONE)
InitCompressorNone(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressorGzip(cs, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
InitCompressorLZ4(cs, compression_spec);When the reader wants to see what happens when the PG_COMPRESSION_XXX
is set, has to simply search for the XXX part. I think that this is
justification enough for the use of the names.
You're right about that.
But (with the exception of InitCompressorGzip), I'm referring to the
naming of internal functions, static to gzip.c, so renaming can't be
said to cause a loss of clarity.
I'd want to create an empty large object in src/test/sql/largeobject.sql
to exercise this tested during pgupgrade. But unfortunately that
doesn't use -Fc, so this isn't hit. Empty input is an important enough
test case to justify a tap test, if there's no better way.Please find in the attached a test case that exercises this codepath.
Thanks for writing it.
This patch could be an opportunity to improve the "diff" output, without
renaming anything.
The old order of functions was:
-InitCompressorZlib
-EndCompressorZlib
-DeflateCompressorZlib
-WriteDataToArchiveZlib
-ReadDataFromArchiveZlib
If you put DeflateCompressorEnd immediately after DeflateCompressorInit,
diff works nicely. You'll have to add at least one declaration, which
seems very worth it.
--
Justin
------- Original Message -------
On Wednesday, March 1st, 2023 at 5:20 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 2/25/23 15:05, Justin Pryzby wrote:
On Fri, Feb 24, 2023 at 11:02:14PM -0600, Justin Pryzby wrote:
I have some fixes (attached) and questions while polishing the patch for
zstd compression. The fixes are small and could be integrated with the
patch for zstd, but could be applied independently.One more - WriteDataToArchiveGzip() says:
+ if (cs->compression_spec.level == 0) + pg_fatal("requested to compress the archive yet no level was specified");That was added at e9960732a.
But if you specify gzip:0, the compression level is already enforced by
validate_compress_specification(), before hitting gzip.c:| pg_dump: error: invalid compression specification: compression algorithm "gzip" expects a compression level between 1 and 9 (default at -1)
5e73a6048 intended that to work as before, and you can specify -Z0:
The change is backward-compatible, hence specifying only an integer
leads to no compression for a level of 0 and gzip compression when the
level is greater than 0.$ time ./src/bin/pg_dump/pg_dump -h /tmp regression -t int8_tbl -Fp --compress 0 |file -
/dev/stdin: ASCII textFWIW I agree we should make this backwards-compatible - accept "0" and
treat it as no compression.Georgios, can you prepare a patch doing that?
Please find a patch attached. However I am a bit at a loss, the backwards
compatible behaviour has not changed. Passing -Z0/--compress=0 does produce
a non compressed output. So I am not really certain as to what broke and
needs fixing.
What commit 5e73a6048 did fail to do, is test the backwards compatible
behaviour. The attached amends it.
Cheers,
//Georgios
Show quoted text
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
0001-Add-test-for-backwards-compatible-Z0-option-in-pg.patchtext/x-patch; name=0001-Add-test-for-backwards-compatible-Z0-option-in-pg.patchDownload
From 99c2da94ecbeacf997270dd26fc5c0a63ffcedd4 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 2 Mar 2023 16:10:03 +0000
Subject: [PATCH vX] Add test for backwards compatible -Z0 option in pg_dump
Commit 5e73a6048 expanded pg_dump with the ability to use compression
specifications. A commonly shared code which lets the user control in an
extended way the method, level, and other details of a desired compression.
Prior to this commit, pg_dump could only accept an integer for the
-Z/--compress option. An integer value of 0 had the special meaning of non
compression. Commit 5e73a6048 respected and maintained this behaviour for
backwards compatibility.
However no tests covered this scenario. The current commit adds coverage for
this case.
---
src/bin/pg_dump/t/002_pg_dump.pl | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 7b5a6e190c..ec7aaab884 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -76,6 +76,19 @@ my %pgdump_runs = (
],
},
+ # Verify that the backwards compatible option -Z0 produces
+ # non compressed output
+ compression_none_plain => {
+ test_key => 'compression',
+ # Enforce this test when compile option is available
+ compile_option => 'gzip',
+ dump_cmd => [
+ 'pg_dump', '--format=plain',
+ '-Z0', "--file=$tempdir/compression_none_plain.sql",
+ 'postgres',
+ ],
+ },
+
# Do not use --no-sync to give test coverage for data sync.
compression_gzip_custom => {
test_key => 'compression',
--
2.34.1
On Wed, Mar 01, 2023 at 05:20:05PM +0100, Tomas Vondra wrote:
On 2/25/23 15:05, Justin Pryzby wrote:
On Fri, Feb 24, 2023 at 11:02:14PM -0600, Justin Pryzby wrote:
I have some fixes (attached) and questions while polishing the patch for
zstd compression. The fixes are small and could be integrated with the
patch for zstd, but could be applied independently.One more - WriteDataToArchiveGzip() says:
+ if (cs->compression_spec.level == 0) + pg_fatal("requested to compress the archive yet no level was specified");That was added at e9960732a.
But if you specify gzip:0, the compression level is already enforced by
validate_compress_specification(), before hitting gzip.c:| pg_dump: error: invalid compression specification: compression algorithm "gzip" expects a compression level between 1 and 9 (default at -1)
5e73a6048 intended that to work as before, and you *can* specify -Z0:
The change is backward-compatible, hence specifying only an integer
leads to no compression for a level of 0 and gzip compression when the
level is greater than 0.$ time ./src/bin/pg_dump/pg_dump -h /tmp regression -t int8_tbl -Fp --compress 0 |file -
/dev/stdin: ASCII textFWIW I agree we should make this backwards-compatible - accept "0" and
treat it as no compression.Georgios, can you prepare a patch doing that?
I think maybe Tomas misunderstood. What I was trying to say is that -Z
0 *is* accepted to mean no compression. This part wasn't quoted, but I
said:
Right now, I think that pg_fatal in gzip.c is dead code - that was first
added in the patch version sent on 21 Dec 2022.
If you run the diff command that I've been talking about, you'll see
that InitCompressorZlib was almost unchanged - e9960732 is essentially a
refactoring. I don't think it's desirable to add a pg_fatal() in a
function that's otherwise nearly-unchanged. The fact that it's
nearly-unchanged is a good thing: it simplifies reading of what changed.
If someone wants to add a pg_fatal() in that code path, it'd be better
done in its own commit, with a separate message explaining the change.
If you insist on changing anything here, you might add an assertion (as
you said earlier) along with a comment like
/* -Z 0 uses the "None" compressor rather than zlib with no compression */
--
Justin
On 3/2/23 18:18, Justin Pryzby wrote:
On Wed, Mar 01, 2023 at 05:20:05PM +0100, Tomas Vondra wrote:
On 2/25/23 15:05, Justin Pryzby wrote:
On Fri, Feb 24, 2023 at 11:02:14PM -0600, Justin Pryzby wrote:
I have some fixes (attached) and questions while polishing the patch for
zstd compression. The fixes are small and could be integrated with the
patch for zstd, but could be applied independently.One more - WriteDataToArchiveGzip() says:
+ if (cs->compression_spec.level == 0) + pg_fatal("requested to compress the archive yet no level was specified");That was added at e9960732a.
But if you specify gzip:0, the compression level is already enforced by
validate_compress_specification(), before hitting gzip.c:| pg_dump: error: invalid compression specification: compression algorithm "gzip" expects a compression level between 1 and 9 (default at -1)
5e73a6048 intended that to work as before, and you *can* specify -Z0:
The change is backward-compatible, hence specifying only an integer
leads to no compression for a level of 0 and gzip compression when the
level is greater than 0.$ time ./src/bin/pg_dump/pg_dump -h /tmp regression -t int8_tbl -Fp --compress 0 |file -
/dev/stdin: ASCII textFWIW I agree we should make this backwards-compatible - accept "0" and
treat it as no compression.Georgios, can you prepare a patch doing that?
I think maybe Tomas misunderstood. What I was trying to say is that -Z
0 *is* accepted to mean no compression. This part wasn't quoted, but I
said:
Ah, I see. Well, I also tried but with "-Z gzip:0" (and not -Z 0), and
that does fail:
error: invalid compression specification: compression algorithm "gzip"
expects a compression level between 1 and 9 (default at -1)
It's a bit weird these two cases behave differently, when both translate
to the same default compression method (gzip).
Right now, I think that pg_fatal in gzip.c is dead code - that was first
added in the patch version sent on 21 Dec 2022.If you run the diff command that I've been talking about, you'll see
that InitCompressorZlib was almost unchanged - e9960732 is essentially a
refactoring. I don't think it's desirable to add a pg_fatal() in a
function that's otherwise nearly-unchanged. The fact that it's
nearly-unchanged is a good thing: it simplifies reading of what changed.
If someone wants to add a pg_fatal() in that code path, it'd be better
done in its own commit, with a separate message explaining the change.If you insist on changing anything here, you might add an assertion (as
you said earlier) along with a comment like
/* -Z 0 uses the "None" compressor rather than zlib with no compression */
Yeah, a comment would be helpful.
Also, after thinking about it a bit more maybe having the unreachable
pg_fatal() is not a good thing, as it will just confuse people (I'd
certainly assume having such check means there's a way in which case it
might trigger.). Maybe an assert would be better?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Mar 01, 2023 at 04:52:49PM +0100, Tomas Vondra wrote:
Thanks. That seems correct to me, but I find it somewhat confusing,
because we now haveDeflateCompressorInit vs. InitCompressorGzip
DeflateCompressorEnd vs. EndCompressorGzip
DeflateCompressorData - The name doesn't really say what it does (would
be better to have a verb in there, I think).I wonder if we can make this somehow clearer?
To move things along, I updated Georgios' patch:
Rename DeflateCompressorData() to DeflateCompressorCommon();
Rearrange functions to their original order allowing a cleaner diff to the prior code;
Change pg_fatal() to an assertion+comment;
Update the commit message and fix a few typos;
Also, InitCompressorGzip says this:
/*
* If the caller has defined a write function, prepare the necessary
* state. Avoid initializing during the first write call, because End
* may be called without ever writing any data.
*/
if (cs->writeF)
DeflateCompressorInit(cs);Does it actually make sense to not have writeF defined in some cases?
InitCompressor is being called for either reading or writing, either of
which could be null:
src/bin/pg_dump/pg_backup_custom.c: ctx->cs = AllocateCompressor(AH->compression_spec,
src/bin/pg_dump/pg_backup_custom.c- NULL,
src/bin/pg_dump/pg_backup_custom.c- _CustomWriteFunc);
--
src/bin/pg_dump/pg_backup_custom.c: cs = AllocateCompressor(AH->compression_spec,
src/bin/pg_dump/pg_backup_custom.c- _CustomReadFunc, NULL);
It's confusing that the comment says "Avoid initializing...". What it
really means is "Initialize eagerly...". But that makes more sense in
the context of the commit message for this bugfix than in a comment. So
I changed that too.
+ /* If deflation was initialized, finalize it */
+ if (cs->private_data)
+ DeflateCompressorEnd(AH, cs);
Maybe it'd be more clear if this used "if (cs->writeF)", like in the
init function ?
--
Justin
Attachments:
0001-pg_dump-fix-gzip-compression-of-empty-data.patchtext/x-diff; charset=us-asciiDownload
From 5c027aa86e8591db140093c48a58aafba7a6c28c Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Wed, 1 Mar 2023 12:42:32 +0000
Subject: [PATCH] pg_dump: fix gzip compression of empty data
When creating dumps with the Compressor API, it is possible to only call
the Allocate and End compressor functions without ever writing any data.
Since e9960732a, the gzip implementation wrongly assumed that the write
function would always be called and deferred the initialization of the
internal compression system until the first write call.
The problem with that approach is that the End call would not finalize
the internal compression system if it hadn't been initialized.
This commit initializes the internal compression system during the
Allocate call, whenever a write function was provided by the caller.
Given that decompression does not need to keep track of any state, the
compressor's private_data member is now populated only during
compression.
In passing, rearrange the functions to their original order, to allow
usefully comparing with the previous code, like:
git diff --diff-algorithm=minimal -w e9960732a~:src/bin/pg_dump/compress_io.c src/bin/pg_dump/compress_gzip.c
Also replace an unreachable pg_fatal() with an assert+comment. I
(Justin) argued that the new fatal shouldn't have been introduced in a
refactoring commit, so this is a compromise.
Report and initial patch by Justin Pryzby, test case by Georgios
Kokolatos.
https://www.postgresql.org/message-id/20230228235834.GC30529%40telsasoft.com
---
src/bin/pg_dump/compress_gzip.c | 137 ++++++++++++++++++-------------
src/bin/pg_dump/t/002_pg_dump.pl | 23 ++++++
2 files changed, 101 insertions(+), 59 deletions(-)
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index 0af65afeb4e..3c9ac55c266 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -32,9 +32,75 @@ typedef struct GzipCompressorState
size_t outsize;
} GzipCompressorState;
+
/* Private routines that support gzip compressed data I/O */
+static void DeflateCompressorInit(CompressorState *cs);
+static void DeflateCompressorEnd(ArchiveHandle *AH, CompressorState *cs);
+static void DeflateCompressorCommon(ArchiveHandle *AH, CompressorState *cs,
+ bool flush);
+static void EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs);
+
static void
-DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
+DeflateCompressorInit(CompressorState *cs)
+{
+ GzipCompressorState *gzipcs;
+ z_streamp zp;
+
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ zp->zalloc = Z_NULL;
+ zp->zfree = Z_NULL;
+ zp->opaque = Z_NULL;
+
+ /*
+ * outsize is the buffer size we tell zlib it can output to. We actually
+ * allocate one extra byte because some routines want to append a trailing
+ * zero byte to the zlib output.
+ */
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
+
+ /* -Z 0 uses the "None" compressor -- not zlib with no compression */
+ Assert(cs->compression_spec.level != 0);
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
+
+ /* Just be paranoid - maybe End is called after Start, with no Write */
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+
+ /* Keep track of gzipcs */
+ cs->private_data = gzipcs;
+}
+
+static void
+DeflateCompressorEnd(ArchiveHandle *AH, CompressorState *cs)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+
+ zp = gzipcs->zp;
+ zp->next_in = NULL;
+ zp->avail_in = 0;
+
+ /* Flush any remaining data from zlib buffer */
+ DeflateCompressorCommon(AH, cs, true);
+
+ if (deflateEnd(zp) != Z_OK)
+ pg_fatal("could not close compression stream: %s", zp->msg);
+
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ pg_free(gzipcs);
+ cs->private_data = NULL;
+}
+
+static void
+DeflateCompressorCommon(ArchiveHandle *AH, CompressorState *cs, bool flush)
{
GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
z_streamp zp = gzipcs->zp;
@@ -78,27 +144,9 @@ DeflateCompressorGzip(ArchiveHandle *AH, CompressorState *cs, bool flush)
static void
EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
{
- GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
- z_streamp zp;
-
- if (gzipcs->zp)
- {
- zp = gzipcs->zp;
- zp->next_in = NULL;
- zp->avail_in = 0;
-
- /* Flush any remaining data from zlib buffer */
- DeflateCompressorGzip(AH, cs, true);
-
- if (deflateEnd(zp) != Z_OK)
- pg_fatal("could not close compression stream: %s", zp->msg);
-
- pg_free(gzipcs->outbuf);
- pg_free(gzipcs->zp);
- }
-
- pg_free(gzipcs);
- cs->private_data = NULL;
+ /* If deflation was initialized, finalize it */
+ if (cs->private_data)
+ DeflateCompressorEnd(AH, cs);
}
static void
@@ -106,41 +154,10 @@ WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen)
{
GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
- z_streamp zp;
-
- if (!gzipcs->zp)
- {
- zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
- zp->zalloc = Z_NULL;
- zp->zfree = Z_NULL;
- zp->opaque = Z_NULL;
-
- /*
- * outsize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to
- * append a trailing zero byte to the zlib output.
- */
- gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
- gzipcs->outsize = ZLIB_OUT_SIZE;
-
- /*
- * A level of zero simply copies the input one block at the time. This
- * is probably not what the user wanted when calling this interface.
- */
- if (cs->compression_spec.level == 0)
- pg_fatal("requested to compress the archive yet no level was specified");
-
- if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
- pg_fatal("could not initialize compression library: %s", zp->msg);
-
- /* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = gzipcs->outbuf;
- zp->avail_out = gzipcs->outsize;
- }
gzipcs->zp->next_in = (void *) unconstify(void *, data);
gzipcs->zp->avail_in = dLen;
- DeflateCompressorGzip(AH, cs, false);
+ DeflateCompressorCommon(AH, cs, false);
}
static void
@@ -214,17 +231,19 @@ void
InitCompressorGzip(CompressorState *cs,
const pg_compress_specification compression_spec)
{
- GzipCompressorState *gzipcs;
-
cs->readData = ReadDataFromArchiveGzip;
cs->writeData = WriteDataToArchiveGzip;
cs->end = EndCompressorGzip;
cs->compression_spec = compression_spec;
- gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
-
- cs->private_data = gzipcs;
+ /*
+ * If the caller has defined a write function, prepare the necessary
+ * state. Note that if the data is empty, End may be called immediately
+ * after Init, without ever calling Write.
+ */
+ if (cs->writeF)
+ DeflateCompressorInit(cs);
}
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 187e4b8d07d..14cd0d2d503 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -1194,6 +1194,29 @@ my %tests = (
},
},
+ 'LO create (with no data)' => {
+ create_sql =>
+ 'SELECT pg_catalog.lo_create(0);',
+ regexp => qr/^
+ \QSELECT pg_catalog.lo_open\E \('\d+',\ \d+\);\n
+ \QSELECT pg_catalog.lo_close(0);\E
+ /xm,
+ like => {
+ %full_runs,
+ column_inserts => 1,
+ data_only => 1,
+ inserts => 1,
+ section_data => 1,
+ test_schema_plus_large_objects => 1,
+ },
+ unlike => {
+ binary_upgrade => 1,
+ no_large_objects => 1,
+ schema_only => 1,
+ section_pre_data => 1,
+ },
+ },
+
'COMMENT ON DATABASE postgres' => {
regexp => qr/^COMMENT ON DATABASE postgres IS .+;/m,
--
2.34.1
On Wed, Mar 01, 2023 at 05:39:54PM +0100, Tomas Vondra wrote:
On 2/27/23 05:49, Justin Pryzby wrote:
On Sat, Feb 25, 2023 at 08:05:53AM -0600, Justin Pryzby wrote:
On Fri, Feb 24, 2023 at 11:02:14PM -0600, Justin Pryzby wrote:
I have some fixes (attached) and questions while polishing the patch for
zstd compression. The fixes are small and could be integrated with the
patch for zstd, but could be applied independently.One more - WriteDataToArchiveGzip() says:
One more again.
The LZ4 path is using non-streaming mode, which compresses each block
without persistent state, giving poor compression for -Fc compared with
-Fp. If the data is highly compressible, the difference can be orders
of magnitude.$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -Z lz4 -Fp |wc -c
12351763
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -Z lz4 -Fc |wc -c
21890708That's not true for gzip:
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Z gzip -Fc |wc -c
2118869
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Z gzip -Fp |wc -c
2115832The function ought to at least use streaming mode, so each block/row
isn't compressioned in isolation. 003 is a simple patch to use
streaming mode, which improves the -Fc case:$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -Z lz4 -Fc |wc -c
15178283However, that still flushes the compression buffer, writing a block
header, for every row. With a single-column table, pg_dump -Fc -Z lz4
still outputs ~10% *more* data than with no compression at all. And
that's for compressible data.$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Fc -Z lz4 |wc -c
12890296
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Fc -Z none |wc -c
11890296I think this should use the LZ4F API with frames, which are buffered to
avoid outputting a header for every single row. The LZ4F format isn't
compatible with the LZ4 format, so (unlike changing to the streaming
API) that's not something we can change in a bugfix release. I consider
this an Opened Item.With the LZ4F API in 004, -Fp and -Fc are essentially the same size
(like gzip). (Oh, and the output is three times smaller, too.)$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Z lz4 -Fp |wc -c
4155448
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Z lz4 -Fc |wc -c
4156548Thanks. Those are definitely interesting improvements/optimizations!
I suggest we track them as a separate patch series - please add them to
the CF app (I guess you'll have to add them to 2023-07 at this point,
but we can get them in, I think).
Thanks for looking. I'm not sure if I'm the best person to write/submit
the patch to implement that for LZ4. Georgios, would you want to take
on this change ?
I think that needs to be changed for v16, since 1) LZ4F works so much
better like this, and 2) we can't change it later without breaking
compatibility of the dumpfiles by changing the header with another name
other than "lz4". Also, I imagine we'd want to continue supporting the
ability to *restore* a dumpfile using the old(current) format, which
would be untestable code unless we also preserved the ability to write
it somehow (like -Z lz4-old).
One issue is that LZ4F_createCompressionContext() and
LZ4F_compressBegin() ought to be called in InitCompressorLZ4(). It
seems like it might *need* to be called there to avoid exactly the kind
of issue that I reported with empty LOs with gzip. But
InitCompressorLZ4() isn't currently passed the ArchiveHandle, so can't
write the header. And LZ4CompressorState has a simple char *buf, and
not an more elaborate data structure like zlib. You could work around
that by keeping storing the "len" of the existing buffer, and flushing
it in EndCompressorLZ4(), but that adds needless complexity to the Write
and End functions. Maybe the Init function should be passed the AH.
--
Justin
On 3/9/23 17:15, Justin Pryzby wrote:
On Wed, Mar 01, 2023 at 05:39:54PM +0100, Tomas Vondra wrote:
On 2/27/23 05:49, Justin Pryzby wrote:
On Sat, Feb 25, 2023 at 08:05:53AM -0600, Justin Pryzby wrote:
On Fri, Feb 24, 2023 at 11:02:14PM -0600, Justin Pryzby wrote:
I have some fixes (attached) and questions while polishing the patch for
zstd compression. The fixes are small and could be integrated with the
patch for zstd, but could be applied independently.One more - WriteDataToArchiveGzip() says:
One more again.
The LZ4 path is using non-streaming mode, which compresses each block
without persistent state, giving poor compression for -Fc compared with
-Fp. If the data is highly compressible, the difference can be orders
of magnitude.$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -Z lz4 -Fp |wc -c
12351763
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -Z lz4 -Fc |wc -c
21890708That's not true for gzip:
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Z gzip -Fc |wc -c
2118869
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Z gzip -Fp |wc -c
2115832The function ought to at least use streaming mode, so each block/row
isn't compressioned in isolation. 003 is a simple patch to use
streaming mode, which improves the -Fc case:$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -Z lz4 -Fc |wc -c
15178283However, that still flushes the compression buffer, writing a block
header, for every row. With a single-column table, pg_dump -Fc -Z lz4
still outputs ~10% *more* data than with no compression at all. And
that's for compressible data.$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Fc -Z lz4 |wc -c
12890296
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Fc -Z none |wc -c
11890296I think this should use the LZ4F API with frames, which are buffered to
avoid outputting a header for every single row. The LZ4F format isn't
compatible with the LZ4 format, so (unlike changing to the streaming
API) that's not something we can change in a bugfix release. I consider
this an Opened Item.With the LZ4F API in 004, -Fp and -Fc are essentially the same size
(like gzip). (Oh, and the output is three times smaller, too.)$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Z lz4 -Fp |wc -c
4155448
$ ./src/bin/pg_dump/pg_dump -h /tmp postgres -t t1 -Z lz4 -Fc |wc -c
4156548Thanks. Those are definitely interesting improvements/optimizations!
I suggest we track them as a separate patch series - please add them to
the CF app (I guess you'll have to add them to 2023-07 at this point,
but we can get them in, I think).Thanks for looking. I'm not sure if I'm the best person to write/submit
the patch to implement that for LZ4. Georgios, would you want to take
on this change ?I think that needs to be changed for v16, since 1) LZ4F works so much
better like this, and 2) we can't change it later without breaking
compatibility of the dumpfiles by changing the header with another name
other than "lz4". Also, I imagine we'd want to continue supporting the
ability to *restore* a dumpfile using the old(current) format, which
would be untestable code unless we also preserved the ability to write
it somehow (like -Z lz4-old).
I'm a bit confused about the lz4 vs. lz4f stuff, TBH. If we switch to
lz4f, doesn't that mean it (e.g. restore) won't work on systems that
only have older lz4 version? What would/should happen if we take backup
compressed with lz4f, an then try restoring it on an older system where
lz4 does not support lz4f?
Maybe if lz4f format is incompatible with regular lz4, we should treat
it as a separate compression method 'lz4f'?
I'm mostly afk until the end of the week, but I tried searching for lz4f
info - the results are not particularly enlightening, unfortunately.
AFAICS this only applies to lz4f stuff. Or would the streaming mode be a
breaking change too?
One issue is that LZ4F_createCompressionContext() and
LZ4F_compressBegin() ought to be called in InitCompressorLZ4(). It
seems like it might *need* to be called there to avoid exactly the kind
of issue that I reported with empty LOs with gzip. But
InitCompressorLZ4() isn't currently passed the ArchiveHandle, so can't
write the header. And LZ4CompressorState has a simple char *buf, and
not an more elaborate data structure like zlib. You could work around
that by keeping storing the "len" of the existing buffer, and flushing
it in EndCompressorLZ4(), but that adds needless complexity to the Write
and End functions. Maybe the Init function should be passed the AH.
Not sure, but looking at GzipCompressorState I see the only extra thing
it has (compared to LZ4CompressorState) is "z_streamp". I can't
experiment with this until the end of this week, so perhaps that's not
workable, but wouldn't it be better to add a similar field into
LZ4CompressorState? Passing AH to the init function seems like a
violation of abstraction.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Thu, Mar 09, 2023 at 06:58:20PM +0100, Tomas Vondra wrote:
I'm a bit confused about the lz4 vs. lz4f stuff, TBH. If we switch to
lz4f, doesn't that mean it (e.g. restore) won't work on systems that
only have older lz4 version? What would/should happen if we take backup
compressed with lz4f, an then try restoring it on an older system where
lz4 does not support lz4f?
You seem to be thinking about LZ4F as a weird, new innovation I'm
experimenting with, but compress_lz4.c already uses LZ4F for its "file"
API. LZ4F is also what's written by the lz4 CLI tool, and I found that
LZ4F has been included in the library for ~8 years:
https://github.com/lz4/lz4/releases?page=2
r126 Dec 24, 2014
New : lz4frame API is now integrated into liblz4
Maybe if lz4f format is incompatible with regular lz4, we should treat
it as a separate compression method 'lz4f'?I'm mostly afk until the end of the week, but I tried searching for lz4f
info - the results are not particularly enlightening, unfortunately.AFAICS this only applies to lz4f stuff. Or would the streaming mode be a
breaking change too?
Streaming mode outputs the same format as the existing code, but gives
better compression. We could (theoretically) change it in a bugfix
release, and old output would still be restorable (I think new output
would even be restorable with the old versions of pg_restore).
But that's not true for LZ4F. The benefit there is that it avoids
outputing a separate block for each row. That's essential for narrow
tables, for which the block header currently being written has an
overhead several times larger than the data.
--
Justin
On Fri, Mar 10, 2023 at 07:05:49AM -0600, Justin Pryzby wrote:
On Thu, Mar 09, 2023 at 06:58:20PM +0100, Tomas Vondra wrote:
I'm a bit confused about the lz4 vs. lz4f stuff, TBH. If we switch to
lz4f, doesn't that mean it (e.g. restore) won't work on systems that
only have older lz4 version? What would/should happen if we take backup
compressed with lz4f, an then try restoring it on an older system where
lz4 does not support lz4f?You seem to be thinking about LZ4F as a weird, new innovation I'm
experimenting with, but compress_lz4.c already uses LZ4F for its "file"
API.
Note: we already use lz4 frames in pg_receivewal (for WAL) and
pg_basebackup (bbstreamer).
--
Michael
Hello,
23.02.2023 23:24, Tomas Vondra wrote:
On 2/23/23 16:26, Tomas Vondra wrote:
Thanks for v30 with the updated commit messages. I've pushed 0001 after
fixing a comment typo and removing (I think) an unnecessary change in an
error message.I'll give the buildfarm a bit of time before pushing 0002 and 0003.
I've now pushed 0002 and 0003, after minor tweaks (a couple typos etc.),
and marked the CF entry as committed. Thanks for the patch!I wonder how difficult would it be to add the zstd compression, so that
we don't have the annoying "unsupported" cases.
With the patch 0003 committed, a single warning -Wtype-limits appeared in the
master branch:
$ CPPFLAGS="-Og -Wtype-limits" ./configure --with-lz4 -q && make -s -j8
compress_lz4.c: In function ‘LZ4File_gets’:
compress_lz4.c:492:19: warning: comparison of unsigned expression in ‘< 0’ is
always false [-Wtype-limits]
492 | if (dsize < 0)
|
(I wonder, is it accidental that there no other places that triggers
the warning, or some buildfarm animals had this check enabled before?)
It is not a false positive as can be proved by the 002_pg_dump.pl modified as
follows:
- program => $ENV{'LZ4'},
+ program => 'mv',
args => [
- '-z', '-f', '--rm',
"$tempdir/compression_lz4_dir/blobs.toc",
"$tempdir/compression_lz4_dir/blobs.toc.lz4",
],
},
A diagnostic logging added shows:
LZ4File_gets() after LZ4File_read_internal; dsize: 18446744073709551615
and pg_restore fails with:
error: invalid line in large object TOC file
".../src/bin/pg_dump/tmp_check/tmp_test_22ri/compression_lz4_dir/blobs.toc": "????"
Best regards,
Alexander
------- Original Message -------
On Saturday, March 11th, 2023 at 7:00 AM, Alexander Lakhin <exclusion@gmail.com> wrote:
Hello,
23.02.2023 23:24, Tomas Vondra wrote:On 2/23/23 16:26, Tomas Vondra wrote:
Thanks for v30 with the updated commit messages. I've pushed 0001 after
fixing a comment typo and removing (I think) an unnecessary change in an
error message.I'll give the buildfarm a bit of time before pushing 0002 and 0003.
I've now pushed 0002 and 0003, after minor tweaks (a couple typos etc.),
and marked the CF entry as committed. Thanks for the patch!I wonder how difficult would it be to add the zstd compression, so that
we don't have the annoying "unsupported" cases.With the patch 0003 committed, a single warning -Wtype-limits appeared in the
master branch:
$ CPPFLAGS="-Og -Wtype-limits" ./configure --with-lz4 -q && make -s -j8
compress_lz4.c: In function ‘LZ4File_gets’:
compress_lz4.c:492:19: warning: comparison of unsigned expression in ‘< 0’ is
always false [-Wtype-limits]
492 | if (dsize < 0)
|
Thank you Alexander. Please find attached an attempt to address it.
(I wonder, is it accidental that there no other places that triggers
the warning, or some buildfarm animals had this check enabled before?)
I can not answer about the buildfarms. Do you think that adding an explicit
check for this warning in meson would help? I am a bit uncertain as I think
that type-limits are included in extra.
@@ -1748,6 +1748,7 @@ common_warning_flags = [
'-Wshadow=compatible-local',
# This was included in -Wall/-Wformat in older GCC versions
'-Wformat-security',
+ '-Wtype-limits',
]
It is not a false positive as can be proved by the 002_pg_dump.pl modified as
follows:
- program => $ENV{'LZ4'},+ program => 'mv',
args => [
- '-z', '-f', '--rm',
"$tempdir/compression_lz4_dir/blobs.toc",
"$tempdir/compression_lz4_dir/blobs.toc.lz4",
],
},
Correct, it is not a false positive. The existing testing framework provides
limited support for exercising error branches. Especially so when those are
dependent on generated output.
A diagnostic logging added shows:
LZ4File_gets() after LZ4File_read_internal; dsize: 18446744073709551615and pg_restore fails with:
error: invalid line in large object TOC file
".../src/bin/pg_dump/tmp_check/tmp_test_22ri/compression_lz4_dir/blobs.toc": "????"
It is a good thing that the restore fails with bad input. Yet it should
have failed earlier. The attached makes certain it does fail earlier.
Cheers,
//Georgios
Show quoted text
Best regards,
Alexander
Attachments:
v1-0001-Respect-return-type-of-LZ4File_read_internal.patchtext/x-patch; name=v1-0001-Respect-return-type-of-LZ4File_read_internal.patchDownload
From b80bb52ef6546aee8c8431d7cc126fa4a76b543c Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Sat, 11 Mar 2023 09:54:40 +0000
Subject: [PATCH v1] Respect return type of LZ4File_read_internal
The function LZ4File_gets() was storing the return value of
LZ4File_read_internal in a variable of the wrong type, disregarding sign-es.
As a consequence, LZ4File_gets() would not take the error path when it should.
In an attempt to improve readability, spell out the significance of a negative
return value of LZ4File_read_internal() in LZ4File_read().
Reported-by: Alexander Lakhin <exclusion@gmail.com>
---
src/bin/pg_dump/compress_lz4.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 63e794cdc6..cc039f0b47 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -453,7 +453,7 @@ LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
int ret;
ret = LZ4File_read_internal(fs, ptr, size, false);
- if (ret != size && !LZ4File_eof(CFH))
+ if (ret < 0 || (ret != size && !LZ4File_eof(CFH)))
pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
return ret;
@@ -486,14 +486,14 @@ static char *
LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
{
LZ4File *fs = (LZ4File *) CFH->private_data;
- size_t dsize;
+ int ret;
- dsize = LZ4File_read_internal(fs, ptr, size, true);
- if (dsize < 0)
+ ret = LZ4File_read_internal(fs, ptr, size, true);
+ if (ret < 0)
pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
/* Done reading */
- if (dsize == 0)
+ if (ret == 0)
return NULL;
return ptr;
--
2.34.1
Hi Georgios,
11.03.2023 13:50, gkokolatos@pm.me wrote:
I can not answer about the buildfarms. Do you think that adding an explicit
check for this warning in meson would help? I am a bit uncertain as I think
that type-limits are included in extra.@@ -1748,6 +1748,7 @@ common_warning_flags = [
'-Wshadow=compatible-local',
# This was included in -Wall/-Wformat in older GCC versions
'-Wformat-security',
+ '-Wtype-limits',
]
I'm not sure that I can promote additional checks (or determine where
to put them), but if some patch introduces a warning of a type that wasn't
present before, I think it's worth to eliminate the warning (if it is
sensible) to keep the source code check baseline at the same level
or even lift it up gradually.
I've also found that the same commit introduced a single instance of
the analyzer-possible-null-argument warning:
CPPFLAGS="-Og -fanalyzer -Wno-analyzer-malloc-leak -Wno-analyzer-file-leak
-Wno-analyzer-null-dereference -Wno-analyzer-shift-count-overflow
-Wno-analyzer-free-of-non-heap -Wno-analyzer-null-argument
-Wno-analyzer-double-free -Wanalyzer-possible-null-argument" ./configure
--with-lz4 -q && make -s -j8
compress_io.c: In function ‘hasSuffix’:
compress_io.c:158:47: warning: use of possibly-NULL ‘filename’ where non-null
expected [CWE-690] [-Wanalyzer-possible-null-argument]
158 | int filenamelen = strlen(filename);
| ^~~~~~~~~~~~~~~~
‘InitDiscoverCompressFileHandle’: events 1-3
...
(I use gcc-11.3.)
As I can see, many existing uses of strdup() are followed by a check for
null result, so maybe it's a common practice and a similar check should
be added in InitDiscoverCompressFileHandle().
(There also a couple of other warnings introduced with the lz4 compression
patches, but those ones are not unique, so I maybe they aren't worth fixing.)
It is a good thing that the restore fails with bad input. Yet it should
have failed earlier. The attached makes certain it does fail earlier.
Thanks! Your patch definitely fixes the issue.
Best regards,
Alexander
On 11.03.23 07:00, Alexander Lakhin wrote:
Hello,
23.02.2023 23:24, Tomas Vondra wrote:On 2/23/23 16:26, Tomas Vondra wrote:
Thanks for v30 with the updated commit messages. I've pushed 0001 after
fixing a comment typo and removing (I think) an unnecessary change in an
error message.I'll give the buildfarm a bit of time before pushing 0002 and 0003.
I've now pushed 0002 and 0003, after minor tweaks (a couple typos etc.),
and marked the CF entry as committed. Thanks for the patch!I wonder how difficult would it be to add the zstd compression, so that
we don't have the annoying "unsupported" cases.With the patch 0003 committed, a single warning -Wtype-limits appeared
in the
master branch:
$ CPPFLAGS="-Og -Wtype-limits" ./configure --with-lz4 -q && make -s -j8
compress_lz4.c: In function ‘LZ4File_gets’:
compress_lz4.c:492:19: warning: comparison of unsigned expression in ‘<
0’ is always false [-Wtype-limits]
492 | if (dsize < 0)
|
(I wonder, is it accidental that there no other places that triggers
the warning, or some buildfarm animals had this check enabled before?)
I think there is an underlying problem in this code that it dances back
and forth between size_t and int in an unprincipled way.
In the code that triggers the warning, dsize is size_t. dsize is the
return from LZ4File_read_internal(), which is declared to return int.
The variable that LZ4File_read_internal() returns in the success case is
size_t, but in case of an error it returns -1. (So the code that is
warning is meaning to catch this error case, but it won't ever work.)
Further below LZ4File_read_internal() calls LZ4File_read_overflow(),
which is declared to return int, but in some cases it returns
fs->overflowlen, which is size_t.
This should be cleaned up.
AFAICT, the upstream API in lz4.h uses int for size values, but
lz4frame.h uses size_t, so I don't know what the correct approach is.
On 3/12/23 11:07, Peter Eisentraut wrote:
On 11.03.23 07:00, Alexander Lakhin wrote:
Hello,
23.02.2023 23:24, Tomas Vondra wrote:On 2/23/23 16:26, Tomas Vondra wrote:
Thanks for v30 with the updated commit messages. I've pushed 0001 after
fixing a comment typo and removing (I think) an unnecessary change
in an
error message.I'll give the buildfarm a bit of time before pushing 0002 and 0003.
I've now pushed 0002 and 0003, after minor tweaks (a couple typos etc.),
and marked the CF entry as committed. Thanks for the patch!I wonder how difficult would it be to add the zstd compression, so that
we don't have the annoying "unsupported" cases.With the patch 0003 committed, a single warning -Wtype-limits appeared
in the
master branch:
$ CPPFLAGS="-Og -Wtype-limits" ./configure --with-lz4 -q && make -s -j8
compress_lz4.c: In function ‘LZ4File_gets’:
compress_lz4.c:492:19: warning: comparison of unsigned expression in
‘< 0’ is always false [-Wtype-limits]
492 | if (dsize < 0)
|
(I wonder, is it accidental that there no other places that triggers
the warning, or some buildfarm animals had this check enabled before?)I think there is an underlying problem in this code that it dances back
and forth between size_t and int in an unprincipled way.In the code that triggers the warning, dsize is size_t. dsize is the
return from LZ4File_read_internal(), which is declared to return int.
The variable that LZ4File_read_internal() returns in the success case is
size_t, but in case of an error it returns -1. (So the code that is
warning is meaning to catch this error case, but it won't ever work.)
Further below LZ4File_read_internal() calls LZ4File_read_overflow(),
which is declared to return int, but in some cases it returns
fs->overflowlen, which is size_t.
I agree. I just got home so I looked at this only very briefly, but I
think it's clearly wrong to assign the LZ4File_read_internal() result to
a size_t variable (and it seems to me LZ4File_gets does the same mistake
with LZ4File_read_internal() result).
I'll get this fixed early next week, I'm too tired to do that now
without likely causing further issues.
This should be cleaned up.
AFAICT, the upstream API in lz4.h uses int for size values, but
lz4frame.h uses size_t, so I don't know what the correct approach is.
Yeah, that's a good point. I think Justin is right we should be using
the LZ4F stuff, so ultimately we'll probably switch to size_t. But IMO
it's definitely better to correct the current code first, and only then
switch to LZ4F (from one correct state to another).
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 3/11/23 11:50, gkokolatos@pm.me wrote:
------- Original Message -------
On Saturday, March 11th, 2023 at 7:00 AM, Alexander Lakhin <exclusion@gmail.com> wrote:Hello,
23.02.2023 23:24, Tomas Vondra wrote:On 2/23/23 16:26, Tomas Vondra wrote:
Thanks for v30 with the updated commit messages. I've pushed 0001 after
fixing a comment typo and removing (I think) an unnecessary change in an
error message.I'll give the buildfarm a bit of time before pushing 0002 and 0003.
I've now pushed 0002 and 0003, after minor tweaks (a couple typos etc.),
and marked the CF entry as committed. Thanks for the patch!I wonder how difficult would it be to add the zstd compression, so that
we don't have the annoying "unsupported" cases.With the patch 0003 committed, a single warning -Wtype-limits appeared in the
master branch:
$ CPPFLAGS="-Og -Wtype-limits" ./configure --with-lz4 -q && make -s -j8
compress_lz4.c: In function ‘LZ4File_gets’:
compress_lz4.c:492:19: warning: comparison of unsigned expression in ‘< 0’ is
always false [-Wtype-limits]
492 | if (dsize < 0)
|Thank you Alexander. Please find attached an attempt to address it.
(I wonder, is it accidental that there no other places that triggers
the warning, or some buildfarm animals had this check enabled before?)I can not answer about the buildfarms. Do you think that adding an explicit
check for this warning in meson would help? I am a bit uncertain as I think
that type-limits are included in extra.@@ -1748,6 +1748,7 @@ common_warning_flags = [
'-Wshadow=compatible-local',
# This was included in -Wall/-Wformat in older GCC versions
'-Wformat-security',
+ '-Wtype-limits',
]It is not a false positive as can be proved by the 002_pg_dump.pl modified as
follows:
- program => $ENV{'LZ4'},+ program => 'mv',
args => [
- '-z', '-f', '--rm',
"$tempdir/compression_lz4_dir/blobs.toc",
"$tempdir/compression_lz4_dir/blobs.toc.lz4",
],
},Correct, it is not a false positive. The existing testing framework provides
limited support for exercising error branches. Especially so when those are
dependent on generated output.A diagnostic logging added shows:
LZ4File_gets() after LZ4File_read_internal; dsize: 18446744073709551615and pg_restore fails with:
error: invalid line in large object TOC file
".../src/bin/pg_dump/tmp_check/tmp_test_22ri/compression_lz4_dir/blobs.toc": "????"It is a good thing that the restore fails with bad input. Yet it should
have failed earlier. The attached makes certain it does fail earlier.
Thanks for the patch.
I did look if there are other places that might have the same issue, and
I think there are - see attached 0002. For example LZ4File_write is
declared to return size_t, but then it also does
if (LZ4F_isError(status))
{
fs->errcode = status;
return -1;
}
That won't work :-(
And these issues may not be restricted to lz4 code - Gzip_write is
declared to return size_t, but it does
return gzwrite(gzfp, ptr, size);
and gzwrite returns int. Although, maybe that's correct, because
gzwrite() is "0 on error" so maybe this is fine ...
However, Gzip_read assigns gzread() to size_t, and that does not seem
great. It probably will still trigger the following pg_fatal() because
it'd be very lucky to match the expected 'size', but it's confusing.
I wonder whether CompressorState should use int or size_t for the
read_func/write_func callbacks. I guess no option is perfect, i.e. no
data type will work for all compression libraries we might use (lz4 uses
int while lz4f uses size_t, to there's that).
It's a bit weird the "open" functions return int and the read/write
size_t. Maybe we should stick to int, which is what the old functions
(cfwrite etc.) did.
But I think the actual problem here is that the API does not clearly
define how errors are communicated. I mean, it's nice to return the
value returned by the library function without "mangling" it by
conversion to size_t, but what if the libraries communicate errors in
different way? Some may return "0" while others may return "-1".
I think the right approach is to handle all library errors and not just
let them through. So Gzip_write() needs to check the return value, and
either call pg_fatal() or translate it to an error defined by the API.
For example we might say "returns 0 on error" and then translate all
library-specific errors to that.
While looking at the code I realized a couple function comments don't
say what's returned in case of error, etc. So 0004 adds those.
0003 is a couple minor assorted comments/questions:
- Should we move ZLIB_OUT_SIZE/ZLIB_IN_SIZE to compress_gzip.c?
- Why are LZ4 buffer sizes different (ZLIB has both 4kB)?
- I wonder if we actually need LZ4F_HEADER_SIZE_MAX? Is it even possible
for LZ4F_compressBound to return value this small (especially for 16kB
input buffer)?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
0004-comment-improvements.patchtext/x-patch; charset=UTF-8; name=0004-comment-improvements.patchDownload
From 72cb710c08c4617fa491fe4824c4f99d3d3402fb Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Mon, 13 Mar 2023 20:42:33 +0100
Subject: [PATCH 4/4] comment improvements
---
src/bin/pg_dump/compress_lz4.c | 22 +++++++++++++++++-----
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 94f28d6806..19a9cf2df2 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -190,10 +190,15 @@ LZ4File_get_error(CompressFileHandle *CFH)
}
/*
- * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls (either
+ * compression or decompression).
*
- * It creates the necessary contexts for the operations. When compressing,
- * it additionally writes the LZ4 header in the output stream.
+ * It creates the necessary contexts for the operations. When compressing data
+ * (indicated by compressing=true), it additionally writes the LZ4 header in the
+ * output stream.
+ *
+ * Returns 0 on success. In case of a failure returns 1, and stores the error
+ * code in fs->errcode.
*/
static int
LZ4File_init(LZ4File *fs, int size, bool compressing)
@@ -206,6 +211,7 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
fs->compressing = compressing;
fs->inited = true;
+ /* When compressing, write LZ4 header to the output stream. */
if (fs->compressing)
{
fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
@@ -248,7 +254,7 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
return 1;
}
- fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buflen = (size > LZ4_OUT_SIZE) ? size : LZ4_OUT_SIZE;
fs->buffer = pg_malloc(fs->buflen);
fs->overflowalloclen = fs->buflen;
@@ -262,7 +268,10 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
/*
* Read already decompressed content from the overflow buffer into 'ptr' up to
* 'size' bytes, if available. If the eol_flag is set, then stop at the first
- * occurrence of the new line char prior to 'size' bytes.
+ * occurrence of the newline char prior to 'size' bytes.
+ *
+ * Returns the number of bytes read from the overflow buffer (and copied into
+ * the 'ptr' buffer), or 0 if the overflow buffer is empty.
*
* Any unread content in the overflow buffer is moved to the beginning.
*/
@@ -304,6 +313,9 @@ LZ4File_read_overflow(LZ4File *fs, void *ptr, int size, bool eol_flag)
* at an overflow buffer within LZ4File. Of course, when the function is
* called, it will first try to consume any decompressed content already
* present in the overflow buffer, before decompressing new content.
+ *
+ * Returns the number of bytes of decompressed data copied into the ptr
+ * buffer, or -1 in case of error.
*/
static int
LZ4File_read_internal(LZ4File *fs, void *ptr, int ptrsize, bool eol_flag)
--
2.39.2
0003-questions.patchtext/x-patch; charset=UTF-8; name=0003-questions.patchDownload
From e9d1e4dcbf17200f34cdb857c7961fb0df1e8435 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Mon, 13 Mar 2023 20:42:21 +0100
Subject: [PATCH 3/4] questions
---
src/bin/pg_dump/compress_io.h | 1 +
src/bin/pg_dump/compress_lz4.c | 6 ++++++
2 files changed, 7 insertions(+)
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index cdb15951ea..ae32a4de1c 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -18,6 +18,7 @@
#include "pg_backup_archiver.h"
/* Initial buffer sizes used in zlib compression. */
+/* XXX shouldn't this be moved to compress_gzip.c? */
#define ZLIB_OUT_SIZE 4096
#define ZLIB_IN_SIZE 4096
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 9ab57ceff3..94f28d6806 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -20,6 +20,8 @@
#include <lz4.h>
#include <lz4frame.h>
+/* Initial buffer sizes used in zlib compression. */
+/* XXX Why is this different from GZIP values? That uses 4kB for both. */
#define LZ4_OUT_SIZE (4 * 1024)
#define LZ4_IN_SIZE (16 * 1024)
@@ -207,6 +209,10 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
if (fs->compressing)
{
fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ /*
+ * XXX Does this actually do something meaningful? With LZ4_IN_SIZE=16kB
+ * I get buflen=143600 (roughly), so can it ever be smaller than 22?
+ */
if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
fs->buflen = LZ4F_HEADER_SIZE_MAX;
--
2.39.2
0002-more-size_t-places.patchtext/x-patch; charset=UTF-8; name=0002-more-size_t-places.patchDownload
From cd361f4bc631a33eb7374bf8b292976aaf07799b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Mon, 13 Mar 2023 20:41:38 +0100
Subject: [PATCH 2/4] more size_t places
---
src/bin/pg_dump/compress_gzip.c | 6 ++++--
src/bin/pg_dump/compress_lz4.c | 2 +-
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index 0af65afeb4..6b042cab5b 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -233,12 +233,13 @@ InitCompressorGzip(CompressorState *cs,
*----------------------
*/
-static size_t
+static size_t /* XXX issue size_t vs. int? */
Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
{
gzFile gzfp = (gzFile) CFH->private_data;
size_t ret;
+ /* XXX this is probably wrong because gzread is "-1 for error" but this breaks that. */
ret = gzread(gzfp, ptr, size);
if (ret != size && !gzeof(gzfp))
{
@@ -252,11 +253,12 @@ Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
return ret;
}
-static size_t
+static size_t /* XXX issue size_t vs. int? */
Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
{
gzFile gzfp = (gzFile) CFH->private_data;
+ /* XXX this is probably OK, because gzwrite is "or 0 in case of error" per https://zlib.net/manual.html#Basic */
return gzwrite(gzfp, ptr, size);
}
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index cc039f0b47..9ab57ceff3 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -430,7 +430,7 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
if (LZ4F_isError(status))
{
fs->errcode = status;
- return -1;
+ return -1; /* FIXME size_t */
}
if (fwrite(fs->buffer, 1, status, fs->fp) != status)
--
2.39.2
0001-Respect-return-type-of-LZ4File_read_internal.patchtext/x-patch; charset=UTF-8; name=0001-Respect-return-type-of-LZ4File_read_internal.patchDownload
From 79860a9600f4e677f10be39db507f38c711812cf Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Mon, 13 Mar 2023 20:39:37 +0100
Subject: [PATCH 1/4] Respect return type of LZ4File_read_internal
The function LZ4File_gets() was storing the return value of
LZ4File_read_internal in a variable of the wrong type, disregarding sign-es.
As a consequence, LZ4File_gets() would not take the error path when it should.
In an attempt to improve readability, spell out the significance of a negative
return value of LZ4File_read_internal() in LZ4File_read().
Reported-by: Alexander Lakhin <exclusion@gmail.com>
---
src/bin/pg_dump/compress_lz4.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 63e794cdc6..cc039f0b47 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -453,7 +453,7 @@ LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
int ret;
ret = LZ4File_read_internal(fs, ptr, size, false);
- if (ret != size && !LZ4File_eof(CFH))
+ if (ret < 0 || (ret != size && !LZ4File_eof(CFH)))
pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
return ret;
@@ -486,14 +486,14 @@ static char *
LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
{
LZ4File *fs = (LZ4File *) CFH->private_data;
- size_t dsize;
+ int ret;
- dsize = LZ4File_read_internal(fs, ptr, size, true);
- if (dsize < 0)
+ ret = LZ4File_read_internal(fs, ptr, size, true);
+ if (ret < 0)
pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
/* Done reading */
- if (dsize == 0)
+ if (ret == 0)
return NULL;
return ptr;
--
2.39.2
Hi Justin,
Thanks for the patch.
On 3/8/23 02:45, Justin Pryzby wrote:
On Wed, Mar 01, 2023 at 04:52:49PM +0100, Tomas Vondra wrote:
Thanks. That seems correct to me, but I find it somewhat confusing,
because we now haveDeflateCompressorInit vs. InitCompressorGzip
DeflateCompressorEnd vs. EndCompressorGzip
DeflateCompressorData - The name doesn't really say what it does (would
be better to have a verb in there, I think).I wonder if we can make this somehow clearer?
To move things along, I updated Georgios' patch:
Rename DeflateCompressorData() to DeflateCompressorCommon();
Hmmm, I don't find "common" any clearer than "data" :-( There needs to
at least be a comment explaining what "common" does.
Rearrange functions to their original order allowing a cleaner diff to the prior code;
OK. I wasn't very enthusiastic about this initially, but after thinking
about it a bit I think it's meaningful to make diffs clearer. But I
don't see much difference with/without the patch. The
git diff --diff-algorithm=minimal -w
e9960732a~:src/bin/pg_dump/compress_io.c src/bin/pg_dump/compress_gzip.c
Produces ~25k diff with/without the patch. What am I doing wrong?
Change pg_fatal() to an assertion+comment;
Yeah, that's reasonable. I'd even ditch the assert/comment, TBH. We
could add such protections against "impossible" stuff to a zillion other
places and the confusion likely outweighs the benefits.
Update the commit message and fix a few typos;
Thanks. I don't want to annoy you too much, but could you split the
patch into the "empty-data" fix and all the other changes (rearranging
functions etc.)? I'd rather not mix those in the same commit.
Also, InitCompressorGzip says this:
/*
* If the caller has defined a write function, prepare the necessary
* state. Avoid initializing during the first write call, because End
* may be called without ever writing any data.
*/
if (cs->writeF)
DeflateCompressorInit(cs);Does it actually make sense to not have writeF defined in some cases?
InitCompressor is being called for either reading or writing, either of
which could be null:src/bin/pg_dump/pg_backup_custom.c: ctx->cs = AllocateCompressor(AH->compression_spec,
src/bin/pg_dump/pg_backup_custom.c- NULL,
src/bin/pg_dump/pg_backup_custom.c- _CustomWriteFunc);
--
src/bin/pg_dump/pg_backup_custom.c: cs = AllocateCompressor(AH->compression_spec,
src/bin/pg_dump/pg_backup_custom.c- _CustomReadFunc, NULL);It's confusing that the comment says "Avoid initializing...". What it
really means is "Initialize eagerly...". But that makes more sense in
the context of the commit message for this bugfix than in a comment. So
I changed that too.+ /* If deflation was initialized, finalize it */ + if (cs->private_data) + DeflateCompressorEnd(AH, cs);Maybe it'd be more clear if this used "if (cs->writeF)", like in the
init function ?
Yeah, if the two checks are equivalent, it'd be better to stick to the
same check everywhere.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
------- Original Message -------
On Monday, March 13th, 2023 at 10:47 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
Change pg_fatal() to an assertion+comment;
Yeah, that's reasonable. I'd even ditch the assert/comment, TBH. We
could add such protections against "impossible" stuff to a zillion other
places and the confusion likely outweighs the benefits.
A minor note to add is to not ignore the lessons learned from a7885c9bb.
For example, as the testing framework stands, one can not test that the
contents of the custom format are indeed compressed. One can infer it by
examining the header of the produced dump and searching for the
compression flag. The code responsible for writing the header and the
code responsible for actually dealing with data, is not the same. Also,
the compression library itself will happily read and write uncompressed
data.
A pg_fatal, assertion, or similar, is the only guard rail against this
kind of error. Without it, the tests will continue passing even after
e.g. a wrong initialization of the API. It was such a case that lead to
a7885c9bb in the first place. I do think that we wish it to be an
"impossible" case. Also it will be an untested case with some history
without such a guard rail.
Of course I will not object to removing it, if you think that is more
confusing than useful.
Cheers,
//Georgios
Show quoted text
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
------- Original Message -------
On Monday, March 13th, 2023 at 9:21 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 3/11/23 11:50, gkokolatos@pm.me wrote:
------- Original Message -------
On Saturday, March 11th, 2023 at 7:00 AM, Alexander Lakhin exclusion@gmail.com wrote:Hello,
23.02.2023 23:24, Tomas Vondra wrote:
Thanks for the patch.
I did look if there are other places that might have the same issue, and
I think there are - see attached 0002. For example LZ4File_write is
declared to return size_t, but then it also doesif (LZ4F_isError(status))
{
fs->errcode = status;return -1;
}That won't work :-(
You are right. It is confusing.
And these issues may not be restricted to lz4 code - Gzip_write is
declared to return size_t, but it doesreturn gzwrite(gzfp, ptr, size);
and gzwrite returns int. Although, maybe that's correct, because
gzwrite() is "0 on error" so maybe this is fine ...However, Gzip_read assigns gzread() to size_t, and that does not seem
great. It probably will still trigger the following pg_fatal() because
it'd be very lucky to match the expected 'size', but it's confusing.
Agreed.
I wonder whether CompressorState should use int or size_t for the
read_func/write_func callbacks. I guess no option is perfect, i.e. no
data type will work for all compression libraries we might use (lz4 uses
int while lz4f uses size_t, to there's that).It's a bit weird the "open" functions return int and the read/write
size_t. Maybe we should stick to int, which is what the old functions
(cfwrite etc.) did.
You are right. These functions are modeled by the open/fread/
fwrite etc, and they have kept the return types of these ones. Their
callers do check the return value of read_func and write_func against
the requested size of bytes to be transferred.
But I think the actual problem here is that the API does not clearly
define how errors are communicated. I mean, it's nice to return the
value returned by the library function without "mangling" it by
conversion to size_t, but what if the libraries communicate errors in
different way? Some may return "0" while others may return "-1".
Agreed.
I think the right approach is to handle all library errors and not just
let them through. So Gzip_write() needs to check the return value, and
either call pg_fatal() or translate it to an error defined by the API.
It makes sense. It will change some of the behaviour of the callers,
mostly on what constitutes an error, and what error message is emitted.
This is a reasonable change though.
For example we might say "returns 0 on error" and then translate all
library-specific errors to that.
Ok.
While looking at the code I realized a couple function comments don't
say what's returned in case of error, etc. So 0004 adds those.0003 is a couple minor assorted comments/questions:
- Should we move ZLIB_OUT_SIZE/ZLIB_IN_SIZE to compress_gzip.c?
It would make things clearer.
- Why are LZ4 buffer sizes different (ZLIB has both 4kB)?
Clearly some comments are needed, if the difference makes sense.
- I wonder if we actually need LZ4F_HEADER_SIZE_MAX? Is it even possible
for LZ4F_compressBound to return value this small (especially for 16kB
input buffer)?
I would recommend to keep it. Earlier versions of LZ4F_HEADER_SIZE_MAX
do not have it. Later versions do advise to use it.
Would you mind me trying to come with a patch to address your points?
Cheers,
//Georgios
Show quoted text
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 3/14/23 16:18, gkokolatos@pm.me wrote:
...> Would you mind me trying to come with a patch to address your points?
That'd be great, thanks. Please keep it split into smaller patches - two
might work, with one patch for "cosmetic" changes and the other tweaking
the API error-handling stuff.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 3/14/23 12:07, gkokolatos@pm.me wrote:
------- Original Message -------
On Monday, March 13th, 2023 at 10:47 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:Change pg_fatal() to an assertion+comment;
Yeah, that's reasonable. I'd even ditch the assert/comment, TBH. We
could add such protections against "impossible" stuff to a zillion other
places and the confusion likely outweighs the benefits.A minor note to add is to not ignore the lessons learned from a7885c9bb.
For example, as the testing framework stands, one can not test that the
contents of the custom format are indeed compressed. One can infer it by
examining the header of the produced dump and searching for the
compression flag. The code responsible for writing the header and the
code responsible for actually dealing with data, is not the same. Also,
the compression library itself will happily read and write uncompressed
data.A pg_fatal, assertion, or similar, is the only guard rail against this
kind of error. Without it, the tests will continue passing even after
e.g. a wrong initialization of the API. It was such a case that lead to
a7885c9bb in the first place. I do think that we wish it to be an
"impossible" case. Also it will be an untested case with some history
without such a guard rail.
So is the pg_fatal() a dead code or not? My understanding was it's not
really reachable, and the main purpose is to remind people this is not
possible. Or am I mistaken/confused?
If it's reachable, can we test it? AFAICS we don't, per the coverage
reports.
If it's just a protection against incorrect API initialization, then an
assert is the right solution, I think. With proper comment. But can't we
actually verify that *during* the initialization?
Also, how come WriteDataToArchiveLZ4() doesn't need this protection too?
Or is that due to gzip being the default compression method?
Of course I will not object to removing it, if you think that is more
confusing than useful.
Not sure, I have a feeling I don't quite understand in what situation
this actually helps.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Mar 13, 2023 at 10:47:12PM +0100, Tomas Vondra wrote:
Rearrange functions to their original order allowing a cleaner diff to the prior code;
OK. I wasn't very enthusiastic about this initially, but after thinking
about it a bit I think it's meaningful to make diffs clearer. But I
don't see much difference with/without the patch. Thegit diff --diff-algorithm=minimal -w e9960732a~:src/bin/pg_dump/compress_io.c src/bin/pg_dump/compress_gzip.c
Produces ~25k diff with/without the patch. What am I doing wrong?
Do you mean 25 kB of diff ? I agree that the statistics of the diff
output don't change a lot:
1 file changed, 201 insertions(+), 570 deletions(-)
1 file changed, 198 insertions(+), 548 deletions(-)
But try reading the diff while looking for the cause of a bug. It's the
difference between reading 50, two-line changes, and reading a hunk that
replaces 100 lines with a different 100 lines, with empty/unrelated
lines randomly thrown in as context.
When the diff is readable, the pg_fatal() also stands out.
Change pg_fatal() to an assertion+comment;
Yeah, that's reasonable. I'd even ditch the assert/comment, TBH. We
could add such protections against "impossible" stuff to a zillion other
places and the confusion likely outweighs the benefits.Update the commit message and fix a few typos;
Thanks. I don't want to annoy you too much, but could you split the
patch into the "empty-data" fix and all the other changes (rearranging
functions etc.)? I'd rather not mix those in the same commit.
I don't know if that makes sense? The "empty-data" fix creates a new
function called DeflateCompressorInit(). My proposal was to add the new
function in the same place in the file as it used to be.
The patch also moves the pg_fatal() that's being removed. I don't think
it's going to look any cleaner to read a history involving the
pg_fatal() first being added, then moved, then removed. Anyway, I'll
wait while the community continues discussion about the pg_fatal().
--
Justin
------- Original Message -------
On Tuesday, March 14th, 2023 at 4:32 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 3/14/23 16:18, gkokolatos@pm.me wrote:
...> Would you mind me trying to come with a patch to address your points?
That'd be great, thanks. Please keep it split into smaller patches - two
might work, with one patch for "cosmetic" changes and the other tweaking
the API error-handling stuff.
Please find attached a set for it. I will admit that the splitting in the
series might not be ideal and what you requested. It is split on what
seemed as a logical units. Please advice how a better split can look like.
0001 is unifying types and return values on the API
0002 is addressing the constant definitions
0003 is your previous 0004 adding comments
As far as the error handling is concerned, you had said upthread:
I think the right approach is to handle all library errors and not just
let them through. So Gzip_write() needs to check the return value, and
either call pg_fatal() or translate it to an error defined by the API.
While working on it, I thought it would be clearer and more consistent
for the pg_fatal() to be called by the caller of the individual functions.
Each individual function can keep track of the specifics of the error
internally. Then the caller upon detecting that there was an error by
checking the return value, can call pg_fatal() with a uniform error
message and then add the specifics by calling the get_error_func().
Thoughts?
Cheers,
//Georgios
Show quoted text
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
v2-0002-Clean-up-constants-in-pg_dump-s-compression-API.patchtext/x-patch; name=v2-0002-Clean-up-constants-in-pg_dump-s-compression-API.patchDownload
From 4aa7603d891c62bf9d95af9910b8fb4b0fe2fb10 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 16 Mar 2023 16:30:09 +0000
Subject: [PATCH v2 2/3] Clean up constants in pg_dump's compression API.
Prior to the introduction of the API, pg_dump would use the ZLIB_[IN|OUT]_SIZE
constants to handle buffer sizes throughout. This behaviour is confusing after
the introduction of the API. Ammend it by introducing a DEFAULT_IO_BUFFER_SIZE
constant to use when appropriate while giving the opportunity to specific
compression implementations to use their own.
With the help and guidance of Tomas Vondra.
---
src/bin/pg_dump/compress_gzip.c | 22 +++++++++++-----------
src/bin/pg_dump/compress_io.h | 5 ++---
src/bin/pg_dump/compress_lz4.c | 17 +++++++++--------
src/bin/pg_dump/compress_none.c | 4 ++--
src/bin/pg_dump/pg_backup_directory.c | 4 ++--
5 files changed, 26 insertions(+), 26 deletions(-)
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index 29e2fd8d50..4106d4c866 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -120,8 +120,8 @@ WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
* actually allocate one extra byte because some routines want to
* append a trailing zero byte to the zlib output.
*/
- gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
- gzipcs->outsize = ZLIB_OUT_SIZE;
+ gzipcs->outsize = DEFAULT_IO_BUFFER_SIZE;
+ gzipcs->outbuf = pg_malloc(gzipcs->outsize + 1);
/*
* A level of zero simply copies the input one block at the time. This
@@ -158,10 +158,10 @@ ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
zp->zfree = Z_NULL;
zp->opaque = Z_NULL;
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
+ buflen = DEFAULT_IO_BUFFER_SIZE;
+ buf = pg_malloc(buflen);
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
+ out = pg_malloc(DEFAULT_IO_BUFFER_SIZE + 1);
if (inflateInit(zp) != Z_OK)
pg_fatal("could not initialize compression library: %s",
@@ -176,14 +176,14 @@ ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
while (zp->avail_in > 0)
{
zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
+ zp->avail_out = DEFAULT_IO_BUFFER_SIZE;
res = inflate(zp, 0);
if (res != Z_OK && res != Z_STREAM_END)
pg_fatal("could not uncompress data: %s", zp->msg);
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ out[DEFAULT_IO_BUFFER_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, DEFAULT_IO_BUFFER_SIZE - zp->avail_out, AH);
}
}
@@ -192,13 +192,13 @@ ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
while (res != Z_STREAM_END)
{
zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
+ zp->avail_out = DEFAULT_IO_BUFFER_SIZE;
res = inflate(zp, 0);
if (res != Z_OK && res != Z_STREAM_END)
pg_fatal("could not uncompress data: %s", zp->msg);
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ out[DEFAULT_IO_BUFFER_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, DEFAULT_IO_BUFFER_SIZE - zp->avail_out, AH);
}
if (inflateEnd(zp) != Z_OK)
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index b03d5b325b..60e1735834 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -17,9 +17,8 @@
#include "pg_backup_archiver.h"
-/* Initial buffer sizes used in zlib compression. */
-#define ZLIB_OUT_SIZE 4096
-#define ZLIB_IN_SIZE 4096
+/* Default size used for IO buffers */
+#define DEFAULT_IO_BUFFER_SIZE 4096
extern char *supports_compression(const pg_compress_specification compression_spec);
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index df2b4c9546..0c0eb09d68 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -20,9 +20,6 @@
#include <lz4.h>
#include <lz4frame.h>
-#define LZ4_OUT_SIZE (4 * 1024)
-#define LZ4_IN_SIZE (16 * 1024)
-
/*
* LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
* Redefine it for installations with a lesser version.
@@ -57,7 +54,7 @@ ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
size_t buflen;
size_t cnt;
- buflen = LZ4_IN_SIZE;
+ buflen = DEFAULT_IO_BUFFER_SIZE;
buf = pg_malloc(buflen);
decbuf = pg_malloc(buflen);
@@ -206,7 +203,7 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
if (fs->compressing)
{
- fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ fs->buflen = LZ4F_compressBound(DEFAULT_IO_BUFFER_SIZE, &fs->prefs);
if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
fs->buflen = LZ4F_HEADER_SIZE_MAX;
@@ -242,9 +239,11 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
return 1;
}
- fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ if (size > DEFAULT_IO_BUFFER_SIZE)
+ fs->buflen = size;
+ else
+ fs->buflen = DEFAULT_IO_BUFFER_SIZE;
fs->buffer = pg_malloc(fs->buflen);
-
fs->overflowalloclen = fs->buflen;
fs->overflowbuf = pg_malloc(fs->overflowalloclen);
fs->overflowlen = 0;
@@ -421,8 +420,10 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
while (remaining > 0)
{
- int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+ int chunk = DEFAULT_IO_BUFFER_SIZE;
+ if (remaining < DEFAULT_IO_BUFFER_SIZE)
+ chunk = remaining;
remaining -= chunk;
status = LZ4F_compressUpdate(fs->ctx, fs->buffer, fs->buflen,
diff --git a/src/bin/pg_dump/compress_none.c b/src/bin/pg_dump/compress_none.c
index bd479fde59..f4a7d7c193 100644
--- a/src/bin/pg_dump/compress_none.c
+++ b/src/bin/pg_dump/compress_none.c
@@ -33,8 +33,8 @@ ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
char *buf;
size_t buflen;
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
+ buflen = DEFAULT_IO_BUFFER_SIZE;
+ buf = pg_malloc(buflen);
while ((cnt = cs->readF(AH, &buf, &buflen)))
{
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 1cd9805ef7..8b92f42ac5 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -394,8 +394,8 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
+ buflen = DEFAULT_IO_BUFFER_SIZE;
+ buf = pg_malloc(buflen);
while (CFH->read_func(buf, buflen, &cnt, CFH) == 0 && cnt > 0)
{
--
2.34.1
v2-0003-Improve-compress_lz4-documentation.patchtext/x-patch; name=v2-0003-Improve-compress_lz4-documentation.patchDownload
From 61d29e828f33163b1750ffe1a0ac1823044d34a9 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 16 Mar 2023 15:43:29 +0000
Subject: [PATCH v2 3/3] Improve compress_lz4 documentation.
Author: Tomas Vondra
---
src/bin/pg_dump/compress_lz4.c | 25 +++++++++++++++++++++----
1 file changed, 21 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 0c0eb09d68..f4c0910afc 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -185,10 +185,15 @@ LZ4File_get_error(CompressFileHandle *CFH)
}
/*
- * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls (either
+ * compression or decompression).
*
- * It creates the necessary contexts for the operations. When compressing,
- * it additionally writes the LZ4 header in the output stream.
+ * It creates the necessary contexts for the operations. When compressing data
+ * (indicated by compressing=true), it additionally writes the LZ4 header in the
+ * output stream.
+ *
+ * Returns 0 on success. In case of a failure returns 1, and stores the error
+ * code in fs->errcode.
*/
static int
LZ4File_init(LZ4File *fs, int size, bool compressing)
@@ -201,9 +206,15 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
fs->compressing = compressing;
fs->inited = true;
+ /* When compressing, write LZ4 header to the output stream. */
if (fs->compressing)
{
fs->buflen = LZ4F_compressBound(DEFAULT_IO_BUFFER_SIZE, &fs->prefs);
+
+ /*
+ * LZ4F_compressBegin requires a buffer that is greater or equal to
+ * LZ4F_HEADER_SIZE_MAX. Verify that the requirement is met.
+ */
if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
fs->buflen = LZ4F_HEADER_SIZE_MAX;
@@ -255,9 +266,12 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
/*
* Read already decompressed content from the overflow buffer into 'ptr' up to
* 'size' bytes, if available. If the eol_flag is set, then stop at the first
- * occurrence of the new line char prior to 'size' bytes.
+ * occurrence of the newline char prior to 'size' bytes.
*
* Any unread content in the overflow buffer is moved to the beginning.
+ *
+ * Returns the number of bytes read from the overflow buffer (and copied into
+ * the 'ptr' buffer), or 0 if the overflow buffer is empty.
*/
static int
LZ4File_read_overflow(LZ4File *fs, void *ptr, int size, bool eol_flag)
@@ -297,6 +311,9 @@ LZ4File_read_overflow(LZ4File *fs, void *ptr, int size, bool eol_flag)
* at an overflow buffer within LZ4File. Of course, when the function is
* called, it will first try to consume any decompressed content already
* present in the overflow buffer, before decompressing new content.
+ *
+ * Returns the number of bytes of decompressed data copied into the ptr
+ * buffer, or -1 in case of error.
*/
static int
LZ4File_read_internal(LZ4File *fs, void *ptr, int ptrsize, bool eol_flag)
--
2.34.1
v2-0001-Improve-type-handling-in-pg_dump-s-compress-file-.patchtext/x-patch; name=v2-0001-Improve-type-handling-in-pg_dump-s-compress-file-.patchDownload
From 94ca77e1fbdcf063f9a4f0957c03ef7cf1829cc4 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Thu, 16 Mar 2023 16:06:00 +0000
Subject: [PATCH v2 1/3] Improve type handling in pg_dump's compress file API
The function LZ4File_gets() was storing the return value of
LZ4File_read_internal in a variable of the wrong type, disregarding sign-es.
As a consequence, LZ4File_gets() would not take the error path when it should.
In an attempt to improve readability and code uniformity, change the return type
of the API's read and write functions to integer from size_t. Along with it,
homogenize the return values of the relevant functions of this API.
This change, helps the specific compression implementations handle the return
types of their corresponding libraries internally and not expose them to the
API caller.
With the help and guidance of Tomas Vondra.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
---
src/bin/pg_dump/compress_gzip.c | 24 ++++++++++-----
src/bin/pg_dump/compress_io.h | 28 ++++++++++++++---
src/bin/pg_dump/compress_lz4.c | 43 ++++++++++++++-------------
src/bin/pg_dump/compress_none.c | 19 ++++++++----
src/bin/pg_dump/pg_backup_archiver.c | 3 +-
src/bin/pg_dump/pg_backup_directory.c | 14 ++++-----
6 files changed, 85 insertions(+), 46 deletions(-)
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index 0af65afeb4..29e2fd8d50 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -233,14 +233,14 @@ InitCompressorGzip(CompressorState *cs,
*----------------------
*/
-static size_t
-Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+static int
+Gzip_read(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH)
{
gzFile gzfp = (gzFile) CFH->private_data;
- size_t ret;
+ size_t gzret;
- ret = gzread(gzfp, ptr, size);
- if (ret != size && !gzeof(gzfp))
+ gzret = gzread(gzfp, ptr, size);
+ if (gzret != size && !gzeof(gzfp))
{
int errnum;
const char *errmsg = gzerror(gzfp, &errnum);
@@ -249,15 +249,23 @@ Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
errnum == Z_ERRNO ? strerror(errno) : errmsg);
}
- return ret;
+ if (rsize)
+ *rsize = gzret;
+
+ return 0;
}
-static size_t
+static int
Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
{
gzFile gzfp = (gzFile) CFH->private_data;
+ size_t gzret;
+
+ gzret = gzwrite(gzfp, ptr, size);
+ if (gzret != size)
+ return 1;
- return gzwrite(gzfp, ptr, size);
+ return 0;
}
static int
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index cdb15951ea..b03d5b325b 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -100,6 +100,8 @@ struct CompressFileHandle
* Pass either 'path' or 'fd' depending on whether a file path or a file
* descriptor is available. 'mode' can be one of 'r', 'rb', 'w', 'wb',
* 'a', and 'ab'. Requires an already initialized CompressFileHandle.
+ *
+ * Returns zero on success and non-zero on error.
*/
int (*open_func) (const char *path, int fd, const char *mode,
CompressFileHandle *CFH);
@@ -109,19 +111,27 @@ struct CompressFileHandle
*
* 'mode' can be one of 'w', 'wb', 'a', and 'ab'. Requires an already
* initialized CompressFileHandle.
+ *
+ * Returns zero on success and non-zero on error.
*/
int (*open_write_func) (const char *path, const char *mode,
CompressFileHandle *CFH);
/*
* Read 'size' bytes of data from the file and store them into 'ptr'.
+ * Optionally it will store the number of bytes read in 'rsize'.
+ *
+ * Returns zero on success and non-zero on error.
*/
- size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
+ int (*read_func) (void *ptr, size_t size, size_t *rsize,
+ CompressFileHandle *CFH);
/*
* Write 'size' bytes of data into the file from 'ptr'.
+ *
+ * Returns zero on success and non-zero on error.
*/
- size_t (*write_func) (const void *ptr, size_t size,
+ int (*write_func) (const void *ptr, size_t size,
struct CompressFileHandle *CFH);
/*
@@ -130,28 +140,38 @@ struct CompressFileHandle
*
* Stop if an EOF or a newline is found first. 's' is always null
* terminated and contains the newline if it was found.
+ *
+ * Returns 's' on success, and NULL on error or when end of file occurs
+ * while no characters have been read.
*/
char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
/*
* Read the next character from the compress file handle as 'unsigned
* char' cast into 'int'.
+ *
+ * Returns the character read on success and throws an internal error
+ * otherwise. It treats EOF as error.
*/
int (*getc_func) (CompressFileHandle *CFH);
/*
* Test if EOF is reached in the compress file handle.
+ *
+ * Returns non-zero if it is reached.
*/
int (*eof_func) (CompressFileHandle *CFH);
/*
* Close an open file handle.
+ *
+ * Returns zero on success and non-zero on error.
*/
int (*close_func) (CompressFileHandle *CFH);
/*
- * Get a pointer to a string that describes an error that occurred during a
- * compress file handle operation.
+ * Get a pointer to a string that describes an error that occurred during
+ * a compress file handle operation.
*/
const char *(*get_error_func) (CompressFileHandle *CFH);
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 63e794cdc6..df2b4c9546 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -302,9 +302,9 @@ LZ4File_read_overflow(LZ4File *fs, void *ptr, int size, bool eol_flag)
static int
LZ4File_read_internal(LZ4File *fs, void *ptr, int ptrsize, bool eol_flag)
{
- size_t dsize = 0;
- size_t rsize;
- size_t size = ptrsize;
+ int dsize = 0;
+ int rsize;
+ int size = ptrsize;
bool eol_found = false;
void *readbuf;
@@ -398,17 +398,17 @@ LZ4File_read_internal(LZ4File *fs, void *ptr, int ptrsize, bool eol_flag)
fs->overflowlen += outlen;
}
}
- } while (rsize == size && dsize < size && eol_found == 0);
+ } while (rsize == size && dsize < size && eol_found == false);
pg_free(readbuf);
- return (int) dsize;
+ return dsize;
}
/*
* Compress size bytes from ptr and write them to the stream.
*/
-static size_t
+static int
LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
{
LZ4File *fs = (LZ4File *) CFH->private_data;
@@ -417,7 +417,7 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
/* Lazy init */
if (LZ4File_init(fs, size, true))
- return -1;
+ return 1;
while (remaining > 0)
{
@@ -430,7 +430,7 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
if (LZ4F_isError(status))
{
fs->errcode = status;
- return -1;
+ return 1;
}
if (fwrite(fs->buffer, 1, status, fs->fp) != status)
@@ -440,23 +440,25 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
}
}
- return size;
+ return 0;
}
/*
* fread() equivalent implementation for LZ4 compressed files.
*/
-static size_t
-LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+static int
+LZ4File_read(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH)
{
LZ4File *fs = (LZ4File *) CFH->private_data;
int ret;
- ret = LZ4File_read_internal(fs, ptr, size, false);
- if (ret != size && !LZ4File_eof(CFH))
+ if ((ret = LZ4File_read_internal(fs, ptr, size, false)) < 0)
pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
- return ret;
+ if (rsize)
+ *rsize = (size_t) ret;
+
+ return 0;
}
/*
@@ -468,7 +470,7 @@ LZ4File_getc(CompressFileHandle *CFH)
LZ4File *fs = (LZ4File *) CFH->private_data;
unsigned char c;
- if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ if (LZ4File_read_internal(fs, &c, 1, false) <= 0)
{
if (!LZ4File_eof(CFH))
pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
@@ -486,14 +488,14 @@ static char *
LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
{
LZ4File *fs = (LZ4File *) CFH->private_data;
- size_t dsize;
+ int ret;
- dsize = LZ4File_read_internal(fs, ptr, size, true);
- if (dsize < 0)
+ ret = LZ4File_read_internal(fs, ptr, size, true);
+ if (ret < 0 || (ret == 0 && !LZ4File_eof(CFH)))
pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
/* Done reading */
- if (dsize == 0)
+ if (ret == 0)
return NULL;
return ptr;
@@ -509,7 +511,6 @@ LZ4File_close(CompressFileHandle *CFH)
FILE *fp;
LZ4File *fs = (LZ4File *) CFH->private_data;
size_t status;
- int ret;
fp = fs->fp;
if (fs->inited)
@@ -520,7 +521,7 @@ LZ4File_close(CompressFileHandle *CFH)
if (LZ4F_isError(status))
pg_fatal("failed to end compression: %s",
LZ4F_getErrorName(status));
- else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ else if (fwrite(fs->buffer, 1, status, fs->fp) != status)
{
errno = (errno) ? errno : ENOSPC;
WRITE_ERROR_EXIT;
diff --git a/src/bin/pg_dump/compress_none.c b/src/bin/pg_dump/compress_none.c
index ecbcf4b04a..bd479fde59 100644
--- a/src/bin/pg_dump/compress_none.c
+++ b/src/bin/pg_dump/compress_none.c
@@ -83,8 +83,8 @@ InitCompressorNone(CompressorState *cs,
* Private routines
*/
-static size_t
-read_none(void *ptr, size_t size, CompressFileHandle *CFH)
+static int
+read_none(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH)
{
FILE *fp = (FILE *) CFH->private_data;
size_t ret;
@@ -97,13 +97,22 @@ read_none(void *ptr, size_t size, CompressFileHandle *CFH)
pg_fatal("could not read from input file: %s",
strerror(errno));
- return ret;
+ if (rsize)
+ *rsize = ret;
+
+ return 0;
}
-static size_t
+static int
write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
{
- return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
+ size_t ret;
+
+ ret = fwrite(ptr, 1, size, (FILE *) CFH->private_data);
+ if (ret != size)
+ return 1;
+
+ return 0;
}
static const char *
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 61ebb8fe85..138ea158f1 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -1694,7 +1694,8 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
{
CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
- bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ if (!CFH->write_func(ptr, size * nmemb, CFH))
+ bytes_written = size * nmemb;
}
if (bytes_written != size * nmemb)
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 41c2b733e3..1cd9805ef7 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -348,7 +348,7 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
+ if (dLen > 0 && CFH->write_func(data, dLen, CFH))
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
@@ -382,7 +382,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintFileData(ArchiveHandle *AH, char *filename)
{
- size_t cnt;
+ size_t cnt = 0;
char *buf;
size_t buflen;
CompressFileHandle *CFH;
@@ -397,7 +397,7 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = CFH->read_func(buf, buflen, CFH)))
+ while (CFH->read_func(buf, buflen, &cnt, CFH) == 0 && cnt > 0)
{
ahwrite(buf, 1, cnt, AH);
}
@@ -491,7 +491,7 @@ _WriteByte(ArchiveHandle *AH, const int i)
CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (CFH->write_func(&c, 1, CFH) != 1)
+ if (CFH->write_func(&c, 1, CFH))
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
@@ -529,7 +529,7 @@ _WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (CFH->write_func(buf, len, CFH) != len)
+ if (CFH->write_func(buf, len, CFH))
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
@@ -554,7 +554,7 @@ _ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
* If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (CFH->read_func(buf, len, CFH) != len)
+ if (CFH->read_func(buf, len, NULL, CFH))
pg_fatal("could not read from input file: end of file");
}
@@ -696,7 +696,7 @@ _EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (CFH->write_func(buf, len, CFH) != len)
+ if (CFH->write_func(buf, len, CFH))
pg_fatal("could not write to LOs TOC file");
}
--
2.34.1
On 3/16/23 18:04, gkokolatos@pm.me wrote:
------- Original Message -------
On Tuesday, March 14th, 2023 at 4:32 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:On 3/14/23 16:18, gkokolatos@pm.me wrote:
...> Would you mind me trying to come with a patch to address your points?
That'd be great, thanks. Please keep it split into smaller patches - two
might work, with one patch for "cosmetic" changes and the other tweaking
the API error-handling stuff.Please find attached a set for it. I will admit that the splitting in the
series might not be ideal and what you requested. It is split on what
seemed as a logical units. Please advice how a better split can look like.0001 is unifying types and return values on the API
0002 is addressing the constant definitions
0003 is your previous 0004 adding comments
Thanks. I think the split seems reasonable - the goal was to not mix
different changes, and from that POV it works.
I'm not sure I understand the Gzip_read/Gzip_write changes in 0001. I
mean, gzread/gzwrite returns int, so how does renaming the size_t
variable solve the issue of negative values for errors? I mean, this
- size_t ret;
+ size_t gzret;
- ret = gzread(gzfp, ptr, size);
+ gzret = gzread(gzfp, ptr, size);
means we still lost the information gzread() returned a negative value,
no? We'll still probably trigger an error, but it's a bit weird.
ISTM all this kinda assumes we're processing chunks of memory small
enough that we'll never actually overflow int - I did check what the
code in 15 does, and it seems use int and size_t quite arbitrarily.
For example cfread() seems quite sane:
int
cfread(void *ptr, int size, cfp *fp)
{
int ret;
...
ret = gzread(fp->compressedfp, ptr, size);
...
return ret;
}
but then _PrintFileData() happily stashes it into a size_t, ignoring the
signedness. Surely, if
static void
_PrintFileData(ArchiveHandle *AH, char *filename)
{
size_t cnt;
...
while ((cnt = cfread(buf, buflen, cfp)))
{
ahwrite(buf, 1, cnt, AH);
}
...
}
Unless I'm missing something, if gzread() ever returns -1 or some other
negative error value, we'll cast it to size_t, while condition will
evaluate to "true" and we'll happily chew on some random chunk of data.
So the confusion is (at least partially) a preexisting issue ...
For gzwrite() it seems to be fine, because that only returns 0 on error.
OTOH it's defined to take 'int size' but then we happily pass size_t
values to it.
As I wrote earlier, this apparently assumes we never need to deal with
buffers larger than int, and I don't think we have the ambition to relax
that (I'm not sure it's even needed / possible).
I see the read/write functions are now defined as int, but we only ever
return 0/1 from them, and then interpret that as bool. Why not to define
it like that? I don't think we need to adhere to the custom that
everything returns "int". This is an internal API. Or if we want to
stick to int, I'd define meaningful "nice" constants for 0/1.
0002 seems fine to me. I see you've ditched the idea of having two
separate buffers, and replaced them with DEFAULT_IO_BUFFER_SIZE. Fine
with me, although I wonder if this might have negative impact on
performance or something (but I doubt that).
0003 seems fine too.
As far as the error handling is concerned, you had said upthread:
I think the right approach is to handle all library errors and not just
let them through. So Gzip_write() needs to check the return value, and
either call pg_fatal() or translate it to an error defined by the API.While working on it, I thought it would be clearer and more consistent
for the pg_fatal() to be called by the caller of the individual functions.
Each individual function can keep track of the specifics of the error
internally. Then the caller upon detecting that there was an error by
checking the return value, can call pg_fatal() with a uniform error
message and then add the specifics by calling the get_error_func().
I agree it's cleaner the way you did it.
I was thinking that with each compression function handling error
internally, the callers would not need to do that. But I haven't
realized there's logic to detect ENOSPC and so on, and we'd need to
duplicate that in every compression func.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 3/16/23 01:20, Justin Pryzby wrote:
On Mon, Mar 13, 2023 at 10:47:12PM +0100, Tomas Vondra wrote:
Rearrange functions to their original order allowing a cleaner diff to the prior code;
OK. I wasn't very enthusiastic about this initially, but after thinking
about it a bit I think it's meaningful to make diffs clearer. But I
don't see much difference with/without the patch. Thegit diff --diff-algorithm=minimal -w e9960732a~:src/bin/pg_dump/compress_io.c src/bin/pg_dump/compress_gzip.c
Produces ~25k diff with/without the patch. What am I doing wrong?
Do you mean 25 kB of diff ?
Yes, if you redirect the git-diff to a file, it's a 25kB file.
I agree that the statistics of the diff output don't change a lot:
1 file changed, 201 insertions(+), 570 deletions(-)
1 file changed, 198 insertions(+), 548 deletions(-)But try reading the diff while looking for the cause of a bug. It's the
difference between reading 50, two-line changes, and reading a hunk that
replaces 100 lines with a different 100 lines, with empty/unrelated
lines randomly thrown in as context.When the diff is readable, the pg_fatal() also stands out.
I don't know, maybe I'm doing something wrong or maybe I just am bad at
looking at diffs, but if I apply the patch you submitted on 8/3 and do
the git-diff above (output attached), it seems pretty incomprehensible
to me :-( I don't see 50 two-line changes (I certainly wouldn't be able
to identify the root cause of the bug based on that).
Change pg_fatal() to an assertion+comment;
Yeah, that's reasonable. I'd even ditch the assert/comment, TBH. We
could add such protections against "impossible" stuff to a zillion other
places and the confusion likely outweighs the benefits.Update the commit message and fix a few typos;
Thanks. I don't want to annoy you too much, but could you split the
patch into the "empty-data" fix and all the other changes (rearranging
functions etc.)? I'd rather not mix those in the same commit.I don't know if that makes sense? The "empty-data" fix creates a new
function called DeflateCompressorInit(). My proposal was to add the new
function in the same place in the file as it used to be.
Got it. In that case I agree it's fine to do that in a single commit.
The patch also moves the pg_fatal() that's being removed. I don't think
it's going to look any cleaner to read a history involving the
pg_fatal() first being added, then moved, then removed. Anyway, I'll
wait while the community continues discussion about the pg_fatal().
I think the agreement was to replace the pg_fatal with and assert, and I
see your patch already does that.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
git-diff.txttext/plain; charset=UTF-8; name=git-diff.txtDownload
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_gzip.c
index 5ac21f091f0..3c9ac55c266 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -1,285 +1,118 @@
/*-------------------------------------------------------------------------
*
- * compress_io.c
- * Routines for archivers to write an uncompressed or compressed data
- * stream.
+ * compress_gzip.c
+ * Routines for archivers to read or write a gzip compressed data stream.
*
* Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
- * This file includes two APIs for dealing with compressed data. The first
- * provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
- * friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
- *
- * Compressor API
- * --------------
- *
- * The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
- *
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
- *
- * The interface is the same for compressed and uncompressed streams.
- *
- * Compressed stream API
- * ----------------------
- *
- * The compressed stream API is a wrapper around the C standard fopen() and
- * libz's gzopen() APIs. It allows you to use the same functions for
- * compressed and uncompressed streams. cfopen_read() first tries to open
- * the file with given name, and if it fails, it tries to open the same
- * file with the .gz suffix. cfopen_write() opens a file for writing, an
- * extra argument specifies if the file should be compressed, and adds the
- * .gz suffix to the filename if so. This allows you to easily handle both
- * compressed and uncompressed files.
- *
* IDENTIFICATION
- * src/bin/pg_dump/compress_io.c
+ * src/bin/pg_dump/compress_gzip.c
*
*-------------------------------------------------------------------------
*/
#include "postgres_fe.h"
+#include <unistd.h>
-#include "compress_io.h"
+#include "compress_gzip.h"
#include "pg_backup_utils.h"
#ifdef HAVE_LIBZ
-#include <zlib.h>
-#endif
-
-/*----------------------
- * Generic functions
- *----------------------
- */
-
-/*
- * Checks whether a compression algorithm is supported.
- *
- * On success returns NULL, otherwise returns a malloc'ed string which can be
- * used by the caller in an error message.
- */
-char *
-supports_compression(const pg_compress_specification compression_spec)
-{
- const pg_compress_algorithm algorithm = compression_spec.algorithm;
- bool supported = false;
-
- if (algorithm == PG_COMPRESSION_NONE)
- supported = true;
-#ifdef HAVE_LIBZ
- if (algorithm == PG_COMPRESSION_GZIP)
- supported = true;
-#endif
-
- if (!supported)
- return psprintf("this build does not support compression with %s",
- get_compress_algorithm_name(algorithm));
-
- return NULL;
-}
+#include "zlib.h"
/*----------------------
* Compressor API
*----------------------
*/
-
-/* typedef appears in compress_io.h */
-struct CompressorState
+typedef struct GzipCompressorState
{
- pg_compress_specification compression_spec;
- WriteFunc writeF;
-
-#ifdef HAVE_LIBZ
z_streamp zp;
- char *zlibOut;
- size_t zlibOutSize;
-#endif
-};
-/* Routines that support zlib compressed data I/O */
-#ifdef HAVE_LIBZ
-static void InitCompressorZlib(CompressorState *cs, int level);
-static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
- bool flush);
-static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
-#endif
-
-/* Routines that support uncompressed data I/O */
-static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
-static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen);
-
-/* Public interface routines */
-
-/* Allocate a new compressor */
-CompressorState *
-AllocateCompressor(const pg_compress_specification compression_spec,
- WriteFunc writeF)
-{
- CompressorState *cs;
-
-#ifndef HAVE_LIBZ
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
-
- cs = (CompressorState *) pg_malloc0(sizeof(CompressorState));
- cs->writeF = writeF;
- cs->compression_spec = compression_spec;
-
- /*
- * Perform compression algorithm specific initialization.
- */
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- InitCompressorZlib(cs, cs->compression_spec.level);
-#endif
-
- return cs;
-}
-
-/*
- * Read all compressed data from the input stream (via readF) and print it
- * out with ahwrite().
- */
-void
-ReadDataFromArchive(ArchiveHandle *AH,
- const pg_compress_specification compression_spec,
- ReadFunc readF)
-{
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- ReadDataFromArchiveNone(AH, readF);
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- ReadDataFromArchiveZlib(AH, readF);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- }
-}
-
-/*
- * Compress and write data to the output stream (via writeF).
- */
-void
-WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen)
-{
- switch (cs->compression_spec.algorithm)
- {
- case PG_COMPRESSION_GZIP:
-#ifdef HAVE_LIBZ
- WriteDataToArchiveZlib(AH, cs, data, dLen);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
- break;
- case PG_COMPRESSION_NONE:
- WriteDataToArchiveNone(AH, cs, data, dLen);
- break;
- case PG_COMPRESSION_LZ4:
- /* fallthrough */
- case PG_COMPRESSION_ZSTD:
- pg_fatal("invalid compression method");
- break;
- }
-}
-
-/*
- * Terminate compression library context and flush its buffers.
- */
-void
-EndCompressor(ArchiveHandle *AH, CompressorState *cs)
-{
-#ifdef HAVE_LIBZ
- if (cs->compression_spec.algorithm == PG_COMPRESSION_GZIP)
- EndCompressorZlib(AH, cs);
-#endif
- free(cs);
-}
+ void *outbuf;
+ size_t outsize;
+} GzipCompressorState;
-/* Private routines, specific to each compression method. */
-#ifdef HAVE_LIBZ
-/*
- * Functions for zlib compressed output.
- */
+/* Private routines that support gzip compressed data I/O */
+static void DeflateCompressorInit(CompressorState *cs);
+static void DeflateCompressorEnd(ArchiveHandle *AH, CompressorState *cs);
+static void DeflateCompressorCommon(ArchiveHandle *AH, CompressorState *cs,
+ bool flush);
+static void EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs);
+static void WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen);
+static void ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs);
static void
-InitCompressorZlib(CompressorState *cs, int level)
+DeflateCompressorInit(CompressorState *cs)
{
+ GzipCompressorState *gzipcs;
z_streamp zp;
- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
+ gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState));
+ zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
zp->zalloc = Z_NULL;
zp->zfree = Z_NULL;
zp->opaque = Z_NULL;
/*
- * zlibOutSize is the buffer size we tell zlib it can output to. We
- * actually allocate one extra byte because some routines want to append a
- * trailing zero byte to the zlib output.
+ * outsize is the buffer size we tell zlib it can output to. We actually
+ * allocate one extra byte because some routines want to append a trailing
+ * zero byte to the zlib output.
*/
- cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
- cs->zlibOutSize = ZLIB_OUT_SIZE;
+ gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
+ gzipcs->outsize = ZLIB_OUT_SIZE;
- if (deflateInit(zp, level) != Z_OK)
- pg_fatal("could not initialize compression library: %s",
- zp->msg);
+ /* -Z 0 uses the "None" compressor -- not zlib with no compression */
+ Assert(cs->compression_spec.level != 0);
+
+ if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
+ pg_fatal("could not initialize compression library: %s", zp->msg);
/* Just be paranoid - maybe End is called after Start, with no Write */
- zp->next_out = (void *) cs->zlibOut;
- zp->avail_out = cs->zlibOutSize;
+ zp->next_out = gzipcs->outbuf;
+ zp->avail_out = gzipcs->outsize;
+
+ /* Keep track of gzipcs */
+ cs->private_data = gzipcs;
}
static void
-EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
+DeflateCompressorEnd(ArchiveHandle *AH, CompressorState *cs)
{
- z_streamp zp = cs->zp;
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp;
+ zp = gzipcs->zp;
zp->next_in = NULL;
zp->avail_in = 0;
/* Flush any remaining data from zlib buffer */
- DeflateCompressorZlib(AH, cs, true);
+ DeflateCompressorCommon(AH, cs, true);
if (deflateEnd(zp) != Z_OK)
pg_fatal("could not close compression stream: %s", zp->msg);
- free(cs->zlibOut);
- free(cs->zp);
+ pg_free(gzipcs->outbuf);
+ pg_free(gzipcs->zp);
+ pg_free(gzipcs);
+ cs->private_data = NULL;
}
static void
-DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
+DeflateCompressorCommon(ArchiveHandle *AH, CompressorState *cs, bool flush)
{
- z_streamp zp = cs->zp;
- char *out = cs->zlibOut;
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+ z_streamp zp = gzipcs->zp;
+ void *out = gzipcs->outbuf;
int res = Z_OK;
- while (cs->zp->avail_in != 0 || flush)
+ while (gzipcs->zp->avail_in != 0 || flush)
{
res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
if (res == Z_STREAM_ERROR)
pg_fatal("could not compress data: %s", zp->msg);
- if ((flush && (zp->avail_out < cs->zlibOutSize))
+ if ((flush && (zp->avail_out < gzipcs->outsize))
|| (zp->avail_out == 0)
|| (zp->avail_in != 0)
)
@@ -289,18 +122,18 @@ DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
* chunk is the EOF marker in the custom format. This should never
* happen but ...
*/
- if (zp->avail_out < cs->zlibOutSize)
+ if (zp->avail_out < gzipcs->outsize)
{
/*
* Any write function should do its own error checking but to
* make sure we do a check here as well ...
*/
- size_t len = cs->zlibOutSize - zp->avail_out;
+ size_t len = gzipcs->outsize - zp->avail_out;
- cs->writeF(AH, out, len);
+ cs->writeF(AH, (char *) out, len);
}
- zp->next_out = (void *) out;
- zp->avail_out = cs->zlibOutSize;
+ zp->next_out = out;
+ zp->avail_out = gzipcs->outsize;
}
if (res == Z_STREAM_END)
@@ -309,16 +142,26 @@ DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
}
static void
-WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
+EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs)
{
- cs->zp->next_in = (void *) unconstify(char *, data);
- cs->zp->avail_in = dLen;
- DeflateCompressorZlib(AH, cs, false);
+ /* If deflation was initialized, finalize it */
+ if (cs->private_data)
+ DeflateCompressorEnd(AH, cs);
}
static void
-ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
+WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
+ const void *data, size_t dLen)
+{
+ GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
+
+ gzipcs->zp->next_in = (void *) unconstify(void *, data);
+ gzipcs->zp->avail_in = dLen;
+ DeflateCompressorCommon(AH, cs, false);
+}
+
+static void
+ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
{
z_streamp zp;
char *out;
@@ -342,7 +185,7 @@ ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
zp->msg);
/* no minimal chunk size for zlib */
- while ((cnt = readF(AH, &buf, &buflen)))
+ while ((cnt = cs->readF(AH, &buf, &buflen)))
{
zp->next_in = (void *) buf;
zp->avail_in = cnt;
@@ -382,389 +225,196 @@ ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF)
free(out);
free(zp);
}
-#endif /* HAVE_LIBZ */
-
-/*
- * Functions for uncompressed output.
- */
-
-static void
-ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF)
-{
- size_t cnt;
- char *buf;
- size_t buflen;
-
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
-
- while ((cnt = readF(AH, &buf, &buflen)))
+/* Public routines that support gzip compressed data I/O */
+void
+InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec)
{
- ahwrite(buf, 1, cnt, AH);
- }
+ cs->readData = ReadDataFromArchiveGzip;
+ cs->writeData = WriteDataToArchiveGzip;
+ cs->end = EndCompressorGzip;
- free(buf);
-}
+ cs->compression_spec = compression_spec;
-static void
-WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
- const char *data, size_t dLen)
-{
- cs->writeF(AH, data, dLen);
+ /*
+ * If the caller has defined a write function, prepare the necessary
+ * state. Note that if the data is empty, End may be called immediately
+ * after Init, without ever calling Write.
+ */
+ if (cs->writeF)
+ DeflateCompressorInit(cs);
}
/*----------------------
- * Compressed stream API
+ * Compress File API
*----------------------
*/
-/*
- * cfp represents an open stream, wrapping the underlying FILE or gzFile
- * pointer. This is opaque to the callers.
- */
-struct cfp
+static size_t
+Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
{
- FILE *uncompressedfp;
-#ifdef HAVE_LIBZ
- gzFile compressedfp;
-#endif
-};
-
-#ifdef HAVE_LIBZ
-static int hasSuffix(const char *filename, const char *suffix);
-#endif
+ gzFile gzfp = (gzFile) CFH->private_data;
+ size_t ret;
-/* free() without changing errno; useful in several places below */
-static void
-free_keep_errno(void *p)
-{
- int save_errno = errno;
-
- free(p);
- errno = save_errno;
-}
-
-/*
- * Open a file for reading. 'path' is the file to open, and 'mode' should
- * be either "r" or "rb".
- *
- * If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
- * doesn't already have it) and try again. So if you pass "foo" as 'path',
- * this will open either "foo" or "foo.gz".
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_read(const char *path, const char *mode)
+ ret = gzread(gzfp, ptr, size);
+ if (ret != size && !gzeof(gzfp))
{
- cfp *fp;
-
- pg_compress_specification compression_spec = {0};
+ int errnum;
+ const char *errmsg = gzerror(gzfp, &errnum);
-#ifdef HAVE_LIBZ
- if (hasSuffix(path, ".gz"))
- {
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(path, mode, compression_spec);
+ pg_fatal("could not read from input file: %s",
+ errnum == Z_ERRNO ? strerror(errno) : errmsg);
}
- else
-#endif
- {
- compression_spec.algorithm = PG_COMPRESSION_NONE;
- fp = cfopen(path, mode, compression_spec);
-#ifdef HAVE_LIBZ
- if (fp == NULL)
- {
- char *fname;
- fname = psprintf("%s.gz", path);
- compression_spec.algorithm = PG_COMPRESSION_GZIP;
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
- }
-#endif
- }
- return fp;
+ return ret;
}
-/*
- * Open a file for writing. 'path' indicates the path name, and 'mode' must
- * be a filemode as accepted by fopen() and gzopen() that indicates writing
- * ("w", "wb", "a", or "ab").
- *
- * If 'compression_spec.algorithm' is GZIP, a gzip compressed stream is opened,
- * and 'compression_spec.level' used. The ".gz" suffix is automatically added to
- * 'path' in that case.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen_write(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static size_t
+Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
{
- cfp *fp;
-
- if (compression_spec.algorithm == PG_COMPRESSION_NONE)
- fp = cfopen(path, mode, compression_spec);
- else
- {
-#ifdef HAVE_LIBZ
- char *fname;
+ gzFile gzfp = (gzFile) CFH->private_data;
- fname = psprintf("%s.gz", path);
- fp = cfopen(fname, mode, compression_spec);
- free_keep_errno(fname);
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
- fp = NULL; /* keep compiler quiet */
-#endif
+ return gzwrite(gzfp, ptr, size);
}
- return fp;
-}
-
-/*
- * This is the workhorse for cfopen() or cfdopen(). It opens file 'path' or
- * associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'. The
- * descriptor is not dup'ed and it is the caller's responsibility to do so.
- * The caller must verify that the 'compress_algorithm' is supported by the
- * current build.
- *
- * On failure, return NULL with an error code in errno.
- */
-static cfp *
-cfopen_internal(const char *path, int fd, const char *mode,
- pg_compress_specification compression_spec)
-{
- cfp *fp = pg_malloc0(sizeof(cfp));
- if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
- {
-#ifdef HAVE_LIBZ
- if (compression_spec.level != Z_DEFAULT_COMPRESSION)
+static int
+Gzip_getc(CompressFileHandle *CFH)
{
- /* user has specified a compression level, so tell zlib to use it */
- char mode_compression[32];
+ gzFile gzfp = (gzFile) CFH->private_data;
+ int ret;
- snprintf(mode_compression, sizeof(mode_compression), "%s%d",
- mode, compression_spec.level);
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode_compression);
- else
- fp->compressedfp = gzopen(path, mode_compression);
- }
- else
+ errno = 0;
+ ret = gzgetc(gzfp);
+ if (ret == EOF)
{
- /* don't specify a level, just use the zlib default */
- if (fd >= 0)
- fp->compressedfp = gzdopen(fd, mode);
+ if (!gzeof(gzfp))
+ pg_fatal("could not read from input file: %s", strerror(errno));
else
- fp->compressedfp = gzopen(path, mode);
+ pg_fatal("could not read from input file: end of file");
}
- if (fp->compressedfp == NULL)
- {
- free_keep_errno(fp);
- fp = NULL;
- }
-#else
- pg_fatal("this build does not support compression with %s", "gzip");
-#endif
+ return ret;
}
- else
- {
- if (fd >= 0)
- fp->uncompressedfp = fdopen(fd, mode);
- else
- fp->uncompressedfp = fopen(path, mode);
- if (fp->uncompressedfp == NULL)
+static char *
+Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
{
- free_keep_errno(fp);
- fp = NULL;
- }
- }
+ gzFile gzfp = (gzFile) CFH->private_data;
- return fp;
+ return gzgets(gzfp, ptr, size);
}
-/*
- * Opens file 'path' in 'mode' and compression as defined in
- * compression_spec. The caller must verify that the compression
- * is supported by the current build.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfopen(const char *path, const char *mode,
- const pg_compress_specification compression_spec)
+static int
+Gzip_close(CompressFileHandle *CFH)
{
- return cfopen_internal(path, -1, mode, compression_spec);
-}
+ gzFile gzfp = (gzFile) CFH->private_data;
-/*
- * Associates a stream 'fd', if 'fd' is a valid descriptor, in 'mode'
- * and compression as defined in compression_spec. The caller must
- * verify that the compression is supported by the current build.
- *
- * On failure, return NULL with an error code in errno.
- */
-cfp *
-cfdopen(int fd, const char *mode,
- const pg_compress_specification compression_spec)
-{
- return cfopen_internal(NULL, fd, mode, compression_spec);
+ CFH->private_data = NULL;
+
+ return gzclose(gzfp);
}
-int
-cfread(void *ptr, int size, cfp *fp)
+static int
+Gzip_eof(CompressFileHandle *CFH)
{
- int ret;
+ gzFile gzfp = (gzFile) CFH->private_data;
- if (size == 0)
- return 0;
+ return gzeof(gzfp);
+}
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzread(fp->compressedfp, ptr, size);
- if (ret != size && !gzeof(fp->compressedfp))
+static const char *
+Gzip_get_error(CompressFileHandle *CFH)
{
+ gzFile gzfp = (gzFile) CFH->private_data;
+ const char *errmsg;
int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
- pg_fatal("could not read from input file: %s",
- errnum == Z_ERRNO ? strerror(errno) : errmsg);
- }
- }
- else
-#endif
- {
- ret = fread(ptr, 1, size, fp->uncompressedfp);
- if (ret != size && !feof(fp->uncompressedfp))
- READ_ERROR_EXIT(fp->uncompressedfp);
- }
- return ret;
-}
+ errmsg = gzerror(gzfp, &errnum);
+ if (errnum == Z_ERRNO)
+ errmsg = strerror(errno);
-int
-cfwrite(const void *ptr, int size, cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzwrite(fp->compressedfp, ptr, size);
- else
-#endif
- return fwrite(ptr, 1, size, fp->uncompressedfp);
+ return errmsg;
}
-int
-cfgetc(cfp *fp)
+static int
+Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
{
- int ret;
+ gzFile gzfp;
+ char mode_compression[32];
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- ret = gzgetc(fp->compressedfp);
- if (ret == EOF)
+ if (CFH->compression_spec.level != Z_DEFAULT_COMPRESSION)
{
- if (!gzeof(fp->compressedfp))
- pg_fatal("could not read from input file: %s", strerror(errno));
- else
- pg_fatal("could not read from input file: end of file");
- }
+ /*
+ * user has specified a compression level, so tell zlib to use it
+ */
+ snprintf(mode_compression, sizeof(mode_compression), "%s%d",
+ mode, CFH->compression_spec.level);
}
else
-#endif
- {
- ret = fgetc(fp->uncompressedfp);
- if (ret == EOF)
- READ_ERROR_EXIT(fp->uncompressedfp);
- }
-
- return ret;
-}
+ strcpy(mode_compression, mode);
-char *
-cfgets(cfp *fp, char *buf, int len)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzgets(fp->compressedfp, buf, len);
+ if (fd >= 0)
+ gzfp = gzdopen(dup(fd), mode_compression);
else
-#endif
- return fgets(buf, len, fp->uncompressedfp);
-}
+ gzfp = gzopen(path, mode_compression);
-int
-cfclose(cfp *fp)
-{
- int result;
+ if (gzfp == NULL)
+ return 1;
- if (fp == NULL)
- {
- errno = EBADF;
- return EOF;
- }
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- result = gzclose(fp->compressedfp);
- fp->compressedfp = NULL;
- }
- else
-#endif
- {
- result = fclose(fp->uncompressedfp);
- fp->uncompressedfp = NULL;
- }
- free_keep_errno(fp);
+ CFH->private_data = gzfp;
- return result;
+ return 0;
}
-int
-cfeof(cfp *fp)
+static int
+Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- return gzeof(fp->compressedfp);
- else
-#endif
- return feof(fp->uncompressedfp);
-}
+ char *fname;
+ int ret;
+ int save_errno;
-const char *
-get_cfp_error(cfp *fp)
-{
-#ifdef HAVE_LIBZ
- if (fp->compressedfp)
- {
- int errnum;
- const char *errmsg = gzerror(fp->compressedfp, &errnum);
+ fname = psprintf("%s.gz", path);
+ ret = CFH->open_func(fname, -1, mode, CFH);
- if (errnum != Z_ERRNO)
- return errmsg;
- }
-#endif
- return strerror(errno);
+ save_errno = errno;
+ pg_free(fname);
+ errno = save_errno;
+
+ return ret;
}
-#ifdef HAVE_LIBZ
-static int
-hasSuffix(const char *filename, const char *suffix)
+void
+InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
{
- int filenamelen = strlen(filename);
- int suffixlen = strlen(suffix);
+ CFH->open_func = Gzip_open;
+ CFH->open_write_func = Gzip_open_write;
+ CFH->read_func = Gzip_read;
+ CFH->write_func = Gzip_write;
+ CFH->gets_func = Gzip_gets;
+ CFH->getc_func = Gzip_getc;
+ CFH->close_func = Gzip_close;
+ CFH->eof_func = Gzip_eof;
+ CFH->get_error_func = Gzip_get_error;
- if (filenamelen < suffixlen)
- return 0;
+ CFH->compression_spec = compression_spec;
- return memcmp(&filename[filenamelen - suffixlen],
- suffix,
- suffixlen) == 0;
+ CFH->private_data = NULL;
+}
+#else /* HAVE_LIBZ */
+void
+InitCompressorGzip(CompressorState *cs,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
}
-#endif
+void
+InitCompressFileHandleGzip(CompressFileHandle *CFH,
+ const pg_compress_specification compression_spec)
+{
+ pg_fatal("this build does not support compression with %s", "gzip");
+}
+#endif /* HAVE_LIBZ */
On Thu, Mar 16, 2023 at 11:30:50PM +0100, Tomas Vondra wrote:
On 3/16/23 01:20, Justin Pryzby wrote:
But try reading the diff while looking for the cause of a bug. It's the
difference between reading 50, two-line changes, and reading a hunk that
replaces 100 lines with a different 100 lines, with empty/unrelated
lines randomly thrown in as context.I don't know, maybe I'm doing something wrong or maybe I just am bad at
looking at diffs, but if I apply the patch you submitted on 8/3 and do
the git-diff above (output attached), it seems pretty incomprehensible
to me :-( I don't see 50 two-line changes (I certainly wouldn't be able
to identify the root cause of the bug based on that).
It's true that most of the diff is still incomprehensible...
But look at the part relevant to the "empty-data" bug:
[... incomprehensible changes elided ...]
static void -InitCompressorZlib(CompressorState *cs, int level) +DeflateCompressorInit(CompressorState *cs) { + GzipCompressorState *gzipcs; z_streamp zp;- zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream)); + gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState)); + zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream)); zp->zalloc = Z_NULL; zp->zfree = Z_NULL; zp->opaque = Z_NULL;/* - * zlibOutSize is the buffer size we tell zlib it can output to. We - * actually allocate one extra byte because some routines want to append a - * trailing zero byte to the zlib output. + * outsize is the buffer size we tell zlib it can output to. We actually + * allocate one extra byte because some routines want to append a trailing + * zero byte to the zlib output. */ - cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1); - cs->zlibOutSize = ZLIB_OUT_SIZE; + gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1); + gzipcs->outsize = ZLIB_OUT_SIZE;- if (deflateInit(zp, level) != Z_OK) - pg_fatal("could not initialize compression library: %s", - zp->msg); + /* -Z 0 uses the "None" compressor -- not zlib with no compression */ + Assert(cs->compression_spec.level != 0); + + if (deflateInit(zp, cs->compression_spec.level) != Z_OK) + pg_fatal("could not initialize compression library: %s", zp->msg);/* Just be paranoid - maybe End is called after Start, with no Write */ - zp->next_out = (void *) cs->zlibOut; - zp->avail_out = cs->zlibOutSize; + zp->next_out = gzipcs->outbuf; + zp->avail_out = gzipcs->outsize; + + /* Keep track of gzipcs */ + cs->private_data = gzipcs; }static void -EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs) +DeflateCompressorEnd(ArchiveHandle *AH, CompressorState *cs) { - z_streamp zp = cs->zp; + GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data; + z_streamp zp;+ zp = gzipcs->zp;
zp->next_in = NULL;
zp->avail_in = 0;/* Flush any remaining data from zlib buffer */ - DeflateCompressorZlib(AH, cs, true); + DeflateCompressorCommon(AH, cs, true);if (deflateEnd(zp) != Z_OK)
pg_fatal("could not close compression stream: %s", zp->msg);- free(cs->zlibOut); - free(cs->zp); + pg_free(gzipcs->outbuf); + pg_free(gzipcs->zp); + pg_free(gzipcs); + cs->private_data = NULL; }static void -DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush) +DeflateCompressorCommon(ArchiveHandle *AH, CompressorState *cs, bool flush) { - z_streamp zp = cs->zp; - char *out = cs->zlibOut; + GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data; + z_streamp zp = gzipcs->zp; + void *out = gzipcs->outbuf; int res = Z_OK;- while (cs->zp->avail_in != 0 || flush) + while (gzipcs->zp->avail_in != 0 || flush) { res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH); if (res == Z_STREAM_ERROR) pg_fatal("could not compress data: %s", zp->msg); - if ((flush && (zp->avail_out < cs->zlibOutSize)) + if ((flush && (zp->avail_out < gzipcs->outsize)) || (zp->avail_out == 0) || (zp->avail_in != 0) ) @@ -289,18 +122,18 @@ DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush) * chunk is the EOF marker in the custom format. This should never * happen but ... */ - if (zp->avail_out < cs->zlibOutSize) + if (zp->avail_out < gzipcs->outsize) { /* * Any write function should do its own error checking but to * make sure we do a check here as well ... */ - size_t len = cs->zlibOutSize - zp->avail_out; + size_t len = gzipcs->outsize - zp->avail_out;- cs->writeF(AH, out, len); + cs->writeF(AH, (char *) out, len); } - zp->next_out = (void *) out; - zp->avail_out = cs->zlibOutSize; + zp->next_out = out; + zp->avail_out = gzipcs->outsize; }if (res == Z_STREAM_END)
@@ -309,16 +142,26 @@ DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
}static void -WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs, - const char *data, size_t dLen) +EndCompressorGzip(ArchiveHandle *AH, CompressorState *cs) { - cs->zp->next_in = (void *) unconstify(char *, data); - cs->zp->avail_in = dLen; - DeflateCompressorZlib(AH, cs, false); + /* If deflation was initialized, finalize it */ + if (cs->private_data) + DeflateCompressorEnd(AH, cs); }static void -ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF) +WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs, + const void *data, size_t dLen) +{ + GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data; + + gzipcs->zp->next_in = (void *) unconstify(void *, data); + gzipcs->zp->avail_in = dLen; + DeflateCompressorCommon(AH, cs, false); +} + +static void +ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs) { z_streamp zp; char *out; @@ -342,7 +185,7 @@ ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF) zp->msg);/* no minimal chunk size for zlib */ - while ((cnt = readF(AH, &buf, &buflen))) + while ((cnt = cs->readF(AH, &buf, &buflen))) { zp->next_in = (void *) buf; zp->avail_in = cnt; @@ -382,389 +225,196 @@ ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF) free(out); free(zp); }
[... more incomprehensible changes elided ...]
On 3/16/23 23:58, Justin Pryzby wrote:
On Thu, Mar 16, 2023 at 11:30:50PM +0100, Tomas Vondra wrote:
On 3/16/23 01:20, Justin Pryzby wrote:
But try reading the diff while looking for the cause of a bug. It's the
difference between reading 50, two-line changes, and reading a hunk that
replaces 100 lines with a different 100 lines, with empty/unrelated
lines randomly thrown in as context.I don't know, maybe I'm doing something wrong or maybe I just am bad at
looking at diffs, but if I apply the patch you submitted on 8/3 and do
the git-diff above (output attached), it seems pretty incomprehensible
to me :-( I don't see 50 two-line changes (I certainly wouldn't be able
to identify the root cause of the bug based on that).It's true that most of the diff is still incomprehensible...
But look at the part relevant to the "empty-data" bug:
Well, yeah. If you know where to look, and if you squint just the right
way, then you can see any bug. I don't think I'd be able to spot the bug
in the diff unless I knew in advance what the bug is.
That being said, I don't object to moving the function etc. Unless there
are alternative ideas how to fix the empty-data issue, I'll get this
committed after playing with it a bit more.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
------- Original Message -------
On Thursday, March 16th, 2023 at 10:20 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 3/16/23 18:04, gkokolatos@pm.me wrote:
------- Original Message -------
On Tuesday, March 14th, 2023 at 4:32 PM, Tomas Vondra tomas.vondra@enterprisedb.com wrote:On 3/14/23 16:18, gkokolatos@pm.me wrote:
...> Would you mind me trying to come with a patch to address your points?
That'd be great, thanks. Please keep it split into smaller patches - two
might work, with one patch for "cosmetic" changes and the other tweaking
the API error-handling stuff.Please find attached a set for it. I will admit that the splitting in the
series might not be ideal and what you requested. It is split on what
seemed as a logical units. Please advice how a better split can look like.0001 is unifying types and return values on the API
0002 is addressing the constant definitions
0003 is your previous 0004 adding commentsThanks. I think the split seems reasonable - the goal was to not mix
different changes, and from that POV it works.I'm not sure I understand the Gzip_read/Gzip_write changes in 0001. I
mean, gzread/gzwrite returns int, so how does renaming the size_t
variable solve the issue of negative values for errors? I mean, this- size_t ret; + size_t gzret;- ret = gzread(gzfp, ptr, size); + gzret = gzread(gzfp, ptr, size);means we still lost the information gzread() returned a negative value,
no? We'll still probably trigger an error, but it's a bit weird.
You are obviously correct. My bad, I miss-read the return type of gzread().
Please find an amended version attached.
Unless I'm missing something, if gzread() ever returns -1 or some other
negative error value, we'll cast it to size_t, while condition will
evaluate to "true" and we'll happily chew on some random chunk of data.So the confusion is (at least partially) a preexisting issue ...
For gzwrite() it seems to be fine, because that only returns 0 on error.
OTOH it's defined to take 'int size' but then we happily pass size_t
values to it.As I wrote earlier, this apparently assumes we never need to deal with
buffers larger than int, and I don't think we have the ambition to relax
that (I'm not sure it's even needed / possible).
Agreed.
I see the read/write functions are now defined as int, but we only ever
return 0/1 from them, and then interpret that as bool. Why not to define
it like that? I don't think we need to adhere to the custom that
everything returns "int". This is an internal API. Or if we want to
stick to int, I'd define meaningful "nice" constants for 0/1.
The return types are now booleans and the callers have been made aware.
0002 seems fine to me. I see you've ditched the idea of having two
separate buffers, and replaced them with DEFAULT_IO_BUFFER_SIZE. Fine
with me, although I wonder if this might have negative impact on
performance or something (but I doubt that).
I doubt that too. Thank you.
0003 seems fine too.
Thank you.
As far as the error handling is concerned, you had said upthread:
I think the right approach is to handle all library errors and not just
let them through. So Gzip_write() needs to check the return value, and
either call pg_fatal() or translate it to an error defined by the API.While working on it, I thought it would be clearer and more consistent
for the pg_fatal() to be called by the caller of the individual functions.
Each individual function can keep track of the specifics of the error
internally. Then the caller upon detecting that there was an error by
checking the return value, can call pg_fatal() with a uniform error
message and then add the specifics by calling the get_error_func().I agree it's cleaner the way you did it.
I was thinking that with each compression function handling error
internally, the callers would not need to do that. But I haven't
realized there's logic to detect ENOSPC and so on, and we'd need to
duplicate that in every compression func.
If you agree, I can prepare a patch to improve on the error handling
aspect of the API as a separate thread, since here we are trying to
focus on correctness.
Cheers,
//Georgios
Show quoted text
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
v3-0003-Improve-compress_lz4-documentation.patchtext/x-patch; name=v3-0003-Improve-compress_lz4-documentation.patchDownload
From eeac82b647dc4021e1dcf22d8cc59840fbde8847 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 17 Mar 2023 15:29:05 +0000
Subject: [PATCH v3 3/3] Improve compress_lz4 documentation.
Author: Tomas Vondra
---
src/bin/pg_dump/compress_lz4.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 2f3e552f51..fc2f4e116d 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -185,12 +185,15 @@ LZ4File_get_error(CompressFileHandle *CFH)
}
/*
- * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls (either
+ * compression or decompression).
*
- * It creates the necessary contexts for the operations. When compressing,
- * it additionally writes the LZ4 header in the output stream.
+ * It creates the necessary contexts for the operations. When compressing data
+ * (indicated by compressing=true), it additionally writes the LZ4 header in the
+ * output stream.
*
- * Returns true on success and false on error.
+ * Returns true on success. In case of a failure returns false, and stores the
+ * error code in fs->errcode.
*/
static bool
LZ4File_init(LZ4File *fs, int size, bool compressing)
@@ -203,9 +206,15 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
fs->compressing = compressing;
fs->inited = true;
+ /* When compressing, write LZ4 header to the output stream. */
if (fs->compressing)
{
fs->buflen = LZ4F_compressBound(DEFAULT_IO_BUFFER_SIZE, &fs->prefs);
+
+ /*
+ * LZ4F_compressBegin requires a buffer that is greater or equal to
+ * LZ4F_HEADER_SIZE_MAX. Verify that the requirement is met.
+ */
if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
fs->buflen = LZ4F_HEADER_SIZE_MAX;
@@ -255,9 +264,12 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
/*
* Read already decompressed content from the overflow buffer into 'ptr' up to
* 'size' bytes, if available. If the eol_flag is set, then stop at the first
- * occurrence of the new line char prior to 'size' bytes.
+ * occurrence of the newline char prior to 'size' bytes.
*
* Any unread content in the overflow buffer is moved to the beginning.
+ *
+ * Returns the number of bytes read from the overflow buffer (and copied into
+ * the 'ptr' buffer), or 0 if the overflow buffer is empty.
*/
static int
LZ4File_read_overflow(LZ4File *fs, void *ptr, int size, bool eol_flag)
@@ -297,6 +309,9 @@ LZ4File_read_overflow(LZ4File *fs, void *ptr, int size, bool eol_flag)
* at an overflow buffer within LZ4File. Of course, when the function is
* called, it will first try to consume any decompressed content already
* present in the overflow buffer, before decompressing new content.
+ *
+ * Returns the number of bytes of decompressed data copied into the ptr
+ * buffer, or -1 in case of error.
*/
static int
LZ4File_read_internal(LZ4File *fs, void *ptr, int ptrsize, bool eol_flag)
--
2.34.1
v3-0002-Clean-up-constants-in-pg_dump-s-compression-API.patchtext/x-patch; name=v3-0002-Clean-up-constants-in-pg_dump-s-compression-API.patchDownload
From 4d8643e3e0081973c001e56cc9a5ecd4ff0083b4 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 17 Mar 2023 15:06:22 +0000
Subject: [PATCH v3 2/3] Clean up constants in pg_dump's compression API.
Prior to the introduction of the API, pg_dump would use the ZLIB_[IN|OUT]_SIZE
constants to handle buffer sizes throughout. This behaviour is confusing after
the introduction of the API. Ammend it by introducing a DEFAULT_IO_BUFFER_SIZE
constant to use when appropriate while giving the opportunity to specific
compression implementations to use their own.
With the help and guidance of Tomas Vondra.
---
src/bin/pg_dump/compress_gzip.c | 22 +++++++++++-----------
src/bin/pg_dump/compress_io.h | 5 ++---
src/bin/pg_dump/compress_lz4.c | 11 ++++-------
src/bin/pg_dump/compress_none.c | 4 ++--
src/bin/pg_dump/pg_backup_directory.c | 4 ++--
5 files changed, 21 insertions(+), 25 deletions(-)
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index 79ed2cf371..61d15907eb 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -120,8 +120,8 @@ WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
* actually allocate one extra byte because some routines want to
* append a trailing zero byte to the zlib output.
*/
- gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
- gzipcs->outsize = ZLIB_OUT_SIZE;
+ gzipcs->outsize = DEFAULT_IO_BUFFER_SIZE;
+ gzipcs->outbuf = pg_malloc(gzipcs->outsize + 1);
/*
* A level of zero simply copies the input one block at the time. This
@@ -158,10 +158,10 @@ ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
zp->zfree = Z_NULL;
zp->opaque = Z_NULL;
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
+ buflen = DEFAULT_IO_BUFFER_SIZE;
+ buf = pg_malloc(buflen);
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
+ out = pg_malloc(DEFAULT_IO_BUFFER_SIZE + 1);
if (inflateInit(zp) != Z_OK)
pg_fatal("could not initialize compression library: %s",
@@ -176,14 +176,14 @@ ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
while (zp->avail_in > 0)
{
zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
+ zp->avail_out = DEFAULT_IO_BUFFER_SIZE;
res = inflate(zp, 0);
if (res != Z_OK && res != Z_STREAM_END)
pg_fatal("could not uncompress data: %s", zp->msg);
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ out[DEFAULT_IO_BUFFER_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, DEFAULT_IO_BUFFER_SIZE - zp->avail_out, AH);
}
}
@@ -192,13 +192,13 @@ ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
while (res != Z_STREAM_END)
{
zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
+ zp->avail_out = DEFAULT_IO_BUFFER_SIZE;
res = inflate(zp, 0);
if (res != Z_OK && res != Z_STREAM_END)
pg_fatal("could not uncompress data: %s", zp->msg);
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ out[DEFAULT_IO_BUFFER_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, DEFAULT_IO_BUFFER_SIZE - zp->avail_out, AH);
}
if (inflateEnd(zp) != Z_OK)
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 7c2f9b5668..fd8752db0d 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -17,9 +17,8 @@
#include "pg_backup_archiver.h"
-/* Initial buffer sizes used in zlib compression. */
-#define ZLIB_OUT_SIZE 4096
-#define ZLIB_IN_SIZE 4096
+/* Default size used for IO buffers */
+#define DEFAULT_IO_BUFFER_SIZE 4096
extern char *supports_compression(const pg_compress_specification compression_spec);
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 278f262162..2f3e552f51 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -20,9 +20,6 @@
#include <lz4.h>
#include <lz4frame.h>
-#define LZ4_OUT_SIZE (4 * 1024)
-#define LZ4_IN_SIZE (16 * 1024)
-
/*
* LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
* Redefine it for installations with a lesser version.
@@ -57,7 +54,7 @@ ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
size_t buflen;
size_t cnt;
- buflen = LZ4_IN_SIZE;
+ buflen = DEFAULT_IO_BUFFER_SIZE;
buf = pg_malloc(buflen);
decbuf = pg_malloc(buflen);
@@ -208,7 +205,7 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
if (fs->compressing)
{
- fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ fs->buflen = LZ4F_compressBound(DEFAULT_IO_BUFFER_SIZE, &fs->prefs);
if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
fs->buflen = LZ4F_HEADER_SIZE_MAX;
@@ -244,7 +241,7 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
return false;
}
- fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buflen = Max(size, DEFAULT_IO_BUFFER_SIZE);
fs->buffer = pg_malloc(fs->buflen);
fs->overflowalloclen = fs->buflen;
@@ -423,7 +420,7 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
while (remaining > 0)
{
- int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+ int chunk = Min(remaining, DEFAULT_IO_BUFFER_SIZE);
remaining -= chunk;
diff --git a/src/bin/pg_dump/compress_none.c b/src/bin/pg_dump/compress_none.c
index 18f3514d11..736a7957bc 100644
--- a/src/bin/pg_dump/compress_none.c
+++ b/src/bin/pg_dump/compress_none.c
@@ -33,8 +33,8 @@ ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
char *buf;
size_t buflen;
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
+ buflen = DEFAULT_IO_BUFFER_SIZE;
+ buf = pg_malloc(buflen);
while ((cnt = cs->readF(AH, &buf, &buflen)))
{
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 525dbf9bf0..abaaa3b10e 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -394,8 +394,8 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
+ buflen = DEFAULT_IO_BUFFER_SIZE;
+ buf = pg_malloc(buflen);
while (CFH->read_func(buf, buflen, &cnt, CFH) && cnt > 0)
{
--
2.34.1
v3-0001-Improve-type-handling-in-pg_dump-s-compress-file-.patchtext/x-patch; name=v3-0001-Improve-type-handling-in-pg_dump-s-compress-file-.patchDownload
From a174cdff4ec8aad59f5bcc7e8d52218a14fe56fc Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 17 Mar 2023 14:45:58 +0000
Subject: [PATCH v3 1/3] Improve type handling in pg_dump's compress file API
The function LZ4File_gets() was storing the return value of
LZ4File_read_internal in a variable of the wrong type, disregarding sign-es.
As a consequence, LZ4File_gets() would not take the error path when it should.
In an attempt to improve readability and code uniformity, change the return type
of the API's read and write functions to bool from size_t. Along with it,
homogenize the return values of the relevant functions of this API.
This change, helps the specific compression implementations handle the return
types of their corresponding libraries internally and not expose them to the
API caller.
In passing save the appropriate errno in LZ4File_open_write in case that the
caller is not using the API's get_error_func.
With the help and guidance of Tomas Vondra.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
---
src/bin/pg_dump/compress_gzip.c | 37 ++++++------
src/bin/pg_dump/compress_io.c | 8 +--
src/bin/pg_dump/compress_io.h | 38 +++++++++---
src/bin/pg_dump/compress_lz4.c | 85 +++++++++++++++------------
src/bin/pg_dump/compress_none.c | 41 ++++++++-----
src/bin/pg_dump/pg_backup_archiver.c | 18 ++----
src/bin/pg_dump/pg_backup_directory.c | 36 ++++++------
7 files changed, 148 insertions(+), 115 deletions(-)
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index 0af65afeb4..79ed2cf371 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -233,14 +233,14 @@ InitCompressorGzip(CompressorState *cs,
*----------------------
*/
-static size_t
-Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+static bool
+Gzip_read(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH)
{
gzFile gzfp = (gzFile) CFH->private_data;
- size_t ret;
+ int gzret;
- ret = gzread(gzfp, ptr, size);
- if (ret != size && !gzeof(gzfp))
+ gzret = gzread(gzfp, ptr, size);
+ if (gzret <= 0 && !gzeof(gzfp))
{
int errnum;
const char *errmsg = gzerror(gzfp, &errnum);
@@ -249,15 +249,18 @@ Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
errnum == Z_ERRNO ? strerror(errno) : errmsg);
}
- return ret;
+ if (rsize)
+ *rsize = (size_t) gzret;
+
+ return true;
}
-static size_t
+static bool
Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
{
gzFile gzfp = (gzFile) CFH->private_data;
- return gzwrite(gzfp, ptr, size);
+ return gzwrite(gzfp, ptr, size) > 0;
}
static int
@@ -287,22 +290,22 @@ Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
return gzgets(gzfp, ptr, size);
}
-static int
+static bool
Gzip_close(CompressFileHandle *CFH)
{
gzFile gzfp = (gzFile) CFH->private_data;
CFH->private_data = NULL;
- return gzclose(gzfp);
+ return gzclose(gzfp) == Z_OK;
}
-static int
+static bool
Gzip_eof(CompressFileHandle *CFH)
{
gzFile gzfp = (gzFile) CFH->private_data;
- return gzeof(gzfp);
+ return gzeof(gzfp) == 1;
}
static const char *
@@ -319,7 +322,7 @@ Gzip_get_error(CompressFileHandle *CFH)
return errmsg;
}
-static int
+static bool
Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
{
gzFile gzfp;
@@ -342,19 +345,19 @@ Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
gzfp = gzopen(path, mode_compression);
if (gzfp == NULL)
- return 1;
+ return false;
CFH->private_data = gzfp;
- return 0;
+ return true;
}
-static int
+static bool
Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
{
char *fname;
- int ret;
int save_errno;
+ bool ret;
fname = psprintf("%s.gz", path);
ret = CFH->open_func(fname, -1, mode, CFH);
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index ce06f1eac9..8f32cb4385 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -262,7 +262,7 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
}
CFH = InitCompressFileHandle(compression_spec);
- if (CFH->open_func(fname, -1, mode, CFH))
+ if (!CFH->open_func(fname, -1, mode, CFH))
{
free_keep_errno(CFH);
CFH = NULL;
@@ -275,12 +275,12 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
/*
* Close an open file handle and release its memory.
*
- * On failure, returns an error value and sets errno appropriately.
+ * On failure, returns false and sets errno appropriately.
*/
-int
+bool
EndCompressFileHandle(CompressFileHandle *CFH)
{
- int ret = 0;
+ bool ret = 0;
if (CFH->private_data)
ret = CFH->close_func(CFH);
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index cdb15951ea..7c2f9b5668 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -100,8 +100,10 @@ struct CompressFileHandle
* Pass either 'path' or 'fd' depending on whether a file path or a file
* descriptor is available. 'mode' can be one of 'r', 'rb', 'w', 'wb',
* 'a', and 'ab'. Requires an already initialized CompressFileHandle.
+ *
+ * Returns true on success and false on error.
*/
- int (*open_func) (const char *path, int fd, const char *mode,
+ bool (*open_func) (const char *path, int fd, const char *mode,
CompressFileHandle *CFH);
/*
@@ -109,19 +111,27 @@ struct CompressFileHandle
*
* 'mode' can be one of 'w', 'wb', 'a', and 'ab'. Requires an already
* initialized CompressFileHandle.
+ *
+ * Returns true on success and false on error.
*/
- int (*open_write_func) (const char *path, const char *mode,
+ bool (*open_write_func) (const char *path, const char *mode,
CompressFileHandle *CFH);
/*
* Read 'size' bytes of data from the file and store them into 'ptr'.
+ * Optionally it will store the number of bytes read in 'rsize'.
+ *
+ * Returns true on success and throws an internal error otherwise.
*/
- size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
+ bool (*read_func) (void *ptr, size_t size, size_t *rsize,
+ CompressFileHandle *CFH);
/*
* Write 'size' bytes of data into the file from 'ptr'.
+ *
+ * Returns true on success and false on error.
*/
- size_t (*write_func) (const void *ptr, size_t size,
+ bool (*write_func) (const void *ptr, size_t size,
struct CompressFileHandle *CFH);
/*
@@ -130,28 +140,38 @@ struct CompressFileHandle
*
* Stop if an EOF or a newline is found first. 's' is always null
* terminated and contains the newline if it was found.
+ *
+ * Returns 's' on success, and NULL on error or when end of file occurs
+ * while no characters have been read.
*/
char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
/*
* Read the next character from the compress file handle as 'unsigned
* char' cast into 'int'.
+ *
+ * Returns the character read on success and throws an internal error
+ * otherwise. It treats EOF as error.
*/
int (*getc_func) (CompressFileHandle *CFH);
/*
* Test if EOF is reached in the compress file handle.
+ *
+ * Returns true if it is reached.
*/
- int (*eof_func) (CompressFileHandle *CFH);
+ bool (*eof_func) (CompressFileHandle *CFH);
/*
* Close an open file handle.
+ *
+ * Returns true on success and false on error.
*/
- int (*close_func) (CompressFileHandle *CFH);
+ bool (*close_func) (CompressFileHandle *CFH);
/*
- * Get a pointer to a string that describes an error that occurred during a
- * compress file handle operation.
+ * Get a pointer to a string that describes an error that occurred during
+ * a compress file handle operation.
*/
const char *(*get_error_func) (CompressFileHandle *CFH);
@@ -178,5 +198,5 @@ extern CompressFileHandle *InitCompressFileHandle(const pg_compress_specificatio
*/
extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
const char *mode);
-extern int EndCompressFileHandle(CompressFileHandle *CFH);
+extern bool EndCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 63e794cdc6..278f262162 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -165,7 +165,7 @@ typedef struct LZ4File
* decompressed output in the overflow buffer and the end of the backing file
* is reached.
*/
-static int
+static bool
LZ4File_eof(CompressFileHandle *CFH)
{
LZ4File *fs = (LZ4File *) CFH->private_data;
@@ -192,14 +192,16 @@ LZ4File_get_error(CompressFileHandle *CFH)
*
* It creates the necessary contexts for the operations. When compressing,
* it additionally writes the LZ4 header in the output stream.
+ *
+ * Returns true on success and false on error.
*/
-static int
+static bool
LZ4File_init(LZ4File *fs, int size, bool compressing)
{
size_t status;
if (fs->inited)
- return 0;
+ return true;
fs->compressing = compressing;
fs->inited = true;
@@ -214,7 +216,7 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
if (LZ4F_isError(status))
{
fs->errcode = status;
- return 1;
+ return false;
}
fs->buffer = pg_malloc(fs->buflen);
@@ -224,13 +226,13 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
if (LZ4F_isError(status))
{
fs->errcode = status;
- return 1;
+ return false;
}
if (fwrite(fs->buffer, 1, status, fs->fp) != status)
{
errno = (errno) ? errno : ENOSPC;
- return 1;
+ return false;
}
}
else
@@ -239,7 +241,7 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
if (LZ4F_isError(status))
{
fs->errcode = status;
- return 1;
+ return false;
}
fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
@@ -250,7 +252,7 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
fs->overflowlen = 0;
}
- return 0;
+ return true;
}
/*
@@ -302,15 +304,15 @@ LZ4File_read_overflow(LZ4File *fs, void *ptr, int size, bool eol_flag)
static int
LZ4File_read_internal(LZ4File *fs, void *ptr, int ptrsize, bool eol_flag)
{
- size_t dsize = 0;
- size_t rsize;
- size_t size = ptrsize;
+ int dsize = 0;
+ int rsize;
+ int size = ptrsize;
bool eol_found = false;
void *readbuf;
/* Lazy init */
- if (LZ4File_init(fs, size, false /* decompressing */ ))
+ if (!LZ4File_init(fs, size, false /* decompressing */ ))
return -1;
/* Verify that there is enough space in the outbuf */
@@ -398,17 +400,17 @@ LZ4File_read_internal(LZ4File *fs, void *ptr, int ptrsize, bool eol_flag)
fs->overflowlen += outlen;
}
}
- } while (rsize == size && dsize < size && eol_found == 0);
+ } while (rsize == size && dsize < size && eol_found == false);
pg_free(readbuf);
- return (int) dsize;
+ return dsize;
}
/*
* Compress size bytes from ptr and write them to the stream.
*/
-static size_t
+static bool
LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
{
LZ4File *fs = (LZ4File *) CFH->private_data;
@@ -416,8 +418,8 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
int remaining = size;
/* Lazy init */
- if (LZ4File_init(fs, size, true))
- return -1;
+ if (!LZ4File_init(fs, size, true))
+ return false;
while (remaining > 0)
{
@@ -430,33 +432,35 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
if (LZ4F_isError(status))
{
fs->errcode = status;
- return -1;
+ return false;
}
if (fwrite(fs->buffer, 1, status, fs->fp) != status)
{
errno = (errno) ? errno : ENOSPC;
- return 1;
+ return false;
}
}
- return size;
+ return true;
}
/*
* fread() equivalent implementation for LZ4 compressed files.
*/
-static size_t
-LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+static bool
+LZ4File_read(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH)
{
LZ4File *fs = (LZ4File *) CFH->private_data;
int ret;
- ret = LZ4File_read_internal(fs, ptr, size, false);
- if (ret != size && !LZ4File_eof(CFH))
+ if ((ret = LZ4File_read_internal(fs, ptr, size, false)) < 0)
pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
- return ret;
+ if (rsize)
+ *rsize = (size_t) ret;
+
+ return true;
}
/*
@@ -468,7 +472,7 @@ LZ4File_getc(CompressFileHandle *CFH)
LZ4File *fs = (LZ4File *) CFH->private_data;
unsigned char c;
- if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ if (LZ4File_read_internal(fs, &c, 1, false) <= 0)
{
if (!LZ4File_eof(CFH))
pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
@@ -486,14 +490,14 @@ static char *
LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
{
LZ4File *fs = (LZ4File *) CFH->private_data;
- size_t dsize;
+ int ret;
- dsize = LZ4File_read_internal(fs, ptr, size, true);
- if (dsize < 0)
+ ret = LZ4File_read_internal(fs, ptr, size, true);
+ if (ret < 0 || (ret == 0 && !LZ4File_eof(CFH)))
pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
/* Done reading */
- if (dsize == 0)
+ if (ret == 0)
return NULL;
return ptr;
@@ -503,13 +507,12 @@ LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
* Finalize (de)compression of a stream. When compressing it will write any
* remaining content and/or generated footer from the LZ4 API.
*/
-static int
+static bool
LZ4File_close(CompressFileHandle *CFH)
{
FILE *fp;
LZ4File *fs = (LZ4File *) CFH->private_data;
size_t status;
- int ret;
fp = fs->fp;
if (fs->inited)
@@ -520,7 +523,7 @@ LZ4File_close(CompressFileHandle *CFH)
if (LZ4F_isError(status))
pg_fatal("failed to end compression: %s",
LZ4F_getErrorName(status));
- else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ else if (fwrite(fs->buffer, 1, status, fs->fp) != status)
{
errno = (errno) ? errno : ENOSPC;
WRITE_ERROR_EXIT;
@@ -545,10 +548,10 @@ LZ4File_close(CompressFileHandle *CFH)
pg_free(fs);
- return fclose(fp);
+ return fclose(fp) == 0;
}
-static int
+static bool
LZ4File_open(const char *path, int fd, const char *mode,
CompressFileHandle *CFH)
{
@@ -562,23 +565,27 @@ LZ4File_open(const char *path, int fd, const char *mode,
if (fp == NULL)
{
lz4fp->errcode = errno;
- return 1;
+ return false;
}
lz4fp->fp = fp;
- return 0;
+ return true;
}
-static int
+static bool
LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
{
char *fname;
- int ret;
+ int save_errno;
+ bool ret;
fname = psprintf("%s.lz4", path);
ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
pg_free(fname);
+ errno = save_errno;
return ret;
}
diff --git a/src/bin/pg_dump/compress_none.c b/src/bin/pg_dump/compress_none.c
index ecbcf4b04a..18f3514d11 100644
--- a/src/bin/pg_dump/compress_none.c
+++ b/src/bin/pg_dump/compress_none.c
@@ -83,27 +83,36 @@ InitCompressorNone(CompressorState *cs,
* Private routines
*/
-static size_t
-read_none(void *ptr, size_t size, CompressFileHandle *CFH)
+static bool
+read_none(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH)
{
FILE *fp = (FILE *) CFH->private_data;
size_t ret;
if (size == 0)
- return 0;
+ return true;
ret = fread(ptr, 1, size, fp);
if (ret != size && !feof(fp))
pg_fatal("could not read from input file: %s",
strerror(errno));
- return ret;
+ if (rsize)
+ *rsize = ret;
+
+ return true;
}
-static size_t
+static bool
write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
{
- return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
+ size_t ret;
+
+ ret = fwrite(ptr, 1, size, (FILE *) CFH->private_data);
+ if (ret != size)
+ return false;
+
+ return true;
}
static const char *
@@ -136,7 +145,7 @@ getc_none(CompressFileHandle *CFH)
return ret;
}
-static int
+static bool
close_none(CompressFileHandle *CFH)
{
FILE *fp = (FILE *) CFH->private_data;
@@ -147,16 +156,16 @@ close_none(CompressFileHandle *CFH)
if (fp)
ret = fclose(fp);
- return ret;
+ return ret == 0;
}
-static int
+static bool
eof_none(CompressFileHandle *CFH)
{
- return feof((FILE *) CFH->private_data);
+ return feof((FILE *) CFH->private_data) != 0;
}
-static int
+static bool
open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
{
Assert(CFH->private_data == NULL);
@@ -167,21 +176,21 @@ open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
CFH->private_data = fopen(path, mode);
if (CFH->private_data == NULL)
- return 1;
+ return false;
- return 0;
+ return true;
}
-static int
+static bool
open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
{
Assert(CFH->private_data == NULL);
CFH->private_data = fopen(path, mode);
if (CFH->private_data == NULL)
- return 1;
+ return false;
- return 0;
+ return true;
}
/*
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 61ebb8fe85..6f3a85fe20 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -265,16 +265,13 @@ OpenArchive(const char *FileSpec, const ArchiveFormat fmt)
void
CloseArchive(Archive *AHX)
{
- int res = 0;
ArchiveHandle *AH = (ArchiveHandle *) AHX;
AH->ClosePtr(AH);
/* Close the output */
errno = 0;
- res = EndCompressFileHandle(AH->OF);
-
- if (res != 0)
+ if (!EndCompressFileHandle(AH->OF))
pg_fatal("could not close output file: %m");
}
@@ -1529,7 +1526,7 @@ SetOutput(ArchiveHandle *AH, const char *filename,
CFH = InitCompressFileHandle(compression_spec);
- if (CFH->open_func(filename, fn, mode, CFH))
+ if (!CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
@@ -1549,12 +1546,8 @@ SaveOutput(ArchiveHandle *AH)
static void
RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
- int res;
-
errno = 0;
- res = EndCompressFileHandle(AH->OF);
-
- if (res != 0)
+ if (!EndCompressFileHandle(AH->OF))
pg_fatal("could not close output file: %m");
AH->OF = savedOutput;
@@ -1694,7 +1687,8 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
{
CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
- bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ if (CFH->write_func(ptr, size * nmemb, CFH))
+ bytes_written = size * nmemb;
}
if (bytes_written != size * nmemb)
@@ -2243,7 +2237,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
CFH = InitCompressFileHandle(out_compress_spec);
- if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ if (!CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
pg_fatal("could not open stdout for appending: %m");
AH->OF = CFH;
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 41c2b733e3..525dbf9bf0 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -217,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (EndCompressFileHandle(tocFH) != 0)
+ if (!EndCompressFileHandle(tocFH))
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -328,7 +328,7 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
- if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
+ if (!ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -348,7 +348,7 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
+ if (dLen > 0 && !CFH->write_func(data, dLen, CFH))
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (EndCompressFileHandle(ctx->dataFH) != 0)
+ if (!EndCompressFileHandle(ctx->dataFH))
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -382,7 +382,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintFileData(ArchiveHandle *AH, char *filename)
{
- size_t cnt;
+ size_t cnt = 0;
char *buf;
size_t buflen;
CompressFileHandle *CFH;
@@ -397,13 +397,13 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = CFH->read_func(buf, buflen, CFH)))
+ while (CFH->read_func(buf, buflen, &cnt, CFH) && cnt > 0)
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (EndCompressFileHandle(CFH) != 0)
+ if (!EndCompressFileHandle(CFH))
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -468,7 +468,7 @@ _LoadLOs(ArchiveHandle *AH)
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
+ if (!EndCompressFileHandle(ctx->LOsTocFH))
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -491,7 +491,7 @@ _WriteByte(ArchiveHandle *AH, const int i)
CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (CFH->write_func(&c, 1, CFH) != 1)
+ if (!CFH->write_func(&c, 1, CFH))
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
@@ -529,7 +529,7 @@ _WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (CFH->write_func(buf, len, CFH) != len)
+ if (!CFH->write_func(buf, len, CFH))
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
@@ -554,7 +554,7 @@ _ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
* If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (CFH->read_func(buf, len, CFH) != len)
+ if (!CFH->read_func(buf, len, NULL, CFH))
pg_fatal("could not read from input file: end of file");
}
@@ -589,7 +589,7 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
tocFH = InitCompressFileHandle(compression_spec);
- if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
+ if (!tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -602,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (EndCompressFileHandle(tocFH) != 0)
+ if (!EndCompressFileHandle(tocFH))
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -654,7 +654,7 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
- if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
+ if (!ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -672,7 +672,7 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
- if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
+ if (!ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -690,13 +690,13 @@ _EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
int len;
/* Close the BLOB data file itself */
- if (EndCompressFileHandle(ctx->dataFH) != 0)
+ if (!EndCompressFileHandle(ctx->dataFH))
pg_fatal("could not close LO data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (CFH->write_func(buf, len, CFH) != len)
+ if (!CFH->write_func(buf, len, CFH))
pg_fatal("could not write to LOs TOC file");
}
@@ -710,7 +710,7 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
+ if (!EndCompressFileHandle(ctx->LOsTocFH))
pg_fatal("could not close LOs TOC file: %m");
ctx->LOsTocFH = NULL;
}
--
2.34.1
On 3/17/23 16:43, gkokolatos@pm.me wrote:
...
I agree it's cleaner the way you did it.
I was thinking that with each compression function handling error
internally, the callers would not need to do that. But I haven't
realized there's logic to detect ENOSPC and so on, and we'd need to
duplicate that in every compression func.If you agree, I can prepare a patch to improve on the error handling
aspect of the API as a separate thread, since here we are trying to
focus on correctness.
Yes, that makes sense. There are far too many patches in this thread
already ...
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Hi,
I was preparing to get the 3 cleanup patches pushed, so I
updated/reworded the commit messages a bit (attached, please check).
But I noticed the commit message for 0001 said:
In passing save the appropriate errno in LZ4File_open_write in
case that the caller is not using the API's get_error_func.
I think that's far too low-level for a commit message, it'd be much more
appropriate for a comment at the function.
However, do we even need this behavior? I was looking for code calling
this function without using get_error_func(), but no luck. And if there
is such caller, shouldn't we fix it to use get_error_func()?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
v4-0001-Improve-type-handling-in-pg_dump-s-compress-file-.patchtext/x-patch; charset=UTF-8; name=v4-0001-Improve-type-handling-in-pg_dump-s-compress-file-.patchDownload
From 6a3d5d743f022ffcd0fcaf3d6e9ba711e2e785e7 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 17 Mar 2023 14:45:58 +0000
Subject: [PATCH v4 1/3] Improve type handling in pg_dump's compress file API
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
After 0da243fed0 got committed, we've received a report about a compiler
warning, related to the new LZ4File_gets() function:
compress_lz4.c: In function ‘LZ4File_gets’:
compress_lz4.c:492:19: warning: comparison of unsigned expression in
‘< 0’ is always false [-Wtype-limits]
492 | if (dsize < 0)
The reason is very simple - dsize is declared as size_t, which is an
unsigned integer, and thus the check is pointless and we might fail to
notice an error in some cases (or fail in a strange way a bit later).
The warning could have been silenced by simply changing the type, but we
realized the API mostly assumes all the libraries use the same types and
report errors the same way (e.g. by returning 0 and/or negative value).
But we can't make this assumption - the gzip/lz4 libraries already
disagree on some of this, and even if they did a library added in the
future might not.
The right solution is to define what the API does, and translate the
library-specific behavior in consistent way (so that the internal errors
are not exposed to users of our compression API).
For that reason, this commit adjusts the data types in a couple places,
so that we don't miss library errors, and unifies the error reporting to
simply return true/false (instead of e.g. size_t).
Author: Georgios Kokolatos
Reported-by: Alexander Lakhin
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/33496f7c-3449-1426-d568-63f6bca2ac1f@gmail.com
---
src/bin/pg_dump/compress_gzip.c | 37 ++++++------
src/bin/pg_dump/compress_io.c | 8 +--
src/bin/pg_dump/compress_io.h | 38 +++++++++---
src/bin/pg_dump/compress_lz4.c | 85 +++++++++++++++------------
src/bin/pg_dump/compress_none.c | 41 ++++++++-----
src/bin/pg_dump/pg_backup_archiver.c | 18 ++----
src/bin/pg_dump/pg_backup_directory.c | 36 ++++++------
7 files changed, 148 insertions(+), 115 deletions(-)
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index 0af65afeb4..d9c3969332 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -233,14 +233,14 @@ InitCompressorGzip(CompressorState *cs,
*----------------------
*/
-static size_t
-Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
+static bool
+Gzip_read(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH)
{
gzFile gzfp = (gzFile) CFH->private_data;
- size_t ret;
+ int gzret;
- ret = gzread(gzfp, ptr, size);
- if (ret != size && !gzeof(gzfp))
+ gzret = gzread(gzfp, ptr, size);
+ if (gzret <= 0 && !gzeof(gzfp))
{
int errnum;
const char *errmsg = gzerror(gzfp, &errnum);
@@ -249,15 +249,18 @@ Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH)
errnum == Z_ERRNO ? strerror(errno) : errmsg);
}
- return ret;
+ if (rsize)
+ *rsize = (size_t) gzret;
+
+ return true;
}
-static size_t
+static bool
Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH)
{
gzFile gzfp = (gzFile) CFH->private_data;
- return gzwrite(gzfp, ptr, size);
+ return gzwrite(gzfp, ptr, size) > 0;
}
static int
@@ -287,22 +290,22 @@ Gzip_gets(char *ptr, int size, CompressFileHandle *CFH)
return gzgets(gzfp, ptr, size);
}
-static int
+static bool
Gzip_close(CompressFileHandle *CFH)
{
gzFile gzfp = (gzFile) CFH->private_data;
CFH->private_data = NULL;
- return gzclose(gzfp);
+ return gzclose(gzfp) == Z_OK;
}
-static int
+static bool
Gzip_eof(CompressFileHandle *CFH)
{
gzFile gzfp = (gzFile) CFH->private_data;
- return gzeof(gzfp);
+ return gzeof(gzfp) == 1;
}
static const char *
@@ -319,7 +322,7 @@ Gzip_get_error(CompressFileHandle *CFH)
return errmsg;
}
-static int
+static bool
Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
{
gzFile gzfp;
@@ -342,18 +345,18 @@ Gzip_open(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
gzfp = gzopen(path, mode_compression);
if (gzfp == NULL)
- return 1;
+ return false;
CFH->private_data = gzfp;
- return 0;
+ return true;
}
-static int
+static bool
Gzip_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
{
char *fname;
- int ret;
+ bool ret;
int save_errno;
fname = psprintf("%s.gz", path);
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index ce06f1eac9..8f32cb4385 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -262,7 +262,7 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
}
CFH = InitCompressFileHandle(compression_spec);
- if (CFH->open_func(fname, -1, mode, CFH))
+ if (!CFH->open_func(fname, -1, mode, CFH))
{
free_keep_errno(CFH);
CFH = NULL;
@@ -275,12 +275,12 @@ InitDiscoverCompressFileHandle(const char *path, const char *mode)
/*
* Close an open file handle and release its memory.
*
- * On failure, returns an error value and sets errno appropriately.
+ * On failure, returns false and sets errno appropriately.
*/
-int
+bool
EndCompressFileHandle(CompressFileHandle *CFH)
{
- int ret = 0;
+ bool ret = 0;
if (CFH->private_data)
ret = CFH->close_func(CFH);
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index cdb15951ea..7c2f9b5668 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -100,8 +100,10 @@ struct CompressFileHandle
* Pass either 'path' or 'fd' depending on whether a file path or a file
* descriptor is available. 'mode' can be one of 'r', 'rb', 'w', 'wb',
* 'a', and 'ab'. Requires an already initialized CompressFileHandle.
+ *
+ * Returns true on success and false on error.
*/
- int (*open_func) (const char *path, int fd, const char *mode,
+ bool (*open_func) (const char *path, int fd, const char *mode,
CompressFileHandle *CFH);
/*
@@ -109,19 +111,27 @@ struct CompressFileHandle
*
* 'mode' can be one of 'w', 'wb', 'a', and 'ab'. Requires an already
* initialized CompressFileHandle.
+ *
+ * Returns true on success and false on error.
*/
- int (*open_write_func) (const char *path, const char *mode,
+ bool (*open_write_func) (const char *path, const char *mode,
CompressFileHandle *CFH);
/*
* Read 'size' bytes of data from the file and store them into 'ptr'.
+ * Optionally it will store the number of bytes read in 'rsize'.
+ *
+ * Returns true on success and throws an internal error otherwise.
*/
- size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH);
+ bool (*read_func) (void *ptr, size_t size, size_t *rsize,
+ CompressFileHandle *CFH);
/*
* Write 'size' bytes of data into the file from 'ptr'.
+ *
+ * Returns true on success and false on error.
*/
- size_t (*write_func) (const void *ptr, size_t size,
+ bool (*write_func) (const void *ptr, size_t size,
struct CompressFileHandle *CFH);
/*
@@ -130,28 +140,38 @@ struct CompressFileHandle
*
* Stop if an EOF or a newline is found first. 's' is always null
* terminated and contains the newline if it was found.
+ *
+ * Returns 's' on success, and NULL on error or when end of file occurs
+ * while no characters have been read.
*/
char *(*gets_func) (char *s, int size, CompressFileHandle *CFH);
/*
* Read the next character from the compress file handle as 'unsigned
* char' cast into 'int'.
+ *
+ * Returns the character read on success and throws an internal error
+ * otherwise. It treats EOF as error.
*/
int (*getc_func) (CompressFileHandle *CFH);
/*
* Test if EOF is reached in the compress file handle.
+ *
+ * Returns true if it is reached.
*/
- int (*eof_func) (CompressFileHandle *CFH);
+ bool (*eof_func) (CompressFileHandle *CFH);
/*
* Close an open file handle.
+ *
+ * Returns true on success and false on error.
*/
- int (*close_func) (CompressFileHandle *CFH);
+ bool (*close_func) (CompressFileHandle *CFH);
/*
- * Get a pointer to a string that describes an error that occurred during a
- * compress file handle operation.
+ * Get a pointer to a string that describes an error that occurred during
+ * a compress file handle operation.
*/
const char *(*get_error_func) (CompressFileHandle *CFH);
@@ -178,5 +198,5 @@ extern CompressFileHandle *InitCompressFileHandle(const pg_compress_specificatio
*/
extern CompressFileHandle *InitDiscoverCompressFileHandle(const char *path,
const char *mode);
-extern int EndCompressFileHandle(CompressFileHandle *CFH);
+extern bool EndCompressFileHandle(CompressFileHandle *CFH);
#endif
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 63e794cdc6..278f262162 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -165,7 +165,7 @@ typedef struct LZ4File
* decompressed output in the overflow buffer and the end of the backing file
* is reached.
*/
-static int
+static bool
LZ4File_eof(CompressFileHandle *CFH)
{
LZ4File *fs = (LZ4File *) CFH->private_data;
@@ -192,14 +192,16 @@ LZ4File_get_error(CompressFileHandle *CFH)
*
* It creates the necessary contexts for the operations. When compressing,
* it additionally writes the LZ4 header in the output stream.
+ *
+ * Returns true on success and false on error.
*/
-static int
+static bool
LZ4File_init(LZ4File *fs, int size, bool compressing)
{
size_t status;
if (fs->inited)
- return 0;
+ return true;
fs->compressing = compressing;
fs->inited = true;
@@ -214,7 +216,7 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
if (LZ4F_isError(status))
{
fs->errcode = status;
- return 1;
+ return false;
}
fs->buffer = pg_malloc(fs->buflen);
@@ -224,13 +226,13 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
if (LZ4F_isError(status))
{
fs->errcode = status;
- return 1;
+ return false;
}
if (fwrite(fs->buffer, 1, status, fs->fp) != status)
{
errno = (errno) ? errno : ENOSPC;
- return 1;
+ return false;
}
}
else
@@ -239,7 +241,7 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
if (LZ4F_isError(status))
{
fs->errcode = status;
- return 1;
+ return false;
}
fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
@@ -250,7 +252,7 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
fs->overflowlen = 0;
}
- return 0;
+ return true;
}
/*
@@ -302,15 +304,15 @@ LZ4File_read_overflow(LZ4File *fs, void *ptr, int size, bool eol_flag)
static int
LZ4File_read_internal(LZ4File *fs, void *ptr, int ptrsize, bool eol_flag)
{
- size_t dsize = 0;
- size_t rsize;
- size_t size = ptrsize;
+ int dsize = 0;
+ int rsize;
+ int size = ptrsize;
bool eol_found = false;
void *readbuf;
/* Lazy init */
- if (LZ4File_init(fs, size, false /* decompressing */ ))
+ if (!LZ4File_init(fs, size, false /* decompressing */ ))
return -1;
/* Verify that there is enough space in the outbuf */
@@ -398,17 +400,17 @@ LZ4File_read_internal(LZ4File *fs, void *ptr, int ptrsize, bool eol_flag)
fs->overflowlen += outlen;
}
}
- } while (rsize == size && dsize < size && eol_found == 0);
+ } while (rsize == size && dsize < size && eol_found == false);
pg_free(readbuf);
- return (int) dsize;
+ return dsize;
}
/*
* Compress size bytes from ptr and write them to the stream.
*/
-static size_t
+static bool
LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
{
LZ4File *fs = (LZ4File *) CFH->private_data;
@@ -416,8 +418,8 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
int remaining = size;
/* Lazy init */
- if (LZ4File_init(fs, size, true))
- return -1;
+ if (!LZ4File_init(fs, size, true))
+ return false;
while (remaining > 0)
{
@@ -430,33 +432,35 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
if (LZ4F_isError(status))
{
fs->errcode = status;
- return -1;
+ return false;
}
if (fwrite(fs->buffer, 1, status, fs->fp) != status)
{
errno = (errno) ? errno : ENOSPC;
- return 1;
+ return false;
}
}
- return size;
+ return true;
}
/*
* fread() equivalent implementation for LZ4 compressed files.
*/
-static size_t
-LZ4File_read(void *ptr, size_t size, CompressFileHandle *CFH)
+static bool
+LZ4File_read(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH)
{
LZ4File *fs = (LZ4File *) CFH->private_data;
int ret;
- ret = LZ4File_read_internal(fs, ptr, size, false);
- if (ret != size && !LZ4File_eof(CFH))
+ if ((ret = LZ4File_read_internal(fs, ptr, size, false)) < 0)
pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
- return ret;
+ if (rsize)
+ *rsize = (size_t) ret;
+
+ return true;
}
/*
@@ -468,7 +472,7 @@ LZ4File_getc(CompressFileHandle *CFH)
LZ4File *fs = (LZ4File *) CFH->private_data;
unsigned char c;
- if (LZ4File_read_internal(fs, &c, 1, false) != 1)
+ if (LZ4File_read_internal(fs, &c, 1, false) <= 0)
{
if (!LZ4File_eof(CFH))
pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
@@ -486,14 +490,14 @@ static char *
LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
{
LZ4File *fs = (LZ4File *) CFH->private_data;
- size_t dsize;
+ int ret;
- dsize = LZ4File_read_internal(fs, ptr, size, true);
- if (dsize < 0)
+ ret = LZ4File_read_internal(fs, ptr, size, true);
+ if (ret < 0 || (ret == 0 && !LZ4File_eof(CFH)))
pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
/* Done reading */
- if (dsize == 0)
+ if (ret == 0)
return NULL;
return ptr;
@@ -503,13 +507,12 @@ LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
* Finalize (de)compression of a stream. When compressing it will write any
* remaining content and/or generated footer from the LZ4 API.
*/
-static int
+static bool
LZ4File_close(CompressFileHandle *CFH)
{
FILE *fp;
LZ4File *fs = (LZ4File *) CFH->private_data;
size_t status;
- int ret;
fp = fs->fp;
if (fs->inited)
@@ -520,7 +523,7 @@ LZ4File_close(CompressFileHandle *CFH)
if (LZ4F_isError(status))
pg_fatal("failed to end compression: %s",
LZ4F_getErrorName(status));
- else if ((ret = fwrite(fs->buffer, 1, status, fs->fp)) != status)
+ else if (fwrite(fs->buffer, 1, status, fs->fp) != status)
{
errno = (errno) ? errno : ENOSPC;
WRITE_ERROR_EXIT;
@@ -545,10 +548,10 @@ LZ4File_close(CompressFileHandle *CFH)
pg_free(fs);
- return fclose(fp);
+ return fclose(fp) == 0;
}
-static int
+static bool
LZ4File_open(const char *path, int fd, const char *mode,
CompressFileHandle *CFH)
{
@@ -562,23 +565,27 @@ LZ4File_open(const char *path, int fd, const char *mode,
if (fp == NULL)
{
lz4fp->errcode = errno;
- return 1;
+ return false;
}
lz4fp->fp = fp;
- return 0;
+ return true;
}
-static int
+static bool
LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
{
char *fname;
- int ret;
+ int save_errno;
+ bool ret;
fname = psprintf("%s.lz4", path);
ret = CFH->open_func(fname, -1, mode, CFH);
+
+ save_errno = errno;
pg_free(fname);
+ errno = save_errno;
return ret;
}
diff --git a/src/bin/pg_dump/compress_none.c b/src/bin/pg_dump/compress_none.c
index ecbcf4b04a..18f3514d11 100644
--- a/src/bin/pg_dump/compress_none.c
+++ b/src/bin/pg_dump/compress_none.c
@@ -83,27 +83,36 @@ InitCompressorNone(CompressorState *cs,
* Private routines
*/
-static size_t
-read_none(void *ptr, size_t size, CompressFileHandle *CFH)
+static bool
+read_none(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH)
{
FILE *fp = (FILE *) CFH->private_data;
size_t ret;
if (size == 0)
- return 0;
+ return true;
ret = fread(ptr, 1, size, fp);
if (ret != size && !feof(fp))
pg_fatal("could not read from input file: %s",
strerror(errno));
- return ret;
+ if (rsize)
+ *rsize = ret;
+
+ return true;
}
-static size_t
+static bool
write_none(const void *ptr, size_t size, CompressFileHandle *CFH)
{
- return fwrite(ptr, 1, size, (FILE *) CFH->private_data);
+ size_t ret;
+
+ ret = fwrite(ptr, 1, size, (FILE *) CFH->private_data);
+ if (ret != size)
+ return false;
+
+ return true;
}
static const char *
@@ -136,7 +145,7 @@ getc_none(CompressFileHandle *CFH)
return ret;
}
-static int
+static bool
close_none(CompressFileHandle *CFH)
{
FILE *fp = (FILE *) CFH->private_data;
@@ -147,16 +156,16 @@ close_none(CompressFileHandle *CFH)
if (fp)
ret = fclose(fp);
- return ret;
+ return ret == 0;
}
-static int
+static bool
eof_none(CompressFileHandle *CFH)
{
- return feof((FILE *) CFH->private_data);
+ return feof((FILE *) CFH->private_data) != 0;
}
-static int
+static bool
open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
{
Assert(CFH->private_data == NULL);
@@ -167,21 +176,21 @@ open_none(const char *path, int fd, const char *mode, CompressFileHandle *CFH)
CFH->private_data = fopen(path, mode);
if (CFH->private_data == NULL)
- return 1;
+ return false;
- return 0;
+ return true;
}
-static int
+static bool
open_write_none(const char *path, const char *mode, CompressFileHandle *CFH)
{
Assert(CFH->private_data == NULL);
CFH->private_data = fopen(path, mode);
if (CFH->private_data == NULL)
- return 1;
+ return false;
- return 0;
+ return true;
}
/*
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 3337d34e40..ab77e373e9 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -266,16 +266,13 @@ OpenArchive(const char *FileSpec, const ArchiveFormat fmt)
void
CloseArchive(Archive *AHX)
{
- int res = 0;
ArchiveHandle *AH = (ArchiveHandle *) AHX;
AH->ClosePtr(AH);
/* Close the output */
errno = 0;
- res = EndCompressFileHandle(AH->OF);
-
- if (res != 0)
+ if (!EndCompressFileHandle(AH->OF))
pg_fatal("could not close output file: %m");
}
@@ -1580,7 +1577,7 @@ SetOutput(ArchiveHandle *AH, const char *filename,
CFH = InitCompressFileHandle(compression_spec);
- if (CFH->open_func(filename, fn, mode, CFH))
+ if (!CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
@@ -1600,12 +1597,8 @@ SaveOutput(ArchiveHandle *AH)
static void
RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
{
- int res;
-
errno = 0;
- res = EndCompressFileHandle(AH->OF);
-
- if (res != 0)
+ if (!EndCompressFileHandle(AH->OF))
pg_fatal("could not close output file: %m");
AH->OF = savedOutput;
@@ -1745,7 +1738,8 @@ ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
{
CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
- bytes_written = CFH->write_func(ptr, size * nmemb, CFH);
+ if (CFH->write_func(ptr, size * nmemb, CFH))
+ bytes_written = size * nmemb;
}
if (bytes_written != size * nmemb)
@@ -2294,7 +2288,7 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
CFH = InitCompressFileHandle(out_compress_spec);
- if (CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
+ if (!CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
pg_fatal("could not open stdout for appending: %m");
AH->OF = CFH;
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 41c2b733e3..525dbf9bf0 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -217,7 +217,7 @@ InitArchiveFmt_Directory(ArchiveHandle *AH)
ReadToc(AH);
/* Nothing else in the file, so close it again... */
- if (EndCompressFileHandle(tocFH) != 0)
+ if (!EndCompressFileHandle(tocFH))
pg_fatal("could not close TOC file: %m");
ctx->dataFH = NULL;
}
@@ -328,7 +328,7 @@ _StartData(ArchiveHandle *AH, TocEntry *te)
ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
- if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
+ if (!ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -348,7 +348,7 @@ _WriteData(ArchiveHandle *AH, const void *data, size_t dLen)
CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (dLen > 0 && CFH->write_func(data, dLen, CFH) != dLen)
+ if (dLen > 0 && !CFH->write_func(data, dLen, CFH))
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
@@ -370,7 +370,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
lclContext *ctx = (lclContext *) AH->formatData;
/* Close the file */
- if (EndCompressFileHandle(ctx->dataFH) != 0)
+ if (!EndCompressFileHandle(ctx->dataFH))
pg_fatal("could not close data file: %m");
ctx->dataFH = NULL;
@@ -382,7 +382,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te)
static void
_PrintFileData(ArchiveHandle *AH, char *filename)
{
- size_t cnt;
+ size_t cnt = 0;
char *buf;
size_t buflen;
CompressFileHandle *CFH;
@@ -397,13 +397,13 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
buf = pg_malloc(ZLIB_OUT_SIZE);
buflen = ZLIB_OUT_SIZE;
- while ((cnt = CFH->read_func(buf, buflen, CFH)))
+ while (CFH->read_func(buf, buflen, &cnt, CFH) && cnt > 0)
{
ahwrite(buf, 1, cnt, AH);
}
free(buf);
- if (EndCompressFileHandle(CFH) != 0)
+ if (!EndCompressFileHandle(CFH))
pg_fatal("could not close data file \"%s\": %m", filename);
}
@@ -468,7 +468,7 @@ _LoadLOs(ArchiveHandle *AH)
pg_fatal("error reading large object TOC file \"%s\"",
tocfname);
- if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
+ if (!EndCompressFileHandle(ctx->LOsTocFH))
pg_fatal("could not close large object TOC file \"%s\": %m",
tocfname);
@@ -491,7 +491,7 @@ _WriteByte(ArchiveHandle *AH, const int i)
CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (CFH->write_func(&c, 1, CFH) != 1)
+ if (!CFH->write_func(&c, 1, CFH))
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
@@ -529,7 +529,7 @@ _WriteBuf(ArchiveHandle *AH, const void *buf, size_t len)
CompressFileHandle *CFH = ctx->dataFH;
errno = 0;
- if (CFH->write_func(buf, len, CFH) != len)
+ if (!CFH->write_func(buf, len, CFH))
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
@@ -554,7 +554,7 @@ _ReadBuf(ArchiveHandle *AH, void *buf, size_t len)
* If there was an I/O error, we already exited in readF(), so here we
* exit on short reads.
*/
- if (CFH->read_func(buf, len, CFH) != len)
+ if (!CFH->read_func(buf, len, NULL, CFH))
pg_fatal("could not read from input file: end of file");
}
@@ -589,7 +589,7 @@ _CloseArchive(ArchiveHandle *AH)
/* The TOC is always created uncompressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
tocFH = InitCompressFileHandle(compression_spec);
- if (tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
+ if (!tocFH->open_write_func(fname, PG_BINARY_W, tocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
ctx->dataFH = tocFH;
@@ -602,7 +602,7 @@ _CloseArchive(ArchiveHandle *AH)
WriteHead(AH);
AH->format = archDirectory;
WriteToc(AH);
- if (EndCompressFileHandle(tocFH) != 0)
+ if (!EndCompressFileHandle(tocFH))
pg_fatal("could not close TOC file: %m");
WriteDataChunks(AH, ctx->pstate);
@@ -654,7 +654,7 @@ _StartLOs(ArchiveHandle *AH, TocEntry *te)
/* The LO TOC file is never compressed */
compression_spec.algorithm = PG_COMPRESSION_NONE;
ctx->LOsTocFH = InitCompressFileHandle(compression_spec);
- if (ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
+ if (!ctx->LOsTocFH->open_write_func(fname, "ab", ctx->LOsTocFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -672,7 +672,7 @@ _StartLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
snprintf(fname, MAXPGPATH, "%s/blob_%u.dat", ctx->directory, oid);
ctx->dataFH = InitCompressFileHandle(AH->compression_spec);
- if (ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
+ if (!ctx->dataFH->open_write_func(fname, PG_BINARY_W, ctx->dataFH))
pg_fatal("could not open output file \"%s\": %m", fname);
}
@@ -690,13 +690,13 @@ _EndLO(ArchiveHandle *AH, TocEntry *te, Oid oid)
int len;
/* Close the BLOB data file itself */
- if (EndCompressFileHandle(ctx->dataFH) != 0)
+ if (!EndCompressFileHandle(ctx->dataFH))
pg_fatal("could not close LO data file: %m");
ctx->dataFH = NULL;
/* register the LO in blobs.toc */
len = snprintf(buf, sizeof(buf), "%u blob_%u.dat\n", oid, oid);
- if (CFH->write_func(buf, len, CFH) != len)
+ if (!CFH->write_func(buf, len, CFH))
pg_fatal("could not write to LOs TOC file");
}
@@ -710,7 +710,7 @@ _EndLOs(ArchiveHandle *AH, TocEntry *te)
{
lclContext *ctx = (lclContext *) AH->formatData;
- if (EndCompressFileHandle(ctx->LOsTocFH) != 0)
+ if (!EndCompressFileHandle(ctx->LOsTocFH))
pg_fatal("could not close LOs TOC file: %m");
ctx->LOsTocFH = NULL;
}
--
2.39.2
v4-0002-Unify-buffer-sizes-in-pg_dump-compression-API.patchtext/x-patch; charset=UTF-8; name=v4-0002-Unify-buffer-sizes-in-pg_dump-compression-API.patchDownload
From e299fca9d50b6cab203dc02db8c04d904fd00c82 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 17 Mar 2023 15:06:22 +0000
Subject: [PATCH v4 2/3] Unify buffer sizes in pg_dump compression API
Prior to the introduction of the compression API in e9960732a9, pg_dump
would use the ZLIB_IN_SIZE/ZLIB_OUT_SIZE to size input/output buffers.
Commit 0da243fed0 introduced similar constants for LZ4, but while gzip
defined both buffers to be 4kB, LZ4 used 4kB and 16kB without any clear
reasoning why that's desirable.
Furthermore, parts of the code unaware of which compression is used
(e.g. pg_backup_directory.c) continued to use ZLIB_OUT_SIZE directly.
Simplify by replacing the various constants with DEFAULT_IO_BUFFER_SIZE,
set to 4kB. The compression implementations still have an option to use
a custom value, but considering 4kB was fine for 20+ years, I find that
unlikely (and we'd probably just increase the default buffer size).
Author: Georgios Kokolatos
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/33496f7c-3449-1426-d568-63f6bca2ac1f@gmail.com
---
src/bin/pg_dump/compress_gzip.c | 22 +++++++++++-----------
src/bin/pg_dump/compress_io.h | 5 ++---
src/bin/pg_dump/compress_lz4.c | 11 ++++-------
src/bin/pg_dump/compress_none.c | 4 ++--
src/bin/pg_dump/pg_backup_directory.c | 4 ++--
5 files changed, 21 insertions(+), 25 deletions(-)
diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index d9c3969332..cec0b41fce 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -120,8 +120,8 @@ WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
* actually allocate one extra byte because some routines want to
* append a trailing zero byte to the zlib output.
*/
- gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
- gzipcs->outsize = ZLIB_OUT_SIZE;
+ gzipcs->outsize = DEFAULT_IO_BUFFER_SIZE;
+ gzipcs->outbuf = pg_malloc(gzipcs->outsize + 1);
/*
* A level of zero simply copies the input one block at the time. This
@@ -158,10 +158,10 @@ ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
zp->zfree = Z_NULL;
zp->opaque = Z_NULL;
- buf = pg_malloc(ZLIB_IN_SIZE);
- buflen = ZLIB_IN_SIZE;
+ buflen = DEFAULT_IO_BUFFER_SIZE;
+ buf = pg_malloc(buflen);
- out = pg_malloc(ZLIB_OUT_SIZE + 1);
+ out = pg_malloc(DEFAULT_IO_BUFFER_SIZE + 1);
if (inflateInit(zp) != Z_OK)
pg_fatal("could not initialize compression library: %s",
@@ -176,14 +176,14 @@ ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
while (zp->avail_in > 0)
{
zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
+ zp->avail_out = DEFAULT_IO_BUFFER_SIZE;
res = inflate(zp, 0);
if (res != Z_OK && res != Z_STREAM_END)
pg_fatal("could not uncompress data: %s", zp->msg);
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ out[DEFAULT_IO_BUFFER_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, DEFAULT_IO_BUFFER_SIZE - zp->avail_out, AH);
}
}
@@ -192,13 +192,13 @@ ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
while (res != Z_STREAM_END)
{
zp->next_out = (void *) out;
- zp->avail_out = ZLIB_OUT_SIZE;
+ zp->avail_out = DEFAULT_IO_BUFFER_SIZE;
res = inflate(zp, 0);
if (res != Z_OK && res != Z_STREAM_END)
pg_fatal("could not uncompress data: %s", zp->msg);
- out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
- ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+ out[DEFAULT_IO_BUFFER_SIZE - zp->avail_out] = '\0';
+ ahwrite(out, 1, DEFAULT_IO_BUFFER_SIZE - zp->avail_out, AH);
}
if (inflateEnd(zp) != Z_OK)
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 7c2f9b5668..fd8752db0d 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -17,9 +17,8 @@
#include "pg_backup_archiver.h"
-/* Initial buffer sizes used in zlib compression. */
-#define ZLIB_OUT_SIZE 4096
-#define ZLIB_IN_SIZE 4096
+/* Default size used for IO buffers */
+#define DEFAULT_IO_BUFFER_SIZE 4096
extern char *supports_compression(const pg_compress_specification compression_spec);
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 278f262162..2f3e552f51 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -20,9 +20,6 @@
#include <lz4.h>
#include <lz4frame.h>
-#define LZ4_OUT_SIZE (4 * 1024)
-#define LZ4_IN_SIZE (16 * 1024)
-
/*
* LZ4F_HEADER_SIZE_MAX first appeared in v1.7.5 of the library.
* Redefine it for installations with a lesser version.
@@ -57,7 +54,7 @@ ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
size_t buflen;
size_t cnt;
- buflen = LZ4_IN_SIZE;
+ buflen = DEFAULT_IO_BUFFER_SIZE;
buf = pg_malloc(buflen);
decbuf = pg_malloc(buflen);
@@ -208,7 +205,7 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
if (fs->compressing)
{
- fs->buflen = LZ4F_compressBound(LZ4_IN_SIZE, &fs->prefs);
+ fs->buflen = LZ4F_compressBound(DEFAULT_IO_BUFFER_SIZE, &fs->prefs);
if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
fs->buflen = LZ4F_HEADER_SIZE_MAX;
@@ -244,7 +241,7 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
return false;
}
- fs->buflen = size > LZ4_OUT_SIZE ? size : LZ4_OUT_SIZE;
+ fs->buflen = Max(size, DEFAULT_IO_BUFFER_SIZE);
fs->buffer = pg_malloc(fs->buflen);
fs->overflowalloclen = fs->buflen;
@@ -423,7 +420,7 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
while (remaining > 0)
{
- int chunk = remaining < LZ4_IN_SIZE ? remaining : LZ4_IN_SIZE;
+ int chunk = Min(remaining, DEFAULT_IO_BUFFER_SIZE);
remaining -= chunk;
diff --git a/src/bin/pg_dump/compress_none.c b/src/bin/pg_dump/compress_none.c
index 18f3514d11..736a7957bc 100644
--- a/src/bin/pg_dump/compress_none.c
+++ b/src/bin/pg_dump/compress_none.c
@@ -33,8 +33,8 @@ ReadDataFromArchiveNone(ArchiveHandle *AH, CompressorState *cs)
char *buf;
size_t buflen;
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
+ buflen = DEFAULT_IO_BUFFER_SIZE;
+ buf = pg_malloc(buflen);
while ((cnt = cs->readF(AH, &buf, &buflen)))
{
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index 525dbf9bf0..abaaa3b10e 100644
--- a/src/bin/pg_dump/pg_backup_directory.c
+++ b/src/bin/pg_dump/pg_backup_directory.c
@@ -394,8 +394,8 @@ _PrintFileData(ArchiveHandle *AH, char *filename)
if (!CFH)
pg_fatal("could not open input file \"%s\": %m", filename);
- buf = pg_malloc(ZLIB_OUT_SIZE);
- buflen = ZLIB_OUT_SIZE;
+ buflen = DEFAULT_IO_BUFFER_SIZE;
+ buf = pg_malloc(buflen);
while (CFH->read_func(buf, buflen, &cnt, CFH) && cnt > 0)
{
--
2.39.2
v4-0003-Minor-comment-improvements-for-compress_lz4.patchtext/x-patch; charset=UTF-8; name=v4-0003-Minor-comment-improvements-for-compress_lz4.patchDownload
From d94190973e9deb064abc2e75ca296253b46be151 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 17 Mar 2023 15:29:05 +0000
Subject: [PATCH v4 3/3] Minor comment improvements for compress_lz4
Author: Tomas Vondra
Reviewed-by: Georgios Kokolatos
Discussion: https://postgr.es/m/33496f7c-3449-1426-d568-63f6bca2ac1f@gmail.com
---
src/bin/pg_dump/compress_lz4.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 2f3e552f51..fc2f4e116d 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -185,12 +185,15 @@ LZ4File_get_error(CompressFileHandle *CFH)
}
/*
- * Prepare an already alloc'ed LZ4File struct for subsequent calls.
+ * Prepare an already alloc'ed LZ4File struct for subsequent calls (either
+ * compression or decompression).
*
- * It creates the necessary contexts for the operations. When compressing,
- * it additionally writes the LZ4 header in the output stream.
+ * It creates the necessary contexts for the operations. When compressing data
+ * (indicated by compressing=true), it additionally writes the LZ4 header in the
+ * output stream.
*
- * Returns true on success and false on error.
+ * Returns true on success. In case of a failure returns false, and stores the
+ * error code in fs->errcode.
*/
static bool
LZ4File_init(LZ4File *fs, int size, bool compressing)
@@ -203,9 +206,15 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
fs->compressing = compressing;
fs->inited = true;
+ /* When compressing, write LZ4 header to the output stream. */
if (fs->compressing)
{
fs->buflen = LZ4F_compressBound(DEFAULT_IO_BUFFER_SIZE, &fs->prefs);
+
+ /*
+ * LZ4F_compressBegin requires a buffer that is greater or equal to
+ * LZ4F_HEADER_SIZE_MAX. Verify that the requirement is met.
+ */
if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
fs->buflen = LZ4F_HEADER_SIZE_MAX;
@@ -255,9 +264,12 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
/*
* Read already decompressed content from the overflow buffer into 'ptr' up to
* 'size' bytes, if available. If the eol_flag is set, then stop at the first
- * occurrence of the new line char prior to 'size' bytes.
+ * occurrence of the newline char prior to 'size' bytes.
*
* Any unread content in the overflow buffer is moved to the beginning.
+ *
+ * Returns the number of bytes read from the overflow buffer (and copied into
+ * the 'ptr' buffer), or 0 if the overflow buffer is empty.
*/
static int
LZ4File_read_overflow(LZ4File *fs, void *ptr, int size, bool eol_flag)
@@ -297,6 +309,9 @@ LZ4File_read_overflow(LZ4File *fs, void *ptr, int size, bool eol_flag)
* at an overflow buffer within LZ4File. Of course, when the function is
* called, it will first try to consume any decompressed content already
* present in the overflow buffer, before decompressing new content.
+ *
+ * Returns the number of bytes of decompressed data copied into the ptr
+ * buffer, or -1 in case of error.
*/
static int
LZ4File_read_internal(LZ4File *fs, void *ptr, int ptrsize, bool eol_flag)
--
2.39.2
On Fri, Mar 17, 2023 at 03:43:58PM +0000, gkokolatos@pm.me wrote:
From a174cdff4ec8aad59f5bcc7e8d52218a14fe56fc Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 17 Mar 2023 14:45:58 +0000
Subject: [PATCH v3 1/3] Improve type handling in pg_dump's compress file API
-int +bool EndCompressFileHandle(CompressFileHandle *CFH) { - int ret = 0; + bool ret = 0;
Should say "= false" ?
/* * Write 'size' bytes of data into the file from 'ptr'. + * + * Returns true on success and false on error. + */ + bool (*write_func) (const void *ptr, size_t size,
- * Get a pointer to a string that describes an error that occurred during a - * compress file handle operation. + * Get a pointer to a string that describes an error that occurred during + * a compress file handle operation. */ const char *(*get_error_func) (CompressFileHandle *CFH);
This should mention that the error accessible in error_func() applies (only) to
write_func() ?
As long as this touches pg_backup_directory.c you could update the
header comment to refer to "compressed extensions", not just .gz.
I noticed that EndCompressorLZ4() tests "if (LZ4cs)", but that should
always be true.
I was able to convert the zstd patch to this new API with no issue.
--
Justin
Hi,
I looked at this again, and I realized I misunderstood the bit about
errno in LZ4File_open_write a bit. I now see it simply just brings the
function in line with Gzip_open_write(), so that the callers can just do
pg_fatal("%m"). I still think the special "errno" handling in this one
place feels a bit random, and handling it by get_error_func() would be
nicer, but we can leave that for a separate patch - no need to block
these changes because of that.
So pushed all three parts, after updating the commit messages a bit.
This leaves the empty-data issue (which we have a fix for) and the
switch to LZ4F. And then the zstd part.
On 3/20/23 23:40, Justin Pryzby wrote:
On Fri, Mar 17, 2023 at 03:43:58PM +0000, gkokolatos@pm.me wrote:
From a174cdff4ec8aad59f5bcc7e8d52218a14fe56fc Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 17 Mar 2023 14:45:58 +0000
Subject: [PATCH v3 1/3] Improve type handling in pg_dump's compress file API-int +bool EndCompressFileHandle(CompressFileHandle *CFH) { - int ret = 0; + bool ret = 0;Should say "= false" ?
Right, fixed.
/* * Write 'size' bytes of data into the file from 'ptr'. + * + * Returns true on success and false on error. + */ + bool (*write_func) (const void *ptr, size_t size,- * Get a pointer to a string that describes an error that occurred during a - * compress file handle operation. + * Get a pointer to a string that describes an error that occurred during + * a compress file handle operation. */ const char *(*get_error_func) (CompressFileHandle *CFH);This should mention that the error accessible in error_func() applies (only) to
write_func() ?As long as this touches pg_backup_directory.c you could update the
header comment to refer to "compressed extensions", not just .gz.I noticed that EndCompressorLZ4() tests "if (LZ4cs)", but that should
always be true.
I haven't done these two things. We can/should do that, but it didn't
fit into the three patches.
I was able to convert the zstd patch to this new API with no issue.
Good to hear.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
------- Original Message -------
On Thursday, March 23rd, 2023 at 6:10 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
So pushed all three parts, after updating the commit messages a bit.
Thank you very much.
This leaves the empty-data issue (which we have a fix for) and the
switch to LZ4F. And then the zstd part.
Please expect promptly a patch for the switch to frames.
Cheers,
//Georgios
------- Original Message -------
On Thursday, March 16th, 2023 at 11:30 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 3/16/23 01:20, Justin Pryzby wrote:
On Mon, Mar 13, 2023 at 10:47:12PM +0100, Tomas Vondra wrote:
Thanks. I don't want to annoy you too much, but could you split the
patch into the "empty-data" fix and all the other changes (rearranging
functions etc.)? I'd rather not mix those in the same commit.I don't know if that makes sense? The "empty-data" fix creates a new
function called DeflateCompressorInit(). My proposal was to add the new
function in the same place in the file as it used to be.Got it. In that case I agree it's fine to do that in a single commit.
For what is worth, I think that this patch should get a +1 and get in. It
solves the empty writes problem and includes a test to a previous untested
case.
Cheers,
//Georgios
Show quoted text
The patch also moves the pg_fatal() that's being removed. I don't think
it's going to look any cleaner to read a history involving the
pg_fatal() first being added, then moved, then removed. Anyway, I'll
wait while the community continues discussion about the pg_fatal().I think the agreement was to replace the pg_fatal with and assert, and I
see your patch already does that.regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
------- Original Message -------
On Friday, March 24th, 2023 at 10:30 AM, gkokolatos@pm.me <gkokolatos@pm.me> wrote:
------- Original Message -------
On Thursday, March 23rd, 2023 at 6:10 PM, Tomas Vondra tomas.vondra@enterprisedb.com wrote:This leaves the empty-data issue (which we have a fix for) and the
switch to LZ4F. And then the zstd part.Please expect promptly a patch for the switch to frames.
Please find the expected patch attached. Note that the bulk of the
patch is code unification, variable renaming to something more
appropriate, and comment addition. These are changes that are not
strictly necessary to switch to LZ4F. I do believe that are
essential for code hygiene after the switch and they do belong
on the same commit.
Cheers,
//Georgios
Show quoted text
Cheers,
//Georgios
Attachments:
v1-0001-Use-LZ4-frames-in-pg_dump-s-compressor-API.patchtext/x-patch; name=v1-0001-Use-LZ4-frames-in-pg_dump-s-compressor-API.patchDownload
From c289fb8d49b680ad180a44b20fff1dc9553b7494 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Tue, 28 Mar 2023 15:48:06 +0000
Subject: [PATCH v1] Use LZ4 frames in pg_dump's compressor API.
This change allows for greater compaction of data, especially so in very narrow
relations, by avoiding at least a compaction header and footer per row. Since
LZ4 frames are now used by both compression APIs, some code deduplication
opportunities have become obvious and are also implemented.
Reported by: Justin Pryzby
---
src/bin/pg_dump/compress_lz4.c | 358 ++++++++++++++++++++++-----------
1 file changed, 244 insertions(+), 114 deletions(-)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index fc2f4e116d..078dc35cd6 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -17,7 +17,6 @@
#include "compress_lz4.h"
#ifdef USE_LZ4
-#include <lz4.h>
#include <lz4frame.h>
/*
@@ -29,102 +28,279 @@
#endif
/*----------------------
- * Compressor API
- *----------------------
+ * Common to both APIs
*/
-typedef struct LZ4CompressorState
+/*
+ * State used for LZ4 (de)compression by both APIs.
+ */
+typedef struct LZ4State
{
- char *outbuf;
- size_t outsize;
-} LZ4CompressorState;
+ /*
+ * Used by the File API to keep track of the file stream.
+ */
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ /*
+ * Used by the File API's lazy initialization.
+ */
+ bool inited;
+
+ /*
+ * Used by the File API to distinguish between compression
+ * and decompression operations.
+ */
+ bool compressing;
+
+ /*
+ * Used by the Compressor API to mark if the compression
+ * headers have been written after initialization.
+ */
+ bool needs_header_flush;
+
+ size_t buflen;
+ char *buffer;
+
+ /*
+ * Used by the File API to store already uncompressed
+ * data that the caller has not consumed.
+ */
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ /*
+ * Used by both APIs to keep track of the compressed
+ * data length stored in the buffer.
+ */
+ size_t compressedlen;
+
+ /*
+ * Used by both APIs to keep track of error codes.
+ */
+ size_t errcode;
+} LZ4State;
+
+/*
+ * Initialize the required LZ4State members for compression. Write the LZ4 frame
+ * header in a buffer keeping track of its length. Users of this function can
+ * choose when and how to write the header to a file stream.
+ *
+ * Returns true on success. In case of a failure returns false, and stores the
+ * error code in state->errcode.
+ */
+static bool
+LZ4_compression_state_init(LZ4State *state)
+{
+ size_t status;
+
+ state->buflen = LZ4F_compressBound(DEFAULT_IO_BUFFER_SIZE, &state->prefs);
+
+ /*
+ * LZ4F_compressBegin requires a buffer that is greater or equal to
+ * LZ4F_HEADER_SIZE_MAX. Verify that the requirement is met.
+ */
+ if (state->buflen < LZ4F_HEADER_SIZE_MAX)
+ state->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&state->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ state->errcode = status;
+ return false;
+ }
+
+ state->buffer = pg_malloc(state->buflen);
+ status = LZ4F_compressBegin(state->ctx,
+ state->buffer, state->buflen,
+ &state->prefs);
+ if (LZ4F_isError(status))
+ {
+ state->errcode = status;
+ return false;
+ }
+
+ state->compressedlen = status;
+
+ return true;
+}
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
/* Private routines that support LZ4 compressed data I/O */
-static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
-static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
-static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
static void
ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
{
- LZ4_streamDecode_t lz4StreamDecode;
- char *buf;
- char *decbuf;
- size_t buflen;
- size_t cnt;
-
- buflen = DEFAULT_IO_BUFFER_SIZE;
- buf = pg_malloc(buflen);
- decbuf = pg_malloc(buflen);
+ size_t r;
+ size_t readbuflen;
+ char *outbuf;
+ char *readbuf;
+ LZ4F_decompressionContext_t ctx = NULL;
+ LZ4F_decompressOptions_t dec_opt;
+ LZ4F_errorCode_t status;
+
+ memset(&dec_opt, 0, sizeof(dec_opt));
+ status = LZ4F_createDecompressionContext(&ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ pg_fatal("could not create LZ4 decompression context: %s",
+ LZ4F_getErrorName(status));
+
+ outbuf = pg_malloc0(DEFAULT_IO_BUFFER_SIZE);
+ readbuf = pg_malloc0(DEFAULT_IO_BUFFER_SIZE);
+ readbuflen = DEFAULT_IO_BUFFER_SIZE;
+ while ((r = cs->readF(AH, &readbuf, &readbuflen)) > 0)
+ {
+ char *readp;
+ char *readend;
- LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+ /* Process one chunk */
+ readp = readbuf;
+ readend = readbuf + r;
+ while (readp < readend)
+ {
+ size_t out_size = DEFAULT_IO_BUFFER_SIZE;
+ size_t read_size = readend - readp;
- while ((cnt = cs->readF(AH, &buf, &buflen)))
- {
- int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
- buf, decbuf,
- cnt, buflen);
+ memset(outbuf, 0, DEFAULT_IO_BUFFER_SIZE);
+ status = LZ4F_decompress(ctx, outbuf, &out_size,
+ readp, &read_size, &dec_opt);
+ if (LZ4F_isError(status))
+ pg_fatal("could not decompress: %s",
+ LZ4F_getErrorName(status));
- ahwrite(decbuf, 1, decBytes, AH);
+ ahwrite(outbuf, 1, out_size, AH);
+ readp += read_size;
+ }
}
- pg_free(buf);
- pg_free(decbuf);
+ pg_free(outbuf);
+ pg_free(readbuf);
+
+ status = LZ4F_freeDecompressionContext(ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("could not free LZ4 decompression context: %s",
+ LZ4F_getErrorName(status));
}
static void
WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen)
{
- LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
- size_t compressed;
- size_t requiredsize = LZ4_compressBound(dLen);
+ LZ4State *state = (LZ4State *) cs->private_data;
+ size_t remaining = dLen;
+ size_t status;
+ size_t chunk;
- if (requiredsize > LZ4cs->outsize)
+ /* Write the header if not yet written. */
+ if (state->needs_header_flush)
{
- LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
- LZ4cs->outsize = requiredsize;
+ cs->writeF(AH, state->buffer, state->compressedlen);
+ state->needs_header_flush = false;
}
- compressed = LZ4_compress_default(data, LZ4cs->outbuf,
- dLen, LZ4cs->outsize);
+ while (remaining > 0)
+ {
+
+ if (remaining > DEFAULT_IO_BUFFER_SIZE)
+ chunk = DEFAULT_IO_BUFFER_SIZE;
+ else
+ chunk = remaining;
+
+ remaining -= chunk;
+ status = LZ4F_compressUpdate(state->ctx,
+ state->buffer, state->buflen,
+ data, chunk, NULL);
+
+ if (LZ4F_isError(status))
+ pg_fatal("failed to LZ4 compress data: %s",
+ LZ4F_getErrorName(status));
- if (compressed <= 0)
- pg_fatal("failed to LZ4 compress data");
+ cs->writeF(AH, state->buffer, status);
- cs->writeF(AH, LZ4cs->outbuf, compressed);
+ data = ((char *) data) + chunk;
+ }
}
static void
EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
{
- LZ4CompressorState *LZ4cs;
-
- LZ4cs = (LZ4CompressorState *) cs->private_data;
- if (LZ4cs)
- {
- pg_free(LZ4cs->outbuf);
- pg_free(LZ4cs);
- cs->private_data = NULL;
- }
+ LZ4State *state = (LZ4State *) cs->private_data;
+ size_t status;
+
+ /* Nothing needs to be done */
+ if (!state)
+ return;
+
+ /*
+ * Write the header if not yet written. The caller is not required to
+ * call writeData if the relation does not contain any data. Thus it is
+ * possible to reach here without having flushed the header. Do it before
+ * ending the compression.
+ */
+ if (state->needs_header_flush)
+ cs->writeF(AH, state->buffer, state->compressedlen);
+
+ status = LZ4F_compressEnd(state->ctx,
+ state->buffer, state->buflen,
+ NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+
+ cs->writeF(AH, state->buffer, status);
+
+ status = LZ4F_freeCompressionContext(state->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+
+ pg_free(state->buffer);
+ pg_free(state);
+
+ cs->private_data = NULL;
}
-
/*
* Public routines that support LZ4 compressed data I/O
*/
void
InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
{
+ LZ4State *state;
+
cs->readData = ReadDataFromArchiveLZ4;
cs->writeData = WriteDataToArchiveLZ4;
cs->end = EndCompressorLZ4;
cs->compression_spec = compression_spec;
- /* Will be lazy init'd */
- cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+ /*
+ * Read operations have access to the whole input. No state needs
+ * to be carried between calls.
+ */
+ if (cs->readF)
+ return;
+
+ state = pg_malloc0(sizeof(*state));
+ if (cs->compression_spec.level >= 0)
+ state->prefs.compressionLevel = cs->compression_spec.level;
+
+ if (!LZ4_compression_state_init(state))
+ pg_fatal("could not initialize LZ4 compression: %s",
+ LZ4F_getErrorName(state->errcode));
+
+ /* Remember that the header has not been written. */
+ state->needs_header_flush = true;
+ cs->private_data = state;
}
/*----------------------
@@ -132,30 +308,6 @@ InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compressi
*----------------------
*/
-/*
- * State needed for LZ4 (de)compression using the CompressFileHandle API.
- */
-typedef struct LZ4File
-{
- FILE *fp;
-
- LZ4F_preferences_t prefs;
-
- LZ4F_compressionContext_t ctx;
- LZ4F_decompressionContext_t dtx;
-
- bool inited;
- bool compressing;
-
- size_t buflen;
- char *buffer;
-
- size_t overflowalloclen;
- size_t overflowlen;
- char *overflowbuf;
-
- size_t errcode;
-} LZ4File;
/*
* LZ4 equivalent to feof() or gzeof(). Return true iff there is no
@@ -165,7 +317,7 @@ typedef struct LZ4File
static bool
LZ4File_eof(CompressFileHandle *CFH)
{
- LZ4File *fs = (LZ4File *) CFH->private_data;
+ LZ4State *fs = (LZ4State *) CFH->private_data;
return fs->overflowlen == 0 && feof(fs->fp);
}
@@ -173,7 +325,7 @@ LZ4File_eof(CompressFileHandle *CFH)
static const char *
LZ4File_get_error(CompressFileHandle *CFH)
{
- LZ4File *fs = (LZ4File *) CFH->private_data;
+ LZ4State *fs = (LZ4State *) CFH->private_data;
const char *errmsg;
if (LZ4F_isError(fs->errcode))
@@ -185,7 +337,7 @@ LZ4File_get_error(CompressFileHandle *CFH)
}
/*
- * Prepare an already alloc'ed LZ4File struct for subsequent calls (either
+ * Prepare an already alloc'ed LZ4State struct for subsequent calls (either
* compression or decompression).
*
* It creates the necessary contexts for the operations. When compressing data
@@ -196,7 +348,7 @@ LZ4File_get_error(CompressFileHandle *CFH)
* error code in fs->errcode.
*/
static bool
-LZ4File_init(LZ4File *fs, int size, bool compressing)
+LZ4File_init(LZ4State *fs, int size, bool compressing)
{
size_t status;
@@ -209,33 +361,11 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
/* When compressing, write LZ4 header to the output stream. */
if (fs->compressing)
{
- fs->buflen = LZ4F_compressBound(DEFAULT_IO_BUFFER_SIZE, &fs->prefs);
- /*
- * LZ4F_compressBegin requires a buffer that is greater or equal to
- * LZ4F_HEADER_SIZE_MAX. Verify that the requirement is met.
- */
- if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
- fs->buflen = LZ4F_HEADER_SIZE_MAX;
-
- status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
- if (LZ4F_isError(status))
- {
- fs->errcode = status;
- return false;
- }
-
- fs->buffer = pg_malloc(fs->buflen);
- status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
- &fs->prefs);
-
- if (LZ4F_isError(status))
- {
- fs->errcode = status;
+ if (!LZ4_compression_state_init(fs))
return false;
- }
- if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ if (fwrite(fs->buffer, 1, fs->compressedlen, fs->fp) != fs->compressedlen)
{
errno = (errno) ? errno : ENOSPC;
return false;
@@ -272,7 +402,7 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
* the 'ptr' buffer), or 0 if the overflow buffer is empty.
*/
static int
-LZ4File_read_overflow(LZ4File *fs, void *ptr, int size, bool eol_flag)
+LZ4File_read_overflow(LZ4State *fs, void *ptr, int size, bool eol_flag)
{
char *p;
int readlen = 0;
@@ -306,7 +436,7 @@ LZ4File_read_overflow(LZ4File *fs, void *ptr, int size, bool eol_flag)
* char if found first when the eol_flag is set. It is possible that the
* decompressed output generated by reading any compressed input via the
* LZ4F API, exceeds 'ptrsize'. Any exceeding decompressed content is stored
- * at an overflow buffer within LZ4File. Of course, when the function is
+ * at an overflow buffer within LZ4State. Of course, when the function is
* called, it will first try to consume any decompressed content already
* present in the overflow buffer, before decompressing new content.
*
@@ -314,7 +444,7 @@ LZ4File_read_overflow(LZ4File *fs, void *ptr, int size, bool eol_flag)
* buffer, or -1 in case of error.
*/
static int
-LZ4File_read_internal(LZ4File *fs, void *ptr, int ptrsize, bool eol_flag)
+LZ4File_read_internal(LZ4State *fs, void *ptr, int ptrsize, bool eol_flag)
{
int dsize = 0;
int rsize;
@@ -425,7 +555,7 @@ LZ4File_read_internal(LZ4File *fs, void *ptr, int ptrsize, bool eol_flag)
static bool
LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
{
- LZ4File *fs = (LZ4File *) CFH->private_data;
+ LZ4State *fs = (LZ4State *) CFH->private_data;
size_t status;
int remaining = size;
@@ -463,7 +593,7 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
static bool
LZ4File_read(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH)
{
- LZ4File *fs = (LZ4File *) CFH->private_data;
+ LZ4State *fs = (LZ4State *) CFH->private_data;
int ret;
if ((ret = LZ4File_read_internal(fs, ptr, size, false)) < 0)
@@ -481,7 +611,7 @@ LZ4File_read(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH)
static int
LZ4File_getc(CompressFileHandle *CFH)
{
- LZ4File *fs = (LZ4File *) CFH->private_data;
+ LZ4State *fs = (LZ4State *) CFH->private_data;
unsigned char c;
if (LZ4File_read_internal(fs, &c, 1, false) <= 0)
@@ -501,7 +631,7 @@ LZ4File_getc(CompressFileHandle *CFH)
static char *
LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
{
- LZ4File *fs = (LZ4File *) CFH->private_data;
+ LZ4State *fs = (LZ4State *) CFH->private_data;
int ret;
ret = LZ4File_read_internal(fs, ptr, size, true);
@@ -523,7 +653,7 @@ static bool
LZ4File_close(CompressFileHandle *CFH)
{
FILE *fp;
- LZ4File *fs = (LZ4File *) CFH->private_data;
+ LZ4State *fs = (LZ4State *) CFH->private_data;
size_t status;
fp = fs->fp;
@@ -568,7 +698,7 @@ LZ4File_open(const char *path, int fd, const char *mode,
CompressFileHandle *CFH)
{
FILE *fp;
- LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+ LZ4State *lz4fp = (LZ4State *) CFH->private_data;
if (fd >= 0)
fp = fdopen(fd, mode);
@@ -609,7 +739,7 @@ void
InitCompressFileHandleLZ4(CompressFileHandle *CFH,
const pg_compress_specification compression_spec)
{
- LZ4File *lz4fp;
+ LZ4State *lz4fp;
CFH->open_func = LZ4File_open;
CFH->open_write_func = LZ4File_open_write;
--
2.34.1
On 3/28/23 18:07, gkokolatos@pm.me wrote:
------- Original Message -------
On Friday, March 24th, 2023 at 10:30 AM, gkokolatos@pm.me <gkokolatos@pm.me> wrote:------- Original Message -------
On Thursday, March 23rd, 2023 at 6:10 PM, Tomas Vondra tomas.vondra@enterprisedb.com wrote:This leaves the empty-data issue (which we have a fix for) and the
switch to LZ4F. And then the zstd part.Please expect promptly a patch for the switch to frames.
Please find the expected patch attached. Note that the bulk of the
patch is code unification, variable renaming to something more
appropriate, and comment addition. These are changes that are not
strictly necessary to switch to LZ4F. I do believe that are
essential for code hygiene after the switch and they do belong
on the same commit.
Thanks!
I agree the renames & cleanup are appropriate - it'd be silly to stick
to misleading naming etc. Would it make sense to split the patch into
two, to separate the renames and the switch to lz4f?
That'd make it the changes necessary for lz4f switch clearer.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Tue, Mar 28, 2023 at 06:40:03PM +0200, Tomas Vondra wrote:
On 3/28/23 18:07, gkokolatos@pm.me wrote:
------- Original Message -------
On Friday, March 24th, 2023 at 10:30 AM, gkokolatos@pm.me <gkokolatos@pm.me> wrote:------- Original Message -------
On Thursday, March 23rd, 2023 at 6:10 PM, Tomas Vondra tomas.vondra@enterprisedb.com wrote:This leaves the empty-data issue (which we have a fix for) and the
switch to LZ4F. And then the zstd part.Please expect promptly a patch for the switch to frames.
Please find the expected patch attached. Note that the bulk of the
patch is code unification, variable renaming to something more
appropriate, and comment addition. These are changes that are not
strictly necessary to switch to LZ4F. I do believe that are
essential for code hygiene after the switch and they do belong
on the same commit.Thanks!
I agree the renames & cleanup are appropriate - it'd be silly to stick
to misleading naming etc. Would it make sense to split the patch into
two, to separate the renames and the switch to lz4f?
That'd make it the changes necessary for lz4f switch clearer.
I don't think so. Did you mean separate commits only for review ?
The patch is pretty readable - the File API has just some renames, and
the compressor API is what's being replaced, which isn't going to be any
more clear.
@Georgeos: did you consider using a C union in LZ4State, to separate the
parts used by the different APIs ?
--
Justin
On 3/28/23 18:07, gkokolatos@pm.me wrote:
------- Original Message -------
On Friday, March 24th, 2023 at 10:30 AM, gkokolatos@pm.me <gkokolatos@pm.me> wrote:------- Original Message -------
On Thursday, March 23rd, 2023 at 6:10 PM, Tomas Vondra tomas.vondra@enterprisedb.com wrote:This leaves the empty-data issue (which we have a fix for) and the
switch to LZ4F. And then the zstd part.Please expect promptly a patch for the switch to frames.
Please find the expected patch attached. Note that the bulk of the
patch is code unification, variable renaming to something more
appropriate, and comment addition. These are changes that are not
strictly necessary to switch to LZ4F. I do believe that are
essential for code hygiene after the switch and they do belong
on the same commit.
I think the patch is fine, but I'm wondering if the renames shouldn't go
a bit further. It removes references to LZ4File struct, but there's a
bunch of functions with LZ4File_ prefix. Why not to simply use LZ4_
prefix? We don't have GzipFile either.
Sure, it might be a bit confusing because lz4.h uses LZ4_ prefix, but
then we probably should not define LZ4_compressor_init ...
Also, maybe the comments shouldn't use "File API" when compress_io.c
calls that "Compressed stream API".
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 3/28/23 00:34, gkokolatos@pm.me wrote:
...
Got it. In that case I agree it's fine to do that in a single commit.
For what is worth, I think that this patch should get a +1 and get in. It
solves the empty writes problem and includes a test to a previous untested
case.
Pushed, after updating / rewording the commit message a little bit.
Thanks!
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
------- Original Message -------
On Wednesday, March 29th, 2023 at 12:02 AM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 3/28/23 18:07, gkokolatos@pm.me wrote:
------- Original Message -------
On Friday, March 24th, 2023 at 10:30 AM, gkokolatos@pm.me gkokolatos@pm.me wrote:------- Original Message -------
On Thursday, March 23rd, 2023 at 6:10 PM, Tomas Vondra tomas.vondra@enterprisedb.com wrote:This leaves the empty-data issue (which we have a fix for) and the
switch to LZ4F. And then the zstd part.Please expect promptly a patch for the switch to frames.
Please find the expected patch attached. Note that the bulk of the
patch is code unification, variable renaming to something more
appropriate, and comment addition. These are changes that are not
strictly necessary to switch to LZ4F. I do believe that are
essential for code hygiene after the switch and they do belong
on the same commit.I think the patch is fine, but I'm wondering if the renames shouldn't go
a bit further. It removes references to LZ4File struct, but there's a
bunch of functions with LZ4File_ prefix. Why not to simply use LZ4_
prefix? We don't have GzipFile either.Sure, it might be a bit confusing because lz4.h uses LZ4_ prefix, but
then we probably should not define LZ4_compressor_init ...
This is a good point. The initial thought was that since lz4.h is now
removed, such ambiguity will not be present. In v2 of the patch the
function is renamed to `LZ4State_compression_init` since this name
describes better its purpose. It initializes the LZ4State for
compression.
As for the LZ4File_ prefix, I have no objections. Please find the
prefix changed to LZ4Stream_. For the record, the word 'File' is not
unique to the lz4 implementation. The common data structure used by
the API in compress_io.h:
typedef struct CompressFileHandle CompressFileHandle;
The public functions for this API are named:
InitCompressFileHandle
InitDiscoverCompressFileHandle
EndCompressFileHandle
And within InitCompressFileHandle the pattern is:
if (compression_spec.algorithm == PG_COMPRESSION_NONE)
InitCompressFileHandleNone(CFH, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressFileHandleGzip(CFH, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
InitCompressFileHandleLZ4(CFH, compression_spec);
It was felt that a prefix was required due to the inclusion 'lz4.h'
header where naming functions as 'LZ4_' would be wrong. The prefix
'LZ4File_' seemed to be in line with the naming of the rest of
the relevant functions and structures. Other compressions, gzip and
none, did not face the same issue.
To conclude, I think that having a prefix is slightly preferred
over not having one. Since the prefix `LZ4File_` is not desired,
I propose `LZ4Stream_` in v2.
I will not object to dismissing the argument and drop `File` from
the prefix, if so requested.
Also, maybe the comments shouldn't use "File API" when compress_io.c
calls that "Compressed stream API".
Done.
Cheers,
//Georgios
Show quoted text
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
v2-0001-Use-LZ4-frames-in-pg_dump-s-compressor-API.patchtext/x-patch; name=v2-0001-Use-LZ4-frames-in-pg_dump-s-compressor-API.patchDownload
From b17b60cc1ff608f85c6c75ab19ad40c0863cfa93 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 31 Mar 2023 09:16:52 +0000
Subject: [PATCH v2] Use LZ4 frames in pg_dump's compressor API.
This change allows for greater compaction of data, especially so in very narrow
relations, by avoiding at least a compaction header and footer per row. Since
LZ4 frames are now used by both compression APIs, some code deduplication
opportunities have become obvious and are also implemented.
While at it, rename LZ4File* functions to LZ4Stream* to improve readability.
Reported by: Justin Pryzby
---
src/bin/pg_dump/compress_lz4.c | 420 +++++++++++++++++++++------------
1 file changed, 275 insertions(+), 145 deletions(-)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index fc2f4e116d..7023b11a2c 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -17,7 +17,6 @@
#include "compress_lz4.h"
#ifdef USE_LZ4
-#include <lz4.h>
#include <lz4frame.h>
/*
@@ -29,133 +28,286 @@
#endif
/*----------------------
- * Compressor API
- *----------------------
+ * Common to both APIs
*/
-typedef struct LZ4CompressorState
+/*
+ * State used for LZ4 (de)compression by both APIs.
+ */
+typedef struct LZ4State
{
- char *outbuf;
- size_t outsize;
-} LZ4CompressorState;
+ /*
+ * Used by the Stream API to keep track of the file stream.
+ */
+ FILE *fp;
+
+ LZ4F_preferences_t prefs;
+
+ LZ4F_compressionContext_t ctx;
+ LZ4F_decompressionContext_t dtx;
+
+ /*
+ * Used by the Stream API's lazy initialization.
+ */
+ bool inited;
+
+ /*
+ * Used by the Stream API to distinguish between compression
+ * and decompression operations.
+ */
+ bool compressing;
+
+ /*
+ * Used by the Compressor API to mark if the compression
+ * headers have been written after initialization.
+ */
+ bool needs_header_flush;
+
+ size_t buflen;
+ char *buffer;
+
+ /*
+ * Used by the Stream API to store already uncompressed
+ * data that the caller has not consumed.
+ */
+ size_t overflowalloclen;
+ size_t overflowlen;
+ char *overflowbuf;
+
+ /*
+ * Used by both APIs to keep track of the compressed
+ * data length stored in the buffer.
+ */
+ size_t compressedlen;
+
+ /*
+ * Used by both APIs to keep track of error codes.
+ */
+ size_t errcode;
+} LZ4State;
+
+/*
+ * Initialize the required LZ4State members for compression. Write the LZ4 frame
+ * header in a buffer keeping track of its length. Users of this function can
+ * choose when and how to write the header to a file stream.
+ *
+ * Returns true on success. In case of a failure returns false, and stores the
+ * error code in state->errcode.
+ */
+static bool
+LZ4State_compression_init(LZ4State *state)
+{
+ size_t status;
+
+ state->buflen = LZ4F_compressBound(DEFAULT_IO_BUFFER_SIZE, &state->prefs);
+
+ /*
+ * LZ4F_compressBegin requires a buffer that is greater or equal to
+ * LZ4F_HEADER_SIZE_MAX. Verify that the requirement is met.
+ */
+ if (state->buflen < LZ4F_HEADER_SIZE_MAX)
+ state->buflen = LZ4F_HEADER_SIZE_MAX;
+
+ status = LZ4F_createCompressionContext(&state->ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ {
+ state->errcode = status;
+ return false;
+ }
+
+ state->buffer = pg_malloc(state->buflen);
+ status = LZ4F_compressBegin(state->ctx,
+ state->buffer, state->buflen,
+ &state->prefs);
+ if (LZ4F_isError(status))
+ {
+ state->errcode = status;
+ return false;
+ }
+
+ state->compressedlen = status;
+
+ return true;
+}
+
+/*----------------------
+ * Compressor API
+ *----------------------
+ */
/* Private routines that support LZ4 compressed data I/O */
-static void ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs);
-static void WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
- const void *data, size_t dLen);
-static void EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs);
static void
ReadDataFromArchiveLZ4(ArchiveHandle *AH, CompressorState *cs)
{
- LZ4_streamDecode_t lz4StreamDecode;
- char *buf;
- char *decbuf;
- size_t buflen;
- size_t cnt;
-
- buflen = DEFAULT_IO_BUFFER_SIZE;
- buf = pg_malloc(buflen);
- decbuf = pg_malloc(buflen);
+ size_t r;
+ size_t readbuflen;
+ char *outbuf;
+ char *readbuf;
+ LZ4F_decompressionContext_t ctx = NULL;
+ LZ4F_decompressOptions_t dec_opt;
+ LZ4F_errorCode_t status;
+
+ memset(&dec_opt, 0, sizeof(dec_opt));
+ status = LZ4F_createDecompressionContext(&ctx, LZ4F_VERSION);
+ if (LZ4F_isError(status))
+ pg_fatal("could not create LZ4 decompression context: %s",
+ LZ4F_getErrorName(status));
+
+ outbuf = pg_malloc0(DEFAULT_IO_BUFFER_SIZE);
+ readbuf = pg_malloc0(DEFAULT_IO_BUFFER_SIZE);
+ readbuflen = DEFAULT_IO_BUFFER_SIZE;
+ while ((r = cs->readF(AH, &readbuf, &readbuflen)) > 0)
+ {
+ char *readp;
+ char *readend;
- LZ4_setStreamDecode(&lz4StreamDecode, NULL, 0);
+ /* Process one chunk */
+ readp = readbuf;
+ readend = readbuf + r;
+ while (readp < readend)
+ {
+ size_t out_size = DEFAULT_IO_BUFFER_SIZE;
+ size_t read_size = readend - readp;
- while ((cnt = cs->readF(AH, &buf, &buflen)))
- {
- int decBytes = LZ4_decompress_safe_continue(&lz4StreamDecode,
- buf, decbuf,
- cnt, buflen);
+ memset(outbuf, 0, DEFAULT_IO_BUFFER_SIZE);
+ status = LZ4F_decompress(ctx, outbuf, &out_size,
+ readp, &read_size, &dec_opt);
+ if (LZ4F_isError(status))
+ pg_fatal("could not decompress: %s",
+ LZ4F_getErrorName(status));
- ahwrite(decbuf, 1, decBytes, AH);
+ ahwrite(outbuf, 1, out_size, AH);
+ readp += read_size;
+ }
}
- pg_free(buf);
- pg_free(decbuf);
+ pg_free(outbuf);
+ pg_free(readbuf);
+
+ status = LZ4F_freeDecompressionContext(ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("could not free LZ4 decompression context: %s",
+ LZ4F_getErrorName(status));
}
static void
WriteDataToArchiveLZ4(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen)
{
- LZ4CompressorState *LZ4cs = (LZ4CompressorState *) cs->private_data;
- size_t compressed;
- size_t requiredsize = LZ4_compressBound(dLen);
+ LZ4State *state = (LZ4State *) cs->private_data;
+ size_t remaining = dLen;
+ size_t status;
+ size_t chunk;
- if (requiredsize > LZ4cs->outsize)
+ /* Write the header if not yet written. */
+ if (state->needs_header_flush)
{
- LZ4cs->outbuf = pg_realloc(LZ4cs->outbuf, requiredsize);
- LZ4cs->outsize = requiredsize;
+ cs->writeF(AH, state->buffer, state->compressedlen);
+ state->needs_header_flush = false;
}
- compressed = LZ4_compress_default(data, LZ4cs->outbuf,
- dLen, LZ4cs->outsize);
+ while (remaining > 0)
+ {
+
+ if (remaining > DEFAULT_IO_BUFFER_SIZE)
+ chunk = DEFAULT_IO_BUFFER_SIZE;
+ else
+ chunk = remaining;
+
+ remaining -= chunk;
+ status = LZ4F_compressUpdate(state->ctx,
+ state->buffer, state->buflen,
+ data, chunk, NULL);
+
+ if (LZ4F_isError(status))
+ pg_fatal("failed to LZ4 compress data: %s",
+ LZ4F_getErrorName(status));
- if (compressed <= 0)
- pg_fatal("failed to LZ4 compress data");
+ cs->writeF(AH, state->buffer, status);
- cs->writeF(AH, LZ4cs->outbuf, compressed);
+ data = ((char *) data) + chunk;
+ }
}
static void
EndCompressorLZ4(ArchiveHandle *AH, CompressorState *cs)
{
- LZ4CompressorState *LZ4cs;
-
- LZ4cs = (LZ4CompressorState *) cs->private_data;
- if (LZ4cs)
- {
- pg_free(LZ4cs->outbuf);
- pg_free(LZ4cs);
- cs->private_data = NULL;
- }
+ LZ4State *state = (LZ4State *) cs->private_data;
+ size_t status;
+
+ /* Nothing needs to be done */
+ if (!state)
+ return;
+
+ /*
+ * Write the header if not yet written. The caller is not required to
+ * call writeData if the relation does not contain any data. Thus it is
+ * possible to reach here without having flushed the header. Do it before
+ * ending the compression.
+ */
+ if (state->needs_header_flush)
+ cs->writeF(AH, state->buffer, state->compressedlen);
+
+ status = LZ4F_compressEnd(state->ctx,
+ state->buffer, state->buflen,
+ NULL);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+
+ cs->writeF(AH, state->buffer, status);
+
+ status = LZ4F_freeCompressionContext(state->ctx);
+ if (LZ4F_isError(status))
+ pg_fatal("failed to end compression: %s",
+ LZ4F_getErrorName(status));
+
+ pg_free(state->buffer);
+ pg_free(state);
+
+ cs->private_data = NULL;
}
-
/*
* Public routines that support LZ4 compressed data I/O
*/
void
InitCompressorLZ4(CompressorState *cs, const pg_compress_specification compression_spec)
{
+ LZ4State *state;
+
cs->readData = ReadDataFromArchiveLZ4;
cs->writeData = WriteDataToArchiveLZ4;
cs->end = EndCompressorLZ4;
cs->compression_spec = compression_spec;
- /* Will be lazy init'd */
- cs->private_data = pg_malloc0(sizeof(LZ4CompressorState));
+ /*
+ * Read operations have access to the whole input. No state needs
+ * to be carried between calls.
+ */
+ if (cs->readF)
+ return;
+
+ state = pg_malloc0(sizeof(*state));
+ if (cs->compression_spec.level >= 0)
+ state->prefs.compressionLevel = cs->compression_spec.level;
+
+ if (!LZ4State_compression_init(state))
+ pg_fatal("could not initialize LZ4 compression: %s",
+ LZ4F_getErrorName(state->errcode));
+
+ /* Remember that the header has not been written. */
+ state->needs_header_flush = true;
+ cs->private_data = state;
}
/*----------------------
- * Compress File API
+ * Compress Stream API
*----------------------
*/
-/*
- * State needed for LZ4 (de)compression using the CompressFileHandle API.
- */
-typedef struct LZ4File
-{
- FILE *fp;
-
- LZ4F_preferences_t prefs;
-
- LZ4F_compressionContext_t ctx;
- LZ4F_decompressionContext_t dtx;
-
- bool inited;
- bool compressing;
-
- size_t buflen;
- char *buffer;
-
- size_t overflowalloclen;
- size_t overflowlen;
- char *overflowbuf;
-
- size_t errcode;
-} LZ4File;
/*
* LZ4 equivalent to feof() or gzeof(). Return true iff there is no
@@ -163,17 +315,17 @@ typedef struct LZ4File
* is reached.
*/
static bool
-LZ4File_eof(CompressFileHandle *CFH)
+LZ4Stream_eof(CompressFileHandle *CFH)
{
- LZ4File *fs = (LZ4File *) CFH->private_data;
+ LZ4State *fs = (LZ4State *) CFH->private_data;
return fs->overflowlen == 0 && feof(fs->fp);
}
static const char *
-LZ4File_get_error(CompressFileHandle *CFH)
+LZ4Stream_get_error(CompressFileHandle *CFH)
{
- LZ4File *fs = (LZ4File *) CFH->private_data;
+ LZ4State *fs = (LZ4State *) CFH->private_data;
const char *errmsg;
if (LZ4F_isError(fs->errcode))
@@ -185,7 +337,7 @@ LZ4File_get_error(CompressFileHandle *CFH)
}
/*
- * Prepare an already alloc'ed LZ4File struct for subsequent calls (either
+ * Prepare an already alloc'ed LZ4State struct for subsequent calls (either
* compression or decompression).
*
* It creates the necessary contexts for the operations. When compressing data
@@ -196,7 +348,7 @@ LZ4File_get_error(CompressFileHandle *CFH)
* error code in fs->errcode.
*/
static bool
-LZ4File_init(LZ4File *fs, int size, bool compressing)
+LZ4Stream_init(LZ4State *fs, int size, bool compressing)
{
size_t status;
@@ -209,33 +361,11 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
/* When compressing, write LZ4 header to the output stream. */
if (fs->compressing)
{
- fs->buflen = LZ4F_compressBound(DEFAULT_IO_BUFFER_SIZE, &fs->prefs);
- /*
- * LZ4F_compressBegin requires a buffer that is greater or equal to
- * LZ4F_HEADER_SIZE_MAX. Verify that the requirement is met.
- */
- if (fs->buflen < LZ4F_HEADER_SIZE_MAX)
- fs->buflen = LZ4F_HEADER_SIZE_MAX;
-
- status = LZ4F_createCompressionContext(&fs->ctx, LZ4F_VERSION);
- if (LZ4F_isError(status))
- {
- fs->errcode = status;
- return false;
- }
-
- fs->buffer = pg_malloc(fs->buflen);
- status = LZ4F_compressBegin(fs->ctx, fs->buffer, fs->buflen,
- &fs->prefs);
-
- if (LZ4F_isError(status))
- {
- fs->errcode = status;
+ if (!LZ4State_compression_init(fs))
return false;
- }
- if (fwrite(fs->buffer, 1, status, fs->fp) != status)
+ if (fwrite(fs->buffer, 1, fs->compressedlen, fs->fp) != fs->compressedlen)
{
errno = (errno) ? errno : ENOSPC;
return false;
@@ -272,7 +402,7 @@ LZ4File_init(LZ4File *fs, int size, bool compressing)
* the 'ptr' buffer), or 0 if the overflow buffer is empty.
*/
static int
-LZ4File_read_overflow(LZ4File *fs, void *ptr, int size, bool eol_flag)
+LZ4Stream_read_overflow(LZ4State *fs, void *ptr, int size, bool eol_flag)
{
char *p;
int readlen = 0;
@@ -306,7 +436,7 @@ LZ4File_read_overflow(LZ4File *fs, void *ptr, int size, bool eol_flag)
* char if found first when the eol_flag is set. It is possible that the
* decompressed output generated by reading any compressed input via the
* LZ4F API, exceeds 'ptrsize'. Any exceeding decompressed content is stored
- * at an overflow buffer within LZ4File. Of course, when the function is
+ * at an overflow buffer within LZ4State. Of course, when the function is
* called, it will first try to consume any decompressed content already
* present in the overflow buffer, before decompressing new content.
*
@@ -314,7 +444,7 @@ LZ4File_read_overflow(LZ4File *fs, void *ptr, int size, bool eol_flag)
* buffer, or -1 in case of error.
*/
static int
-LZ4File_read_internal(LZ4File *fs, void *ptr, int ptrsize, bool eol_flag)
+LZ4Stream_read_internal(LZ4State *fs, void *ptr, int ptrsize, bool eol_flag)
{
int dsize = 0;
int rsize;
@@ -324,7 +454,7 @@ LZ4File_read_internal(LZ4File *fs, void *ptr, int ptrsize, bool eol_flag)
void *readbuf;
/* Lazy init */
- if (!LZ4File_init(fs, size, false /* decompressing */ ))
+ if (!LZ4Stream_init(fs, size, false /* decompressing */ ))
return -1;
/* Verify that there is enough space in the outbuf */
@@ -335,7 +465,7 @@ LZ4File_read_internal(LZ4File *fs, void *ptr, int ptrsize, bool eol_flag)
}
/* use already decompressed content if available */
- dsize = LZ4File_read_overflow(fs, ptr, size, eol_flag);
+ dsize = LZ4Stream_read_overflow(fs, ptr, size, eol_flag);
if (dsize == size || (eol_flag && memchr(ptr, '\n', dsize)))
return dsize;
@@ -423,14 +553,14 @@ LZ4File_read_internal(LZ4File *fs, void *ptr, int ptrsize, bool eol_flag)
* Compress size bytes from ptr and write them to the stream.
*/
static bool
-LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
+LZ4Stream_write(const void *ptr, size_t size, CompressFileHandle *CFH)
{
- LZ4File *fs = (LZ4File *) CFH->private_data;
+ LZ4State *fs = (LZ4State *) CFH->private_data;
size_t status;
int remaining = size;
/* Lazy init */
- if (!LZ4File_init(fs, size, true))
+ if (!LZ4Stream_init(fs, size, true))
return false;
while (remaining > 0)
@@ -461,13 +591,13 @@ LZ4File_write(const void *ptr, size_t size, CompressFileHandle *CFH)
* fread() equivalent implementation for LZ4 compressed files.
*/
static bool
-LZ4File_read(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH)
+LZ4Stream_read(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH)
{
- LZ4File *fs = (LZ4File *) CFH->private_data;
+ LZ4State *fs = (LZ4State *) CFH->private_data;
int ret;
- if ((ret = LZ4File_read_internal(fs, ptr, size, false)) < 0)
- pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ if ((ret = LZ4Stream_read_internal(fs, ptr, size, false)) < 0)
+ pg_fatal("could not read from input file: %s", LZ4Stream_get_error(CFH));
if (rsize)
*rsize = (size_t) ret;
@@ -479,15 +609,15 @@ LZ4File_read(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH)
* fgetc() equivalent implementation for LZ4 compressed files.
*/
static int
-LZ4File_getc(CompressFileHandle *CFH)
+LZ4Stream_getc(CompressFileHandle *CFH)
{
- LZ4File *fs = (LZ4File *) CFH->private_data;
+ LZ4State *fs = (LZ4State *) CFH->private_data;
unsigned char c;
- if (LZ4File_read_internal(fs, &c, 1, false) <= 0)
+ if (LZ4Stream_read_internal(fs, &c, 1, false) <= 0)
{
- if (!LZ4File_eof(CFH))
- pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ if (!LZ4Stream_eof(CFH))
+ pg_fatal("could not read from input file: %s", LZ4Stream_get_error(CFH));
else
pg_fatal("could not read from input file: end of file");
}
@@ -499,14 +629,14 @@ LZ4File_getc(CompressFileHandle *CFH)
* fgets() equivalent implementation for LZ4 compressed files.
*/
static char *
-LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
+LZ4Stream_gets(char *ptr, int size, CompressFileHandle *CFH)
{
- LZ4File *fs = (LZ4File *) CFH->private_data;
+ LZ4State *fs = (LZ4State *) CFH->private_data;
int ret;
- ret = LZ4File_read_internal(fs, ptr, size, true);
- if (ret < 0 || (ret == 0 && !LZ4File_eof(CFH)))
- pg_fatal("could not read from input file: %s", LZ4File_get_error(CFH));
+ ret = LZ4Stream_read_internal(fs, ptr, size, true);
+ if (ret < 0 || (ret == 0 && !LZ4Stream_eof(CFH)))
+ pg_fatal("could not read from input file: %s", LZ4Stream_get_error(CFH));
/* Done reading */
if (ret == 0)
@@ -520,10 +650,10 @@ LZ4File_gets(char *ptr, int size, CompressFileHandle *CFH)
* remaining content and/or generated footer from the LZ4 API.
*/
static bool
-LZ4File_close(CompressFileHandle *CFH)
+LZ4Stream_close(CompressFileHandle *CFH)
{
FILE *fp;
- LZ4File *fs = (LZ4File *) CFH->private_data;
+ LZ4State *fs = (LZ4State *) CFH->private_data;
size_t status;
fp = fs->fp;
@@ -564,11 +694,11 @@ LZ4File_close(CompressFileHandle *CFH)
}
static bool
-LZ4File_open(const char *path, int fd, const char *mode,
+LZ4Stream_open(const char *path, int fd, const char *mode,
CompressFileHandle *CFH)
{
FILE *fp;
- LZ4File *lz4fp = (LZ4File *) CFH->private_data;
+ LZ4State *lz4fp = (LZ4State *) CFH->private_data;
if (fd >= 0)
fp = fdopen(fd, mode);
@@ -586,7 +716,7 @@ LZ4File_open(const char *path, int fd, const char *mode,
}
static bool
-LZ4File_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
+LZ4Stream_open_write(const char *path, const char *mode, CompressFileHandle *CFH)
{
char *fname;
int save_errno;
@@ -609,17 +739,17 @@ void
InitCompressFileHandleLZ4(CompressFileHandle *CFH,
const pg_compress_specification compression_spec)
{
- LZ4File *lz4fp;
-
- CFH->open_func = LZ4File_open;
- CFH->open_write_func = LZ4File_open_write;
- CFH->read_func = LZ4File_read;
- CFH->write_func = LZ4File_write;
- CFH->gets_func = LZ4File_gets;
- CFH->getc_func = LZ4File_getc;
- CFH->eof_func = LZ4File_eof;
- CFH->close_func = LZ4File_close;
- CFH->get_error_func = LZ4File_get_error;
+ LZ4State *lz4fp;
+
+ CFH->open_func = LZ4Stream_open;
+ CFH->open_write_func = LZ4Stream_open_write;
+ CFH->read_func = LZ4Stream_read;
+ CFH->write_func = LZ4Stream_write;
+ CFH->gets_func = LZ4Stream_gets;
+ CFH->getc_func = LZ4Stream_getc;
+ CFH->eof_func = LZ4Stream_eof;
+ CFH->close_func = LZ4Stream_close;
+ CFH->get_error_func = LZ4Stream_get_error;
CFH->compression_spec = compression_spec;
lz4fp = pg_malloc0(sizeof(*lz4fp));
--
2.34.1
On 3/31/23 11:19, gkokolatos@pm.me wrote:
...
I think the patch is fine, but I'm wondering if the renames shouldn't go
a bit further. It removes references to LZ4File struct, but there's a
bunch of functions with LZ4File_ prefix. Why not to simply use LZ4_
prefix? We don't have GzipFile either.Sure, it might be a bit confusing because lz4.h uses LZ4_ prefix, but
then we probably should not define LZ4_compressor_init ...This is a good point. The initial thought was that since lz4.h is now
removed, such ambiguity will not be present. In v2 of the patch the
function is renamed to `LZ4State_compression_init` since this name
describes better its purpose. It initializes the LZ4State for
compression.As for the LZ4File_ prefix, I have no objections. Please find the
prefix changed to LZ4Stream_. For the record, the word 'File' is not
unique to the lz4 implementation. The common data structure used by
the API in compress_io.h:typedef struct CompressFileHandle CompressFileHandle;
The public functions for this API are named:
InitCompressFileHandle
InitDiscoverCompressFileHandle
EndCompressFileHandleAnd within InitCompressFileHandle the pattern is:
if (compression_spec.algorithm == PG_COMPRESSION_NONE)
InitCompressFileHandleNone(CFH, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_GZIP)
InitCompressFileHandleGzip(CFH, compression_spec);
else if (compression_spec.algorithm == PG_COMPRESSION_LZ4)
InitCompressFileHandleLZ4(CFH, compression_spec);It was felt that a prefix was required due to the inclusion 'lz4.h'
header where naming functions as 'LZ4_' would be wrong. The prefix
'LZ4File_' seemed to be in line with the naming of the rest of
the relevant functions and structures. Other compressions, gzip and
none, did not face the same issue.To conclude, I think that having a prefix is slightly preferred
over not having one. Since the prefix `LZ4File_` is not desired,
I propose `LZ4Stream_` in v2.I will not object to dismissing the argument and drop `File` from
the prefix, if so requested.
Thanks.
I think the LZ4Stream prefix is reasonable, so let's roll with that. I
cleaned up the patch a little bit (mostly comment tweaks, etc.), updated
the commit message and pushed it.
The main tweak I did is renaming all the LZ4State variables from "fs" to
state. The old name referred to the now abandoned "file state", but
after the rename to LZ4State that seems confusing. Some of the places
already used "state"and it's easier to know "state" is always LZ4State,
so let's keep it consistent.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
23.03.2023 20:10, Tomas Vondra wrote:
So pushed all three parts, after updating the commit messages a bit.
This leaves the empty-data issue (which we have a fix for) and the
switch to LZ4F. And then the zstd part.
I'm sorry that I haven't noticed/checked that before, but when trying to
perform check-world with Valgrind I've discovered another issue presumably
related to LZ4File_gets().
When running under Valgrind:
PROVE_TESTS=t/002_pg_dump.pl make check -C src/bin/pg_dump/
I get:
...
[07:07:11.683](0.000s) ok 1939 - compression_lz4_dir: glob check for
.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir/*.dat.lz4
# Running: pg_restore --jobs=2 --file=.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir.sql
.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir
==00:00:00:00.579 2811926== Conditional jump or move depends on uninitialised value(s)
==00:00:00:00.579 2811926== at 0x4853376: rawmemchr (vg_replace_strmem.c:1548)
==00:00:00:00.579 2811926== by 0x4C96A67: _IO_str_init_static_internal (strops.c:41)
==00:00:00:00.579 2811926== by 0x4C693A2: _IO_strfile_read (strfile.h:95)
==00:00:00:00.579 2811926== by 0x4C693A2: __isoc99_sscanf (isoc99_sscanf.c:28)
==00:00:00:00.579 2811926== by 0x11DB6F: _LoadLOs (pg_backup_directory.c:458)
==00:00:00:00.579 2811926== by 0x11DD1E: _PrintTocData (pg_backup_directory.c:422)
==00:00:00:00.579 2811926== by 0x118484: restore_toc_entry (pg_backup_archiver.c:882)
==00:00:00:00.579 2811926== by 0x1190CC: RestoreArchive (pg_backup_archiver.c:699)
==00:00:00:00.579 2811926== by 0x10F25D: main (pg_restore.c:414)
==00:00:00:00.579 2811926==
...
It looks like the line variable returned by gets_func() here is not
null-terminated:
while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
{
...
if (sscanf(line, "%u %" CppAsString2(MAXPGPATH) "s\n", &oid, lofname) != 2)
...
And Valgrind doesn't like it.
Best regards,
Alexander
------- Original Message -------
On Friday, May 5th, 2023 at 8:00 AM, Alexander Lakhin <exclusion@gmail.com> wrote:
23.03.2023 20:10, Tomas Vondra wrote:
So pushed all three parts, after updating the commit messages a bit.
This leaves the empty-data issue (which we have a fix for) and the
switch to LZ4F. And then the zstd part.I'm sorry that I haven't noticed/checked that before, but when trying to
perform check-world with Valgrind I've discovered another issue presumably
related to LZ4File_gets().
When running under Valgrind:
PROVE_TESTS=t/002_pg_dump.pl make check -C src/bin/pg_dump/
I get:
...
07:07:11.683 ok 1939 - compression_lz4_dir: glob check for
.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir/*.dat.lz4
# Running: pg_restore --jobs=2 --file=.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir.sql
.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir==00:00:00:00.579 2811926== Conditional jump or move depends on uninitialised value(s)
==00:00:00:00.579 2811926== at 0x4853376: rawmemchr (vg_replace_strmem.c:1548)
==00:00:00:00.579 2811926== by 0x4C96A67: _IO_str_init_static_internal (strops.c:41)
==00:00:00:00.579 2811926== by 0x4C693A2: _IO_strfile_read (strfile.h:95)
==00:00:00:00.579 2811926== by 0x4C693A2: __isoc99_sscanf (isoc99_sscanf.c:28)
==00:00:00:00.579 2811926== by 0x11DB6F: _LoadLOs (pg_backup_directory.c:458)
==00:00:00:00.579 2811926== by 0x11DD1E: _PrintTocData (pg_backup_directory.c:422)
==00:00:00:00.579 2811926== by 0x118484: restore_toc_entry (pg_backup_archiver.c:882)
==00:00:00:00.579 2811926== by 0x1190CC: RestoreArchive (pg_backup_archiver.c:699)
==00:00:00:00.579 2811926== by 0x10F25D: main (pg_restore.c:414)
==00:00:00:00.579 2811926==
...It looks like the line variable returned by gets_func() here is not
null-terminated:
while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL){
...
if (sscanf(line, "%u %" CppAsString2(MAXPGPATH) "s\n", &oid, lofname) != 2)
...
And Valgrind doesn't like it.
Valgrind is correct to not like it. LZ4Stream_gets() got modeled after
gets() when it should have been modeled after fgets().
Please find a patch attached to address it.
Cheers,
//Georgios
Show quoted text
Best regards,
Alexander
Attachments:
0001-Null-terminate-the-output-buffer-of-LZ4Stream_gets.patchtext/x-patch; name=0001-Null-terminate-the-output-buffer-of-LZ4Stream_gets.patchDownload
From 587873da2b563c59b281051c2636cda667abf099 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 5 May 2023 09:47:02 +0000
Subject: [PATCH] Null terminate the output buffer of LZ4Stream_gets
LZ4Stream_gets did not null terminate its output buffer. Its callers expected
the buffer to be null terminated so they passed it around to functions such as
sscanf with unintended consequences.
Reported-by: Alexander Lakhin<exclusion@gmail.com>
---
src/bin/pg_dump/compress_lz4.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 423e1b7976..26c9a8b280 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -459,6 +459,10 @@ LZ4Stream_read_internal(LZ4State *state, void *ptr, int ptrsize, bool eol_flag)
if (!LZ4Stream_init(state, size, false /* decompressing */ ))
return -1;
+ /* No work needs to be done for a zero-sized output buffer */
+ if (size <= 0)
+ return 0;
+
/* Verify that there is enough space in the outbuf */
if (size > state->buflen)
{
@@ -636,7 +640,12 @@ LZ4Stream_gets(char *ptr, int size, CompressFileHandle *CFH)
LZ4State *state = (LZ4State *) CFH->private_data;
int ret;
- ret = LZ4Stream_read_internal(state, ptr, size, true);
+ Assert(size > 1);
+
+ /* Our caller expects the return string to be NULL terminated */
+ memset(ptr, '\0', size);
+
+ ret = LZ4Stream_read_internal(state, ptr, size - 1, true);
if (ret < 0 || (ret == 0 && !LZ4Stream_eof(CFH)))
pg_fatal("could not read from input file: %s", LZ4Stream_get_error(CFH));
--
2.34.1
On 2023-05-05 Fr 06:02, gkokolatos@pm.me wrote:
------- Original Message -------
On Friday, May 5th, 2023 at 8:00 AM, Alexander Lakhin<exclusion@gmail.com> wrote:23.03.2023 20:10, Tomas Vondra wrote:
So pushed all three parts, after updating the commit messages a bit.
This leaves the empty-data issue (which we have a fix for) and the
switch to LZ4F. And then the zstd part.I'm sorry that I haven't noticed/checked that before, but when trying to
perform check-world with Valgrind I've discovered another issue presumably
related to LZ4File_gets().
When running under Valgrind:
PROVE_TESTS=t/002_pg_dump.pl make check -C src/bin/pg_dump/
I get:
...
07:07:11.683 ok 1939 - compression_lz4_dir: glob check for
.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir/*.dat.lz4
# Running: pg_restore --jobs=2 --file=.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir.sql
.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir==00:00:00:00.579 2811926== Conditional jump or move depends on uninitialised value(s)
==00:00:00:00.579 2811926== at 0x4853376: rawmemchr (vg_replace_strmem.c:1548)
==00:00:00:00.579 2811926== by 0x4C96A67: _IO_str_init_static_internal (strops.c:41)
==00:00:00:00.579 2811926== by 0x4C693A2: _IO_strfile_read (strfile.h:95)
==00:00:00:00.579 2811926== by 0x4C693A2: __isoc99_sscanf (isoc99_sscanf.c:28)
==00:00:00:00.579 2811926== by 0x11DB6F: _LoadLOs (pg_backup_directory.c:458)
==00:00:00:00.579 2811926== by 0x11DD1E: _PrintTocData (pg_backup_directory.c:422)
==00:00:00:00.579 2811926== by 0x118484: restore_toc_entry (pg_backup_archiver.c:882)
==00:00:00:00.579 2811926== by 0x1190CC: RestoreArchive (pg_backup_archiver.c:699)
==00:00:00:00.579 2811926== by 0x10F25D: main (pg_restore.c:414)
==00:00:00:00.579 2811926==
...It looks like the line variable returned by gets_func() here is not
null-terminated:
while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL){
...
if (sscanf(line, "%u %" CppAsString2(MAXPGPATH) "s\n", &oid, lofname) != 2)
...
And Valgrind doesn't like it.Valgrind is correct to not like it. LZ4Stream_gets() got modeled after
gets() when it should have been modeled after fgets().Please find a patch attached to address it.
Isn't using memset here a bit wasteful? Why not just put a null at the
end after calling LZ4Stream_read_internal(), which tells you how many
bytes it has written?
cheers
andrew
--
Andrew Dunstan
EDB:https://www.enterprisedb.com
------- Original Message -------
On Friday, May 5th, 2023 at 3:23 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
On 2023-05-05 Fr 06:02, gkokolatos@pm.me wrote:
------- Original Message -------
On Friday, May 5th, 2023 at 8:00 AM, Alexander Lakhin
[<exclusion@gmail.com>](mailto:exclusion@gmail.com)
wrote:23.03.2023 20:10, Tomas Vondra wrote:
So pushed all three parts, after updating the commit messages a bit.
This leaves the empty-data issue (which we have a fix for) and the
switch to LZ4F. And then the zstd part.I'm sorry that I haven't noticed/checked that before, but when trying to
perform check-world with Valgrind I've discovered another issue presumably
related to LZ4File_gets().
When running under Valgrind:
PROVE_TESTS=t/002_pg_dump.pl make check -C src/bin/pg_dump/
I get:
...
07:07:11.683 ok 1939 - compression_lz4_dir: glob check for
.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir/*.dat.lz4
# Running: pg_restore --jobs=2 --file=.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir.sql
.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir==00:00:00:00.579 2811926== Conditional jump or move depends on uninitialised value(s)
==00:00:00:00.579 2811926== at 0x4853376: rawmemchr (vg_replace_strmem.c:1548)
==00:00:00:00.579 2811926== by 0x4C96A67: _IO_str_init_static_internal (strops.c:41)
==00:00:00:00.579 2811926== by 0x4C693A2: _IO_strfile_read (strfile.h:95)
==00:00:00:00.579 2811926== by 0x4C693A2: __isoc99_sscanf (isoc99_sscanf.c:28)
==00:00:00:00.579 2811926== by 0x11DB6F: _LoadLOs (pg_backup_directory.c:458)
==00:00:00:00.579 2811926== by 0x11DD1E: _PrintTocData (pg_backup_directory.c:422)
==00:00:00:00.579 2811926== by 0x118484: restore_toc_entry (pg_backup_archiver.c:882)
==00:00:00:00.579 2811926== by 0x1190CC: RestoreArchive (pg_backup_archiver.c:699)
==00:00:00:00.579 2811926== by 0x10F25D: main (pg_restore.c:414)
==00:00:00:00.579 2811926==
...It looks like the line variable returned by gets_func() here is not
null-terminated:
while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL){
...
if (sscanf(line, "%u %" CppAsString2(MAXPGPATH) "s\n", &oid, lofname) != 2)
...
And Valgrind doesn't like it.Valgrind is correct to not like it. LZ4Stream_gets() got modeled after
gets() when it should have been modeled after fgets().Please find a patch attached to address it.
Isn't using memset here a bit wasteful? Why not just put a null at the end after calling LZ4Stream_read_internal(), which tells you how many bytes it has written?
Good point. I thought about it before submitting the patch. I concluded that given the complexity and operations involved in LZ4Stream_read_internal() and the rest of the pg_dump/pg_restore code, the memset() call will be negligible. However from the readability point of view, the function is a bit cleaner with the memset().
I will not object to any suggestion though, as this is a very trivial point. Please find attached a v2 of the patch following the suggested approach.
Cheers,
//Georgios
Show quoted text
cheers
andrew
--
Andrew Dunstan
EDB:
https://www.enterprisedb.com
Attachments:
v2-0001-Null-terminate-the-output-buffer-of-LZ4Stream_get.patchtext/x-patch; name=v2-0001-Null-terminate-the-output-buffer-of-LZ4Stream_get.patchDownload
From 65dbce1eb81597e3dd44eff62d8d667b0a3322da Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Fri, 5 May 2023 09:47:02 +0000
Subject: [PATCH v2] Null terminate the output buffer of LZ4Stream_gets
LZ4Stream_gets did not null terminate its output buffer. Its callers expected
the buffer to be null terminated so they passed it around to functions such as
sscanf with unintended consequences.
Reported-by: Alexander Lakhin<exclusion@gmail.com>
---
src/bin/pg_dump/compress_lz4.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 423e1b7976..0f447919b2 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -459,6 +459,10 @@ LZ4Stream_read_internal(LZ4State *state, void *ptr, int ptrsize, bool eol_flag)
if (!LZ4Stream_init(state, size, false /* decompressing */ ))
return -1;
+ /* No work needs to be done for a zero-sized output buffer */
+ if (size <= 0)
+ return 0;
+
/* Verify that there is enough space in the outbuf */
if (size > state->buflen)
{
@@ -636,7 +640,9 @@ LZ4Stream_gets(char *ptr, int size, CompressFileHandle *CFH)
LZ4State *state = (LZ4State *) CFH->private_data;
int ret;
- ret = LZ4Stream_read_internal(state, ptr, size, true);
+ Assert(size > 1);
+
+ ret = LZ4Stream_read_internal(state, ptr, size - 1, true);
if (ret < 0 || (ret == 0 && !LZ4Stream_eof(CFH)))
pg_fatal("could not read from input file: %s", LZ4Stream_get_error(CFH));
@@ -644,6 +650,12 @@ LZ4Stream_gets(char *ptr, int size, CompressFileHandle *CFH)
if (ret == 0)
return NULL;
+ /*
+ * Our caller expects the return string to be NULL terminated
+ * and we know that ret is greater than zero.
+ */
+ ptr[ret - 1] = '\0';
+
return ptr;
}
--
2.34.1
On Fri, May 05, 2023 at 02:13:28PM +0000, gkokolatos@pm.me wrote:
Good point. I thought about it before submitting the patch. I
concluded that given the complexity and operations involved in
LZ4Stream_read_internal() and the rest of t he pg_dump/pg_restore
code, the memset() call will be negligible. However from the
readability point of view, the function is a bit cleaner with the
memset().I will not object to any suggestion though, as this is a very
trivial point. Please find attached a v2 of the patch following the
suggested approach.
Please note that an open item has been added for this stuff.
--
Michael
On Sat, May 6, 2023 at 04:51, Michael Paquier <[michael@paquier.xyz](mailto:On Sat, May 6, 2023 at 04:51, Michael Paquier <<a href=)> wrote:
On Fri, May 05, 2023 at 02:13:28PM +0000, gkokolatos@pm.me wrote:
Good point. I thought about it before submitting the patch. I
concluded that given the complexity and operations involved in
LZ4Stream_read_internal() and the rest of t he pg_dump/pg_restore
code, the memset() call will be negligible. However from the
readability point of view, the function is a bit cleaner with the
memset().I will not object to any suggestion though, as this is a very
trivial point. Please find attached a v2 of the patch following the
suggested approach.Please note that an open item has been added for this stuff.
Thank you but I am not certain I know what that means. Can you please explain?
Cheers,
//Georgios
Show quoted text
--
Michael
On Sun, May 07, 2023 at 03:01:52PM +0000, gkokolatos@pm.me wrote:
Thank you but I am not certain I know what that means. Can you please explain?
It means that this thread has been added to the following list:
https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items#Open_Issues
pg_dump/compress_lz4.c is new as of PostgreSQL 16, and this patch is
fixing a deficiency. That's just a way outside of the commit fest to
track any problems and make sure these are fixed before the release
happens.
--
Michael
On Fri, May 05, 2023 at 02:13:28PM +0000, gkokolatos@pm.me wrote:
Good point. I thought about it before submitting the patch. I
concluded that given the complexity and operations involved in
LZ4Stream_read_internal() and the rest of t he pg_dump/pg_restore
code, the memset() call will be negligible. However from the
readability point of view, the function is a bit cleaner with the
memset().I will not object to any suggestion though, as this is a very
trivial point. Please find attached a v2 of the patch following the
suggested approach.
Hmm. I was looking at this patch, and what you are trying to do
sounds rather right to keep a parallel with the gzip and zstd code
paths.
Looking at the code of gzread.c, gzgets() enforces a null-termination
on the string read. Still, isn't that something we'd better enforce
in read_none() as well? compress_io.h lists this as a requirement of
the callback, and Zstd_gets() does so already. read_none() does not
enforce that, unfortunately.
+ /* No work needs to be done for a zero-sized output buffer */
+ if (size <= 0)
+ return 0;
Indeed. This should be OK.
- ret = LZ4Stream_read_internal(state, ptr, size, true);
+ Assert(size > 1);
The addition of this assertion is a bit surprising, and this is
inconsistent with Zstd_gets where a length of 1 is authorized. We
should be more consistent across all the callbacks, IMO, not less, so
as we apply the same API contract across all the compression methods.
While testing this patch, I have triggered an error pointing out that
the decompression path of LZ4 is broken for table data. I can
reproduce that with a dump of the regression database, as of:
make installcheck
pg_dump --format=d --file=dump_lz4 --compress=lz4 regression
createdb regress_lz4
pg_restore --format=d -d regress_lz4 dump_lz4
pg_restore: error: COPY failed for table "clstr_tst": ERROR: extra data after last expected column
CONTEXT: COPY clstr_tst, line 15: "32 6 seis xyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzy..."
pg_restore: warning: errors ignored on restore: 1
This does not show up with gzip or zstd, and the patch does not
influence the result. In short it shows up with and without the
patch, on HEAD. That does not look really stable :/
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
While testing this patch, I have triggered an error pointing out that
the decompression path of LZ4 is broken for table data. I can
reproduce that with a dump of the regression database, as of:
make installcheck
pg_dump --format=d --file=dump_lz4 --compress=lz4 regression
createdb regress_lz4
pg_restore --format=d -d regress_lz4 dump_lz4
pg_restore: error: COPY failed for table "clstr_tst": ERROR: extra data after last expected column
CONTEXT: COPY clstr_tst, line 15: "32 6 seis xyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzy..."
pg_restore: warning: errors ignored on restore: 1
Ugh. Reproduced here ... so we need an open item for this.
regards, tom lane
On Sun, May 07, 2023 at 09:09:25PM -0400, Tom Lane wrote:
Ugh. Reproduced here ... so we need an open item for this.
Yep. Already added.
--
Michael
I wrote:
Michael Paquier <michael@paquier.xyz> writes:
While testing this patch, I have triggered an error pointing out that
the decompression path of LZ4 is broken for table data. I can
reproduce that with a dump of the regression database, as of:
make installcheck
pg_dump --format=d --file=dump_lz4 --compress=lz4 regression
Ugh. Reproduced here ... so we need an open item for this.
BTW, it seems to work with --format=c.
regards, tom lane
On 5/7/23 17:01, gkokolatos@pm.me wrote:
On Sat, May 6, 2023 at 04:51, Michael Paquier <michael@paquier.xyz
<mailto:On Sat, May 6, 2023 at 04:51, Michael Paquier <<a href=>> wrote:On Fri, May 05, 2023 at 02:13:28PM +0000, gkokolatos@pm.me wrote:
Good point. I thought about it before submitting the patch. I
concluded that given the complexity and operations involved in
LZ4Stream_read_internal() and the rest of t he pg_dump/pg_restore
code, the memset() call will be negligible. However from the
readability point of view, the function is a bit cleaner with the
memset().I will not object to any suggestion though, as this is a very
trivial point. Please find attached a v2 of the patch following the
suggested approach.Please note that an open item has been added for this stuff.
Thank you but I am not certain I know what that means. Can you please
explain?
It means it was added to the list of items we need to fix before PG16
gets out:
https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
------- Original Message -------
On Monday, May 8th, 2023 at 3:16 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
Michael Paquier michael@paquier.xyz writes:
While testing this patch, I have triggered an error pointing out that
the decompression path of LZ4 is broken for table data. I can
reproduce that with a dump of the regression database, as of:
make installcheck
pg_dump --format=d --file=dump_lz4 --compress=lz4 regressionUgh. Reproduced here ... so we need an open item for this.
BTW, it seems to work with --format=c.
Thank you for the extra tests. It seems that exists a gap in the test
coverage. Please find a patch attached that is addressing the issue
and attempt to provide tests for it.
Cheers,
//Georgios
Show quoted text
regards, tom lane
Attachments:
v1-0001-Advance-input-pointer-when-LZ4-compressing-data.patchtext/x-patch; name=v1-0001-Advance-input-pointer-when-LZ4-compressing-data.patchDownload
From 8c6c86c362820e93f066992ede6e5ca23f128807 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 8 May 2023 15:25:25 +0000
Subject: [PATCH v1] Advance input pointer when LZ4 compressing data
LZ4File_write() did not advance the input pointer on subsequent invocations of
LZ4F_compressUpdate(). As a result the generated compressed output would be a
compressed version of the same input chunk.
WriteDataToArchiveLZ4() which is also using LZ4F_compressUpdate() did not suffer
from this omission. Tests failed to catch this error because all of their input
would comfortably fit within the same input chunk. Tests have been added to
provide adequate coverage.
---
src/bin/pg_dump/compress_lz4.c | 5 +++-
src/bin/pg_dump/t/002_pg_dump.pl | 44 ++++++++++++++++++++++++++++++++
2 files changed, 48 insertions(+), 1 deletion(-)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index f97b7550d1..76211c82f0 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -564,6 +564,7 @@ LZ4Stream_write(const void *ptr, size_t size, CompressFileHandle *CFH)
LZ4State *state = (LZ4State *) CFH->private_data;
size_t status;
int remaining = size;
+ const void *in = ptr;
/* Lazy init */
if (!LZ4Stream_init(state, size, true))
@@ -576,7 +577,7 @@ LZ4Stream_write(const void *ptr, size_t size, CompressFileHandle *CFH)
remaining -= chunk;
status = LZ4F_compressUpdate(state->ctx, state->buffer, state->buflen,
- ptr, chunk, NULL);
+ in, chunk, NULL);
if (LZ4F_isError(status))
{
state->errcode = status;
@@ -588,6 +589,8 @@ LZ4Stream_write(const void *ptr, size_t size, CompressFileHandle *CFH)
errno = (errno) ? errno : ENOSPC;
return false;
}
+
+ in = ((const char *) in) + chunk;
}
return true;
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 93e24d5145..c6b1225815 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -3108,6 +3108,50 @@ my %tests = (
},
},
+ 'CREATE TABLE test_compression_method' => {
+ create_order => 110,
+ create_sql => 'CREATE TABLE dump_test.test_compression_method (
+ col1 text
+ );',
+ regexp => qr/^
+ \QCREATE TABLE dump_test.test_compression_method (\E\n
+ \s+\Qcol1 text\E\n
+ \Q);\E
+ /xm,
+ like => {
+ %full_runs,
+ %dump_test_schema_runs,
+ section_pre_data => 1,
+ },
+ unlike => {
+ exclude_dump_test_schema => 1,
+ only_dump_measurement => 1,
+ },
+ },
+
+ 'COPY test_compression_method' => {
+ create_order => 111,
+ create_sql => 'INSERT INTO dump_test.test_compression_method (col1) '
+ . 'SELECT string_agg(a::text, \'\') FROM generate_series(1,4096) a;',
+ regexp => qr/^
+ \QCOPY dump_test.test_compression_method (col1) FROM stdin;\E
+ \n(?:\d{15277}\n){1}\\\.\n
+ /xm,
+ like => {
+ %full_runs,
+ data_only => 1,
+ section_data => 1,
+ only_dump_test_schema => 1,
+ test_schema_plus_large_objects => 1,
+ },
+ unlike => {
+ binary_upgrade => 1,
+ exclude_dump_test_schema => 1,
+ schema_only => 1,
+ only_dump_measurement => 1,
+ },
+ },
+
'CREATE TABLE fk_reference_test_table' => {
create_order => 21,
create_sql => 'CREATE TABLE dump_test.fk_reference_test_table (
--
2.34.1
On 5/8/23 03:16, Tom Lane wrote:
I wrote:
Michael Paquier <michael@paquier.xyz> writes:
While testing this patch, I have triggered an error pointing out that
the decompression path of LZ4 is broken for table data. I can
reproduce that with a dump of the regression database, as of:
make installcheck
pg_dump --format=d --file=dump_lz4 --compress=lz4 regressionUgh. Reproduced here ... so we need an open item for this.
BTW, it seems to work with --format=c.
The LZ4Stream_write() forgot to move the pointer to the next chunk, so
it was happily decompressing the initial chunk over and over. A bit
embarrassing oversight :-(
The custom format calls WriteDataToArchiveLZ4(), which was correct.
The attached patch fixes this for me.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
pg-dump-lz4-fix.patchtext/x-patch; charset=UTF-8; name=pg-dump-lz4-fix.patchDownload
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 423e1b7976f..43c4b9187ef 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -584,6 +584,8 @@ LZ4Stream_write(const void *ptr, size_t size, CompressFileHandle *CFH)
errno = (errno) ? errno : ENOSPC;
return false;
}
+
+ ptr = ((char *) ptr) + chunk;
}
return true;
On 5/8/23 18:19, gkokolatos@pm.me wrote:
------- Original Message -------
On Monday, May 8th, 2023 at 3:16 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:I wrote:
Michael Paquier michael@paquier.xyz writes:
While testing this patch, I have triggered an error pointing out that
the decompression path of LZ4 is broken for table data. I can
reproduce that with a dump of the regression database, as of:
make installcheck
pg_dump --format=d --file=dump_lz4 --compress=lz4 regressionUgh. Reproduced here ... so we need an open item for this.
BTW, it seems to work with --format=c.
Thank you for the extra tests. It seems that exists a gap in the test
coverage. Please find a patch attached that is addressing the issue
and attempt to provide tests for it.
Seems I'm getting messages with a delay - this is mostly the same fix I
ended up with, not realizing you already posted a fix.
I don't think we need the local "in" variable - the pointer parameter is
local in the function, so we can modify it directly (with a cast).
WriteDataToArchiveLZ4 does it that way too.
The tests are definitely a good idea. I wonder if we should add a
comment to DEFAULT_IO_BUFFER_SIZE mentioning that if we choose to
increase the value in the future, we needs to tweak the tests too to use
more data in order to exercise the buffering etc. Maybe it's obvious?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
------- Original Message -------
On Monday, May 8th, 2023 at 8:20 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 5/8/23 18:19, gkokolatos@pm.me wrote:
------- Original Message -------
On Monday, May 8th, 2023 at 3:16 AM, Tom Lane tgl@sss.pgh.pa.us wrote:I wrote:
Michael Paquier michael@paquier.xyz writes:
While testing this patch, I have triggered an error pointing out that
the decompression path of LZ4 is broken for table data. I can
reproduce that with a dump of the regression database, as of:
make installcheck
pg_dump --format=d --file=dump_lz4 --compress=lz4 regressionUgh. Reproduced here ... so we need an open item for this.
BTW, it seems to work with --format=c.
Thank you for the extra tests. It seems that exists a gap in the test
coverage. Please find a patch attached that is addressing the issue
and attempt to provide tests for it.Seems I'm getting messages with a delay - this is mostly the same fix I
ended up with, not realizing you already posted a fix.
Thank you very much for looking.
I don't think we need the local "in" variable - the pointer parameter is
local in the function, so we can modify it directly (with a cast).
WriteDataToArchiveLZ4 does it that way too.
Sure, patch updated.
The tests are definitely a good idea.
Thank you.
I wonder if we should add a
comment to DEFAULT_IO_BUFFER_SIZE mentioning that if we choose to
increase the value in the future, we needs to tweak the tests too to use
more data in order to exercise the buffering etc. Maybe it's obvious?
You are right. Added a comment both in the header and in the test.
I hope v2 gets closer to closing the open item for this.
Cheers,
//Georgios
Show quoted text
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
v2-0001-Advance-input-pointer-when-LZ4-compressing-data.patchtext/x-patch; name=v2-0001-Advance-input-pointer-when-LZ4-compressing-data.patchDownload
From 89e7066d6c3c6a7eeb147c3f2d345c3046a4d155 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 8 May 2023 19:48:03 +0000
Subject: [PATCH v2] Advance input pointer when LZ4 compressing data
LZ4File_write() did not advance the input pointer on subsequent invocations of
LZ4F_compressUpdate(). As a result the generated compressed output would be a
compressed version of the same input chunk.
WriteDataToArchiveLZ4() which is also using LZ4F_compressUpdate() did not suffer
from this omission. Tests failed to catch this error because all of their input
would comfortably fit within the same input chunk. Tests have been added to
provide adequate coverage.
---
src/bin/pg_dump/compress_io.h | 7 ++++-
src/bin/pg_dump/compress_lz4.c | 2 ++
src/bin/pg_dump/t/002_pg_dump.pl | 46 ++++++++++++++++++++++++++++++++
3 files changed, 54 insertions(+), 1 deletion(-)
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index fd8752db0d..e8efc57f1a 100644
--- a/src/bin/pg_dump/compress_io.h
+++ b/src/bin/pg_dump/compress_io.h
@@ -17,7 +17,12 @@
#include "pg_backup_archiver.h"
-/* Default size used for IO buffers */
+/*
+ * Default size used for IO buffers
+ *
+ * When altering this value it might be useful to verify that the relevant tests
+ * cases are meaningfully updated to provide coverage.
+ */
#define DEFAULT_IO_BUFFER_SIZE 4096
extern char *supports_compression(const pg_compress_specification compression_spec);
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index f97b7550d1..b869780c0b 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -588,6 +588,8 @@ LZ4Stream_write(const void *ptr, size_t size, CompressFileHandle *CFH)
errno = (errno) ? errno : ENOSPC;
return false;
}
+
+ ptr = ((const char *) ptr) + chunk;
}
return true;
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 93e24d5145..d66f3b42ea 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -3108,6 +3108,52 @@ my %tests = (
},
},
+ 'CREATE TABLE test_compression_method' => {
+ create_order => 110,
+ create_sql => 'CREATE TABLE dump_test.test_compression_method (
+ col1 text
+ );',
+ regexp => qr/^
+ \QCREATE TABLE dump_test.test_compression_method (\E\n
+ \s+\Qcol1 text\E\n
+ \Q);\E
+ /xm,
+ like => {
+ %full_runs,
+ %dump_test_schema_runs,
+ section_pre_data => 1,
+ },
+ unlike => {
+ exclude_dump_test_schema => 1,
+ only_dump_measurement => 1,
+ },
+ },
+
+ # Insert enough data to surpass DEFAULT_IO_BUFFER_SIZE during
+ # (de)compression operations
+ 'COPY test_compression_method' => {
+ create_order => 111,
+ create_sql => 'INSERT INTO dump_test.test_compression_method (col1) '
+ . 'SELECT string_agg(a::text, \'\') FROM generate_series(1,4096) a;',
+ regexp => qr/^
+ \QCOPY dump_test.test_compression_method (col1) FROM stdin;\E
+ \n(?:\d{15277}\n){1}\\\.\n
+ /xm,
+ like => {
+ %full_runs,
+ data_only => 1,
+ section_data => 1,
+ only_dump_test_schema => 1,
+ test_schema_plus_large_objects => 1,
+ },
+ unlike => {
+ binary_upgrade => 1,
+ exclude_dump_test_schema => 1,
+ schema_only => 1,
+ only_dump_measurement => 1,
+ },
+ },
+
'CREATE TABLE fk_reference_test_table' => {
create_order => 21,
create_sql => 'CREATE TABLE dump_test.fk_reference_test_table (
--
2.34.1
On Mon, May 08, 2023 at 08:00:39PM +0200, Tomas Vondra wrote:
The LZ4Stream_write() forgot to move the pointer to the next chunk, so
it was happily decompressing the initial chunk over and over. A bit
embarrassing oversight :-(The custom format calls WriteDataToArchiveLZ4(), which was correct.
The attached patch fixes this for me.
Ouch. So this was corrupting the dumps and the compression when
trying to write more than two chunks at once, not the decompression
steps. That addresses the issue here as well, thanks!
--
Michael
On 5/9/23 00:10, Michael Paquier wrote:
On Mon, May 08, 2023 at 08:00:39PM +0200, Tomas Vondra wrote:
The LZ4Stream_write() forgot to move the pointer to the next chunk, so
it was happily decompressing the initial chunk over and over. A bit
embarrassing oversight :-(The custom format calls WriteDataToArchiveLZ4(), which was correct.
The attached patch fixes this for me.
Ouch. So this was corrupting the dumps and the compression when
trying to write more than two chunks at once, not the decompression
steps. That addresses the issue here as well, thanks!
Yeah. Thanks for the report, should have been found during review.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
------- Original Message -------
On Tuesday, May 9th, 2023 at 2:54 PM, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 5/9/23 00:10, Michael Paquier wrote:
On Mon, May 08, 2023 at 08:00:39PM +0200, Tomas Vondra wrote:
The LZ4Stream_write() forgot to move the pointer to the next chunk, so
it was happily decompressing the initial chunk over and over. A bit
embarrassing oversight :-(The custom format calls WriteDataToArchiveLZ4(), which was correct.
The attached patch fixes this for me.
Ouch. So this was corrupting the dumps and the compression when
trying to write more than two chunks at once, not the decompression
steps. That addresses the issue here as well, thanks!Yeah. Thanks for the report, should have been found during review.
Thank you both for looking. A small consolation is that now there are
tests for this case.
Moving on to the other open item for this, please find attached v2
of the patch as requested.
Cheers,
//Georgios
Show quoted text
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
v2-0001-Null-terminate-the-output-buffer-of-LZ4Stream_get.patchtext/x-patch; name=v2-0001-Null-terminate-the-output-buffer-of-LZ4Stream_get.patchDownload
From cb0a229be59dffe09cc0ceceececdbd06a559d3f Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos <gkokolatos@pm.me>
Date: Mon, 8 May 2023 11:58:57 +0000
Subject: [PATCH v2] Null terminate the output buffer of LZ4Stream_gets
LZ4Stream_gets did not null terminate its output buffer. Its callers expected
the buffer to be null terminated so they passed it around to functions such as
sscanf with unintended consequences.
Reported-by: Alexander Lakhin<exclusion@gmail.com>
---
src/bin/pg_dump/compress_lz4.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 423e1b7976..f97b7550d1 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -459,6 +459,10 @@ LZ4Stream_read_internal(LZ4State *state, void *ptr, int ptrsize, bool eol_flag)
if (!LZ4Stream_init(state, size, false /* decompressing */ ))
return -1;
+ /* No work needs to be done for a zero-sized output buffer */
+ if (size <= 0)
+ return 0;
+
/* Verify that there is enough space in the outbuf */
if (size > state->buflen)
{
@@ -636,7 +640,7 @@ LZ4Stream_gets(char *ptr, int size, CompressFileHandle *CFH)
LZ4State *state = (LZ4State *) CFH->private_data;
int ret;
- ret = LZ4Stream_read_internal(state, ptr, size, true);
+ ret = LZ4Stream_read_internal(state, ptr, size - 1, true);
if (ret < 0 || (ret == 0 && !LZ4Stream_eof(CFH)))
pg_fatal("could not read from input file: %s", LZ4Stream_get_error(CFH));
@@ -644,6 +648,12 @@ LZ4Stream_gets(char *ptr, int size, CompressFileHandle *CFH)
if (ret == 0)
return NULL;
+ /*
+ * Our caller expects the return string to be NULL terminated
+ * and we know that ret is greater than zero.
+ */
+ ptr[ret - 1] = '\0';
+
return ptr;
}
--
2.34.1
On Tue, May 09, 2023 at 02:12:44PM +0000, gkokolatos@pm.me wrote:
Thank you both for looking. A small consolation is that now there are
tests for this case.
+1, noticing that was pure luck ;)
Worth noting that the patch posted in [1]/messages/by-id/SYTRcNgtAbzyn3y3IInh1x-UfNTKMNpnFvI3mr6SyqyVf3PkaDsMy_cpKKgsl3_HdLy2MFAH4zwjxDmFfiLO8rWtSiJWBtqT06OMjeNo4GA=@pm.me has these tests, not the
version posted in [2]/messages/by-id/f735df01-0bb4-2fbc-1297-73a520cfc534@enterprisedb.com.
+ create_sql => 'INSERT INTO dump_test.test_compression_method (col1) '
+ . 'SELECT string_agg(a::text, \'\') FROM generate_series(1,4096) a;',
Yep, good and cheap idea to check for longer chunks. That should be
enough to loop twice.
[1]: /messages/by-id/SYTRcNgtAbzyn3y3IInh1x-UfNTKMNpnFvI3mr6SyqyVf3PkaDsMy_cpKKgsl3_HdLy2MFAH4zwjxDmFfiLO8rWtSiJWBtqT06OMjeNo4GA=@pm.me
[2]: /messages/by-id/f735df01-0bb4-2fbc-1297-73a520cfc534@enterprisedb.com
Moving on to the other open item for this, please find attached v2
of the patch as requested.
Did you notice the comments of [3]/messages/by-id/ZFhCyn4Gm2eu60rB@paquier.xyz -- Michael about the second patch that aims to
add the null termination in the line from the LZ4 fgets() callback?
[3]: /messages/by-id/ZFhCyn4Gm2eu60rB@paquier.xyz -- Michael
--
Michael
On Tue, May 09, 2023 at 02:54:31PM +0200, Tomas Vondra wrote:
Yeah. Thanks for the report, should have been found during review.
Tomas, are you planning to do something by the end of this week for
beta1? Or do you need some help of any kind?
--
Michael
On 5/17/23 08:18, Michael Paquier wrote:
On Tue, May 09, 2023 at 02:54:31PM +0200, Tomas Vondra wrote:
Yeah. Thanks for the report, should have been found during review.
Tomas, are you planning to do something by the end of this week for
beta1? Or do you need some help of any kind?
I'll take care of it.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 5/17/23 10:59, Tomas Vondra wrote:
On 5/17/23 08:18, Michael Paquier wrote:
On Tue, May 09, 2023 at 02:54:31PM +0200, Tomas Vondra wrote:
Yeah. Thanks for the report, should have been found during review.
Tomas, are you planning to do something by the end of this week for
beta1? Or do you need some help of any kind?I'll take care of it.
FWIW I've pushed fixes for both open issues associated with the pg_dump
compression. I'll keep an eye on the buildfarm, but hopefully that'll do
it for beta1.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company