tablespace_map code cleanup
Hi,
I think it's not good that do_pg_start_backup() takes a flag which
tells it to call back into basebackup.c's sendTablespace(). This means
that details which ought to be private to basebackup.c leak out and
become visible to other parts of the code. This seems to have
originated in commit 72d422a5227ef6f76f412486a395aba9f53bf3f0, and it
looks like there was some discussion of the issue at the time. I think
that patch was right to want only a single iteration over the
tablespace list; if not, the list of tablespaces returned by the
backup could be different from the list that is included in the
tablespace map, which does seem like a good thing to avoid.
However, it doesn't follow that sendTablespace() needs to be called
from do_pg_start_backup(). It's not actually sending the tablespace at
that point, just calculating the size, because the sizeonly argument
is passed as true. And, there's no reason that I can see why that
needs to be done from within do_pg_start_backup(). It can equally well
be done after that function returns, as in the attached 0001. I
believe that this is functionally equivalent but more elegant,
although there is a notable behavior difference: today,
sendTablespaces() is called in sizeonly mode with "fullpath" as the
argument, which I think is pg_tblspc/$OID, and in non-sizeonly mode
with ti->path as an argument, which seems to be the path to which the
symlink points. With the patch, it would be called with the latter in
both cases. It looks to me like that should be OK, and it definitely
seems more consistent.
While I was poking around in this area, I found some other code which
I thought could stand a bit of improvement also. The attached 0002
slightly modifies some tablespace_map related code and comments in
perform_base_backup(), so that instead of having two very similar
calls to sendDir() right next to each other that differ only in the
value passed for the fifth argument, we have just one call with the
fifth argument being a variable. Although this is a minor change I
think it's a good cleanup that reduces the chances of future mistakes
in this area.
Comments?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
0001-Don-t-export-basebackup.c-s-sendTablespace.patchapplication/octet-stream; name=0001-Don-t-export-basebackup.c-s-sendTablespace.patchDownload
From b7371a62b350954a6c5d967e10056753c3888ff3 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 29 Apr 2020 11:57:56 -0400
Subject: [PATCH 1/2] Don't export basebackup.c's sendTablespace().
Commit 72d422a5227ef6f76f412486a395aba9f53bf3f0 made xlog.c call
sendTablespace() with the 'sizeonly' argument set to true, which
required basebackup.c to export sendTablespace(). However, that's
kind of ugly, so instead defer the call to sendTablespace() until
basebackup.c regains control. That way, it can still be a static
function.
---
src/backend/access/transam/xlog.c | 6 ++----
src/backend/access/transam/xlogfuncs.c | 4 ++--
src/backend/replication/basebackup.c | 18 +++++++++++-------
src/include/access/xlog.h | 2 +-
src/include/replication/basebackup.h | 6 ------
5 files changed, 16 insertions(+), 20 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 065eb275b1..220dcf1053 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -10468,8 +10468,7 @@ issue_xlog_fsync(int fd, XLogSegNo segno)
XLogRecPtr
do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
StringInfo labelfile, List **tablespaces,
- StringInfo tblspcmapfile, bool infotbssize,
- bool needtblspcmapfile)
+ StringInfo tblspcmapfile, bool needtblspcmapfile)
{
bool exclusive = (labelfile == NULL);
bool backup_started_in_recovery = false;
@@ -10761,8 +10760,7 @@ do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
ti->oid = pstrdup(de->d_name);
ti->path = pstrdup(buflinkpath.data);
ti->rpath = relpath ? pstrdup(relpath) : NULL;
- ti->size = infotbssize ?
- sendTablespace(fullpath, ti->oid, true, NULL) : -1;
+ ti->size = -1;
if (tablespaces)
*tablespaces = lappend(*tablespaces, ti);
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 00e1b33ed5..290658b22c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -76,7 +76,7 @@ pg_start_backup(PG_FUNCTION_ARGS)
if (exclusive)
{
startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL,
- NULL, NULL, false, true);
+ NULL, NULL, true);
}
else
{
@@ -94,7 +94,7 @@ pg_start_backup(PG_FUNCTION_ARGS)
register_persistent_abort_backup_handler();
startpoint = do_pg_start_backup(backupidstr, fast, NULL, label_file,
- NULL, tblspc_map_file, false, true);
+ NULL, tblspc_map_file, true);
}
PG_RETURN_LSN(startpoint);
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index fbdc28ec39..b83c1edab0 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -58,6 +58,8 @@ typedef struct
pg_checksum_type manifest_checksum_type;
} basebackup_options;
+static int64 sendTablespace(char *path, char *oid, bool sizeonly,
+ struct backup_manifest_info *manifest);
static int64 sendDir(const char *path, int basepathlen, bool sizeonly,
List *tablespaces, bool sendtblspclinks,
backup_manifest_info *manifest, const char *spcoid);
@@ -307,8 +309,7 @@ perform_base_backup(basebackup_options *opt)
PROGRESS_BASEBACKUP_PHASE_WAIT_CHECKPOINT);
startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli,
labelfile, &tablespaces,
- tblspc_map_file,
- opt->progress, opt->sendtblspcmapfile);
+ tblspc_map_file, opt->sendtblspcmapfile);
/*
* Once do_pg_start_backup has been called, ensure that any failure causes
@@ -337,10 +338,7 @@ perform_base_backup(basebackup_options *opt)
/* Add a node for the base directory at the end */
ti = palloc0(sizeof(tablespaceinfo));
- if (opt->progress)
- ti->size = sendDir(".", 1, true, tablespaces, true, NULL, NULL);
- else
- ti->size = -1;
+ ti->size = -1;
tablespaces = lappend(tablespaces, ti);
/*
@@ -353,6 +351,12 @@ perform_base_backup(basebackup_options *opt)
{
tablespaceinfo *tmp = (tablespaceinfo *) lfirst(lc);
+ if (tmp->path == NULL)
+ tmp->size = sendDir(".", 1, true, tablespaces, true, NULL,
+ NULL);
+ else
+ tmp->size = sendTablespace(tmp->path, tmp->oid, true,
+ NULL);
backup_total += tmp->size;
}
}
@@ -1141,7 +1145,7 @@ sendFileWithContent(const char *filename, const char *content,
*
* Only used to send auxiliary tablespaces, not PGDATA.
*/
-int64
+static int64
sendTablespace(char *path, char *spcoid, bool sizeonly,
backup_manifest_info *manifest)
{
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 0a12afb59e..26504c633a 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -372,7 +372,7 @@ typedef enum SessionBackupState
extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
TimeLineID *starttli_p, StringInfo labelfile,
- List **tablespaces, StringInfo tblspcmapfile, bool infotbssize,
+ List **tablespaces, StringInfo tblspcmapfile,
bool needtblspcmapfile);
extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
TimeLineID *stoptli_p);
diff --git a/src/include/replication/basebackup.h b/src/include/replication/basebackup.h
index 923a651cac..f5f044dacd 100644
--- a/src/include/replication/basebackup.h
+++ b/src/include/replication/basebackup.h
@@ -14,9 +14,6 @@
#include "nodes/replnodes.h"
-struct backup_manifest_info; /* avoid including backup_manifest.h */
-
-
/*
* Minimum and maximum values of MAX_RATE option in BASE_BACKUP command.
*/
@@ -33,7 +30,4 @@ typedef struct
extern void SendBaseBackup(BaseBackupCmd *cmd);
-extern int64 sendTablespace(char *path, char *oid, bool sizeonly,
- struct backup_manifest_info *manifest);
-
#endif /* _BASEBACKUP_H */
--
2.24.2 (Apple Git-127)
0002-Minor-code-cleanup-for-perform_base_backup.patchapplication/octet-stream; name=0002-Minor-code-cleanup-for-perform_base_backup.patchDownload
From 6e8274a0600fdd0b1b65f45a593a37cd3742d0e4 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 29 Apr 2020 12:09:07 -0400
Subject: [PATCH 2/2] Minor code cleanup for perform_base_backup().
Merge two calls to sendDir() that are exactly the same except for
the fifth argument. Adjust comments to match.
This seems a little bit clearer than the old way.
---
src/backend/replication/basebackup.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index b83c1edab0..150e382a40 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -421,25 +421,23 @@ perform_base_backup(basebackup_options *opt)
if (ti->path == NULL)
{
struct stat statbuf;
+ bool sendtblspclinks = true;
/* In the main tar, include the backup_label first... */
sendFileWithContent(BACKUP_LABEL_FILE, labelfile->data,
&manifest);
- /*
- * Send tablespace_map file if required and then the bulk of
- * the files.
- */
+ /* Then the tablespace_map file, if required... */
if (tblspc_map_file && opt->sendtblspcmapfile)
{
sendFileWithContent(TABLESPACE_MAP, tblspc_map_file->data,
&manifest);
- sendDir(".", 1, false, tablespaces, false,
- &manifest, NULL);
+ sendtblspclinks = false;
}
- else
- sendDir(".", 1, false, tablespaces, true,
- &manifest, NULL);
+
+ /* Then the bulk of the files... */
+ sendDir(".", 1, false, tablespaces, sendtblspclinks,
+ &manifest, NULL);
/* ... and pg_control after everything else. */
if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
--
2.24.2 (Apple Git-127)
On Wed, Apr 29, 2020 at 9:57 PM Robert Haas <robertmhaas@gmail.com> wrote:
Hi,
I think it's not good that do_pg_start_backup() takes a flag which
tells it to call back into basebackup.c's sendTablespace(). This means
that details which ought to be private to basebackup.c leak out and
become visible to other parts of the code. This seems to have
originated in commit 72d422a5227ef6f76f412486a395aba9f53bf3f0, and it
looks like there was some discussion of the issue at the time. I think
that patch was right to want only a single iteration over the
tablespace list; if not, the list of tablespaces returned by the
backup could be different from the list that is included in the
tablespace map, which does seem like a good thing to avoid.However, it doesn't follow that sendTablespace() needs to be called
from do_pg_start_backup(). It's not actually sending the tablespace at
that point, just calculating the size, because the sizeonly argument
is passed as true. And, there's no reason that I can see why that
needs to be done from within do_pg_start_backup(). It can equally well
be done after that function returns, as in the attached 0001. I
believe that this is functionally equivalent but more elegant,
although there is a notable behavior difference: today,
sendTablespaces() is called in sizeonly mode with "fullpath" as the
argument, which I think is pg_tblspc/$OID, and in non-sizeonly mode
with ti->path as an argument, which seems to be the path to which the
symlink points. With the patch, it would be called with the latter in
both cases. It looks to me like that should be OK, and it definitely
seems more consistent.
If we want to move the calculation of size for tablespaces in the
caller then I think we also need to do something about the progress
reporting related to phase
PROGRESS_BASEBACKUP_PHASE_ESTIMATE_BACKUP_SIZE.
While I was poking around in this area, I found some other code which
I thought could stand a bit of improvement also. The attached 0002
slightly modifies some tablespace_map related code and comments in
perform_base_backup(), so that instead of having two very similar
calls to sendDir() right next to each other that differ only in the
value passed for the fifth argument, we have just one call with the
fifth argument being a variable. Although this is a minor change I
think it's a good cleanup that reduces the chances of future mistakes
in this area.
The 0002 patch looks good to me.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Mon, May 4, 2020 at 5:24 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
If we want to move the calculation of size for tablespaces in the
caller then I think we also need to do something about the progress
reporting related to phase
PROGRESS_BASEBACKUP_PHASE_ESTIMATE_BACKUP_SIZE.
Oh, good point. v2 attached.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
v2-0001-Don-t-export-basebackup.c-s-sendTablespace.patchapplication/octet-stream; name=v2-0001-Don-t-export-basebackup.c-s-sendTablespace.patchDownload
From a730ccb616e5a1f206667744c94dae82233cad33 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 29 Apr 2020 11:57:56 -0400
Subject: [PATCH v2 1/2] Don't export basebackup.c's sendTablespace().
Commit 72d422a5227ef6f76f412486a395aba9f53bf3f0 made xlog.c call
sendTablespace() with the 'sizeonly' argument set to true, which
required basebackup.c to export sendTablespace(). However, that's
kind of ugly, so instead defer the call to sendTablespace() until
basebackup.c regains control. That way, it can still be a static
function.
Patch by me, reviewed by Amit Kapila.
Discussion: http://postgr.es/m/CA+TgmoYq+59SJ2zBbP891ngWPA9fymOqntqYcweSDYXS2a620A@mail.gmail.com
---
src/backend/access/transam/xlog.c | 14 ++------------
src/backend/access/transam/xlogfuncs.c | 4 ++--
src/backend/replication/basebackup.c | 21 ++++++++++++++-------
src/include/access/xlog.h | 2 +-
src/include/replication/basebackup.h | 6 ------
5 files changed, 19 insertions(+), 28 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 0d3d670928..85eabbceb0 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -10468,8 +10468,7 @@ issue_xlog_fsync(int fd, XLogSegNo segno)
XLogRecPtr
do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
StringInfo labelfile, List **tablespaces,
- StringInfo tblspcmapfile, bool infotbssize,
- bool needtblspcmapfile)
+ StringInfo tblspcmapfile, bool needtblspcmapfile)
{
bool exclusive = (labelfile == NULL);
bool backup_started_in_recovery = false;
@@ -10689,14 +10688,6 @@ do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
datadirpathlen = strlen(DataDir);
- /*
- * Report that we are now estimating the total backup size
- * if we're streaming base backup as requested by pg_basebackup
- */
- if (tablespaces)
- pgstat_progress_update_param(PROGRESS_BASEBACKUP_PHASE,
- PROGRESS_BASEBACKUP_PHASE_ESTIMATE_BACKUP_SIZE);
-
/* Collect information about all tablespaces */
tblspcdir = AllocateDir("pg_tblspc");
while ((de = ReadDir(tblspcdir, "pg_tblspc")) != NULL)
@@ -10761,8 +10752,7 @@ do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
ti->oid = pstrdup(de->d_name);
ti->path = pstrdup(buflinkpath.data);
ti->rpath = relpath ? pstrdup(relpath) : NULL;
- ti->size = infotbssize ?
- sendTablespace(fullpath, ti->oid, true, NULL) : -1;
+ ti->size = -1;
if (tablespaces)
*tablespaces = lappend(*tablespaces, ti);
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 00e1b33ed5..290658b22c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -76,7 +76,7 @@ pg_start_backup(PG_FUNCTION_ARGS)
if (exclusive)
{
startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL,
- NULL, NULL, false, true);
+ NULL, NULL, true);
}
else
{
@@ -94,7 +94,7 @@ pg_start_backup(PG_FUNCTION_ARGS)
register_persistent_abort_backup_handler();
startpoint = do_pg_start_backup(backupidstr, fast, NULL, label_file,
- NULL, tblspc_map_file, false, true);
+ NULL, tblspc_map_file, true);
}
PG_RETURN_LSN(startpoint);
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index fbdc28ec39..98be1b854b 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -58,6 +58,8 @@ typedef struct
pg_checksum_type manifest_checksum_type;
} basebackup_options;
+static int64 sendTablespace(char *path, char *oid, bool sizeonly,
+ struct backup_manifest_info *manifest);
static int64 sendDir(const char *path, int basepathlen, bool sizeonly,
List *tablespaces, bool sendtblspclinks,
backup_manifest_info *manifest, const char *spcoid);
@@ -307,8 +309,7 @@ perform_base_backup(basebackup_options *opt)
PROGRESS_BASEBACKUP_PHASE_WAIT_CHECKPOINT);
startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli,
labelfile, &tablespaces,
- tblspc_map_file,
- opt->progress, opt->sendtblspcmapfile);
+ tblspc_map_file, opt->sendtblspcmapfile);
/*
* Once do_pg_start_backup has been called, ensure that any failure causes
@@ -337,10 +338,7 @@ perform_base_backup(basebackup_options *opt)
/* Add a node for the base directory at the end */
ti = palloc0(sizeof(tablespaceinfo));
- if (opt->progress)
- ti->size = sendDir(".", 1, true, tablespaces, true, NULL, NULL);
- else
- ti->size = -1;
+ ti->size = -1;
tablespaces = lappend(tablespaces, ti);
/*
@@ -349,10 +347,19 @@ perform_base_backup(basebackup_options *opt)
*/
if (opt->progress)
{
+ pgstat_progress_update_param(PROGRESS_BASEBACKUP_PHASE,
+ PROGRESS_BASEBACKUP_PHASE_ESTIMATE_BACKUP_SIZE);
+
foreach(lc, tablespaces)
{
tablespaceinfo *tmp = (tablespaceinfo *) lfirst(lc);
+ if (tmp->path == NULL)
+ tmp->size = sendDir(".", 1, true, tablespaces, true, NULL,
+ NULL);
+ else
+ tmp->size = sendTablespace(tmp->path, tmp->oid, true,
+ NULL);
backup_total += tmp->size;
}
}
@@ -1141,7 +1148,7 @@ sendFileWithContent(const char *filename, const char *content,
*
* Only used to send auxiliary tablespaces, not PGDATA.
*/
-int64
+static int64
sendTablespace(char *path, char *spcoid, bool sizeonly,
backup_manifest_info *manifest)
{
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index e917dfe92d..347a38f57c 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -372,7 +372,7 @@ typedef enum SessionBackupState
extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
TimeLineID *starttli_p, StringInfo labelfile,
- List **tablespaces, StringInfo tblspcmapfile, bool infotbssize,
+ List **tablespaces, StringInfo tblspcmapfile,
bool needtblspcmapfile);
extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
TimeLineID *stoptli_p);
diff --git a/src/include/replication/basebackup.h b/src/include/replication/basebackup.h
index 923a651cac..f5f044dacd 100644
--- a/src/include/replication/basebackup.h
+++ b/src/include/replication/basebackup.h
@@ -14,9 +14,6 @@
#include "nodes/replnodes.h"
-struct backup_manifest_info; /* avoid including backup_manifest.h */
-
-
/*
* Minimum and maximum values of MAX_RATE option in BASE_BACKUP command.
*/
@@ -33,7 +30,4 @@ typedef struct
extern void SendBaseBackup(BaseBackupCmd *cmd);
-extern int64 sendTablespace(char *path, char *oid, bool sizeonly,
- struct backup_manifest_info *manifest);
-
#endif /* _BASEBACKUP_H */
--
2.24.2 (Apple Git-127)
v2-0002-Minor-code-cleanup-for-perform_base_backup.patchapplication/octet-stream; name=v2-0002-Minor-code-cleanup-for-perform_base_backup.patchDownload
From a84f7427b4fa888718ed99e048de81266de8b0cc Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 29 Apr 2020 12:09:07 -0400
Subject: [PATCH v2 2/2] Minor code cleanup for perform_base_backup().
Merge two calls to sendDir() that are exactly the same except for
the fifth argument. Adjust comments to match.
This seems a little bit clearer than the old way.
Patch by me, reviewed by Amit Kapila.
Discussion: http://postgr.es/m/CA+TgmoYq+59SJ2zBbP891ngWPA9fymOqntqYcweSDYXS2a620A@mail.gmail.com
---
src/backend/replication/basebackup.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 98be1b854b..d9afcc0b90 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -424,25 +424,23 @@ perform_base_backup(basebackup_options *opt)
if (ti->path == NULL)
{
struct stat statbuf;
+ bool sendtblspclinks = true;
/* In the main tar, include the backup_label first... */
sendFileWithContent(BACKUP_LABEL_FILE, labelfile->data,
&manifest);
- /*
- * Send tablespace_map file if required and then the bulk of
- * the files.
- */
+ /* Then the tablespace_map file, if required... */
if (tblspc_map_file && opt->sendtblspcmapfile)
{
sendFileWithContent(TABLESPACE_MAP, tblspc_map_file->data,
&manifest);
- sendDir(".", 1, false, tablespaces, false,
- &manifest, NULL);
+ sendtblspclinks = false;
}
- else
- sendDir(".", 1, false, tablespaces, true,
- &manifest, NULL);
+
+ /* Then the bulk of the files... */
+ sendDir(".", 1, false, tablespaces, sendtblspclinks,
+ &manifest, NULL);
/* ... and pg_control after everything else. */
if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
--
2.24.2 (Apple Git-127)
On Wed, May 6, 2020 at 11:15 AM Robert Haas <robertmhaas@gmail.com> wrote:
Oh, good point. v2 attached.
Here's v3, with one more small cleanup. I noticed tblspc_map_file is
initialized to NULL and then unconditionally reset to the return value
of makeStringInfo(), and then later tested to see whether it is NULL.
It can't be, because makeStringInfo() doesn't return NULL. So the
attached version deletes the superfluous initialization and the
superfluous test.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
v3-0001-Don-t-export-basebackup.c-s-sendTablespace.patchapplication/octet-stream; name=v3-0001-Don-t-export-basebackup.c-s-sendTablespace.patchDownload
From a730ccb616e5a1f206667744c94dae82233cad33 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 29 Apr 2020 11:57:56 -0400
Subject: [PATCH v3 1/2] Don't export basebackup.c's sendTablespace().
Commit 72d422a5227ef6f76f412486a395aba9f53bf3f0 made xlog.c call
sendTablespace() with the 'sizeonly' argument set to true, which
required basebackup.c to export sendTablespace(). However, that's
kind of ugly, so instead defer the call to sendTablespace() until
basebackup.c regains control. That way, it can still be a static
function.
Patch by me, reviewed by Amit Kapila.
Discussion: http://postgr.es/m/CA+TgmoYq+59SJ2zBbP891ngWPA9fymOqntqYcweSDYXS2a620A@mail.gmail.com
---
src/backend/access/transam/xlog.c | 14 ++------------
src/backend/access/transam/xlogfuncs.c | 4 ++--
src/backend/replication/basebackup.c | 21 ++++++++++++++-------
src/include/access/xlog.h | 2 +-
src/include/replication/basebackup.h | 6 ------
5 files changed, 19 insertions(+), 28 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 0d3d670928..85eabbceb0 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -10468,8 +10468,7 @@ issue_xlog_fsync(int fd, XLogSegNo segno)
XLogRecPtr
do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
StringInfo labelfile, List **tablespaces,
- StringInfo tblspcmapfile, bool infotbssize,
- bool needtblspcmapfile)
+ StringInfo tblspcmapfile, bool needtblspcmapfile)
{
bool exclusive = (labelfile == NULL);
bool backup_started_in_recovery = false;
@@ -10689,14 +10688,6 @@ do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
datadirpathlen = strlen(DataDir);
- /*
- * Report that we are now estimating the total backup size
- * if we're streaming base backup as requested by pg_basebackup
- */
- if (tablespaces)
- pgstat_progress_update_param(PROGRESS_BASEBACKUP_PHASE,
- PROGRESS_BASEBACKUP_PHASE_ESTIMATE_BACKUP_SIZE);
-
/* Collect information about all tablespaces */
tblspcdir = AllocateDir("pg_tblspc");
while ((de = ReadDir(tblspcdir, "pg_tblspc")) != NULL)
@@ -10761,8 +10752,7 @@ do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
ti->oid = pstrdup(de->d_name);
ti->path = pstrdup(buflinkpath.data);
ti->rpath = relpath ? pstrdup(relpath) : NULL;
- ti->size = infotbssize ?
- sendTablespace(fullpath, ti->oid, true, NULL) : -1;
+ ti->size = -1;
if (tablespaces)
*tablespaces = lappend(*tablespaces, ti);
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 00e1b33ed5..290658b22c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -76,7 +76,7 @@ pg_start_backup(PG_FUNCTION_ARGS)
if (exclusive)
{
startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL,
- NULL, NULL, false, true);
+ NULL, NULL, true);
}
else
{
@@ -94,7 +94,7 @@ pg_start_backup(PG_FUNCTION_ARGS)
register_persistent_abort_backup_handler();
startpoint = do_pg_start_backup(backupidstr, fast, NULL, label_file,
- NULL, tblspc_map_file, false, true);
+ NULL, tblspc_map_file, true);
}
PG_RETURN_LSN(startpoint);
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index fbdc28ec39..98be1b854b 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -58,6 +58,8 @@ typedef struct
pg_checksum_type manifest_checksum_type;
} basebackup_options;
+static int64 sendTablespace(char *path, char *oid, bool sizeonly,
+ struct backup_manifest_info *manifest);
static int64 sendDir(const char *path, int basepathlen, bool sizeonly,
List *tablespaces, bool sendtblspclinks,
backup_manifest_info *manifest, const char *spcoid);
@@ -307,8 +309,7 @@ perform_base_backup(basebackup_options *opt)
PROGRESS_BASEBACKUP_PHASE_WAIT_CHECKPOINT);
startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli,
labelfile, &tablespaces,
- tblspc_map_file,
- opt->progress, opt->sendtblspcmapfile);
+ tblspc_map_file, opt->sendtblspcmapfile);
/*
* Once do_pg_start_backup has been called, ensure that any failure causes
@@ -337,10 +338,7 @@ perform_base_backup(basebackup_options *opt)
/* Add a node for the base directory at the end */
ti = palloc0(sizeof(tablespaceinfo));
- if (opt->progress)
- ti->size = sendDir(".", 1, true, tablespaces, true, NULL, NULL);
- else
- ti->size = -1;
+ ti->size = -1;
tablespaces = lappend(tablespaces, ti);
/*
@@ -349,10 +347,19 @@ perform_base_backup(basebackup_options *opt)
*/
if (opt->progress)
{
+ pgstat_progress_update_param(PROGRESS_BASEBACKUP_PHASE,
+ PROGRESS_BASEBACKUP_PHASE_ESTIMATE_BACKUP_SIZE);
+
foreach(lc, tablespaces)
{
tablespaceinfo *tmp = (tablespaceinfo *) lfirst(lc);
+ if (tmp->path == NULL)
+ tmp->size = sendDir(".", 1, true, tablespaces, true, NULL,
+ NULL);
+ else
+ tmp->size = sendTablespace(tmp->path, tmp->oid, true,
+ NULL);
backup_total += tmp->size;
}
}
@@ -1141,7 +1148,7 @@ sendFileWithContent(const char *filename, const char *content,
*
* Only used to send auxiliary tablespaces, not PGDATA.
*/
-int64
+static int64
sendTablespace(char *path, char *spcoid, bool sizeonly,
backup_manifest_info *manifest)
{
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index e917dfe92d..347a38f57c 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -372,7 +372,7 @@ typedef enum SessionBackupState
extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
TimeLineID *starttli_p, StringInfo labelfile,
- List **tablespaces, StringInfo tblspcmapfile, bool infotbssize,
+ List **tablespaces, StringInfo tblspcmapfile,
bool needtblspcmapfile);
extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
TimeLineID *stoptli_p);
diff --git a/src/include/replication/basebackup.h b/src/include/replication/basebackup.h
index 923a651cac..f5f044dacd 100644
--- a/src/include/replication/basebackup.h
+++ b/src/include/replication/basebackup.h
@@ -14,9 +14,6 @@
#include "nodes/replnodes.h"
-struct backup_manifest_info; /* avoid including backup_manifest.h */
-
-
/*
* Minimum and maximum values of MAX_RATE option in BASE_BACKUP command.
*/
@@ -33,7 +30,4 @@ typedef struct
extern void SendBaseBackup(BaseBackupCmd *cmd);
-extern int64 sendTablespace(char *path, char *oid, bool sizeonly,
- struct backup_manifest_info *manifest);
-
#endif /* _BASEBACKUP_H */
--
2.24.2 (Apple Git-127)
v3-0002-Minor-code-cleanup-for-perform_base_backup.patchapplication/octet-stream; name=v3-0002-Minor-code-cleanup-for-perform_base_backup.patchDownload
From 10665fd4419b2bb76c2559e0d3e4806ba699246f Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 6 May 2020 17:50:33 -0400
Subject: [PATCH v3 2/2] Minor code cleanup for perform_base_backup().
Merge two calls to sendDir() that are exactly the same except for
the fifth argument. Adjust comments to match.
Also, don't bother checking whether tblspc_map_file is NULL. We
initialize it in all cases, so it can't be.
Patch by me, reviewed by Amit Kapila.
Discussion: http://postgr.es/m/CA+TgmoYq+59SJ2zBbP891ngWPA9fymOqntqYcweSDYXS2a620A@mail.gmail.com
---
src/backend/replication/basebackup.c | 18 ++++++++----------
1 file changed, 8 insertions(+), 10 deletions(-)
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 98be1b854b..8a491384dd 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -424,25 +424,23 @@ perform_base_backup(basebackup_options *opt)
if (ti->path == NULL)
{
struct stat statbuf;
+ bool sendtblspclinks = true;
/* In the main tar, include the backup_label first... */
sendFileWithContent(BACKUP_LABEL_FILE, labelfile->data,
&manifest);
- /*
- * Send tablespace_map file if required and then the bulk of
- * the files.
- */
- if (tblspc_map_file && opt->sendtblspcmapfile)
+ /* Then the tablespace_map file, if required... */
+ if (opt->sendtblspcmapfile)
{
sendFileWithContent(TABLESPACE_MAP, tblspc_map_file->data,
&manifest);
- sendDir(".", 1, false, tablespaces, false,
- &manifest, NULL);
+ sendtblspclinks = false;
}
- else
- sendDir(".", 1, false, tablespaces, true,
- &manifest, NULL);
+
+ /* Then the bulk of the files... */
+ sendDir(".", 1, false, tablespaces, sendtblspclinks,
+ &manifest, NULL);
/* ... and pg_control after everything else. */
if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
--
2.24.2 (Apple Git-127)
On Thu, May 7, 2020 at 9:44 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, May 6, 2020 at 11:15 AM Robert Haas <robertmhaas@gmail.com> wrote:
Oh, good point. v2 attached.
While looking at this, I noticed that caller (perform_base_backup) of
do_pg_start_backup, sets the backup phase as
PROGRESS_BASEBACKUP_PHASE_WAIT_CHECKPOINT whereas, in
do_pg_start_backup, we do collect the information about all
tablespaces after the checkpoint. I am not sure if it is long enough
that we consider having a separate phase for it. Without your patch,
it was covered under PROGRESS_BASEBACKUP_PHASE_ESTIMATE_BACKUP_SIZE
phase which doesn't appear to be a bad idea.
Here's v3, with one more small cleanup. I noticed tblspc_map_file is
initialized to NULL and then unconditionally reset to the return value
of makeStringInfo(), and then later tested to see whether it is NULL.
It can't be, because makeStringInfo() doesn't return NULL. So the
attached version deletes the superfluous initialization and the
superfluous test.
This change looks fine to me.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Tue, May 12, 2020 at 2:23 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
While looking at this, I noticed that caller (perform_base_backup) of
do_pg_start_backup, sets the backup phase as
PROGRESS_BASEBACKUP_PHASE_WAIT_CHECKPOINT whereas, in
do_pg_start_backup, we do collect the information about all
tablespaces after the checkpoint. I am not sure if it is long enough
that we consider having a separate phase for it. Without your patch,
it was covered under PROGRESS_BASEBACKUP_PHASE_ESTIMATE_BACKUP_SIZE
phase which doesn't appear to be a bad idea.
Maybe I'm confused here, but I think the size estimation still *is*
covered under PROGRESS_BASEBACKUP_PHASE_ESTIMATE_BACKUP_SIZE. It's
just that now that happens a bit later. I'm assuming that listing the
tablespaces is pretty cheap, but sizing them is expensive, as you'd
have to iterate over all the files and stat() each one.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, May 13, 2020 at 1:54 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, May 12, 2020 at 2:23 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
While looking at this, I noticed that caller (perform_base_backup) of
do_pg_start_backup, sets the backup phase as
PROGRESS_BASEBACKUP_PHASE_WAIT_CHECKPOINT whereas, in
do_pg_start_backup, we do collect the information about all
tablespaces after the checkpoint. I am not sure if it is long enough
that we consider having a separate phase for it. Without your patch,
it was covered under PROGRESS_BASEBACKUP_PHASE_ESTIMATE_BACKUP_SIZE
phase which doesn't appear to be a bad idea.Maybe I'm confused here, but I think the size estimation still *is*
covered under PROGRESS_BASEBACKUP_PHASE_ESTIMATE_BACKUP_SIZE. It's
just that now that happens a bit later.
There is no problem with this part.
I'm assuming that listing the
tablespaces is pretty cheap, but sizing them is expensive, as you'd
have to iterate over all the files and stat() each one.
I was trying to say that tablespace listing will happen under
PROGRESS_BASEBACKUP_PHASE_WAIT_CHECKPOINT phase which could be a
problem if it is a costly operation but as you said it is pretty cheap
so I think we don't need to bother about that.
Apart from the above point which I think we don't need to bother, both
your patches look good to me.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Tue, May 12, 2020 at 10:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I was trying to say that tablespace listing will happen under
PROGRESS_BASEBACKUP_PHASE_WAIT_CHECKPOINT phase which could be a
problem if it is a costly operation but as you said it is pretty cheap
so I think we don't need to bother about that.Apart from the above point which I think we don't need to bother, both
your patches look good to me.
OK, good. Let's see if anyone else feels differently about this issue
or wants to raise anything else. If not, I'll plan to commit these
patches after we branch. Thanks for the review.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
At Wed, 13 May 2020 15:10:30 -0400, Robert Haas <robertmhaas@gmail.com> wrote in
On Tue, May 12, 2020 at 10:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I was trying to say that tablespace listing will happen under
PROGRESS_BASEBACKUP_PHASE_WAIT_CHECKPOINT phase which could be a
problem if it is a costly operation but as you said it is pretty cheap
so I think we don't need to bother about that.Apart from the above point which I think we don't need to bother, both
your patches look good to me.OK, good. Let's see if anyone else feels differently about this issue
or wants to raise anything else. If not, I'll plan to commit these
patches after we branch. Thanks for the review.
Table space listing needs only one or few 512k pages, which should be
on OS file cache, which cannot take long time unless the system is
facing a severe trouble. (I believe that is the same on Windows.)
I'm fine that WAIT_CHECKPOINT contains the time to enumerate
tablespace directories.
0001 looks good to me. The progress information gets
About 0002,
+ bool sendtblspclinks = true;
The boolean seems to me useless since it is always the inverse of
opt->sendtblspcmapfile when it is used.
Everything looks fine to me except the above.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Wed, May 13, 2020 at 10:43 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
About 0002,
+ bool sendtblspclinks = true;
The boolean seems to me useless since it is always the inverse of
opt->sendtblspcmapfile when it is used.
Well, I think it might have some documentation value, to clarify that
whether or not we send tablespace links is the opposite of whether we
send the tablespace map file.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
At Wed, 27 May 2020 07:57:38 -0400, Robert Haas <robertmhaas@gmail.com> wrote in
On Wed, May 13, 2020 at 10:43 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:About 0002,
+ bool sendtblspclinks = true;
The boolean seems to me useless since it is always the inverse of
opt->sendtblspcmapfile when it is used.Well, I think it might have some documentation value, to clarify that
whether or not we send tablespace links is the opposite of whether we
send the tablespace map file.
That makes sense to me. Thanks!
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Wed, May 27, 2020 at 8:11 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
That makes sense to me. Thanks!
Great. Thanks to you and Amit for reviewing. I have committed the patches.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company