Exclude pg_internal.init from base backup
Hackers,
The cache in pg_internal.init was reused in days of yore but has been
rebuilt on postmaster startup since v8.1. It appears there is no reason
for this file to be backed up.
I also moved the RELCACHE_INIT_FILENAME constant to relcache.h to avoid
duplicating the string.
I'll add this to the 2017-11 CF.
Thanks,
--
-David
david@pgmasters.net
Attachments:
pg_basebackup-exclusion-v1.patchtext/plain; charset=UTF-8; name=pg_basebackup-exclusion-v1.patch; x-mac-creator=0; x-mac-type=0Download
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 95aeb35507..c3e6c30eba 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -1130,6 +1130,12 @@ SELECT pg_stop_backup();
</para>
<para>
+ The <filename>pg_internal.init</filename> file can be omitted from the
+ backup no matter what directory it appears in. This file contains a
+ relation cache that is always rebuilt on startup.
+ </para>
+
+ <para>
The backup label
file includes the label string you gave to <function>pg_start_backup</>,
as well as the time at which <function>pg_start_backup</> was run, and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 2bb4e38a9d..46748bfad0 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2384,6 +2384,11 @@ The commands accepted in walsender mode are:
</listitem>
<listitem>
<para>
+ <filename>pg_internal.init</>
+ </para>
+ </listitem>
+ <listitem>
+ <para>
Various temporary files and directories created during the operation
of the PostgreSQL server, such as any file or directory beginning
with <filename>pgsql_tmp</>.
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 9776858f03..2518774fb7 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -36,6 +36,7 @@
#include "utils/builtins.h"
#include "utils/elog.h"
#include "utils/ps_status.h"
+#include "utils/relcache.h"
#include "utils/timestamp.h"
@@ -151,6 +152,9 @@ static const char *excludeFiles[] =
/* Skip current log file temporary file */
LOG_METAINFO_DATAFILE_TMP,
+ /* Skip relation cache because it is rebuilt on startup */
+ RELCACHE_INIT_FILENAME,
+
/*
* If there's a backup_label or tablespace_map file, it belongs to a
* backup started by the user with pg_start_backup(). It is *not* correct
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index b8e37809b0..5015719915 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -87,11 +87,6 @@
#include "utils/tqual.h"
-/*
- * name of relcache init file(s), used to speed up backend startup
- */
-#define RELCACHE_INIT_FILENAME "pg_internal.init"
-
#define RELCACHE_INIT_FILEMAGIC 0x573266 /* version ID value */
/*
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index a00f7b0e1a..d95ea3e0d5 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -4,7 +4,7 @@ use Cwd;
use Config;
use PostgresNode;
use TestLib;
-use Test::More tests => 72;
+use Test::More tests => 73;
program_help_ok('pg_basebackup');
program_version_ok('pg_basebackup');
@@ -61,6 +61,11 @@ foreach my $filename (
close $file;
}
+# Connect to a database to create global/pg_internal.init. If this is removed
+# the test to ensure global/pg_internal.init is not copied will return a false
+# positive.
+$node->safe_psql('postgres', 'SELECT 1;');
+
$node->command_ok([ 'pg_basebackup', '-D', "$tempdir/backup", '-X', 'none' ],
'pg_basebackup runs');
ok(-f "$tempdir/backup/PG_VERSION", 'backup was created');
@@ -84,7 +89,8 @@ foreach my $dirname (
# These files should not be copied.
foreach my $filename (
- qw(postgresql.auto.conf.tmp postmaster.opts postmaster.pid tablespace_map current_logfiles.tmp)
+ qw(postgresql.auto.conf.tmp postmaster.opts postmaster.pid tablespace_map current_logfiles.tmp
+ global/pg_internal.init)
)
{
ok(!-f "$tempdir/backup/$filename", "$filename not copied");
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 3c53cefe4b..29c6d9bae3 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -18,6 +18,11 @@
#include "nodes/bitmapset.h"
+/*
+ * Name of relcache init file(s), used to speed up backend startup
+ */
+#define RELCACHE_INIT_FILENAME "pg_internal.init"
+
typedef struct RelationData *Relation;
/* ----------------
Hi,
On 02/09/17 21:08, David Steele wrote:
Hackers,
The cache in pg_internal.init was reused in days of yore but has been
rebuilt on postmaster startup since v8.1. It appears there is no reason
for this file to be backed up.
Makes sense.
I also moved the RELCACHE_INIT_FILENAME constant to relcache.h to avoid
duplicating the string.
+1
+++ b/doc/src/sgml/protocol.sgml @@ -2384,6 +2384,11 @@ The commands accepted in walsender mode are: </listitem> <listitem> <para> + <filename>pg_internal.init</> + </para> + </listitem> + <listitem> + <para>
Not specific problem to this patch, but I wonder if it should be made
more clear that those files (there are couple above of what you added)
are skipped no matter which directory they reside in.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Nov 3, 2017 at 4:04 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:
Not specific problem to this patch, but I wonder if it should be made
more clear that those files (there are couple above of what you added)
are skipped no matter which directory they reside in.
Agreed, it is a good idea to tell in the docs how this behaves. We
could always change things so as the comparison is based on the full
path like what is done for pg_control, but that does not seem worth
complicating the code.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Nov 4, 2017 at 4:04 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:
On Fri, Nov 3, 2017 at 4:04 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:Not specific problem to this patch, but I wonder if it should be made
more clear that those files (there are couple above of what you added)
are skipped no matter which directory they reside in.Agreed, it is a good idea to tell in the docs how this behaves. We
could always change things so as the comparison is based on the full
path like what is done for pg_control, but that does not seem worth
complicating the code.
pg_internal.init can, and do, appear in multiple different directories.
pg_control is always in the same place. So they're not the same thing.
So +1 for documenting the difference in how these are handled, as this is
important to know for somebody writing an external tool for it.
It also seems the list in the documentation is not in sync with the code.
AFAICT docs are not mentioning the current_logfile. This seems to be a miss
in 19dc233c32f ?
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
On 5 November 2017 at 11:55, Magnus Hagander <magnus@hagander.net> wrote:
On Sat, Nov 4, 2017 at 4:04 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:On Fri, Nov 3, 2017 at 4:04 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:Not specific problem to this patch, but I wonder if it should be made
more clear that those files (there are couple above of what you added)
are skipped no matter which directory they reside in.Agreed, it is a good idea to tell in the docs how this behaves. We
could always change things so as the comparison is based on the full
path like what is done for pg_control, but that does not seem worth
complicating the code.pg_internal.init can, and do, appear in multiple different directories.
pg_control is always in the same place. So they're not the same thing.So +1 for documenting the difference in how these are handled, as this is
important to know for somebody writing an external tool for it.
Changes made, moving to commit the attached patch.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
pg_basebackup-exclusion-v2.patchapplication/octet-stream; name=pg_basebackup-exclusion-v2.patchDownload
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index dd9c1bff5b..fa3a701631 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -1130,6 +1130,12 @@ SELECT pg_stop_backup();
</para>
<para>
+ <filename>pg_internal.init</filename> files can be omitted from the
+ backup whenever a file of that name is found. These files contain
+ relation cache data that is always rebuilt when recovering.
+ </para>
+
+ <para>
The backup label
file includes the label string you gave to <function>pg_start_backup</function>,
as well as the time at which <function>pg_start_backup</function> was run, and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 15108baf71..f82affd0c5 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2501,6 +2501,11 @@ The commands accepted in walsender mode are:
</listitem>
<listitem>
<para>
+ <filename>pg_internal.init (found in multiple directories)</>
+ </para>
+ </listitem>
+ <listitem>
+ <para>
Various temporary files and directories created during the operation
of the PostgreSQL server, such as any file or directory beginning
with <filename>pgsql_tmp</filename>.
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 75029b0def..1411c14e92 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -36,6 +36,7 @@
#include "utils/builtins.h"
#include "utils/elog.h"
#include "utils/ps_status.h"
+#include "utils/relcache.h"
#include "utils/timestamp.h"
@@ -151,6 +152,9 @@ static const char *excludeFiles[] =
/* Skip current log file temporary file */
LOG_METAINFO_DATAFILE_TMP,
+ /* Skip relation cache because it is rebuilt on startup */
+ RELCACHE_INIT_FILENAME,
+
/*
* If there's a backup_label or tablespace_map file, it belongs to a
* backup started by the user with pg_start_backup(). It is *not* correct
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index b8e37809b0..5015719915 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -87,11 +87,6 @@
#include "utils/tqual.h"
-/*
- * name of relcache init file(s), used to speed up backend startup
- */
-#define RELCACHE_INIT_FILENAME "pg_internal.init"
-
#define RELCACHE_INIT_FILEMAGIC 0x573266 /* version ID value */
/*
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 6a8be09f4c..cdf4f5be37 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -4,7 +4,7 @@ use Cwd;
use Config;
use PostgresNode;
use TestLib;
-use Test::More tests => 78;
+use Test::More tests => 79;
program_help_ok('pg_basebackup');
program_version_ok('pg_basebackup');
@@ -61,6 +61,11 @@ foreach my $filename (
close $file;
}
+# Connect to a database to create global/pg_internal.init. If this is removed
+# the test to ensure global/pg_internal.init is not copied will return a false
+# positive.
+$node->safe_psql('postgres', 'SELECT 1;');
+
$node->command_ok([ 'pg_basebackup', '-D', "$tempdir/backup", '-X', 'none' ],
'pg_basebackup runs');
ok(-f "$tempdir/backup/PG_VERSION", 'backup was created');
@@ -84,7 +89,8 @@ foreach my $dirname (
# These files should not be copied.
foreach my $filename (
- qw(postgresql.auto.conf.tmp postmaster.opts postmaster.pid tablespace_map current_logfiles.tmp)
+ qw(postgresql.auto.conf.tmp postmaster.opts postmaster.pid tablespace_map current_logfiles.tmp
+ global/pg_internal.init)
)
{
ok(!-f "$tempdir/backup/$filename", "$filename not copied");
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 3c53cefe4b..29c6d9bae3 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -18,6 +18,11 @@
#include "nodes/bitmapset.h"
+/*
+ * Name of relcache init file(s), used to speed up backend startup
+ */
+#define RELCACHE_INIT_FILENAME "pg_internal.init"
+
typedef struct RelationData *Relation;
/* ----------------
On 11/7/17 11:03 AM, Simon Riggs wrote:
On 5 November 2017 at 11:55, Magnus Hagander <magnus@hagander.net> wrote:
So +1 for documenting the difference in how these are handled, as this is
important to know for somebody writing an external tool for it.Changes made, moving to commit the attached patch.
Thanks, Simon! This was on my to do list today -- glad I checked my
email first.
--
-David
david@pgmasters.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Nov 8, 2017 at 1:42 AM, David Steele <david@pgmasters.net> wrote:
On 11/7/17 11:03 AM, Simon Riggs wrote:
On 5 November 2017 at 11:55, Magnus Hagander <magnus@hagander.net> wrote:
So +1 for documenting the difference in how these are handled, as this is
important to know for somebody writing an external tool for it.Changes made, moving to commit the attached patch.
Thanks, Simon! This was on my to do list today -- glad I checked my
email first.
<para>
+ <filename>pg_internal.init</filename> files can be omitted from the
+ backup whenever a file of that name is found. These files contain
+ relation cache data that is always rebuilt when recovering.
+ </para>
Do we want to mention in the docs that the same decision-making is
done for *all* files with matching names, aka the fact that if a file
is listed and found in a sub-folder it is skipped? postmaster.opts or
similar friends are unlikely to be found in anything but the root of
the data folder, still the upthread argument of documenting precisely
what basebackup.c does sounded rather convincing to me.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Nov 8, 2017 at 3:03 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 5 November 2017 at 11:55, Magnus Hagander <magnus@hagander.net> wrote:
On Sat, Nov 4, 2017 at 4:04 AM, Michael Paquier <
michael.paquier@gmail.com>
wrote:
On Fri, Nov 3, 2017 at 4:04 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:Not specific problem to this patch, but I wonder if it should be made
more clear that those files (there are couple above of what you added)
are skipped no matter which directory they reside in.Agreed, it is a good idea to tell in the docs how this behaves. We
could always change things so as the comparison is based on the full
path like what is done for pg_control, but that does not seem worth
complicating the code.pg_internal.init can, and do, appear in multiple different directories.
pg_control is always in the same place. So they're not the same thing.So +1 for documenting the difference in how these are handled, as this is
important to know for somebody writing an external tool for it.Changes made, moving to commit the attached patch.
The commit 98267e missed to check the empty SGML tag, attached patch
fixes the same.
Regards,
Hari Babu
Fujitsu Australia
Attachments:
sgml_empty_tag_fix.patchapplication/octet-stream; name=sgml_empty_tag_fix.patchDownload
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index f82affd0c5..b587ef71c1 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2501,7 +2501,7 @@ The commands accepted in walsender mode are:
</listitem>
<listitem>
<para>
- <filename>pg_internal.init (found in multiple directories)</>
+ <filename>pg_internal.init (found in multiple directories)</filename>
</para>
</listitem>
<listitem>
On Wed, Nov 8, 2017 at 9:04 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
The commit 98267e missed to check the empty SGML tag, attached patch
fixes the same.
<listitem>
<para>
- <filename>pg_internal.init (found in multiple directories)</>
+ <filename>pg_internal.init (found in multiple directories)</filename>
</para>
</listitem>
What has been committed in 98267ee and what is proposed here both look
incorrect to me. The markup filename ought to be used only with file
names, so "(found in multiple directories)" should not be within it.
Simon's commit is not wrong with the markup usage by the way.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Nov 8, 2017 at 11:11 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:
On Wed, Nov 8, 2017 at 9:04 AM, Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:The commit 98267e missed to check the empty SGML tag, attached patch
fixes the same.<listitem> <para> - <filename>pg_internal.init (found in multiple directories)</> + <filename>pg_internal.init (found in multiple directories)</filename> </para> </listitem> What has been committed in 98267ee and what is proposed here both look incorrect to me. The markup filename ought to be used only with file names, so "(found in multiple directories)" should not be within it. Simon's commit is not wrong with the markup usage by the way.
Thanks for the correction. I was not much aware of SGML markup usage.
While building the documentation, it raises an warning message of "empty
end-tag".
So I just added the end tag. Attached the update patch with the suggested
correction.
Regards,
Hari Babu
Fujitsu Australia
Attachments:
sgml_empty_tag_fix_v2.patchapplication/octet-stream; name=sgml_empty_tag_fix_v2.patchDownload
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index f82affd0c5..6d4dcf83ac 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2501,7 +2501,7 @@ The commands accepted in walsender mode are:
</listitem>
<listitem>
<para>
- <filename>pg_internal.init (found in multiple directories)</>
+ <filename>pg_internal.init</filename> (found in multiple directories)
</para>
</listitem>
<listitem>
On Wed, Nov 8, 2017 at 9:50 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
Thanks for the correction. I was not much aware of SGML markup usage.
While building the documentation, it raises an warning message of "empty
end-tag".
So I just added the end tag. Attached the update patch with the suggested
correction.
Ah, I can see the warning as well. Using empty tags is forbidden since
c29c5789, which is really recent. Sorry for missing it. Simon got
trapped by that as well visibly. Your patch looks good to me.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 11/7/17 19:58, Michael Paquier wrote:
On Wed, Nov 8, 2017 at 9:50 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
Thanks for the correction. I was not much aware of SGML markup usage.
While building the documentation, it raises an warning message of "empty
end-tag".
So I just added the end tag. Attached the update patch with the suggested
correction.Ah, I can see the warning as well. Using empty tags is forbidden since
c29c5789, which is really recent. Sorry for missing it. Simon got
trapped by that as well visibly. Your patch looks good to me.
fixed
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 9, 2017 at 1:03 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
On 11/7/17 19:58, Michael Paquier wrote:
On Wed, Nov 8, 2017 at 9:50 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
Thanks for the correction. I was not much aware of SGML markup usage.
While building the documentation, it raises an warning message of "empty
end-tag".
So I just added the end tag. Attached the update patch with the suggested
correction.Ah, I can see the warning as well. Using empty tags is forbidden since
c29c5789, which is really recent. Sorry for missing it. Simon got
trapped by that as well visibly. Your patch looks good to me.fixed
Thanks.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers